copyright james p. sethna 2011pages.physics.cornell.edu/sethna/statmech/newexercises.pdf · 4.13...

122
Copyright James P. Sethna 2011 Unpublished Exercises Statistical Mechanics: Entropy, Order Parameters, and Complexity James P. Sethna Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY 14853-2501 The author provides this version of this manuscript with the primary in- tention of making the text accessible electronically—through web searches and for browsing and study on computers. Oxford University Press retains ownership of the copyright. Hard-copy printing, in particular, is subject to the same copyright rules as they would be for a printed book. CLARENDON PRESS . OXFORD 2017

Upload: phungdiep

Post on 26-Mar-2018

243 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Cop

yrig

ht J

ames

P. S

ethn

a 20

11

Unpublished Exercises

Statistical Mechanics: Entropy, OrderParameters, and Complexity

James P. SethnaLaboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY

14853-2501

The author provides this version of this manuscript with the primary in-

tention of making the text accessible electronically—through web searches

and for browsing and study on computers. Oxford University Press retains

ownership of the copyright. Hard-copy printing, in particular, is subject to

the same copyright rules as they would be for a printed book.

CLARENDON PRESS . OXFORD

2017

Page 2: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011

Page 3: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Contents

Contents v

List of figures vii

Chapter 1: What is statistical mechanics? 1

Exercises 1

1.1 Emergence 1

4.9 The Birthday Problem 1

1.3 Quantum dice 2

1.4 Stirling’s formula 2

1.5 Waiting time ensembles 3

Chapter 2: Random walks and emergent properties. 4

Exercises 4

2.1 Random walks in grade space 4

2.2 Modified random walk 4

2.3 Photon diffusion in the Sun 4

2.4 Modified diffusion 5

2.5 Density dependent diffusion 5

2.6 Local conservation 5

2.7 Absorbing boundary conditions 5

2.8 Random walks, generating functions, and diffusion 5

2.9 Continuous time random walks: Ballistic to diffu-

sive transition 6

Chapter 3: Temperature and equilibrium 8

Exercises 8

3.1 Weirdness in high dimensions 8

3.2 Hard sphere gas 8

3.3 Entropy maximization and temperature 9

3.4 Large and very large numbers 9

3.5 Maxwell relations 10

3.6 Triple product relation 10

3.7 What is the chemical potential? 10

3.8 Undistinguished particles 10

3.9 Random energy model 10

Chapter 4: Phase-space dynamics and ergodicity 13

Exercises 13

4.1 Liouville vs. the damped pendulum 13

4.2 No attractors in Hamiltonian systems 13

4.3 Perverse initial conditions 13

Page 4: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

vi Contents

4.4 Crooks: Exact results in non-equilibrium statisti-

cal mechanics 14

4.5 Jarzynski: Exact results in non-equilibrium sta-

tistical mechanics 15

4.6 The Arnol’d cat map, Expanded 16

4.7 Jupiter! and the KAM theorem 18

4.8 2D turbulence and Jupiter’s Great Red Spot 21

Chapter 5: Entropy 25

Exercises 25

5.1 The Dyson sphere 25

5.2 Undistinguished particles and the Gibbs factor 26

5.3 How many shuffles? 26

5.4 Entropy of socks 27

5.5 Rubber band 27

5.6 Loaded dice and sticky spheres 28

5.7 Gravity and entropy 28

5.8 Aging, entropy, and DNA 28

5.9 Entropy of the galaxy 29

5.10 Nucleosynthesis and the arrow of time 29

Chapter 6: Free energies 33

Exercises 33

6.1 Pollen and hard squares 33

6.2 Rubber band Free Energy 33

6.3 Rubber band: Illustrating the formalism 34

6.4 Molecular motors and free energies 34

6.5 Langevin dynamics 35

6.6 Newton’s theory of sound 36

6.7 Nucleosynthesis as a chemical reaction 36

Chapter 7: Quantum statistical mechanics 38

Exercises 38

7.1 Eigenstate thermalization 38

7.2 Does entropy increase in quantum systems? 38

7.3 Bosons are gregarious: superfluids and lasers 38

7.4 Is sound a quasiparticle? 39

7.5 Drawing wavefunctions 39

7.6 Nonsymmetric eigenstates 39

7.7 Universe of light baryons 40

7.8 Many-fermion wavefunction nodes 40

7.9 The Greenhouse effect and cooling coffee 40

Chapter 8: Calculation and computation 42

Exercises 42

8.1 Solving the Ising model with parallel updates 42

8.2 Detailed balance 42

8.3 Ising model cluster expansion 42

8.4 Low-temperature expansion of Ising model 42

8.5 Fruit flies, Markov matrices, and free energies 43

Chapter 9: Order parameters, broken symmetry, and

topology 45

Page 5: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Contents vii

Exercises 45

9.1 Nematic order parameter half space 45

9.2 Ising order parameter 45

9.3 Pentagonal order parameter 45

9.4 Superfluids and Second Sound 46

9.5 No Heisenberg line defect 47

Chapter 10: Correlations, response, and dissipation 49

Exercises 49

10.1 Human correlations 49

10.2 Liquid free energy 49

10.3 Onsager Regression Hypothesis and the Harmonic

Oscillator 49

10.4 Susceptibility and dissipation for the Harmonic

Oscillator 50

10.5 Liquid susceptibility and dissipation 50

10.6 Fluctuation Dissipation for the Harmonic Oscillator 50

10.7 Static susceptibility and the fluctuation–dissipation

theorem for liquids. 51

10.8 Kramers–Kronig for the Harmonic Oscillator 51

10.9 Kramers Kronig for liquids. 51

10.10 Subway Bench Correlations 51

10.11 Gibbs free energy barrier 53

10.12 Susceptibilities and Correlations 53

Chapter 11: Abrupt phase transitions 55

Exercises 55

11.1 What is it unstable to? 55

11.2 Maxwell and van der Waals 55

11.3 Droplet Nucleation in 2D 55

11.4 Coarsening in the Ising model 56

11.5 Linear stability of a growing interface 56

11.6 Fracture nucleation: elastic theory has zero radius

of convergence 56

Chapter 12: Continuous phase transitions 61

Exercises 61

12.1 Renormalization-group trajectories 61

12.2 Singular corrections to scaling 62

12.3 Nonlinear flows, analytic corrections, and hyper-

scaling 63

12.4 Beyond power laws: Nonlinear flows and loga-

rithms in the 2D Ising model 65

12.5 The Gutenberg Richter law 65

12.6 Period Doubling 65

12.7 Random Walks 66

12.8 Hysteresis and Barkhausen Noise 66

12.9 First to fail: Weibull 67

12.10 Biggest of bunch: Gumbel 68

12.11 Extreme value statistics: Gumbel, Weibull, and

Frechet 70

Page 6: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

viii Contents

12.12 Diffusion equation and universal scaling functions 71

12.13 Avalanche Size Distribution 72

12.14 Conformal Invariance 73

3 Advanced Statistical Mechanics 77

Exercises 77

3.1 A fair split? Number partitioning 77

3.2 Cardiac dynamics 79

3.3 Quantum dissipation from phonons 81

3.4 Variational mean-field derivation 82

3.5 Ising Lower Critical Dimension 83

3.6 XY Lower Critical Dimension and the Mermin-

Wagner Theorem 83

3.7 Long-range Ising 84

3.8 Equilibrium Crystal Shapes 84

4 Numerical Methods 87

Exercises 87

4.1 Condition Number and Accuracy 87

4.2 Sherman–Morrison formula 87

4.3 Methods of interpolation 88

4.4 Numerical definite integrals 88

4.5 Numerical derivatives 90

4.6 Summing series 90

4.7 Random histograms 90

4.8 Monte Carlo integration 90

4.9 The Birthday Problem 91

4.10 Washboard Potential 92

4.11 Sloppy Minimization 92

4.12 Sloppy Monomials 93

4.13 Conservative differential equations: Accuracy and

fidelity 94

5 Quantum 97

Exercises 97

5.1 Quantum Notation 97

5.2 Light Proton Atomic Size 97

5.3 Light Proton Tunneling 97

5.4 Light Proton Superfluid 98

5.5 Aharonov-Bohm Wire 98

5.6 Anyons 99

5.7 Bell 101

5.8 Parallel Transport, Frustration, and the Blue Phase102

5.9 Crystal field theory: d-orbitals 105

5.10 Entangled Spins 106

5.11 Rotating Fermions 107

5.12 Lithium ground state symmetry 107

5.13 Matrices, wavefunctions, and group representations 108

Page 7: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Contents ix

5.14 Molecular Rotations 109

5.15 Propagators to Path Integrals 109

5.16 Three particles in a box 109

5.17 Rotation Matrices 110

5.18 Trace 110

5.19 Complex Exponentials 111

5.20 Dirac δ-functions 111

5.21 Eigen Stuff 112

Page 8: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Page 9: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

List of figures

1.1 Quantum dice 23

2

1

1 2 3Roll #1

Rol

l #2

4

2 3 4

4 5 6

3 5

3.2 Hard sphere gas 8r

3.3 Excluded area around a hard disk 8

3.4 Hard spheres vs. ideal gas 9

4.5 Total derivatives 13

r

4.6 Crooks fluctuation theorem: evolution in time 14Σ

Ω

U

(E+W)

C

Ω

(E)

4.7 Crooks fluctuation theorem: evolution backward in time 14 ΣD

−1U

4.8 One particle in a piston 15

♣❱

4.9 Collision phase diagram 16

4.10 Arnol’d cat map 17p

x

p

x

0 1

1

1

0 1

4.11 Evolution of an initially concentrated phase-space density 17p

−h

h

x0

0

4.12 The Earth’s trajectory 19

4.13 Torus 19

4.14 The Poincare section 20

4.15 Jupiter’s Red Spot 22

4.16 Circles 24

5.17 Rubber band 27L

d

5.18 The Universe as a piston 30Expansion

5.19 PV diagram for the universe 30 Fusion

Low

−27

P

?

Our ’H

Equilibrium ’Fe’

Big

10 1/m3V per baryon

Universe

Universe

FusionBottleneck

Crunch

High

6.20 Square pollen 33 QB b

6.21 RNA polymerase molecular motor 34 F

DNA

RNA

laserFocused

Bead

x

System

6.22 Hairpins in RNA 35

6.23 Telegraph noise in RNA unzipping 35

8.24 Fruit fly behavior map 43

9.25 Projective Plane 45

9.26 Frustrated soccer ball 46

9.27 Cutting and sewing: disclinations 46

9.28 Sphere 47

9.29 Escape in the third dimension 48P

q

10.30 Pulling 53

10.31 Hitting 54

11.32 P–V plot: van der Waals 5550 100 150 200 250 300

Volume per mole (cm3)

0

1 × 108

2 × 108

3 × 108

4 × 108

5 × 108

6 × 108

Pres

sure

(dy

nes/

cm2 )

500 K550 K575 K600 K625 K650 K700 K750 K

11.33 Stretched block 57L/L = F/(YA)∆

MaterialElastic

11.34 Fractured block 57MaterialElastic

Aγ2Surface Energy

11.35 Critical crack 57P < 0

Crack length

11.36 Contour integral in complex pressure 59 PCFRe[P]

Im[P]

A

EBD 0

12.37 Scaling in the period doubling bifurcation diagram 65 δ

α

12.38 Random walk scaling 66

12.39 Two proteins in a critical membrane 73

Page 10: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

xii List of figures

12.40 Ising model under conformal rescaling 74

3.1 Atomic tunneling from a tip 82Vo

Qo

AFM /STM tip AFM /STM tip

Substrate Substrate

4.1 Interpolation methods 89interpolation

-π 0 πx

-1

0

1

Firs

t der

ivat

ive

ABCD

-π 0 πx

-1

0

1

Seco

nd d

eriv

ativ

e

ABCD

-π 0 πx

-0.4

-0.2

0

0.2

0.4

Err

or

ABCD

-π 0 πx

-1

0

1

sin(

x) a

ppro

x

ABCD

-π 0 πx

-1

0

1

Firs

t der

ivat

ive

ABCD

1 1.10.4

0.5

-π 0 πx

-1

0

1

Seco

nd D

eriv

ativ

e

ABCD

-π 0 πx

-0.05

0

0.05

Err

or

ABCD

Five data points Ten data points

Interpolatedsin(x)

Firstderivative

Secondderivative

Errorsin(x) −

-π 0 πx

-1

0

1

sin(

x) a

ppro

x

ABCD

5.1 Atom tunneling 97

5.2 Atoms with imaginary box of size equal to average space

per atom 98

5.3 Feynman diagram: identical particles 100

5.4 Braiding of paths in two dimensions 100

5.5 Blue Phase Molecules 103

5.6 Twisted Parallel Transport 103

5.7 Local structures 104(a) (b)

5.8 Parallel transport frustration 104

5.9 Orange-peel carpet 105

5.10 Level diagram 110Bosons

Fermions

Distinguished

Page 11: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Chapter 1: What is statistical mechanics?

Exercises

(1.1) Emergence. ©pWe begin with the broad statement Statistical me-chanics explains the simple behavior of complexsystems. New laws emerge from bewildering in-teractions of constituents.Which of the following are clearly not an exampleof this kind of emergent behavior?(a) The emergence of the wave equation from

the collisions of atmospheric molecules,(b) The emergence of Newtonian gravity from

Einstein’s general theory,(c) The emergence of random stock price fluctu-

ations from the behavior of traders,(d) The emergence of a power-law distribution

of earthquake sizes from the response of rubble inearthquake faults to external stresses.

(1.2) The Birthday Problem. (Sorting, Randomnumbers) ©2Remember birthday parties in your elementaryschool? Remember those years when two kids hadthe same birthday? How unlikely!How many kids would you need in class to get,more than half of the time, at least two with thesame birthday?(a) Numerical. Write BirthdayCoincidences(K,

C), a routine that returns the fraction among Cclasses for which at least two kids (among K kidsper class) have the same birthday. (Hint: By sort-ing a random list of integers, common birthdayswill be adjacent.) Plot this probability versus Kfor a reasonably large value of C. Is it a surprisethat your classes had overlapping birthdays whenyou were young?One can intuitively understand this, by remem-bering that to avoid a coincidence there areK(K − 1)/2 pairs of kids, all of whom musthave different birthdays (probability 364/365 =1− 1/D, with D days per year).

P (K,D) ≈ (1− 1/D)K(K−1)/2 (1.1)

This is clearly a crude approximation – it doesn’tvanish if K > D! Ignoring subtle correlations,

though, it gives us a net probability

P (K,D) ≈ exp(−1/D)K(K−1)/2

≈ exp(−K2/(2D)) (1.2)

Here we’ve used the fact that 1 − ǫ ≈ exp(−ǫ),and assumed that K/D is small.(b) Analytical. Write the exact formula givingthe probability, for K random integers among Dchoices, that no two kids have the same birthday.(Hint: What is the probability that the secondkid has a different birthday from the first? Thethird kid has a different birthday from the firsttwo?) Show that your formula does give zero ifK > D. Converting the terms in your product toexponentials as we did above, show that your an-swer is consistent with the simple formula above,if K ≪ D. Inverting equation 4.12, give a for-mula for the number of kids needed to have a 50%chance of a shared birthday.Some years ago, we were doing a large simulation,involving sorting a lattice of 10003 random fields(roughly, to figure out which site on the latticewould trigger first). If we want to make sure thatour code is unbiased, we want different randomfields on each lattice site – a giant birthday prob-lem.Old-style random number generators generated arandom integer (232 ‘days in the year’) and thendivided by the maximum possible integer to geta random number between zero and one. Mod-ern random number generators generate all 252

possible doubles between zero and one.(c) If there are 231 − 1 distinct four-byte positiveintegers, how many random numbers would onehave to generate before one would expect coinci-dences half the time? Generate lists of that length,and check your assertion. (Hints: It is faster touse array operations, especially in interpreted lan-guages. I generated a random array with N en-tries, sorted it, subtracted the first N − 1 entriesfrom the last N − 1, and then called min on thearray.) Will we have to worry about coincidenceswith an old-style random number generator? Howlarge a lattice L×L×L of random double precision

Page 12: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

2 List of figures

numbers can one generate with modern generatorsbefore having a 50% chance of a coincidence? Ifyou have a fast machine with a large memory, youmight test this too.

(1.3) Quantum dice.1 (Quantum) ©iYou are given two unusual ‘three-sided’ dicewhich, when rolled, show either one, two, or threespots. There are three games played with thesedice: Distinguishable, Bosons, and Fermions. Ineach turn in these games, the player rolls the twodice, starting over if required by the rules, until alegal combination occurs. In Distinguishable, allrolls are legal. In Bosons, a roll is legal only ifthe second of the two dice shows a number thatis is larger or equal to that of the first of the twodice. In Fermions, a roll is legal only if the sec-ond number is strictly larger than the precedingnumber. See Fig. 1.1 for a table of possibilitiesafter rolling two dice.Our dice rules are the same ones that govern thequantum statistics of identical particles.

3

2

1

1 2 3Roll #1

Rol

l #2

4

2 3 4

4 5 6

3 5

Fig. 1.1 Quantum dice. Rolling two dice. InBosons, one accepts only the rolls in the shadedsquares, with equal probability 1/6. In Fermions, oneaccepts only the rolls in the darkly-shaded squares(not including the diagonal from lower left to upperright), with probability 1/3.

(a) Presume the dice are fair: each of the threenumbers of dots shows up 1/3 of the time. Fora legal turn rolling a die twice in the three games(Distinguishable, Bosons, and Fermions, what isthe probability ρ(4) of rolling a 4?(b) For a legal turn in the three games, what isthe probability of rolling a double? (Hint: There

is a Fermi exclusion principle: when playing Fer-mions, no two dice can have the same numberof dots showing.) Electrons are fermions; no twonon-interacting electrons can be in the same quan-tum state. Bosons are gregarious (Exercise 7.9);non-interacting bosons have a larger likelihood ofbeing in the same state.Let us decrease the number of sides on our dice toN = 2, making them quantum coins, with a headH and a tail T . Let us increase the total numberof coins to a large number M . In the rules for le-gal flips of quantum coins, let us make T < H . Alegal Boson sequence, for example, is then a pat-tern TTTT · · ·HHHH · · · of length M ; all legalsequences have the same probability.(c) What is the probability that all the Mflips of our quantum coin give the same value(HHHH · · · or TTTT · · · ), in the three games?(Hint: How many legal sequences are there for thethree games? How many of these are all the samevalue?)Let us now consider a biased coin, with probabil-ity p = 1/3 of landing H and thus 1− p = 2/3 oflanding T .(d) What is the probability that all the M flips ofour biased quantum coin give tails, in Distinguish-able? What is the relative probability2 in Bosonsof getting M tails versus getting M − 1 tails? Ofgetting M −m tails in Bosons? As M gets large,what is the probability in Bosons that all coins fliptails?We shall learn in Section (7.6.3) that the proba-bility of a state in thermal equilibrium dependson its energy. We can view our quantum dice andcoins as non-interacting particles, with the biasedcoin having a lower energy for T than for H . Hav-ing a non-zero probability of having all the bosonsin the single-particle ground state T is Bose con-densation, closely related to superfluidity.

(1.4) Stirling’s formula. (Mathematics) ©iStirling’s approximation, n! ∼

√2πn(n/e)n, is re-

markably useful in statistical mechanics; it givesan excellent approximation for large n. In statis-tical mechanics the number of particles is so largethat we usually care not about n!, but about its

1The original form of this exercise was developed in collaboration with Sarah Shan-dera.2The probability of finding a particular legal sequence in Bosons is larger by a con-stant factor due to discarding the illegal sequences. This factor is just one over theprobability of a given toss of the coins being legal, Z =

∑α pα over legal Boson

sequences α. This normalization constant is called the partition function, and issurprisingly central to statistical mechanics.

Page 13: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 3

logarithm, so log(n!) ∼ n log n − n + 1/2 log(2πn).Finally, n is often so large that the final term is atiny fractional correction to the others, giving thesimple formula log(n!) ∼ n log n− n.(a) Calculate log(n!) and these two approxima-tions to it for n = 2, 4, and 8.Clearly log(n!) = log(1 ∗ 2 ∗ 3 ∗ · · · ∗ n) = log(1) +log(2) + · · ·+ log(n) =

∑nm=1 log(m).

(b) Convert the sum to an integral,∑n

m=1 ≈∫ n

0dm. Derive the simple form of Stirling’s for-

mula.(c) Draw a plot of log(m) and a bar chart showinglog(ceiling(n)). (Here ceiling(x) represents thesmallest integer larger than x.) Argue that theintegral under the bar chart is log(n!).The difference between the sum and the integralin part (c) should look approximately like a col-lection of triangles, except for the region betweenzero and one. The sum of the areas equals theerror in the simple form for Stirling’s formula.(d) Imagine doubling these triangles into rect-angles on your drawing from part (c), and slid-ing them sideways (ignoring the error for m be-tween zero and one). Explain how this relates tothe term 1/2 log n in Stirling’s formula log(n!) −(n log n−n) ≈ 1/2 log(2πn) =

1/2 log(2)+1/2 log(π)+

1/2 log(n).

(1.5) Waiting time ensembles.3 (Mathematics) ©iOn a highway, the average numbers of cars andbuses going east are equal: each hour, on aver-age, there are 12 buses and 12 cars passing by.The buses are scheduled: each bus appears ex-actly 5 minutes after the previous one. On theother hand, the cars appear at random: in a shortinterval dt, the probability that a car comes by isdt/τ , with τ = 5 minutes.A pedestrian repeatedly approaches a bus stop atrandom times t, and notes how long it takes be-fore the first bus passes, and before the first carpasses.(a) Draw the probability density for the ensembleof waiting times δBus

+ to the next bus observed bythe pedestrian. Draw the density for the corre-sponding ensemble4 of times δCar

+ . What is themean waiting time for a bus 〈δBus

+ 〉t? The mean

time 〈δCar+ 〉t for a car?

In statistical mechanics, we shall describe specificphysical systems (a bottle of N atoms with energyE) by considering ensembles of systems. Some-times we shall use two different ensembles to de-scribe the same system (all bottles of N atomswith energy E, or all bottles of N atoms at thattemperature T where the mean energy is E). Wehave been looking at the time-averaged ensemble(the ensemble 〈· · · 〉t over random times t). Thereis also in this problem an ensemble average overthe gaps between vehicles (〈· · · 〉gap over randomtime intervals); these two give different averagesfor the same quantity.A traffic engineer sits at the bus stop, and mea-sures an ensemble of time gaps ∆Bus betweenneighboring buses, and an ensemble of gaps ∆Car

between neighboring cars.(b) Draw the probability density of gaps she ob-

serves between buses. Draw the probability den-

sity of gaps between cars. (Hint: Is it differentfrom the ensemble of car waiting times you foundin part (a)? Why not?) What is the mean gap

time 〈∆Bus〉gap for the buses? What is the mean

gap time 〈∆Car〉gap for the cars?

You should find that the mean waiting time for abus is twice the mean bus gap time, but that themean waiting time for a car equals the mean cargap time.The traffic engineer now picks random timest, and notes the gap ∆Car given by the sumδCar− +δCar

+ of the times since the last car and tothe next car. She measures 〈∆Car〉t. Clearly thismust be twice the mean waiting time for a car.How can the average gap between cars change bya factor of two depending on how you average it?The two averages are in different ensembles.(c) Consider a short experiment, with three carspassing at times t = 0, 2, and 8 (so there aretwo gaps, of length 2 and 6). What is 〈∆Car〉gap?What is 〈∆Car〉t? Explain why they are different.When we change between ensembles in statisticalmechanics, we will need to check carefully that theproperties we calculate are the same in the differ-ent ensembles, as we take the number of particlesto be large.

3The original form of this exercise was developed in collaboration with Piet Brouwer.4You may be familiar with the car probability density from radioactive decay, or fromthe Poisson distribution. If not, you can derive it by writing a differential equationfor S(t), the survival probability that a care has not yet arrived at time t. What isP (t) in terms of S(t)?

Page 14: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

4 List of figures

Chapter 2: Random walks and emergent properties.

Exercises

(2.1) Random walks in grade space. ©pMany students complain about multiple-choiceexams, saying that it is easy to get a bad gradejust by being unlucky in what questions get asked.While easy to grade and quite unbiased, are theygood measures of knowledge and skill?A course is graded using multiple choice exams,with ten points for each problem. A particularstudent has a probability 0.7 of getting each ques-tion correct.(a) Generalize the coin-flip random walk discus-sion (eqns 2.1–2.5), to calculate the student’smean score 〈sN〉, the mean square score 〈s2N 〉, andthe standard deviation σN =

√(〈sN − 〈sN〉)2〉 af-

ter N questions. (Note that 〈sN−1ℓn〉 6= 0 in thiscase.)Measuring instruments in physics are often char-acterized by the signal-to-noise ratio. One canview the random choice of questions as a sourceof noise, and your calculation in part (a) as anestimate of that noise.(b) An introductory engineering physics examwith ten ten-point multiple choice questions hasa class mean of 70 and a standard deviation of15. Is it a good test, comparing the noise to thetotal range?

(2.2) Modified random walk. ©iRandom walks with a constant drift term willhave a net correlation between steps. This prob-lem can be reduced to the problem without driftby shifting to a ‘moving reference frame’. In par-ticular, suppose we have a random walk with stepsindependently drawn from a uniform density ρ(ℓ)on [0,1), but with a non-zero mean 〈ℓ〉 = ℓ 6= ~0.Argue that the sums s′N =

∑Nn (ℓn − ℓ) describe

random walks in a moving reference frame, withzero mean. Argue that the variance of these ran-dom walks (the squared standard deviation) is thesame as the variance 〈(sN − sN)2〉 of the originalrandom walks.

(2.3) Photon diffusion in the Sun. (Astro-physics) ©iIf fusion in the Sun turned off today, how longwould it take for us to notice? This question be-came urgent some time back when the search forsolar neutrinos failed. Neutrinos, created in thesame fusion reaction that creates heat in the So-lar core, pass through the Sun at near the speedof light without scattering – giving us a currentpicture. The rest of the energy takes longer to getout. (The missing electron neutrinos, as it hap-pened, ‘oscillated’ into other types of neutrinos.)Most of the fusion energy generated by the Sun isproduced near its center. The Sun is 7 × 105 kmin radius. Convection probably dominates heattransport in approximately the outer third ofthe Sun, but it is believed that energy is trans-ported through the inner portions (say to a ra-dius R = 5 × 108 m) through a random walk ofX-ray photons. (A photon is a quantized packageof energy; you may view it as a particle which al-ways moves at the speed of light c. Ignore for thisexercise the index of refraction of the Sun.)There appears to be a widespread error in thepopular science literature on how long this ran-dom walk takes. This error is reflected in Exer-cise (2.2) in my text.(a) Research this. You should rapidly find a largerange of time estimates. Find a reasonably plau-sible estimate for the mean free path of photons inthe sun.(b) About how many random steps N will thephoton take of length ℓ to get to the radius Rwhere convection becomes important? About howmany years ∆t will it take for the photon to getthere? (You may assume for this exercise that thephoton takes steps in random directions, each ofequal length given by the mean free path.) Re-lated formulæ: c = 3 × 108 m/s; 〈x2〉 ∼ 2Dt;〈s2n〉 = nσ2 = n〈s21〉.There are 31 556 925.9747 ∼ π × 107 ∼ 3 × 107 sin a year.

Page 15: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 5

(2.4) Modified diffusion. ©iPhotons diffusing in clouds are occasionally ab-sorbed by the water droplets. Neutrons diffusingin a reactor, or algae diffusing in the sea, maymultiply as they move.How would you modify the derivation of the diffu-sion equation in eqns (2.9–2.13) to allow for par-ticle non-conservation? Which term in equation2.9 should change? What would the new term inthe diffusion equation look like?

(2.5) Density dependent diffusion. ©iThe diffusion constant can be density dependent;for example, proteins diffusing in a cell membraneare so crowded they can get in the way of one an-other.What should the diffusion equation be for a con-served particle density ρ diffusing with diffusionconstant D(ρ)? (Hint: see footnote 18 on page21.)

(2.6) Local conservation. ©pTin is deposited on a surface of niobium. Athigh temperatures, the niobium atoms invade intothe tin layer. Is the number of niobium atomsρNb(x) a locally conserved quantity? Politiciansin ‘red states’ (primarily Republican) are con-cerned about migration of democrats into the SunBelt (the southwest). Is the number of democratsρDem(x) locally conserved?

(2.7) Absorbing boundary conditions. ©pA particle starting at x′ diffuses on the posi-tive x axis for a time t, except that wheneverit hits the origin it is absorbed. The result-ing probability density gives the Green’s functionρ(x, t) = G(x|x′, t) =

∫G(x|x′, t)ρ(x′, 0)dx′.

Solve for G. (Hint: Use the ‘method of images’:add a negative δ function at −x′.) How would oneuse G to evolve an arbitrary initial probability dis-tribution ρ(x, t0) with this boundary condition?

(2.8) Random walks, generating functions, anddiffusion.5 (Math) ©3Consider a one-dimensional random walk withstep-size ±1 starting at the origin. What is theprobability ft that it first returns to the origin att steps?(a) Argue that the probability is zero unless t =2m is even. How many total paths are there oflength 2m? Calculate the probability for f2m forup to eight-step hops (m < 5) by drawing the dif-ferent paths that touch the origin only at their

endpoints. (Hint: You can save paper by draw-ing the paths starting to the right, and multiply-ing by two. Check your answer by comparing tothe results for general m below.)This first return problem is well-studied, and isusually solved using a generating function. Gen-erating functions extend a series (here f2m) intoa function. The generating function for our one-dimensional random walk first return problem, itturns out, is

F (x) =∞∑

m=0

f2mxm = 1−

√1− x (2.3)

=∑

m

2−2m

2m− 1

(2mm

)xm.

(Look up the derivation if you want an exampleof how it is done.)(b) What is the exact probability of returning tothe origin for the first time at t = 20 steps (m =10)? Use Stirling’s formula n! ∼

√2πn(n/e)n to

give the probability f2m of first returning at timet = 2m for large t. (Note that the rate per unittime Phop(t) = f2m/2.)Now let us take the continuum limit for our ran-dom walker. Consider a diffusion equation for theprobability ρ(r, t) for a particle starting at posi-tion ǫ with diffusion constant D and absorbingboundary conditions at the origin. Let Pcont bethe probability density in time of our particle firsttouching the origin (and being absorbed) at timet (like Phop but for the continuum limit).(c) Write a relation between Pcont(t) and the prob-ability of a random walker surviving to time twith absorbing boundary conditions. Write a rela-tion between Pcont(t) and the current of probabil-ity at the origin. Finally, write a relation betweenPcont(t) and the slope of the probability densitywith respect to position at the origin.You solved for the Green’s function for a diffusionequation with an absorbing boundary condition ina pre-class exercise using the method of images.(Feel free to check it with external sources.)(d) Calculate D for the discrete hopping problemof parts (a) and (b), assuming each step takestime ∆t = 1. What value of ǫ (the initial posi-tion of the continuum walker) is needed to makePcont correspond correctly to Phop at long times t?(Warning: Remember that Phop(t) is zero for oddtimes t.)

5This problem was created with the assistance of Archishman Raju.

Page 16: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

6 List of figures

There are lots of other uses for generating func-tions: analyzing partial sums, finding moments,taking convolutions. . .(e) Argue that F (1) is the probability that a par-ticle will eventually return to the origin, and thatF ′(1) is 〈m〉, half the expected time to return tothe origin. What is the probability that our one-dimensional walk will return to the origin? Whatis the mean time to return to the origin?

(2.9) Continuous time random walks: Ballisticto diffusive transition. ©3Random walks at long times diffuse, with 〈x2〉 ∝Dt, with a proportionality constant that dependson dimension. But what happens at short times?In many cases, the motion at short times is bal-listic – straight line motion with constant speed.Perfume molecules move in straight lines betweencollisions with molecules in the air. E-coli andother bacteria alternate between runs and tum-bles.6 Electrons7 in semiconductors can have longmean-free paths between collisions, and the ballis-tic to diffusive crossover we study here has prac-tical implications for the behavior of electronicdevices.Let us consider the trajectory of a single parti-cle moving through vacuum, subject to an exter-nal force F , and to collisions at random timestα which reset the velocity to vα. Under a con-stant external force F , the velocity v(t) = vΩ +(F/m)(t − tΩ), where tΩ is the time of the col-lision immediately preceding the current time t.We can write this as an equation:

v(t′) =∑

α

(vα + (F/m)(t′ − tα)

)

Θ(t′ − tα)Θ(tα+1 − t′).

(2.4)

Here the Heaviside step function Θ (zero for neg-ative argument and one for positive argument), isused to select the time interval between collisions.We assume the velocity after each collision hasmean zero and are uncorrelated with one another,with mean-square velocity σ2

v, so

〈vα〉 = 0, (2.5)

〈vαvβ〉 = σ2vδαβ . (2.6)

We assume that the scattering events happen atrandom with an average scattering time ∆t, so theintervals δα = tα+1−tα have an exponential prob-ability distribution ρ(δ) = exp(−δ/∆t)/∆t. (Re-member that for random events with rate 1/∆t,the gap-averaged probability of the interval lengthis ρ(δ), the waiting time δ to the next collisionaveraged over the current time t is ρ(δ), and thetime δ since the last collision is also distributedby ρ(δ); see ‘Waiting time ensembles’ from thein-class exercises from week one.)Let us first calculate the mobility γ. Again, letthe collision immediately before the current timet be labeled by Ω.(a) Show that the expectation value of the veloc-ity 〈v(t)〉 is given by F/m times the mean time〈t − tΩ〉 since the last collision. Your answershould use eqns 2.4 and 2.5, and the probabilitydistribution ρ(δ). Calculate the mobility in termsof ∆t and m. Why is your answer different fromthe calculation for fixed intervals ∆t considered inSection 2.3? (This is mostly just a warmup forpart (b). See ‘Waiting time ensembles’ from thein-class exercises from week one, and Feynman’sdiscussion [?, I.43].)Now let us explore the crossover from ballisticto diffusive motion, in the absence of an exter-nal force. Let r(t) be the distance moved by ourrandom walk since time zero. At times much lessthan ∆t, the particle will not yet have scattered,and the motion is in a straight line. At long timeswe expect diffusive motion.How does the crossover happen between these twolimits? In the next part we calculate 〈dr2/dt〉 asa function of time t for a continuous-time randomwalk starting at t = 0, in the absence of an exter-nal force, giving us the crossover and the diffusionconstant D. Assume the random walk has beentraveling since t = −∞, but that we measure thedistance traveled since t = 0.(b) Calculate 〈dr2/dt〉 as a function of time,where r(t) = x(t) − x(0) is the distance movedsince t = 0. To avoid getting buried in algebra,we ask you to do this in a particular sequence.(1) Write r as an abstract integral over v. Squareit, and differentiate with respect to time. (2) Sub-

6During the ‘run’ periods, the bacteria move in roughly straight lines, diffusingslightly in the angle of their motion; during ‘tumbles’ they reorient randomly withoutsignificant spatial motion.7More precisely, it is the electron and hole quasiparticles which have large mean-free paths. The bare electrons scatter rapidly with their screening clouds, but theelectrons plus their clouds can move long distances ballistically.

Page 17: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 7

stitute eqn 2.4 with F = 0, and use eqn 2.6 to en-semble average over the collision velocities. (Hint:Your formula at this point should only involve thelast collision time tΩ, and should be an integralfrom 0 to t involving a Θ function.) (3) Thentake the ensemble average over the collision timesto get 〈dr2/dt〉. What is the diffusion constantD? Does it satisfy the relation D/γ = mv2 offootnote 20 on page 22? (Hint: You may end upwith a double integral over tΩ and t′. Examinethe integration region in the (tΩ, t

′) plane, andexchange orders of integration.)The crossover between ballistic motion and dif-

fusion is easily found in the literature, and is adecaying exponential:

〈dr2/dt〉 = 2D(1 − exp(−t/∆t)). (2.7)

(c) Use this to check your answer in part (b). In-tegrate, and provide a log-log plot of

√〈r2〉/D∆t

versus t/∆t, over the range 0.1 < t/∆t < 100.Solve for the asymptotic power-law dependence atshort and long times, and plot them as straightlines on your log-log plot. Label your axes. (Arethe power laws tangent to your curves as t → 0and t→ ∞?)

Page 18: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

8 List of figures

Chapter 3: Temperature and equilibrium

Exercises

(3.1) Weirdness in high dimensions. ©pWe saw in momentum space that most of the sur-face area of a high-dimensional sphere is alongthe equator. Consider the volume of a high-dimensional sphere.Is most of the volume near the center, or the sur-face? How might this relate to statistical mechan-ics treatments of momentum space, which approx-imate the volume of an energy shell with the vol-ume of the entire sphere?

(3.2) Hard sphere gas. ©i

r

Fig. 3.2 Hard sphere gas.

We can improve on the realism of the ideal gas bygiving the atoms a small radius. If we make thepotential energy infinite inside this radius (hardspheres), the potential energy is simple (zero un-less the spheres overlap, which is forbidden). Letus do this in two dimensions; three dimensions isno more complicated, but slightly harder to visu-alize.A two-dimensional L × L box with hard wallscontains a gas of N hard disks of radius r ≪ L

(Fig. 3.2). The disks are dilute; the summed areaNπr2 ≪ L2. Let A be the effective area allowedfor the disks in the box (Fig. 3.2): A = (L− 2r)2.

Fig. 3.3 Excluded area around a hard disk.

(a) The area allowed for the second disk is A −π(2r)2 (Fig. 3.3), ignoring the small correctionwhen the excluded region around the first diskoverlaps the excluded region near the walls of thebox. What is the allowed 2N-dimensional vol-ume in configuration space8 ΩQ

HD of allowed zero-energy configurations of hard disks, in this di-lute limit? Leave your answer as a product of Nterms.Our formula in part (a) expresses ΩQ

HD strangely,with each disk in the product only feeling the ex-cluded area from the former disks. For large num-bers of disks and small densities, we can rewriteΩQ

HD more symmetrically, with each disk feelingthe same excluded area Aexcl.(b) Write log(ΩQ

HD) as a sum over the number ofdisks. Use log(1− ǫ) ≈ −ǫ and9 ∑N−1

m=0m = (N −1)(N − 2)/2 ≈ N2/2 to approximate log(ΩQ

HD) ≈N logA − Nǫ, solving for ǫ and evaluating anysums over disks. Then use −ǫ ≈ log(1 − ǫ) towrite log(ΩQ

HD) in terms of A− Aexcl.

8Again, ignore small corrections when the excluded region around one disk overlapsthe excluded regions around other disks, or near the walls of the box. You may chooseto include the 1/N ! correction for identical particles if you wish, but be consistentwith the ideal gas in part (c).9If this formula is not familiar, you can check the exact formula for the first few N ,or convert the sum to an integral

∫N−10

mdm ≈∫N0

mdm = N2/2.

Page 19: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 9

Fig. 3.4 Hard spheres vs. ideal gas.

Now consider a box of area A = Lx × Ly, with asliding partition of negligable width separating Nhard disks from N ideal-gas particles (Fig. 3.4).The position-space coordinates are thus 2N hard-disk positions, 2N ideal-gas positions, and the po-sition of the partition. The total allowed position-space volume is then given by the volume of all al-lowed configurations of hard-disks in an area AHD,times the volume of all allowed configurations ofideal-gas particles in the area A−AHD, integratedover the position of the partition xHD = AHD/Ly ,so

Ω(A) =

∫ΩQ

HD(AHD)ΩQ

ideal(A− AHD)dAHD/Ly

≈ ΩQ

HD(AmaxHD )ΩQ

ideal(A− AmaxHD ).

(3.8)where Amax

HD is the maximum of the integrand.10

(c) Find AmaxHD in terms of A and Aexcl.

Remember the ideal gas law PV = NkBT , wherefor us V must be A − Amax

HD . What is the cor-responding equation of state for the hard-spheregas? We could use the thermodynamic relationeqn (3.31) to evaluate the pressure for the hard-disk gas, but instead let us evaluate it by usingthe ideal gas as a pressure gauge. The partitionwill move until the pressure of the hard-sphere gasequals that of the ideal gas.(d) Using your formula for Amax

HD , solve for A interms of Amax

HD and Aexcl. Solve for the coexistencepressure. Write the equation of state of the hard-disk gas, relating its pressure PHD, temperature T ,and area AHD.

(3.3) Entropy maximization and temperature.©pExplain in words why, for two weakly coupled sys-tems, that eqn 3.23

ρ(E1) = Ω1(E1)Ω2(E2)/Ω(E) (3.9)

should be intuitively clear for an equilibrium sys-tem. Using S = kB log(Ω), show in one step thatmaximizing the probability of E1 maximizes thesum of the entropies S(E1)+S(E2). Show in onemore step that 1/T = ∂S/∂E is equal for the twosystems when the probability is maximized.

(3.4) Large and very large numbers. ©iThe numbers that arise in statistical mechanicscan defeat your calculator. A googol is 10100 (onewith a hundred zeros after it). A googolplex is10googol .Consider a monatomic ideal gas with one mole ofparticles (N = Avogadro’s number, 6.02 × 1023),room temperature T = 300K, and volume V =22.4 liters (corresponding to atmospheric pres-sure).(a) Which of the properties (S, T , E, and Ω(E))of our gas sample are larger than a googol? Agoogolplex? Does it matter what units you use,within reason?If you double the size of a large equilibrium sys-tem (say, by taking two copies and weakly cou-pling them), some properties will be roughly un-changed; these are called intensive. Some, like thenumber N of particles, will roughly double; theyare called extensive. Some will grow much fasterwith the size of the system.(b) Which category (intensive, extensive, faster)does each of the properties from part (a) belongto?For a large system of N particles, one can usuallyignore terms which add a constant independentof N to extensive quantities. (Adding 2π to 1023

does not change it enough to matter.) For proper-ties which grow even faster, overall multiplicativefactors often are physically unimportant.

10In the last step we ignore Ly as an unimportant multiplicative factor; see Exer-cise (3.4).11One can derive the formula by solving S = S(N, V,E) for E. It is the same surfacein four dimensions as S(N, V,E) (Fig. 3.4) with a different direction pointing ‘up’.

Page 20: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

10 List of figures

(3.5) Maxwell relations. (Thermodynamics, Math-ematics) ©iConsider the microcanonical formula for the equi-librium energy E(S,V,N) of some general sys-tem.11 One knows that the second derivatives ofE are symmetric; at fixed N , we get the same an-swer whichever order we take partial derivativeswith respect to S and V .Use this to show the Maxwell relation

∂T

∂V

∣∣∣∣S,N

= − ∂P

∂S

∣∣∣∣V,N

. (3.10)

(This should take two lines of calculus or less.)Generate two other similar formulæ by takingother second partial derivatives of E. There aremany of these relations [?].

(3.6) Triple product relation. (Thermodynamics,Mathematics) ©iIn traditional thermodynamics, there are manyuseful formulas like

dE = T dS − P dV, (3.11)

(see Section 6.4 and the inside front cover of thistext). For example, if V is held constant (andhence dV = 0 then dE = T dS from eqn (3.11),giving ∂S/∂E|V = 1/T (the definition of temper-ature, eqn (3.29)).(a) Use eqn (3.11) to rederive the traditional for-mula for the pressure P .Let us consider a general formula of this type,

A dx+B dy + C df = 0. (3.12)

(b) What is ∂f/∂x|y? ∂f/∂y|x? ∂x/∂y|f?Use these to derive the triple product relationeqn (3.33), (∂x/∂y)|f (∂y/∂f)|x(∂f/∂x)|y = −1.I have always been uncomfortable with manipu-lating dXs.12 How can we derive these relationsgeometrically, with traditional partial deriva-tives? Our equation of state S(E, V,N) at fixedN is a surface embedded in three dimensions. Fig-ure 3.4 shows a triangle on this surface, which wecan use to derive the general triple-product rela-tion between partial derivatives.(b) Show, if f is a function of x and y, that(∂x/∂y)|f (∂y/∂f)|x(∂f/∂x)|y = −1. (Hint:Consider the triangular path in Fig. 3.4. The firstside starts at (x0, y0, f0) and moves along a con-tour at constant f to y0+∆y. The resulting vertex

will thus be at (x0 + (∂x/∂y)|f∆y, y0 + ∆y, f0).The second side runs at constant x back to y0,and the third side runs at constant y back to(x0, y0, f0). The curve must close to make fa single-valued function; the resulting equationshould imply the triple-product relation.)

(3.7) What is the chemical potential?. ©pWhat the heck is a chemical potential? Explain.

(3.8) Undistinguished particles. ©pWhy should we divide the phase-space volume byN !, when we do not keep track of the differencesbetween the N particles? Look up ‘entropy of mix-ing’. In Fig. (5.4), can you see how to extractwork from the mixing using methods that cannottell white from black particles? If we had a door inthe partition wall that let through only white par-ticles, what would happen? Could we then extractwork from the system?

(3.9) Random energy model.13 (Disordered sys-tems) ©3The nightmare of every optimization algorithmis a random landscape; if every new configura-tion has an energy uncorrelated with the previ-ous ones, no search method is better than sys-tematically examining every configuration. Find-ing ground states of disordered systems like spinglasses and random-field models, or equilibrat-ing them at non-zero temperatures, is challeng-ing because the energy landscape has many fea-tures that are quite random. The random energymodel (REM) is a caricature of these disorderedsystems, where the correlations are completely ig-nored. While optimization of a single REM be-comes hopeless, we shall see that the study of theensemble of REM problems is quite fruitful andinteresting.The REM hasM = 2N states for a system with N‘particles’ (like an Ising spin glass with N spins),each state with a randomly chosen energy. It de-scribes systems in limit when the interactions areso strong and complicated that flipping the stateof a single particle completely randomizes the en-ergy. The states of the individual particles thenneed not be distinguished; we label the states ofthe entire system by j ∈ 1, . . . , 2N. The ener-gies of these states Ej are assumed independent,

12They are really differential forms, which are mathematically subtle (see note 23on p. 116).13This exercise draws heavily from [?, chapter 5].

Page 21: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 11

uncorrelated variables with a Gaussian probabil-ity distribution

P (E) =1√πN

e−E2/N (3.13)

of standard deviation√N/2.

Microcanonical ensemble. Consider the states ina small range E < Ej < E + δE. Let the numberof such states in this range be Ω(E)δE.(a) Calculate the average

〈Ω(Nǫ)〉REM (3.14)

over the ensemble of REM systems, in terms ofthe energy per particle ǫ. For energies near zero,show that this average density of states grows ex-ponentially as the system size N grows. In con-trast, show that 〈Ω(Nǫ)〉REM decreases exponen-tially for E < −Nǫ∗ and for E > Nǫ∗, where thelimiting energy per particle

ǫ∗ =√

log 2. (3.15)

(Hint: The total number of states 2N either growsfaster or more slowly than the probability densityper state P (E) shrinks.)What does an exponentially growing number ofstates mean? Let the entropy per particle bes(ǫ) = S(Nǫ)/N . Then (setting kB = 1 fornotational convenience) Ω(E) = exp(S(E)) =exp(Ns(ǫ)) grows exponentially whenever the en-tropy per particle is positive.What does an exponentially decaying number ofstates for ǫ < −ǫ∗ mean? It means that, forany particular REM, the likelihood of having anystates with energy per particle near ǫ vanishesrapidly as the number of particles N grows large.How do we calculate the entropy per particle s(ǫ)of a typical REM? Can we just use the annealed14

average

sannealed(ǫ) = limN→∞

(1/N) log〈Ω(E)〉REM (3.16)

computed by averaging over the entire ensembleof REMs?

(b) Show that sannealed(ǫ) = log 2− ǫ2.If the energy per particle is above −ǫ∗ (and be-low ǫ∗), the expected number of states Ω(E) δEgrows exponentially with system size, so the frac-tional fluctuations become unimportant as N →∞. The typical entropy will become the annealedentropy. On the other hand, if the energy perparticle is below −ǫ∗, the number of states in theenergy range (E,E + δE) rapidly goes to zero,so the typical entropy s(ǫ) goes to minus infin-ity. (The annealed entropy is not minus infinitybecause it gets a contribution from exponentiallyrare REMs that happen to have an energy levelfar into the tail of the probability distribution.)Hence

s(ǫ) = sannealed(ǫ) = log 2− ǫ2 |ǫ| < ǫ∗

s(ǫ) = −∞ |ǫ| > ǫ∗.(3.17)

Notice why these arguments are subtle. EachREM model in principle has a different entropy.For large systems as N → ∞, the entropies of dif-ferent REMs look more and more similar to oneanother15 (the entropy is self-averaging) whether|ǫ| < ǫ∗ or |ǫ| > ǫ∗. However, Ω(E) is not self-averaging for |ǫ| > ǫ∗, so the typical entropy is notgiven by the ‘annealed’ logarithm 〈Ω(E)〉REM.This sharp cutoff in the energy distribution leadsto a phase transition as a function of temperature.(c) Plot s(ǫ) versus ǫ, and illustrate graphicallythe relation 1/T = ∂S/∂E = ∂s/∂ǫ as a tangentline to the curve, using an energy in the range−ǫ∗ < ǫ < 0. What is the critical temperatureTc? What happens to the tangent line as the tem-perature continues to decrease below Tc?When the energy reaches ǫ∗, it stops changing asthe temperature continues to decrease (becausethere are no states16 below ǫ∗).(d) (Needs some familiarity with parts of chap-ter 6.) Solve for the free energy per particlef(T ) = ǫ−Ts, both in the high-temperature phaseand the low temperature phase. (Your formula forf should not depend upon ǫ.) What is the entropyin the low temperature phase? (Warning: The mi-crocanonical entropy is discontinuous at ǫ∗. You’ll

14Annealing a disordered system (like an alloy or a disordered metal with frozen-indefects) is done by heating it to allow the defects and disordered regions to reachequilibrium. By averaging Ω(E) not only over levels within one REM but also overall REMs, we are computing the result of equilbrating over the disorder—an annealedaverage.15Mathematically, the entropies per particle of REM models with N particles ap-proach that given by equation 3.17 with probability one [?, eqn 5.10].16The distribution of ground-state energies for the REM is an extremal statisticsproblem, which for large N has a Gumbel distribution (Exercise 12.10).

Page 22: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

12 List of figures

need to reason out which limit to take to get theright canonical entropy below Tc.)The REM has a glass transition at Tc. Above Tc

the entropy is extensive and the REM acts muchlike an equilibrium system. Below Tc one canshow [?, eqn 5.25] that the REM thermal pop-ulation condenses onto a finite number of states

(i.e., a number that does not grow as the size ofthe system increases), which goes to zero linearlyas T → 0.The mathematical structure of the REM alsoarises in other, quite different contexts, suchas combinatorial optimization (Exercise 3.1) andrandom error correcting codes [?, chapter 6].

Page 23: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 13

Chapter 4: Phase-space dynamics and ergodicity

Exercises

(4.1) Liouville vs. the damped pendulum.(Mathematics) ©iThe damped pendulum has a force −γp propor-tional to the momentum slowing down the pen-dulum. It satisfies the equations

x = p/M,

p = −γp−K sin(x).(4.18)

At long times, the pendulum will tend to an equi-librium stationary state, zero velocity at x = 0(or more generally at the equivalent positionsx = 2mπ, for m an integer); (p, x) = (0, 0) is anattractor for the damped pendulum. An ensem-ble of damped pendulums is started with initialconditions distributed with probability ρ(p0, x0).At late times, these initial conditions are gatheredtogether near the equilibrium stationary state; Li-ouville’s theorem clearly is not satisfied.

r

Fig. 4.5 Total derivatives. The total derivativegives the local density as measured by a particle mov-ing with the flow: dρ/dt = d/dt (ρ(x(t), p(t), t)). Ap-plying the chain rule gives the definition of the totalderivative, dρ/dt = ∂ρ/∂t + ∂ρ/∂xx+ ∂ρ/∂pp.

(a) In the steps leading from eqn 4.5 to eqn 4.7,why does Liouville’s theorem not apply to thedamped pendulum? More specifically, what are∂p/∂p and ∂q/∂q?(b) Find an expression for the total derivativedρ/dt in terms of ρ for the damped pendulum. Ifwe evolve a region of phase space of initial vol-ume A = ∆p∆x how will its volume depend upontime?

(4.2) No attractors in Hamiltonian systems. ©pPlanetary motion is one of the prime examples ofa Hamiltonian dynamical system. In the three-body problem with one body of small mass, La-grange showed that there were five configurations(L1, L2, L3, L4, and L5) that rotated as a unit– forming fixed points in a rotating frame of ref-erence. Three of these (L1-L3) are unstable, andone is stable. In dissipative dynamical systems,‘stable fixed point’ implies that neighboring ini-tial conditions will converge to the fixed point atlong times.Is that true here? What does it mean to be stable,or unstable, in a Hamiltonian system?You might look up Lagrange points, and how un-stable Lagrange points are used in low energy or-bital transfer (how to get to the Moon cheaply,but slowly). Also, I’ve not seen a clear demonstra-tion that L5 is truly stable – my simulations17 sug-gest otherwise, and the general theory of Arnol’ddiffusion also suggests otherwise.

(4.3) Perverse initial conditions. ©pIf we start a gas of classical spherical particles ina square box all in a vertical line, and all movingexactly in the vertical direction, they will bounceback and forth vertically forever.Does that mean a gas of particles in a square boxis not ergodic? Why or why not?

17See http://pages.physics.cornell.edu/∼sethna/teaching/sss/jupiter/Web/L5Poinc.htm.

Page 24: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

14 List of figures

(4.4) Crooks: Exact results in non-equilibriumstatistical mechanics. ©3

Σ

Ω

U

(E+W)

C

Ω

(E)

Fig. 4.6 Crooks fluctuation theorem: evolution

in time. The inner energy shell Ω(E) gets mappedby a time-dependent Hamiltonian evolution U into aregion spanning a variety of energies (wiggly region);different initial conditions absorb different amounts ofwork. The region ΣC denotes the final states thatlanded in the energy shell Ω(E+W ) (with work W ).

Just at the end of the 20th century, striking newfluctuation theorems were developed for broadclasses of systems driven out of equilibrium. Wecan derive a version of Crooks’ fluctuation the-orem [?] using Liouville’s theorem in the micro-canonical ensemble. See also Exercise (4.5).Consider a system in microcanonical equilibriumin the energy shell (E,E + δE). (Think of apiston of gas, insulated from the outside world).Let the system be subject to an external forcing,giving it some time-dependent potential energytaking the system from an initial state to a fi-nal state. In particular, if the system starts ata point in phase-space (Q,P) when the externalforce starts, let us denote the final point U(Q,P)as in Fig. 4.6. (Think of compressing the gas, per-haps very rapidly, sending it out of equilibrium.)The external force will do different amounts ofwork W on the system depending on the initialcondition. (Think of just a few atoms of gas,which as the gas is compressed may collide moreor less strongly with the moving piston dependingon their initial state.)You should check the derivation of Liouville’s the-orem eqns (4.5-4.7), that phase-space volume isconserved also for time-dependent potential ener-gies.

ΣD

−1U

Fig. 4.7 Crooks fluctuation theorem: evolution

backward in time. Starting from the outer energyshell Ω(E +W ) transformed under the inverse evolu-tion U−1 (wiggly region), the region ΣC gets mappedback into ΣD (and vice-versa under U).

(a) The system starts in microcanonical equilib-rium in the energy shell (E,E + δE), and theprobability density evolves under U into the wig-gly region in Fig. 4.6). Inside this wiggly region,what is the final phase-space probability density?Conversely, if the system starts in microcanonicalequilibrium in the energy shell (E +W,E + δE +W ), what is the final phase space density underthe evolution U−1 (Fig. 4.7)? Express your an-swer in terms of the function Ω(E).Since the microscopic equations of motion are in-variant under time reversal, the volume in phasespace ΣC that absorbed work W starting nearenergy E gets sent by U into the volume ΣD

that emitted work W under the reverse evolutionU−1. Liouville’s theorem tells us that the phase-space volume of ΣC must therefore be equal tothe phase-space volume of ΣD.An experiment starts in a microcanonical state inthe energy shell (E,E + δE), and measures theprobability pC(W ) that the evolution U leaves itin the energy shell (E+W,E+ δE+W ) (roughlydoing work W ). Another experiment starts in(E + W,E + δE + W ), and measures the prob-ability pD(−W ) that the evolution U−1 leaves itin the energy shell (E,E + δE).(b) What is pC(W ), in terms of Ω(E), Ω(E+W ),and ΣC? What is pD(−W ), in terms of theenergy-shell volumes and ΣD? (See Figs. (4.6and 4.7). Hint: the microcanonical ensemble fillsits energy shell with uniform density in phasespace.)Now let us use Liouville’s theorem to derive a way

Page 25: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 15

of using this non-equilibrium experiment to mea-sure the equilibrium entropy difference S(E+W )−S(E).(c) Use your result from part (b) to write Ω(E +W )/Ω(E) in terms of the measured pC(W ) andpD(−W ). Use this to derive the entropy differenceS(E+W )−S(E). (Hint: S(E) = kB log (Ω(E)).)Who would have thought that anything could beproven about an arbitrary time-dependent sys-tem, driven out of equilibrium? Who would havethought that anything physical like pC(W ) wouldbe ever given in terms of the very-large-numberΩ(E) (instead of its log)?Your result in part (c) is a version of Crooks fluc-tuation theorem [?], which states that

pD(−W )/pC(W ) = exp(−∆S/kB). (4.19)

(4.5) Jarzynski: Exact results in non-equilibrium statistical mechanics. ©3You cannot create a perpetual motion machine – amachine that extracts energy from thermal vibra-tions to do useful work. Tiny machines, however,fluctuate – sometimes doing work on their envi-ronment. This exercise discusses a remarkableresult, the Jarzynski equality [?], that quantifiesthe effects of fluctuations in small non-equilibriumsystems. (See also Exercise 4.4.)Suppose we compress a piston from length L tolength L′, doing work Wc ≥ 0 on the gas in thepiston, heating the gas. As we allow the gas toexpand back to length L, the net work done bythe outside world We ≤ 0 is negative, cooling thegas. Perpetual motion is impossible, so no network can be done extracting kinetic energy fromthe gas:

〈Wc +We〉 ≥ 0. (4.20)

Note the averages 〈〉 shown in eqn (4.20). Espe-cially for small systems compressed quickly, therewill be fluctuations in the work done. Sometimesit will happen that the gas will indeed do net workon the outside world. One version of the Jarzynskiequality states that a thermally isolated system,starting in equilibrium at temperature T , evolvedunder a time-dependent Hamiltonian back to itsinitial state, will always satisfy

〈exp(−(Wc +We)/kBT )〉 = 1. (4.21)

(a) Can this be true if the work Wc +We > 0 forall initial conditions? Must every system, evolvedto a non-equilibrium state under a time-dependentHamiltonian, sometimes do net work on the out-side world?Note that the Jarzynski equality eqn (4.21) im-plies the fact that the average work must be pos-itive, eqn (4.20). Because e−x is convex, 〈e−x〉 ≥e−〈x〉. (Roughly, because e−x is curved upwardeverywhere, its average is bigger than its valueat the average x.) Applying this to eqn (4.21)gives us 1 = 〈e−(Wc+We)/kBT 〉 ≥ e−〈Wc+We〉/kBT

(note the difference between parentheses () andangles 〈〉). Taking logs of both sides tells us0 ≥ −〈Wc + We〉/kBT so 0 ≤ 〈Wc + We〉; thenet work done is not negative. Other forms of theJarzynski equality give the change in free energyfor non-equilibrium systems that start and end indifferent states.We shall test a version of this equality in thesmallest, simplest system we can imagine – oneparticle of an ideal gas inside a thermally insu-lated piston, Fig. (4.8), initially confined to 0 <x < L, compressed into a region L − L′ < x < Lof length L′ at velocity V , and then immedi-ately allowed to expand back to L at velocity −V .For simplicity, we shall assume that the particleswhich are hit by the piston during compressiondo not have time to bounce off the far wall andreturn.18

♣❱

Fig. 4.8 One particle in a piston. The piston ispushed forward at velocity V (not assumed small) adistance ∆L, changing the piston length from L toL′ = L − ∆L. The particle is initially at position xand momentum p. If it is hit by the piston, it willrecoil to a new velocity, absorbing work Wc(p) fromthe piston.

18Note that, to stay in equilibrium, the particle must bounce off the piston manytimes during the compression – we are working in the other extreme limit. Also notethat we could just assume an infinitely long piston, except for the computationalpart of the exercise.

Page 26: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

16 List of figures

First, we carefully work through some elementarymechanics. How much energy is transferred dur-ing a collision with the piston? Which initial con-ditions will collide during contraction? Duringexpansion?(b) If a particle with momentum p elastically col-lides with the compressing piston moving with ve-locity V , as shown in Fig. (4.8), what is the fi-nal kinetic energy after the collision? (Hint: Itis easy in the moving reference frame of the pis-ton.) What work Wc(p) is done by the piston onthe particle? If a particle with momentum p elas-tically collides with the expanding piston movingwith velocity −V , what is the work done We(p)by the piston on the particle? Can the piston donegative work during compression? (Hint: Whatis the direction of the impulsive force during a col-lision?) Can We(p) be positive?

Fig. 4.9 Collision phase diagram. Initial condi-tions (x0, p0) in phase space for the particle, show-ing regions where it collides with the piston duringthe compression cycle and during the expansion cycle.The equations determining the boundaries are shown.

Many initial condition (x0, p0) for the particle att = 0 will not collide with the piston during com-pression, many will not collide with it during theexpansion. Fig. (4.9) shows the initial conditionsin phase space from which a particle will collidewith the piston during the compression and ex-pansion.(c) Which boundary describes the initial condi-tions for a particle that collides with the pistonat the very start of the compression cycle? Atthe end of the compression cycle? Which bound-ary describes the initial conditions which collide atthe beginning of the expansion cycle? At the endof the expansion cycle? Will any particle collidewith the piston more than once, given our assump-tion that the particle does not have time to bounceoff of the wall at x = L? Why or why not?Our macroscopic intuition is that the fluctuations

that do net negative work should be rare. Butthese fluctuations, weighted exponentially as inthe Jarzynski equality eqn (4.21), must balancethose that do positive work.(d) Which of the initial conditions of part (c)will do negative net work on the particle? Atlow temperatures and high velocities V , wherekBT ≪ 1/2mV

2, are these initial conditions rare?The theoretical argument for the Jarzynski equal-ity is closely related to that for the Crooksthermodynamic relation of Exercise (4.4). Hereinstead we shall test the answer numerically.Choose m = 2, kBT = 8, V = 2, L = 20, andL′ = 19 (so ∆L = 1).We could calculate the expectation values 〈· · · 〉by doing an integral over phase space (Fig. 4.9)weighted by the Boltzmann distribution. Instead,let us do an ensemble of experiments, starting par-ticles with a Boltzmann distribution of initial con-ditions in phase space. (This is an example of aMonte-Carlo simulation; see Chapter 8).(e) Write a routine that generates N initial con-ditions (x0, p0) with a Boltzmann probability dis-tribution for our ideal gas particle in the pis-ton, and calculates Wc(x0, p0) + We(x0, p0) foreach. Calculate the average work done by the pis-ton 〈Wc +We〉. How big must N be before youget a reliable average? Does your average obeyeqn. (4.20)?(f) Use your routine with N = 1000 to cal-culate Jarzynski’s exponentially weighted average〈exp (−W/kBT )〉 in eqn (4.21), and compare toexp (−〈W 〉/kBT ). Do this 200 times, and plota histogram of each. Does the Jarzynski equalityappear to hold?For big systems, the factor exp(−W/kBT ) in theJarzynski equality balances extremely rare fluctu-ations with low work against the vast majority ofinstances where the work has small fluctuationsabout the average.

(4.6) The Arnol’d cat map, Expanded. (Mathe-matics) ©3Why do we suppose equilibrium systems uni-formly cover the energy surface? Chaotic motionhas sensitive dependence on initial conditions; re-gions on the energy surface get stretched into thinribbons that twist and fold in complicated pat-terns, losing information about the initial con-ditions and leaving many systems with a uni-form distribution of probability over all accessiblestates. Since energy is the only thing we know isconserved, we average over the energy surface.

Page 27: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 17

Arnol’d developed a simple illustration of thisstretching and folding using a function takinga two-dimensional square into itself (called the“cat map”, Figure 4.10). Liouville’s theoremwill tell us that Hamiltonian dynamics preservesthe 6N-dimensional phase-space volume; Arnol’dmap preserves area in this 1 × 1 square, taking[0, 1) × [0, 1) in the plane onto itself:19

Γ(x, p) =

(2 11 1

)(xp

)mod 1

=

(2x+ px+ p

)mod 1 (4.22)

= Mod1

(M

(xp

))

where M = ( 2 11 1 ) stretches and squeezes our

square ‘energy shell’ and Mod1 cuts and pastesit back back into the square. We imagine a tra-jectory (xn+1, pn+1) = Γ(xn, pn) evolving with adiscrete ‘time’ n. This map vividly illustrates howHamiltonian dynamics leads to the microcanoni-cal ensemble.

p

x

p

x

0 1

1

1

0 1

Fig. 4.10 Arnol’d cat map. Arnol’d, in a take-offon Schrodinger’s cat, paints a cat on a 2D phase space,which gets warped and twisted as time evolves. Fromhttp://www-chaos.umd.edu/misc/catmap.html.

The cutting and pasting process con-fuses our analysis as we repeatedly ap-ply the map (xn, pn) = Γn(x0, p0) =Mod1(M(Mod1(M(. . . (x0, p0))))). Fortunately,we can view the time evolution as many appli-cations of the stretching process, with a final

cut-and-paste:

(xn, pn) = Mod1(Mn(x0, p0)). (4.23)

(a) Show that the last equation (4.23) is true.(Hint: Work by induction. You may use the factthat ymod 1 = y + n for some integer n and that(z + n)mod 1 = zmod 1 for any integer n.) Thusthe stretching by M can be done in the entire(x, p) plane, and then the resulting thin strip canbe folded back into the unit square.Now we analyze how the map stretches andsqueezes the rectangle.(b) Verify that (γ, 1) and (−1/γ, 1) are eigenvec-tors of M , where γ = (

√5 + 1)/2 is the Golden

Ratio. What are the eigenvalues? Consider asquare box, centered anywhere in the plane, tiltedso that its sides are parallel to the two eigenvec-tors. How does its height and width change underapplication ofM? Show that its area is conserved.(Hint: If you are doing this by hand, show that1/γ = (

√5 − 1)/2, which can simplify your alge-

bra.)

p

−h

h

x0

0

Fig. 4.11 Evolution of an initially concentrated

phase-space density. Suppose we know our parti-cle has initial position and momentum confined to asmall region |(x, p)| < r. This small region is stretchedalong an irrational angle under the Arnol’d cat map.(This figure shows the origin x = 0, p = 0 as the cen-ter of the figure; in the map of eqn (4.22) shown inFig. (4.10) the origin is at the corner.)

19Such maps arise as Poincare sections of Hamiltonian flows, or as the periodic mapgiven by a Floquet theory. There are many analogies between the behavior of dy-namical systems with continuous motion and discrete maps.

Page 28: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

18 List of figures

The next question explores the map iterationwhen the initial point is at or near the origin(x0, p0) = (0, 0). As illustred in Figure 4.11, asmall, disk-shaped region of uncertainty, centeredon the origin, evolves with time n into a thin strip.When the iteration would make thin strip hit thesquare’s boundary, it gets cut in two; further it-erations stretch and chop our original circle intoa bunch of parallel thin lines. (In the version ofour map given by eqn 4.23, Mn stretches the ini-tial circle into a growing long thin line, which getsfolded in by Mod1 to a series of parallel line seg-ments.)(c) Calculate the momentum h at which the direc-tion of the thin strip (as measured by the expand-ing eigenvector in part (b)) first crosses the linex = 0. Note that h is irrational. The multiples ofan irrational number, taken mod 1, can be shownto be dense in the interval [0, 1]. Using this, arguethat the resulting thin strip, in the limit of manyiternations, will become dense in the unit square(the maximum gap between line segments will goto zero as time goes to infinity).These lines eventually cover the square (our ‘en-ergy shell’) densely and uniformly – intuitively il-lustrating how the microcanonical ensemble typ-ically describes the long-term behavior. (Thesmall circle represents an uncertainty in the ini-tial position. As for chaotic motion, Arnol’d’s catmap amplifies the uncertainty in the initial posi-tion, which leads to the microcanonical ensembleprobability distribution spread uniformly over thesquare.)But there are many initial conditions whosetrajectories do not cover the energy surface.The most obvious example is the origin, x0 =p0 = 0, which maps into itself. In thatcase, the time average of an operator20 O,limT→∞ 1/T

∑T0 O(xn, pn) = O(0, 0), clearly will

usually be different from its microcanonical av-erage

∫ 1

0dx

∫ 1

0dpO(x, p). The next part of the

question asks about a less trivial example of aspecial initial condition leading to non-ergotic be-havior.(d) If x0 and p0 can be written as fractions a/qand b/q for positive integers q, a ≤ q, and b ≤ q,

show that xn and pn can be written as fractionswith denominator q as well. Show thus that thistrajectory must eventually21 settle into a periodicorbit, and give an upper bound to the period. Mustthe time average of an operator O for such a tra-jectory equal the microcanonical average?You probably have heard that there are infinitelymany more irrational numbers than rational ones.So the probability of landing on one of the peri-odic orbits is zero (points with rational x and pcoordinates are of measure zero). But perhapsthere are even more initial conditions which donot equilibrate?(e) Find an initial condition with a non-periodicorbit which goes to the origin as n → ∞. Whatwill the time average of O be for this initial con-dition?Arnol’d’s argument that the cat map (4.22) is er-godic shows that, even taking into account pointslike those you found in parts (d) and (e), theprobability of landing on a non-ergodic compo-nent is zero. For ‘almost all’ initial conditions, orany initial condition with a small uncertainty, theArnol’d cat map trajectory will be dense and uni-formly cover its square phase space at long times.

(4.7) Jupiter! and the KAM theorem. (Astro-physics, Mathematics) ©3See also the Jupiter web pages [?].The foundation of statistical mechanics is the er-godic hypothesis: any large system will explorethe entire energy surface. We focus on large sys-tems because it is well known that many systemswith a few interacting particles are definitely notergodic.The classic example of a non-ergodic system is theSolar System. Jupiter has plenty of energy to sendthe other planets out of the Solar System. Mostof the phase-space volume of the energy surfacehas eight planets evaporated and Jupiter orbitingthe Sun alone; the ergodic hypothesis would doomus to one long harsh winter. So, the big questionis: why has the Earth not been kicked out intointerstellar space?

20The operator O is some function of x and p. In a physics problem, it could be theenergy, the momentum, or the square of the position. In our problem, we’d want Oto obey the periodic boundary conditions of our square, so it could be something likeO(x, p) = sin2(2πx)(cos(4πp) + 1).21Since the cat map of eqn (4.22) is invertible, the orbit will be periodic without aninitial transient, but that’s harder to argue.

Page 29: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 19

Fig. 4.12 The Earth’s trajectory around the Sun ifJupiter’s mass is abruptly increased by about a factorof 100.

Mathematical physicists have studied this prob-lem for hundreds of years. For simplicity, theyfocused on the three-body problem: for exam-ple, the Sun, Jupiter, and the Earth. The early(failed) attempts tried to do perturbation theoryin the strength of the interaction between plan-ets. Jupiter’s gravitational force on the Earth isnot tiny, though; if it acted as a constant brakeor accelerator, our orbit would be seriously per-turbed in a few thousand years. Jupiter’s effectsmust cancel out over time rather perfectly. . .This exercise is mostly discussion and exploration;only a few questions need to be answered. Down-load the hints files for Jupiter from the text website [?]. Check that Jupiter does not seem to sendthe Earth out of the Solar System. Try increasingJupiter’s mass to 35 000 Earth masses.Reset Jupiter’s mass back to 317.83 Earth masses.View Earth’s trajectory, run for a while, and zoomin to see the small effects of Jupiter on the Earth.Note that the Earth’s position shifts dependingon whether Jupiter is on the near or far side ofthe Sun.(a) Estimate the fraction that the Earth’s radiusfrom the Sun changes during the first Jovian year(about 11.9 years). How much does this fractionalvariation increase over the next hundred Jovianyears?Jupiter thus warps Earth’s orbit into a kind of spi-ral around a tube. This orbit in physical three-dimensional space is a projection of the tube in6N-dimensional phase space. The tube in phasespace already exists for massless planets. . .Let us start in the non-interacting planet approx-

imation (where Earth and Jupiter are assumed tohave zero mass). Both Earth’s orbit and Jupiter’sorbit then become circles, or more generally el-lipses. The field of topology does not distinguishan ellipse from a circle; any stretched, ‘wiggled’rubber band is a circle so long as it forms a curvethat closes into a loop. Similarly, a torus (the sur-face of a doughnut) is topologically equivalent toany closed surface with one hole in it (like the sur-face of a coffee cup, with the handle as the hole).Convince yourself in this non-interacting approx-imation that Earth’s orbit remains topologicallya circle in its six-dimensional phase space.22

Fig. 4.13 Torus. The Earth’s orbit around theSun after adiabatically increasing the Jovian mass to50 000 Earth masses.

(b) In the non-interacting planet approximation,what topological surface is it in the eighteen-dimensional phase space that contains the tra-jectory of the three bodies? Choose between(i) sphere, (ii) torus, (iii) Klein bottle, (iv) two-hole torus, and (v) complex projective plane.(Hint: It is a circle cross a circle, parameterizedby two independent angles—one representing thetime during Earth’s year, and one representingthe time during a Jovian year. Feel free to lookat Fig. 4.13 and part (c) before committing your-self, if pure thought is not enough.) About howmany times does Earth wind around this surfaceduring each Jovian year? (This ratio of years iscalled the winding number.)The mathematical understanding of the three-body problem was only solved in the past hun-dred years or so, by Kolmogorov, Arnol’d, andMoser. Their proof focuses on the topological in-

22Hint: Plot the orbit in the (x, y), (x, px), and other planes. It should look like theprojection of a circle along various axes.

Page 30: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

20 List of figures

tegrity of this tube in phase space (called now theKAM torus). They were able to prove stabilityif the winding number (Jupiter year over Earthyear) is sufficiently irrational.23 More specifically,they could prove in this case that for sufficientlysmall planetary masses there is a distorted torusin phase space, near the unperturbed one, aroundwhich the planets spiral around with the samewinding number (Fig. 4.13).(c) About how large can you make Jupiter’s massbefore Earth’s orbit stops looking like a torus(Fig. 4.12)? (Restart each new mass at the sameinitial conditions; otherwise, your answer willdepend upon the location of Jupiter in the skywhen you begin.) Admire the remarkable trajec-tory when the mass becomes too heavy.Thus, for ‘small’ Jovian masses, the trajectory inphase space is warped and rotated a bit, so thatits toroidal shape is visible looking at Earth’s po-sition alone. (The circular orbit for zero Jovianmass is looking at the torus on edge.)The fact that the torus is not destroyed immedi-ately is a serious problem for statistical mechan-ics! The orbit does not ergodically explore theentire allowed energy surface. This is a counterex-ample to Boltzmann’s ergodic hypothesis. Thatmeans that time averages are not equal to aver-ages over the energy surface; our climate wouldbe very unpleasant, on the average, if our orbitwere ergodic.Let us use a Poincare section to explore thesetori, and the chaotic regions between them. Ifa dynamical system keeps looping back in phasespace, one can take a cross-section of phase spaceand look at the mapping from that cross-sectionback into itself (see Fig. 4.14).

Fig. 4.14 The Poincare section of a torus is a cir-cle. The dynamics on the torus becomes a mappingof the circle onto itself.

The Poincare section shown in Fig. 4.14 is aplanar cross-section in a three-dimensional phasespace. Can we reduce our problem to an interest-ing problem with three phase-space coordinates?The original problem has an eighteen-dimensionalphase space. In the center of mass frame it hastwelve interesting dimensions. If we restrict themotion to a plane, it reduces to eight dimen-sions. If we assume that the mass of the Earthis zero (the restricted planar three-body problem)we have five relevant coordinates (Earth xy po-sitions and velocities, and the location of Jupiteralong its orbit). If we remove one more variableby going to a rotating coordinate system that ro-tates with Jupiter, the current state of our modelcan be described with four numbers: two posi-tions and two momenta for the Earth. We canremove another variable by confining ourselves toa fixed ‘energy’. The true energy of the Earth isnot conserved (because Earth feels a periodic po-tential), but there is a conserved quantity whichis like the energy in the rotating frame; more de-tails are described on the web site [?] under ‘De-scription of the three-body problem’. This leavesus with a trajectory in three dimensions (so, forsmall Jovian masses, we have a torus embedded

23A ‘good’ irrational number x cannot be approximated by rationals p/q to betterthan ∝ 1/q2: given a good irrational x, there is a C > 0 such that |x− p/q| > C/q2

for all integers p and q. Almost all real numbers are good irrationals, and the KAMtheorem holds for them. For those readers who have heard of continued fractionexpansions, the best approximations of x by rationals are given by truncating thecontinued fraction expansion. Hence the most irrational number is the golden mean,(1 +

√5)/2 = 1/(1 + 1/(1 + 1/(1 + . . . ))).

Page 31: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 21

in a three-dimensional space). Finally, we take aPoincare cross-section; we plot a point of the tra-jectory every time Earth passes directly betweenJupiter and the Sun. We plot the distance toJupiter along the horizontal axis, and the veloc-ity component towards Jupiter along the verticalaxis; the perpendicular component of the velocityis not shown (and is determined by the ‘energy’).Set the view to Poincare. Set Jupiter’s mass to2000, and run for 1000 years. You should see twonice elliptical cross-sections of the torus. As youincrease the mass (resetting to the original ini-tial conditions) watch the toroidal cross-sectionsas they break down. Run for a few thousand yearsatMJ = 22 000Me; notice that the toroidal cross-section has become three circles.Fixing the mass at MJ = 22 000Me, let us ex-plore the dependence of the planetary orbits onthe initial condition. Set up a chaotic trajectory(MJ = 22 000Me) and observe the Poincare sec-tion. Launch trajectories at various locations inthe Poincare view at fixed ‘energy’.24 You canthus view the trajectories on a two-dimensionalcross-section of the three-dimensional constant-‘energy’ surface.Notice that many initial conditions slowly fill outclosed curves. These are KAM tori that havebeen squashed and twisted like rubber bands.25

Explore until you find some orbits that seem tofill out whole regions; these represent chaotic or-bits.26

(d) Print out a Poincare section with initial con-ditions both on KAM tori and in chaotic regions;label each. See Fig. 4.3 for a small segment ofthe picture you should generate.It turns out that proving that Jupiter’s effectscancel out depends on Earth’s smoothly averagingover the surface of the torus. If Jupiter’s year is arational multiple of Earth’s year, the orbit closesafter a few years and you do not average over the

whole torus; only a closed spiral. Rational wind-ing numbers, we now know, lead to chaos whenthe interactions are turned on; the large chaoticregion you found above is associated with an un-perturbed orbit with a winding ratio of 3:1 (hencethe three circles). Disturbingly, the rational num-bers are dense; between any two KAM tori thereare chaotic regions, just because between any twoirrational numbers there are rational ones. It iseven worse; it turns out that numbers which areextremely close to rational (Liouville numbers like

1+1/10+1/1010+1/101010

+. . . ) may also lead tochaos. It was amazingly tricky to prove that lotsof tori survive nonetheless. You can imagine whythis took hundreds of years to understand (espe-cially without computers to illustrate the tori andchaos graphically).

(4.8) 2D turbulence and Jupiter’s Great RedSpot.27 (Astrophysics) ©3Fully-developed turbulence is one of the outstand-ing challenges in science. The agitated motion ofwater spurting through a fire hose has a complex,fluctuating pattern of swirls and eddies span-ning many length scales. Many have studied thevelocity-velocity correlation functions in turbu-lence, and there are close analogies to the scal-ing behavior seen at continuous phase transitions.Nobody yet, though, has found the links to therenormalization group.Turbulence in two-dimensional fluids is much bet-ter understood, with the latest fashion using con-formal field theories (see A. M. Polyakov, Nucl.Phys. B 396 (1993) and much further work), butthe original insight was a point vortex model in-troduced by Onsager and others that we will sim-ulate here.The model describes the motion of point vortices,describing local rotational eddies in the fluid. Itprovides an analogy to Liouville’s theorem, an ex-ample of a system with an infinite number of con-

24This is probably done by clicking on the point in the display, depending on im-plementation. In the current implementation, this launches a trajectory with thatinitial position and velocity towards Jupiter; it sets the perpendicular component ofthe velocity to keep the current ‘energy’. Choosing a point where energy cannot beconserved, the program complains.25Continue if the trajectory does not run long enough to give you a complete feelingfor the cross-section; also, increase the time to run. Zoom in and out.26Notice that the chaotic orbit does not throw the Earth out of the Solar System.The chaotic regions near infinity and near our initial condition are not connected.This may be an artifact of our low-dimensional model; in other larger systems it isbelieved that all chaotic regions (on a connected energy surface) are joined throughArnol’d diffusion.27This exercise was developed in collaboration with Jaron Kent-Dobias.

Page 32: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

22 List of figures

served quantities, and a system which can have anegative temperature. And it provides a plausibleexplanation of Jupiter’s Giant Red Spot, and cy-clones, hurricanes, and typhoons on Earth.28 Wedraw on many sources, but our inspiration camefrom the work of Jonathan Miller, Peter B. We-ichman, and M. C. Cross, (“Statistical mechanics,Euler’s equation, and Jupiter’s Red Spot”, Phys.Rev. A 45, 2328 (1992)).

Fig. 4.15 Jupiter’s Red Spot. Jupiter’s at-mosphere is stormy, with winds that can exceed400 miles per hour in a complex, turbulent pat-tern. Jupiter’s ‘red spot’ is a giant storm 31/2 timesthe size of the earth, that has lasted for at least186 years. (Image from Voyager I, NASA’s God-dard Space Flight Center, courtesy of NASA/JPL,http://www.nasa.gov/content/jupiters-great-red-spot-viewed-by-voyager-i.

The flow of most liquids is almost incompressible;if u(x) is the velocity field, then to an excellent ap-proximation ∇ · u = 0. Helmholtz’s theorem tellsus that a smooth field u is determined by its diver-gence and curl, if it dies away fast enough at infin-ity and the region has no holes.29 So knowing howthe vorticity ω = ∇×u evolves is enough to deter-mine the evolution of the fluid. In two dimensions,the vorticity is a scalar ω(x) = ∂uy/∂x−∂ux/∂y.(a) Show that

u0(x) =Γ

θ

r(4.24)

in two dimensions has both zero curl and zero di-vergence, except at zero. (This is the velocity fieldaround one of Onsager’s vortices.) Use Stokes’theorem to show that the curl of this velocity fieldis ω(x) = Γδ(x) = Γδ(x)δ(y). Argue using thissolution that the velocity field in an annulus is notuniquely defined by its divergence and curl alone(as described by footnote 29). (Hint: Considerthe annulus bounded by two concentric circles sur-rounding our vortex.)We provide a simulation of the dynamics of inter-acting vortices. Download the hints file.(b) Run the simulation with one vortex with Γ1 =1 and a ‘tracer vortex’ with Γ = 0. Does the tracervortex rotate around the test vortex?At low velocities, high viscosity, and small dis-tances, fluids behave smoothly as they move;turbulence happens at high Reynolds numbers,where velocities are big, distances small, and vis-cosities are low. If we reduce the viscosity to zero,the kinetic energy of the fluid is conserved.We can write u as a nonlocal function of ω

ux(r) = − 1

∫dr′

y − y′

(r− r′)2ω(r′) (4.25)

uy(r) =1

∫dr′

x− x′

(r− r′)2ω(r′) (4.26)

(the Biot-Savart law).(c) Show that the Biot-Savart law agrees with youranswer from part (a) for the case of a δ-functionvorticity.We can write the kinetic energy H in terms ofthe vorticity, which turns out to be a non-localconvolution

H =

∫1/2u(r)

2dr (4.27)

= − 1

∫dr

∫dr′ω(r)ω(r′) log |r− r′|

where we set the density of the fluid equal to oneand ignore an overall constant.30 We can alsowrite the equations of motion for the vorticity ωin terms of itself and the velocity u

∂ω

∂t= −u · ∇ω, (4.28)

28Cyclones, hurricanes, and typhoons are different names used for the same weatherpatterns.29 More generally, if the vorticity is defined in a non-simply connected region S withn holes, and if u is parallel to the boundaries, then knowing ω(r) and the circulations∫∂Sn

u · dℓ around the edges ∂Sn of the holes specifies the velocity u inside.30Miller, Weichman, and Cross tell us that log (|r− r′|/R0) corresponds to freeboundary conditions, where R0 is a constant with dimensions of length.

Page 33: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 23

To set up our simulation, we need to find a numer-ical representation for the vorticity field. Onsagersuggested a discretization into vortices (part (a)),ω(r) =

∑Ni=1 Γiδ(r − ri). For point vortices, the

energy becomes

H = − 1

i

j!=i

ΓiΓj log((ri − rj)

2). (4.29)

Note that there is no ‘kinetic energy’ for the vor-tices. The energy in the velocity field is purelykinetic energy; the ‘potential’ energy of the vor-tices is the kinetic energy of the fluid.The vortices move according to the local velocityfield,

dri/dt = u(ri). (4.30)

This could be intuitively clear (the fluid drags thevortices with it). This means

dxi/dt =1

j!=i

Γj(yi − yj)/(ri − rj)2 (4.31)

dyi/dt = − 1

j!=i

Γj(xi − xj)/(ri − rj)2.

If there is no kinetic energy for the vortices, whatare the ‘conjugate variables’ analogous to x and pfor regular particles?(d) Check from eqn 4.31 and eqn 4.29 thatdxi/dt = (1/4Γi)∂H/∂yi and dyi/dt =−(1/4Γi)∂H/∂xi. Thus 2

√Γixi and 2

√Γiyi are

analogous to x and p in a regular phase space.Note that the continuum vorticity (eqn 4.28) hastotal derivative zero dω/dt = ∂ω/∂t+u ·∇ω = 0.(d) First, if a packet of fluid travels along a pathr(t) with r = u(r), how will the vorticity ω(r(t))of this packet change with time? Second, in thelimit of a δ-function vorticity, is this compatiblewith eqn (4.30)? (Hints: Remember the totalderivative is zero.)Launch the simulation.(e) Start with n = 20 vortices with a random dis-tribution of vortex strengths Γi ∈ [−1, 1] and ran-dom positions ri within the unit circle.31 Printthe original configuration. Run for a time t = 10,and print the final configuration. Do you seethe spontaneous formation of a giant whirlpool?Are the final positions roughly also randomly ar-ranged? Measure and report the energy H of yourvortex configuration. (Warning: Sometimes thedifferential equation solver crashes. Just restart

again with a new set of random initial conditions.)Note: we are not keeping the vortices inside theunit circle during our dynamical evolution.Hurricanes and Jupiter’s Red Spot can be thoughtof as large concentrations of vorticity – all theplus or minus vortices concentrated into one area,making a giant whirlpool. What do we need to doto arrange this? Let’s consider the effect of thetotal energy.(f) For a given set of vortices of strength Γi,would clustering the positive vortices and the neg-ative vortices each into their own clump (counter-rotating hurricanes) be a high-energy or a low en-ergy configuration, as the size of the clumps goesto zero? (Hint: You will be able to check this withthe simulation later.) Why?

Fig. 4.16 Circles. It is unlikely that the vorticeswould collect into clumps by accident.

31You can generate random points in the unit circle by choosing θ ∈ (−π, π] andpicking r =

√p with p uniformly distributed ∈ [0, 1).

Page 34: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

24 List of figures

You chose points at random for part (e), but didnot see the vortices separated into clumps. Howdid we know this was likely to happen?(g) Estimate the entropy difference between a statewhere twenty vortices are confined within the unitcircle, and a state where ten positive vorticesand ten negative vortices each are confined insidethe particular circles of radius R = 1/2 shown inFig. 4.16.32 Leave your answer as a multiple ofkB. (Note: The vortices have different values ofΓ, so are distinguishable.) How many tries wouldyou need to see this clumping happen?Onsager’s problem is one of the best examples ofnegative temperature.33

(h) Using your results from part (f) and part (g),in the energy range where the hurricanes willform, will the change in entropy be positive or neg-ative as the energy increases? Is the temperaturenegative?In most simulations of Onsager’s vortices, one se-lects for states that form a big hurricane by start-ing with several small ones, and watching themcombine. (The small hurricanes must be tightlypacked; as they combine they gain entropy be-cause the vortices can spread out more.) Instead,we shall set the energy of our configuration bydoing a Metropolis Monte-Carlo simulation at afixed temperature.(i) Thermalize the Monte-Carlo simulation foryour n = 20 vortices at low temperature β =1/(kBT ) = 2, report the final energy, and print

out your configuration. Does the thermalized vor-tex state look similar to the initial conditions yougenerated in part (e)? Thermalize again at tem-perature β = −2, report the energy, and print outyour final configuration. Do the vortices separateout into clumps of positive and negative vortic-ity?The time it takes to run a simulation is roughlydetermined by the minimum distance betweenvortices. We can use this to keep our simulationfrom crashing so much.(j) Re-run the Monte Carlo simulation with n =20 vortices until you find a configuration with aclear separation of positive and negative vortices,but where the minimum distance between vorticesis not too small (say, bigger than 0.01). (Thisshould not take lots of tries, so long as you ther-malize with an energy in the right region.) Is theenergy you need to thermalize to positive, or neg-ative? Print this initial configuration of vortices.Animate the simulation for t = 10 with this stateas the initial condition. How many hurricanes doyou find? Print out the final configuration of vor-tices. (Hint: In Mathematica, one can right-clickin the animation window to print the current con-figuration (Save Graphic As. . . ). In Python, justplot xOft[-1], yOft[-1]; an array evaluated at [-1]gives the last entry. You can copy the axis limitsand colors and sizes from the animation, to makea nice plot.)

32It is another interesting question how many small circles could be formed, but thatis beyond the scope of this question.33Because the phase space is finite, the volume of the energy shell at high energiesgoes to zero, instead of continuing to increase.

Page 35: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 25

Chapter 5: Entropy

Exercises

(5.1) The Dyson sphere. (Astrophysics) ©iLife on Earth can be viewed as a heat engine, tak-ing energy a hot bath (the Sun at temperatureTS = 6000 K) and depositing it into a cold bath(interstellar space, at a microwave backgroundtemperature TMB = 2.725K, Exercise 7.15). Theoutward solar energy flux at the Earth’s orbit isΦS = 1370W/m2, and the Earth’s radius is ap-proximately 6400 km, rE = 6.4× 106 m.(a) If life on Earth were perfectly efficient (aCarnot cycle with a hot bath at TS and a cold bathat TMB), how much useful work (in watts) couldbe extracted from this energy flow? Compare thatto the estimated world marketed energy consump-tion of 4.5×1020 J/year. (Useful constant: Thereare about π × 107 s in a year.)Your answer to part (a) suggests that we havesome ways to go before we run out of solar en-ergy. But let’s think big.(b) If we built a sphere enclosing the Sun at aradius equal to Earth’s orbit (about 150 millionkilometers, RES ≈ 1.5 × 1011 m), by what factorwould the useful work available to our civilizationincrease?This huge construction project is called a Dysonsphere, after the physicist who suggested [?] thatwe look for advanced civilizations by watching forlarge sources of infrared radiation.Earth, however, does not radiate at the temper-ature of interstellar space. It radiates roughly asa black body at near TE = 300 K = 23 C (see,however, Exercise 7.9).(c) How much less effective are we at extractingwork from the solar flux, if our heat must be radi-ated effectively to a 300 K cold bath instead of oneat TMB, assuming in both cases we run Carnotengines?There is an alternative point of view, though,which tracks entropy rather than energy. Livingbeings maintain and multiply their low-entropystates by dumping the entropy generated into theenergy stream leading from the Sun to interstel-lar space. New memory storage also intrinsicallyinvolves entropy generation (Exercise 5.2); as we

move into the information age, we may eventu-ally care more about dumping entropy than aboutgenerating work. In analogy to the ‘work ef-fectiveness’ of part (c) (ratio of actual work tothe Carnot upper bound on the work, given thehot and cold baths), we can estimate an entropy-dumping effectiveness (the ratio of the actual en-tropy added to the energy stream, compared tothe entropy that could be conceivably added giventhe same hot and cold baths).(d) How much entropy impinges on the Earthfrom the Sun, per second per square meter cross-sectional area? How much leaves the Earth, persecond per cross-sectional square meter, when thesolar energy flux is radiated away at temperatureTE = 300 K? By what factor f is the entropydumped to outer space less than the entropy wecould dump into a heat bath at TMB? From anentropy-dumping standpoint, which is more im-portant, the hot-bath temperature TS or the cold-bath temperature (TE or TMB, respectively)?For generating useful work, the Sun is the keyand the night sky is hardly significant. Fordumping the entropy generated by civilization,though, the night sky is the giver of life and therealm of opportunity. These two perspectives arenot really at odds. For some purposes, a givenamount of work energy is much more useful atlow temperatures. Dyson later speculated abouthow life could make efficient use of this by run-ning at much colder tempeartures (Exercise 5.1).A hyper-advanced information-based civilizationwould hence want not to radiate in the infrared,but in the microwave range.To do this, it needs to increase the area of theDyson sphere; a bigger sphere can re-radiate theSolar energy flow as black-body radiation at alower temperature. Interstellar space is a goodinsulator, and one can only shove so much heatenergy through it to get to the Universal coldbath. A body at temperature T radiates thelargest possible energy if it is completely black.We will see in Exercise 7.7 that a black body ra-diates an energy σT 4 per square meter per second,

Page 36: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

26 List of figures

where σ = 5.67 × 10−8 J/(sm2 K4) is the Stefan–Boltzmann constant.(e) How large a radius RD must the Dyson spherehave to achieve 50% entropy-dumping effective-ness? How does this radius compare to the dis-tance to Pluto (RPS ≈ 6 × 1012 m)? If we mea-sure entropy in bits (using kS = (1/ log 2) insteadof kB = 1.3807 × 10−23 J/K), how many bits persecond of entropy can our hyper-advanced civiliza-tion dispose of? (You may ignore the relativelysmall entropy impinging from the Sun onto theDyson sphere, and ignore both the energy andthe entropy from outer space.)The sun wouldn’t be bright enough to read byat that distance, but if we had a well-insulatedsphere we could keep it warm inside—only theoutside need be cold. Alternatively, we could justbuild the sphere for our computers, and live closerin to the Sun; our re-radiated energy would be al-most as useful as the original solar energy.

(5.2) Undistinguished particles and the Gibbsfactor. ©pIf we have N particles in a box of volume V anddo not distinguish between the particles, then theeffective configurational entropy is V N/N !, whereN ! is sometimes called the Gibbs factor. We sawthat this factor keeps the entropy of mixing smallfor undistinguished particles.(a) Use Stirling’s formula to show that this con-figurational entropy is related to N times the en-tropy of a particle in a region roughly given by thedistance to neighboring particles.Does eqn (3.58) for the ideal gas entropy, S =NkB(5/2− log(ρλ3), also have a similar interpre-tation?

(5.3) How many shuffles? (Mathematics) ©iFor this exercise, you will need a deck of cards,either a new box or sorted into a known order(conventionally with an ace at the top and a kingat the bottom).34

How many shuffles does it take to randomize adeck of 52 cards?The answer is a bit controversial; it dependson how one measures the information left inthe cards. Some suggest that seven shuffles areneeded; others say that six are enough.35 We willfollow reference [?], and measure the growing ran-domness using the information entropy.We imagine the deck starts out in a known order(say, A♠, 2♠, . . . , K♣).(a) What is the information entropy of the deckbefore it is shuffled? After it is completely ran-domized?(b) Take a sorted deck of cards. Pay attentionto the order; (in particular, note the top and bot-tom cards). Riffle it exactly once, separating itinto two roughly equal portions and interleavingthe cards in the two portions.. Examine the cardsequence, paying particular attention to the topfew and bottom few cards. Can you tell whichcards came from the top portion and which camefrom the bottom?The mathematical definition of a riffle shuffle iseasiest to express if we look at it backward.36

Consider the deck after a riffle; each card in thedeck either came from the top portion or the bot-tom portion of the original deck. A riffle shufflemakes each of the 252 patterns tbbtbttb . . . (denot-ing which card came from which portion) equallylikely.It is clear that the pattern tbbtbttb . . . determinesthe final card order: the number of t’s tells us howmany cards were in the top portion, and then thecards are deposited into the final pile according tothe pattern in order bottom to top. Let us firstpretend the reverse is also true: that every pat-tern corresponds one-to-one with a unique finalcard ordering.(c) Ignoring the possibility that two different rif-fles could yield the same final sequence of cards,what is the information entropy after one riffle?

34Experience shows that those who do not do the experiment in part (b) find itchallenging to solve part (c) by pure thought.35More substantively, as the number of cards N → ∞, some measures of informationshow an abrupt transition near 3/2 log2 N , while by other measures the informationvanishes smoothly and most of it is gone by log2 N shuffles.36In the forward definition of a riffle shuffle, one first cuts the deck into two portions,according to a binomial distribution: the probability that n cards are chosen forthe top portion is 2−52

(52n

). (This mimics an expert riffler, who will split the deck

roughly, but not precisely, into two equal portions.) We then drop cards in sequencefrom the two portions into a pile, with the probability of a card being dropped pro-portional to the number of cards remaining in its portion. You can check that thismakes each of the 252 choices in the backward definition equally likely.

Page 37: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 27

You can convince yourself that the only way tworiffles can yield the same sequence is if all thecards in the bottom portion are dropped first, fol-lowed by all the cards in the top portion.(d) How many riffles drop the entire bottom por-tion and then the entire top portion, leaving thecard ordering unchanged? What fraction of the252 riffles does this correspond to? (Hint: 210 =1024 ≈ 103. Indeed, this approximation underliesmeasures of computing resources: a gigabyte isnot 109 bytes, but (1024)3 = 230 bytes.) Hence,what is the actual information entropy after oneriffle shuffle?We can put a lower bound on the number of rifflesneeded to destroy all information by assuming theentropy increase stays constant for future shuffles.(e) Continuing to ignore the possibility that twodifferent sets of m riffles could yield the same fi-nal sequence of cards, how many riffles would ittake for the entropy to pass that of a completelyrandomized deck?

(5.4) Entropy of socks. ©pIf a billion children neaten their bedrooms, howmuch does the entropy of their toys and clothingdecrease? Assume that they each have one hun-dred distinguishable items weighing 20gm, initiallyrandomly placed in a bedroom with V=5m3, andthat they end up located in their place in a drawerwith center-of-mass location accuracy of 1cm. Forthis exercise, assume that the internal entropies ofthe objects are unchanged in the process (atomicvibrations, folding entropy for socks, etc.) Com-pare this entropy change to that of contracting theair in a one liter balloon by 1% in volume.

(5.5) Rubber band. (Condensed matter) ©iFigure 5.17 shows a one-dimensional model forrubber. Rubber is formed from long polymericmolecules, which undergo random walks in theundeformed material. When we stretch the rub-ber, the molecules respond by rearranging theirrandom walk to elongate in the direction of theexternal stretch. In our model, the molecule isrepresented by a set of N links of length d, whichwith equal energy point either parallel or antipar-allel to the previous link. Let the total change inposition to the right, from the beginning of thepolymer to the end, be L.As the molecule extent L increases, the entropyof our rubber molecule decreases.(a) Find an exact formula for the entropy of thissystem in terms of d, N , and L. (Hint: How

many ways can one divide N links into M right-pointing links and N −M left-pointing links, sothat the total length is L?)

L

d

Fig. 5.17 Rubber band. Simple model of a rubberband with N = 100 segments. The beginning of thepolymer is at the top; the end is at the bottom; thevertical displacements are added for visualization.

The external world, in equilibrium at temperatureT , exerts a force pulling the end of the moleculeto the right. The molecule must exert an equaland opposite entropic force F .(b) Find an expression for the force F exertedby the molecule on the bath in terms of the bathentropy. (Hint: The bath temperature 1/T =∂Sbath/∂E, and force times distance is energy.)Using the fact that the length L must maximizethe entropy of the Universe, write a general ex-pression for F in terms of the internal entropy Sof the molecule.(c) Take our model of the molecule from part (a),the general law of part (b), and Stirling’s formulalog(n!) ≈ n log n − n, write the force law F (L)for our molecule for large lengths N . What is thespring constant K in Hooke’s law F = −KL forour molecule, for small L?Our model has no internal energy; this force isentirely entropic. Note how magical this is –we never considered the mechanism of how thesegments would generate a force. Statistical me-chanics tells us that the force generated by oursegmented chain is independent of the mecha-nism. The joint angles in the chain may jigglefrom thermal motion, or the constituent poly-mer monomers may execute thermal motion – solong as the configuration space is segment orien-tations and the effective potential energy is zerothe force will be given by our calculation. Forthe same reason, the pressure due to compress-ing an ideal gas is independent of the mechanism.The kinetic energy of particle collisions for real di-lute gases gives the same pressure as the complex

Page 38: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

28 List of figures

water-solvent interactions give for osmotic pres-sure (Section 5.2.2).(d) If we increase the temperature of our rubberband while it is under tension, will it expand orcontract? Why?In a more realistic model of a rubber band, theentropy consists primarily of our configurationalrandom-walk entropy plus a vibrational entropyof the molecules. If we stretch the rubber bandwithout allowing heat to flow in or out of the rub-ber, the total entropy should stay approximatelyconstant. (Rubber is designed to bounce well; lit-tle irreversible entropy is generated in a cycle ofstretching and compression, so long as the defor-mation is not too abrupt.)(e) True or false?(T) (F) When we stretch the rubber band, it willcool; the configurational entropy of the randomwalk will decrease, causing the entropy in the vi-brations to decrease, causing the temperature todecrease.(T) (F) When we stretch the rubber band, it willcool; the configurational entropy of the randomwalk will decrease, causing the entropy in the vi-brations to increase, causing the temperature todecrease.(T) (F) When we let the rubber band relax, itwill cool; the configurational entropy of the ran-dom walk will increase, causing the entropy in thevibrations to decrease, causing the temperature todecrease.(T) (F) When we let the rubber band relax, theremust be no temperature change, since the entropyis constant.This more realistic model is much like the idealgas, which also had no configurational energy.(T) (F) Like the ideal gas, the temperaturechanges because of the net work done on the sys-tem.(T) (F) Unlike the ideal gas, the work done onthe rubber band is positive when the rubber bandexpands.You should check your conclusions experimen-tally; find a rubber band (thick and stretchy isbest), touch it to your lips (which are very sensi-tive to temperature), and stretch and relax it.

(5.6) Loaded dice and sticky spheres. ©p(a) A loaded three-sided die has probability 1/2 ofrolling a one, and probability 1/4 of rolling a twoor a three. What is the entropy (in units of kB)per roll?

(b) Let θ, φ be the latitude and longitude onthe Earth, which we assume to be a sphere. Ifcomets hit the earth with uniform probability den-sity ρc(θ, φ) = 1/(4π) and asteroids hit the earthwith probability density ρa(θ, φ) = cos(θ)/π2 thatis maximum at the equator and zero at the poles,which distribution has higher entropy? What isthe entropy difference between the distribution ofasteroids and comets? (Hints: The integral of

f(θ) over a sphere is∫ π/2

−π/22π cos(θ)f(θ)dθ. You

may also want to know that∫ π/2

−π/2cos2(θ)dθ =

π/2 and∫ π/2

−π/2cos2(θ) log(cos(θ))dθ = π(1 −

log(4))/4.)

(5.7) Gravity and entropy. ©pThe motion of gravitationally bound collections ofparticles would seem a natural application of sta-tistical mechanics: chaotic motion of many parti-cles. But there are challenges.(a) The gravitational interaction is long range.One of our key assumptions in deriving the en-tropy was that it should be extensive. Should theentropy of the galaxy be a sum of the entropies ofpieces of the galaxy?Suppose we cut off the long-range force, say witha Yukawa potential. The gravitational interactionis still very attractive at short distances.(b) Write the Boltzmann distribution for pointparticles in three dimensions interacting via grav-ity, at a finite temperature T . (This temperaturemight be determined by an average kinetic energyof the stars, for example.) Write the partitionfunction Z as an integral over the distance r be-tween the two particles. Does the integral convergeas r → 0? Interpret.

(5.8) Aging, entropy, and DNA. ©3Is human aging inevitable? Does the fact that en-tropy must increase mean that our cells must rundown? In particular, as we get older the DNA inour cells gradually builds up damage – thoughtto be a significant contribution to the aging pro-cess (and a key cause of cancer). Can we measureDNA damage using entropy, to quantify the chal-lenge of keeping ourselves young?There are roughly thirty trillion (3×1013) cells inthe human body, and about three billion (3×109)nucleotides in the DNA of each cell. Each nu-cleotide comes in four types (C, T, A, and G).The damaged DNA of each cell will be different.The repair job for fixing all of our DNA cannotbe worse than changing totally random sequencesback into exact copies of the correct sequence.

Page 39: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 29

(a) How many bits of information is it possibleto store in the nucleotide sequence of the DNAin an entire human body? How much entropy, injoules/Kelvin, is associated with a completely ran-domized sequence in every cell? (Boltzmann’sconstant is 1.38× 10−23 J/K.)Life exists by consuming low-entropy sources andturning them into higher-entropy biproducts. Asmall cookie has 100 Calories. (Be warned: aCalorie is 1000 calories. Food calories are mea-sured in kilocalories, and then the kilo is conven-tionally dropped in favor of a capital C.) Bodytemperature is about 310K.(b) Calculate the minimum free energy needed torepair a human’s DNA if it starts in a completelyscrambled state. How many cookies would oneneed to consume? (There are 4.184 joules percalorie, and 1000 calories per Calorie.)Entropy does not discriminate between impor-tant and trivial information. Knowing that airmolecules are confined in a balloon is much lessuseful than knowing that your kid’s toys are putaway neatly (or knowing the contents of the Li-brary of Congress), and there are a lot of airmolecules. . .

(5.9) Entropy of the galaxy.37 (Astrophysics) ©iWhat dominates the entropy of our Galaxy? Theknown sources of entropy include stars, interstel-lar gas, the microwave background photons insidethe Galaxy, and the supermassive black hole atthe galactic center.(a) Stars. Presume there are 100 billion stars(1011) in our galaxy, with mass m = 2 × 1030kg,volume V = 1045m3, and temperature T = 107K.Assume that they are made of hydrogen, and forman ideal gas (a poor approximation). What isthe entropy of one star? Compare to Beken-stein’s estimate of 1035J/K (Jacob D. Bekenstein,“Black holes and entropy”, Phys. Rev. D 7 2333(1973)). What is your estimate for the total en-tropy Sstars of the stars in the Galaxy?(b) Gas. If the volume of the galaxy is V =

1061m3, with interstellar gas of hydrogen atomsof total mass 10% that in the stars, and tempera-ture 10K, what is its entropy Sgas?(c) Microwave photons. What is the total en-tropy for the cosmic microwave background radi-ation at TCMB = 2.75K in the volume V of thegalaxy?(d) Galactic black hole. What is the entropyfor the supermassive black hole at the center ofthe galaxy, if its mass is MBH = 1037kg?(e) Where to live? Where would a technologi-cal civilization locate, if finding a place to depositentropy was the bottleneck for their activities?

(5.10) Nucleosynthesis and the arrow of time.38

(Astrophysics) ©3In this exercise, we shall explain how the first fewminutes after the Big Bang set up a low-entropystate that both fuels our life on Earth and pro-vides an arrow of time. To do so, we shall ex-plore the statistical mechanics of reaction ratesin the early universe, use the equilibrium numberdensity and entropy of the non-interacting photongas, and model the Universe as a heat engine.The microscopic laws of physics are invariant un-der time reversal.39 It is statistical mechanicsthat distinguishes future from past – the futureis the direction in which entropy increases.This must arise because the early universe startedin a low-entropy state. Experimentally, however,measurements of the cosmic microwave back-ground radiation tell us that the matter in theuniverse40 started out as nearly uniform, hot,dense gas (Exercises 7.15 and 10.1)) – an equilib-rium state and hence of maximum entropy. Howcan a state of maximal entropy become our initiallow-entropy state?Some cosmologists suggest that gravitational ef-fects that lead matter to cluster into galaxies,stars, and eventually black holes explain the prob-lem; the initial uniform distribution of matter andenergy maximizes the entropy of the particles, butis a low entropy state for space-time and gravita-

37This exercise was developed in collaboration with Katherine Quinn.38This exercise was developed in collaboration with Katherine Quinn.39Technically, the weak interaction is not time-reversal symmetric. But the laws ofphysics are invariant under CPT, where charge-conjugation reverses matter and anti-matter and parity reflects spatial coordinates. There are no indications that entropyof antimatter decreases with time – the arrow of time for both appears the same.40We do not have direct evidence for equilibration of neutrinos with the other mas-sive particles, as they must have fallen out of equilibrium before the decoupling oflight from matter which led to the microwave background. But estimates suggestthat they too started in equilibrium.

Page 40: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

30 List of figures

tional degrees of freedom. It is certainly true ona cosmological scale that the formation of blackholes can dominate the creation of entropy. How-ever, gravity has not been a significant source offree energy (sink for entropy) apart from the vicin-ity of black holes; a thorough critique is given byWallace, Brit. J. Phil. Soc. 61, 513-540 (2010).The dominant source of free energy in the So-lar System is not gravitation, but the hydrogenwhich fuels the Sun.41 Hydrogen predominatesnow because the Universe fell out of equilibriumas it expanded.When thermally isolated systems are changedslowly enough to remain in equilibrium, we saythe evolution is adiabatic.42 Closed systems evolv-ing adiabatically do not change in entropy.(a) If a closed, thermally isolated system changestoo rapidly to stay in equilibrium, will the entropychange be positive or negative? Does a fall fromequilibrium by itself explain the low-entropy initialstate that determines the arrow of time?

Expansion

Fig. 5.18 The Universe as a piston. Imagine theexpansion of the Universe as the decompression of agas of photons and nucleons. The Universe does notexchange heat with the ‘space-time piston’ as it ex-pands, but it can do work on the piston. (The imageis galaxy cluster MACS 1206, a portion of the imagetaken by the Hubble telescope (NASA, ESA, M. Post-man (STSci) and the CLASH Team). We shall ignorethe formation of galaxies and stars, however, and com-pare the formation of an H-Universe of hydrogen gasto a Fe-Universe consisting solely of iron atoms.)

The very early Universe was so hot that any heav-ier nuclei quickly evaporated into protons and

neutrons. Between a few seconds and a coupleof minutes after the Big Bang, protons and neu-trons began to fuse into light nuclei – mostly he-lium. The rates for conversion to heavier elementswere too slow, however, by the time they becameentropically favorable. Almost all of the heavierelements on Earth were formed later, inside stars(see Exercise (6.7)).

Fusion

Low

−27

P

?

Our ’H

Equilibrium ’Fe’

Big

10 1/m3V per baryon

Universe

Universe

FusionBottleneck

Crunch

High

Fig. 5.19 PV diagram for the universe. In ourUniverse, the baryons largely did not fuse into heav-ier elements, because the reaction rates were too slow.Only in the eons since have stars been fusing hydro-gen into heavier elements, giving off non-equilibriumflows of low-entropy sunlight. Here we contrast a sim-plified version of our Universe with that of an equilib-rium universe where nucleosynthesis finished at earlytimes. This schematic (not drawn to scale) showsthe pressure-volume paths of three hypothetical sce-narios as the ‘piston’ of Fig. (5.18) is moved. The‘H-Universe’ has nucleosynthesis is completely sup-pressed; the proton gas adiabatically cools into hy-drogen atoms, until a time near the present whenhydrogen is fused by intelligent beings (dashed line).In the ‘Fe-Universe’, nucleosynthesis stayed in equi-librium as the universe expanded, leaving only ironatoms. Finally, we consider a ‘Big Crunch’ where theH-Universe re-compressed, eventually re-forming a hotproton gas.

How would our Universe compare with one withfaster nucleosynthesis reactions, which remained

41Helmholtz and Kelvin in the last half of the 1800’s considered whether the Suncould shine from the energy gained by gravitational collapse. They concluded thatthe Sun could last for a few million years. Even then, geology and biology made itclear that the Earth had been around for billions of years; gravity is not enough.42Adiabatic is sometimes used to describe systems that do not exchange heat withtheir surroundings. It is sometimes used to describe systems that change slowlyenough to stay in their quantum ground states, or more generally stay in equilib-rium. We are using it to mean both.

Page 41: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 31

in equilibrium? To simplify the problem, we shallignore gravity (except insofar as it drives the ex-pansion of the Universe), chemistry (no moleculeor iron lump formation), and the distinction be-tween protons and neutrons (just keep track ofthe baryons). We shall compare two universes(see Fig. (5.19)). In the Fe-Universe nucleosyn-thesis is very fast; this universe begins as a gas ofprotons at high temperatures but stays in equilib-rium, forming iron atoms early on in the expan-sion. In the H-Universe we imagine nucleosynthe-sis is so slow that no fusion takes place; it evolvesas an adiabatically43 expanding gas of hydrogenatoms.44

We want to compare the entropy of the H-Universe with that of the Fe-Universe.(b) How much does the entropy of the Fe-Universechange as it adiabatically expands from an initialgas of hot protons, through nucleosynthesis, to agas of iron atoms at the current baryon density?How much does the entropy of the H-Universechange as it expands adiabatically into a gas ofhydrogen atoms at the current density? (Nocalculations should be needed, but a cogent ar-gument with a precise answer is needed for bothentropy changes.)Some facts and parameters. Wikipedia tellsus that nucleosynthesis happened “A few min-utes into the expansion, when the temperaturewas about a billion . . . Kelvin and the densitywas about that of air”. The current microwavebackground radiation temperature is TCMB =2.725K; let us for convenience set the tempera-ture when nucleosynthesis would happen in equi-librium to a billion times this value, Treaction =2.725 × 109K. (We estimate the actual temper-ature in Exercise (6.7).) The thermal photonsand the baryons are roughly conserved during theexpansion, and the baryon-to-(microwave back-ground photon) ratio is estimated to be η =[Baryons]/[Photons] ∼ 5× 10−10 (Exercise 6.7).Hence the temperature, energy density, and pres-sure are almost completely set by the photon den-sity during this era (see part (c) below). The

red shift of the photons increases their wave-length proportional to the expansion, and thusreduces their energy (and hence the temperature)inversely with the expansion rate. Thus the num-ber density of photons and baryons goes as T 3.The current density of baryons is approximatelyone per cubic meter,45 and thus we assume thatthe equilibrium synthesis happens at a density of1027 per cubic meter. Fusing 56 protons into 56Fereleases an energy of about ∆E = 5 × 108 eV orabout ∆E/56 = 1.4× 10−12 joules/baryon.Truths about the photon gas (see Exercise (7.15)).The number density for the photon gas isN/V = (2ζ(3)/π2)(kBT )

3/(c3~3), the internal en-ergy density is E/V = (1/15)π2(kBT )

4/(c3~3) =4T 4σ/c, and the pressure is P = E/3V .46 Thebaryons do not contribute a significant pressureor kinetic energy density.(c) Calculate the temperature rise of the Fe-Universe when the hydrogen fuses into iron inequilibrium at high temperature, pressure, anddensity, releasing thermal photons. Calculate thedifference in temperature for the Fe-Universe filledwith iron and the H-Universe filled with hydrogengas, when both reach the current baryon density.(Hint: How does the temperature change withvolume?)(d) Calculate the temperature rise in the H-Universe if we fuse all the hydrogen into iron atcurrent densities and pressures (dashed line inFigure 5.19), using our heat engines to turn theenergy into thermal photons. Calculate the pres-sure rise. Calculate how much entropy would bereleased per baryon, in units of kB.The key is not that our Universe started in a lowentropy state. It is that it has an untapped sourceof energy. Why, by postponing our fusion, do weend up with so much more energy? The energy forthe matter and photons in an expanding universeis not conserved.(e) Calculate the difference in work done perbaryon by the photon gas during the expansion, be-tween the Fe-Universe and the H-Universe. Com-pare it to the energy released by nucleosynthesis.

43The nuclear degrees of freedom, which do not evolve, can be viewed as decoupledfrom the photons and the atomic motions.44Without stars, the intelligent beings in the H-Universe will need to scoop up hy-drogen with Bussard ramjets, fuse it into heavier elements, and use the resultingheavy elements (and abundant energy) to build their progeny and their technologicalsupport system. But they are better off than the inhabitants of the Fe-Universe.45Very roughly.46Here σ = π2k4B/(60~3c2) = 5.67×10−8J/m2sK4 is the Stefan-Boltzmann constant.

Page 42: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

32 List of figures

(Hint: Calculate the temperature T (V ) and thedifference ∆T (V ) as a function of volume. Usethat to determine ∆P (V ).)(f) Consider a hypothetical ‘big crunch’, wherethe universe contracts slowly after expanding (andnothing irreversible happens at low densities). Inthe post-burn H-Universe, will the pressure and

volume return to its original value at volumes perbaryon much smaller than 10−27? (That is, willthe P-V loop close in Fig. 5.19?) If not, whynot? What will the ‘Fe-Universe Big Crunch’ re-turn path be, assuming the return path is also inequilibrium?

Page 43: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 33

Chapter 6: Free energies

Exercises

(6.1) Pollen and hard squares. ©i

QB b

Fig. 6.20 Square pollen grain in fluid of orientedsquare molecules, next to a wall. The thin lines repre-sents the exclusion region around the pollen grain andaway from the wall.

Objects embedded in a gas will have an effectiveattractive force at short distances, when the gasmolecules can no longer fit between the objects.This is called the depletion force, and is a com-mon tool in physics to get large particles to clumptogether. One can view this force as a pressureimbalance (no collisions from one side) or as anentropic attraction.Let us model the entropic attraction between apollen grain and a wall using a two-dimensionalideal gas of classical indistinguishable particlesas the fluid. For convenience, we imagine thatthe pollen grain and the fluid are formed fromsquare particles lined up with the axes of thebox, of lengths B and b, respectively (Fig. 6.20).We assume no interaction between the ideal gasmolecules (unlike in Exercise 3.2), but the poten-tial energy is infinite if the gas molecules over-lap with the pollen grain or with the wall. Thecontainer as a whole has one pollen grain, N gasmolecules, and total area L× L.

Assume the pollen grain is close to only one wall.Let the distance from the surface of the wall tothe closest face of the pollen grain be Q. (A simi-lar square-particle problem with interacting smallmolecules is studied in [?].)(a) Plot the area A(Q) available for the gasmolecules, in units of (length)2, in three steps.Plot A(Q ≫ 0) when the pollen grain is far fromthe wall. At what distance Q0 does this change?Plot A(0), the area available when the pollen grainis touching the wall. Finally, how does the areavary with Q (linearly, quadratically, step function. . . ) between zero and Q0? Plot this, and giveformulæ for A(Q) as a function of Q for the tworelevant regions, Q < Q0 and Q > Q0.(b) What is the configuration-space volume Ω(Q)for the gas, in units of (length)2N ? What is theconfigurational entropy of the ideal gas, S(Q)?(Write your answers here in terms of A(Q).)Your answers to part (b) can be viewed as givinga free energy for the pollen grain after integratingover the gas degrees of freedom (also known as apartial trace, or coarse-grained free energy).(c) What is the resulting coarse-grained free en-ergy of the pollen grain, F(Q) = E − T S(Q), inthe two regions Q > b and Q < b? Use F(Q) tocalculate the force on the pollen grain for Q < b.Is the force positive (away from the wall) or neg-ative? Why?(d) Directly calculate the force due to the idealgas pressure on the far side of the pollen grain,in terms of A(Q). Compare it to the force fromthe partial trace in part (c). Why is there no bal-ancing force from the other side? Effectively how‘long’ is the far side of the pollen grain?

(6.2) Rubber band Free Energy. (Condensed mat-ter) ©pConsider again the rubber band of Exercise (5.12).View it as a collection of segments sn = ±1 oflength d, so L = d

∑n sn.

(a) Write a Hamiltonian for our rubber band un-der an external force. Show that it can be writtenas a sum of uncoupled Hamiltonians, one for eachlink of the rubber band model. Solve for the free

Page 44: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

34 List of figures

energy of one link, and then use this to solve forthe partition function X(T, F )(b) Calculate the associated thermodynamic po-tential χ(T, F ). Write the formula relating thespring constant k = ∂F/∂L to χ, and evaluate itat L = 0.The calculations of this exercise, and of the orig-inal rubber band Exercise 5.5, are closely relatedto Exercise (6.3) on negative temperatures.

(6.3) Rubber band: Illustrating the formalism.(Condensed matter) ©pConsider the rubber band of Exercise (5.12).View it as a system that can exchange ‘length’LM = d(2M − N) with the external world, withthe force F being an energy cost for buying length.(a) Write the partition function X(T, F ) for ourmodel rubber band under an external force, asa sum over the number of configurations Ω(LM )at fixed length. (See the analogous third line ineqn (6.37).)(b) Write the corresponding thermodynamic po-tential χ(T, F ) in terms of X. Write χ in termsof E, T , S, F , and L. (For example, eqn (6.17)says that a system connected to an external bath isdescribed by 〈E〉−TS; what analogous expressionapplies here?)

(6.4) Molecular motors and free energies.47 (Bi-ology) ©iFigure 6.21 shows the set-up of an experiment onthe molecular motor RNA polymerase that tran-scribes DNA into RNA.48 Choosing a good en-semble for this system is a bit involved. It is undertwo constant forces (F and pressure), and involvescomplicated chemistry and biology. Nonetheless,you know some things based on fundamental prin-ciples. Let us consider the optical trap and thedistant fluid as being part of the external environ-ment, and define the ‘system’ as the local regionof DNA, the RNA, motor, and the fluid and localmolecules in a region immediately enclosing theregion, as shown in Fig. 6.21.

F

DNA

RNA

laserFocused

Bead

x

System

Fig. 6.21 RNA polymerase molecular motor at-tached to a glass slide is pulling along a DNA molecule(transcribing it into RNA). The opposite end of theDNA molecule is attached to a bead which is beingpulled by an optical trap with a constant external forceF . Let the distance from the motor to the bead be x;thus the motor is trying to move to decrease x and theforce is trying to increase x.

(a) Without knowing anything further about thechemistry or biology in the system, which of thefollowing must be true on average, in all cases?(T) (F) The total entropy of the Universe (thesystem, bead, trap, laser beam, . . . ) must increaseor stay unchanged with time.(T) (F) The entropy Ss of the system cannot de-crease with time.(T) (F) The total energy ET of the Universe mustdecrease with time.(T) (F) The energy Es of the system cannot in-crease with time.(T) (F) Gs − Fx = Es − TSs + PVs − Fx can-not increase with time, where Gs is the Gibbsfree energy of the system. Related formula:G = E − TS + PV .(Hint: Precisely two of the answers are correct.)The sequence of monomers on the RNA can en-code information for building proteins, and canalso cause the RNA to fold into shapes that areimportant to its function. One of the most im-portant such structures is the hairpin (Fig. 6.22).Experimentalists study the strength of these hair-

47This exercise was developed with the help of Michelle Wang.48RNA, ribonucleic acid, is a long polymer like DNA, with many functions in livingcells. It has four monomer units (A, U, C, and G: Adenine, Uracil, Cytosine, andGuanine); DNA has T (Thymine) instead of Uracil. Transcription just copies theDNA sequence letter for letter into RNA, except for this substitution.

Page 45: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 35

pins by pulling on them (also with laser tweez-ers). Under a sufficiently large force, the hairpinwill unzip. Near the threshold for unzipping, theRNA is found to jump between the zipped and un-zipped states, giving telegraph noise49 (Fig. 6.23).Just as the current in a telegraph signal is eitheron or off, these systems are bistable and maketransitions from one state to the other; they area two-state system.

Fig. 6.22 Hairpins in RNA. (Reprinted with per-mission from Liphardt et al. [?], c©2001 AAAS.) Alength of RNA attaches to an inverted, complemen-tary strand immediately following, forming a hairpinfold.

The two RNA configurations presumably have dif-ferent energies (Ez, Eu), entropies (Sz, Su), andvolumes (Vz, Vu) for the local region around thezipped and unzipped states, respectively. The en-vironment is at temperature T and pressure P .Let L = Lu − Lz be the extra length of RNA inthe unzipped state. Let ρz be the fraction of thetime our molecule is zipped at a given externalforce F , and ρu = 1−ρz be the unzipped fractionof time.(b) Of the following statements, which are true,assuming that the pulled RNA is in equilibrium?(T) (F) ρz/ρu = exp((Stot

z − Stotu )/kB), where

Stotz and Stot

u are the total entropy of the Uni-verse when the RNA is in the zipped and unzippedstates, respectively.(T) (F) ρz/ρu = exp(−(Ez −Eu)/kBT ).(T) (F) ρz/ρu = exp(−(Gz − Gu)/kBT ), whereGz = Ez−TSz+PVz and Gu = Eu−TSu+PVu

are the Gibbs energies in the two states.(T) (F) ρz/ρu = exp(−(Gz − Gu + FL)/kBT ),

where L is the extra length of the unzipped RNAand F is the applied force.

Fig. 6.23 Telegraph noise in RNA unzipping.(Reprinted with permission from Liphardt et al. [?],c©2001 AAAS.) As the force increases, the fraction oftime spent in the zipped state decreases.

(6.5) Langevin dynamics. (Computational) ©pEven though energy is conserved, macroscopic ob-jects like pendulums and rubber balls tend to min-imize their potential and kinetic energies unlessthey are externally forced. Section (6.5) explainsthat these energies get transferred into heat – theyget lost into the 6N internal degrees of freedom.Equation (6.49) suggests that these microscopicdegrees of freedom can produce both friction γand noise ξ(t).(a) First consider the equation h = ξ(t) for the ve-locity of a particle of mass m. Assume ξ(t) gives a‘kick’ to the particle at regular intervals separatedby ∆t: ξ(t) =

∑∞j=−∞ ξjδ(t−j∆t), with 〈ξ2i 〉 = σ2

and 〈ξiξj〉 = 0 for i 6= j. Argue that this leads to arandom walk in momentum space. Will this alonelead to a thermal equilibrium state?(b) Now include the effects of damping, which re-duces the momentum by exp(−γ∆t) during eachtime step, so pj = exp(−γ∆t)pj−1 + mξj. Bysumming the geometrical series, find 〈p2j〉 in termsof the mass m, the noise σ, the damping γ, andthe time step ∆t.

49Like a telegraph key going on and off at different intervals to send dots and dashes,a system showing telegraph noise jumps between two states at random intervals.

Page 46: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

36 List of figures

(c) What is the relation between the noise and thedamping needed to make 〈p2/2m〉 = 1/2kBT , asneeded for thermal equilibrium?This is Langevin dynamics, which both is a way ofmodeling the effects of the environment on macro-scopic systems and a numerical method for sim-ulating systems at fixed temperature (i.e., in thecanonical ensemble).

(6.6) Newton’s theory of sound. ©pNote that our derivation of the diffusion equa-tion (6.67) for perfume in still air ignored the airmolecule degrees of freedom. Without saying so,we have integrated out the air degrees of freedom– allowing the perfume molecules to exchange mo-mentum and energy locally with a ‘bath’ of airmolecules. Let us consider the effective equationof motion for an ideal gas with a locally con-served momentum, but at constant temperature(exchanging heat but not momentum with its en-vironment.)If the mass of the ideal gas atoms is m, themomentum density is Π(x) = mJ(x), whereJ(x) = ρ(x)v(x) is the particle current andv(x) is the mean particle current. Newton’slaw F = p = ma turned into densities tellsus that Π = F(x), where the force densityF(x) = −ρ(x)∂µ/∂x, the density of particlestimes the force per particle.

Using ∂ρ/∂t = −∂J/∂x, find ∂2ρ/∂t2 in terms ofρ and ∂µ/∂x. Use the ideal gas law eqn (6.65) forµ(x) to find ∂2ρ/∂t2. What law emerges? Whatis the speed of sound?(Newton originally proposed this calculation inthe Principia; it was later pointed out by Laplacethat sound vibrations are too fast to exchangeheat with their surroundings.)

(6.7) Nucleosynthesis as a chemical reaction.50

(Astrophysics) ©3The very early Universe was so hot that any heav-ier nuclei quickly evaporated into protons andneutrons. Between a few seconds and a couple

of minutes after the Big Bang, protons and neu-trons began to fuse into light nuclei – mostly he-lium. The energy barriers for conversion to heav-ier elements were too slow, however, by the timethey became entropically favorable. Almost all ofthe heavier elements on Earth were formed later,inside stars.In this exercise, we explore a hypothetical uni-verse with faster nucleosynthesis reactions, orslower expansion, so that the reactions remainedin equilibrium. Clearly the nucleons for the equili-brated Universe will all fuse into their most stableform.51 The binding energy ∆E released by fus-ing nucleons into 56Fe is about 5× 108 eV (abouthalf a proton mass). In this exercise, we shallignore the difference between protons and neu-trons,52 ignore nuclear excitations (assuming nointernal entropy for the nuclei, so ∆E is the freeenergy difference), and ignore the electrons (soprotons are the same as hydrogen atoms, etc.)The creation of 56Fe from nucleons involves acomplex cascade of reactions. We argued in Sec-tion (6.6) that however complicated the reactions,they must in net be described by the overall reac-tion, here

56p → 56Fe, (6.32)

releasing an energy of about ∆E = 5× 108 eV orabout ∆E/56 = 1.4 × 10−12 joules/baryon. Weused the fact that most chemical species are di-lute (and hence can be described by an ideal gas)to derive the corresponding law of mass–action,here

[Fe]/[p]56 = Keq(T ). (6.33)

(Note that this equality, true in equilibrium nomatter how messy the necessary reactions path-ways, can be rationalized using the oversimplifiedpicture that 56 protons must simultaneously col-lect in a small region to form an iron nucleus.)We then used the Helmholtz free energy for anideal gas to calculate the reaction constant ex-plicitly Keq(T ) for a similar reaction, in the idealgas limit.(a) Give symbolic formulas for Keq(T ) in

50This exercise was developed in collaboration with Katherine Quinn.51It is usually said that 56Fe is the most stable nucleus, but actually 62Ni has ahigher nuclear binding energy per nucleon. Iron-56 is favored by nucleosynthesispathways and conditions inside stars, and we shall go with tradition and use it forour calculation.52The entropy cost for our reaction in reality depends upon the ratio of neutrons toprotons. Calculations imply that this ratio falls out of equilibrium a bit earlier, and isabout 1/7 during this period. By ignoring the difference between neutrons and pro-tons, we avoid considering reactions of the form Mp+(56−M)n+(M−26)e →56Fe.

Page 47: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 37

eqn (6.33), assuming the nucleons form an ap-proximate ideal gas. Reduce it to expressions interms of ∆E, kBT , h, mp, and the atomic num-ber A = 56. (For convenience, assume mFe =Amp, ignoring the half-proton-mass energy re-lease. Check the sign of ∆E: should it be positiveor negative for this reaction?) Evaluate Keq(T ) inMKS units, as a function of T . (Note: Do notbe alarmed by the large numbers.)Wikipedia tells us that nucleosynthesis happened“A few minutes into the expansion, when thetemperature was about a billion . . . Kelvin andthe density was about that of air”. To figureout when half of the protons would have fused([Fe] = [p]/A), we need to know the temperatureand the net baryon density [p] + A[Fe]. We canuse eqn (6.33) to give one equation relating thetemperature to the baryon density.(b) Give the symbolic formula for the net baryondensity53 at which half of the protons would havefused, in terms of Keq and A.What can we use for the other equation relatingtemperature to baryon density? One of the fun-damental constants in the Universe as it evolvesis the baryon to photon number ratio. The totalnumber of baryons has been conserved in accel-erator experiments of much higher energy thanthose during nucleosynthesis. The cosmic mi-crowave background (CMB) photons have mostlybeen traveling in straight lines54 since the de-

coupling time, a few hundred thousand years af-ter the Big Bang when the electrons and nucle-ons combined into atoms and became transparent(decoupled from photons). The ratio of baryonsto CMB photons η ∼ 5 × 10−10 has been con-stant since that point: two billion photons pernucleon.55 Between the nucleosynthesis and de-coupling eras, photons were created and scatteredby matter, but there were so many more photonsthan baryons that the number of photons (andhence the baryon to photon ratio η) was still ap-proximately constant.We know the density of blackbody photons at atemperature T ; we integrate eqn (7.66) for thenumber of photons per unit frequency to get

ρphotons(T ) = (2ζ(3)/π2)(kBT/~c)3; (6.34)

the current density ρphotons(TCMB) of microwavebackground photons at temperature TCMB =2.725K is thus a bit over 400 per cubic centime-ter, and hence the current density of a fraction ofa baryon per cubic meter.(c) Numerically matching your formula for the netbaryon density at the halfway point in part (b)to ηρphotons(Treaction), derive the temperatureTreaction at which our reaction would have oc-curred. (You can do this graphically, if needed.)Is it roughly a billion degrees Kelvin? Is the nu-cleon density roughly equal to that of air? (Hint:Air has a density of about 1 kg/m3.)

53Here we ask you to assume, unphysically, that all baryons are either free protonsor in iron nuclei – ignoring the baryons forming helium and other elements. Theseother elements will be rare at early (hot) times, and rare again at late (cold) times,but will spread out the energy release from nucleosynthesis over a larger range thanour one-reaction estimate would suggest.54More precisely, they have traveled on geodesics in space-time.55It is amusing to note that this ratio is estimated using the models of nucleosynthesisthat we are mimicking in this exercise.

Page 48: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

38 List of figures

Chapter 7: Quantum statistical mechanics

Exercises

(7.1) Eigenstate thermalization. ©pFootnote (10) on page 139 discusses many-bodyeigenstates as ‘weird delicate superpositions ofstates with photons being absorbed by the atomand the atom emitting photons’. It now appearsthat something much cooler often happens.Progress has been made since the text was writ-ten. Look up ‘eigenstate thermalization’. Find adiscussion that discusses a single pure state of alarge system A+B, tracing out the ‘bath’ B andleaving a density matrix for a small subsystem A.Is the discussion near the footnote still correct?What should be added or changed?

(7.2) Does entropy increase in quantum sys-tems? (Mathematics, Quantum) ©iWe saw in Exercise 5.7 that in classical Hamil-tonian systems the non-equilibrium entropySnonequil = −kB

∫ρ log ρ is constant in a classical

mechanical Hamiltonian system. Here you willshow that in the microscopic evolution of an iso-lated quantum system, the entropy is also time in-dependent, even for general, time-dependent den-sity matrices ρ(t).Using the evolution law (eqn 7.19) ∂ρ/∂t =[H, ρ]/(i~), prove that S = Tr (ρ log ρ) is time in-dependent, where ρ is any density matrix. (Twoapproaches: (1) Go to an orthonormal basis ψi

which diagonalizes ρ. Show that ψi(t) is alsoorthonormal, and take the trace in that basis.(2) Let U(t) = exp(−iHt/~) be the unitary op-erator that time evolves the wave function ψ(t).Show that ρ(t) = U(t)ρ(0)U†(t). Write S(t) as aformal power series in ρ(t). Show, term-by-termin the series, that S(t) = U(t)S(0)U†(t). Thenuse the cyclic invariance of the trace.)

(7.3) Bosons are gregarious: superfluids andlasers. (Quantum, Optics, Atomic physics) ©3Adding a particle to a Bose condensate. Sup-pose we have a non-interacting system of bosonicatoms in a box with single-particle eigenstates ψn.Suppose the system begins in a Bose-condensedstate with all N bosons in a state ψ0, so

Ψ[0]N (r1, . . . , rN ) = ψ0(r1) · · ·ψ0(rN ). (7.35)

Suppose a new particle is gently injected into thesystem, into an equal superposition of theM low-est single-particle states.56 That is, if it were in-jected into an empty box, it would start in state

φ(rN+1) =1√M

(ψ0(rN+1) + ψ1(rN+1)

+ . . .+ ψM−1(rN+1)). (7.36)

The state Φ(r1, . . . rN+1) after the particle is in-serted into the non-interacting Bose condensateis given by symmetrizing the product functionΨ

[0]N (r1, . . . , rN )φ(rN+1) (eqn 7.30).

(a) Symmetrize the initial state of the system afterthe particle is injected. (For simplicity, you neednot calculate the overall normalization constantC.) Calculate the ratio of the probability that allparticles in the symmetrized state are in the stateψ0 to the probability that one of the particles isin another particular state ψm for 0 < m < M .(Hint: First do it for N = 1. The enhancementshould be by a factor N + 1.)So, if a macroscopic number of bosons are in onesingle-particle eigenstate, a new particle will bemuch more likely to add itself to this state thanto any of the microscopically populated states.Notice that nothing in your analysis depended onψ0 being the lowest energy state. If we startedwith a macroscopic number of particles in a single-

56For free particles in a cubical box of volume V , injecting a particle at the ori-gin φ(r) = δ(r) would be a superposition of all plane-wave states of equal weight,

δ(r) = (1/V )∑

keik·x. (In second-quantized notation, a†(x = 0) = (1/V )

∑ka†k.)

We ‘gently’ add a particle at the origin by restricting this sum to low-energy states.This is how quantum tunneling into condensed states (say, in Josephson junctions orscanning tunneling microscopes) is usually modeled.

Page 49: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 39

particle state with wavevector k (that is, a su-perfluid with a supercurrent in direction k), newadded particles, or particles scattered by inhomo-geneities, will preferentially enter into that state.This is an alternative approach to understandingthe persistence of supercurrents, complementaryto the topological approach (Exercise 9.7).Adding a photon to a laser beam. This ‘chummy’behavior between bosons is also the principle be-hind lasers.57 A laser has N photons in a partic-ular mode. An atom in an excited state emits aphoton. The photon it emits will prefer to jointhe laser beam than to go off into one of its otheravailable modes by a factor N + 1. Here the Nrepresents stimulated emission, where the existingelectromagnetic field pulls out the energy from theexcited atom, and the +1 represents spontaneousemission which occurs even in the absence of ex-isting photons.Imagine a single atom in a state with excitationenergy energy E and decay rate Γ, in a cubicalbox of volume V with periodic boundary condi-tions for the photons. By the energy-time uncer-tainty principle, 〈∆E∆t〉 ≥ ~/2, the energy of theatom will be uncertain by an amount ∆E ∝ ~Γ.Assume for simplicity that, in a cubical box with-out pre-existing photons, the atom would decayat an equal rate into any mode in the rangeE − ~Γ/2 < ~ω < E + ~Γ/2.(b) Assuming a large box and a small decay rateΓ, find a formula for the number of modes M perunit volume V competing for the photon emittedfrom our atom. Evaluate your formula for a laserwith wavelength λ = 619 nm and the line-widthΓ = 104 rad/s. (Hint: Use the density of states,eqn 7.65.)Assume the laser is already in operation, so thereare N photons in the volume V of the lasing ma-terial, all in one plane-wave state (a single-modelaser).(c) Using your result from part (a), give a formulafor the number of photons per unit volume N/Vthere must be in the lasing mode for the atom tohave 50% likelihood of emitting into that mode.The main task in setting up a laser is providinga population of excited atoms. Amplification canoccur if there is a population inversion, where thenumber of excited atoms is larger than the num-ber of atoms in the lower energy state (definitely

a non-equilibrium condition). This is made pos-sible by pumping atoms into the excited state byusing one or two other single-particle eigenstates.

(7.4) Is sound a quasiparticle?. ©pSound waves in the harmonic approximation arenon-interacting – a general solution is given bya linear combination of the individual frequencymodes. Landau’s Fermi liquid theory (foot-note (23), page 144) describes how the non-interacting electron approximation can be effec-tive even though electrons are strongly coupledto one another. The quasiparticles are electronswith a screening cloud; they develop long lifetimesnear the Fermi energy; they are described as polesof Greens functions.(a) Do phonons have lifetimes? Do their lifetimesget long as the frequency goes to zero? (Lookup ultrasonic attenuation and Goldstone’s theo-rem.)(b) Are they described as poles of a Green’s func-tion? (We will cover this in Chapter 10. Browsein particular Exercise (10.9), ‘Quasiparticle polesand Goldstone’s theorem’.)(c) Can you think of an analogy to a screeningcloud?

(7.5) Drawing wavefunctions. ©i(a) Draw a typical first excited state Ψ(x, y) fortwo particles at positions x and y in a one-dimensional potential well, given that they are(i) distinguishable, (ii) bosons, and (iii) fermions.For the fermion wavefunction, assume the twospins are aligned. Do the plots in the 2D region−L/2 < x, y,< L/2, emphasizing the two regionswhere the wavefunction is positive and negative,and the nodal curve where Ψ(x, y) = 0.(b) Which nodal lines in your plots for part (a) arefixed? Which can vary depending on the manybodywavefunction? In particular, for the bose excitedstate, must the nodal curve be straight? Whichnodal lines in the fermion excited state must re-main in position independent of the Hamilto-nian?

(7.6) Nonsymmetric eigenstates. (Computa-tional) ©2Consider a three-particle Hamiltonian H that issuitable for identical particles (so it treats all par-ticles the same — it has permutation symmetry).Any distinguishable eigenstate of H can by sym-metrized or antisymmetrized, and if this does not

57Laser is an acronym for ‘light amplification by the stimulated emission of radiation’.

Page 50: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

40 List of figures

give zero we get a boson or fermion eigenstate,respectively. Consider now H applied to threedistinguishable particles.Since permutations commute with H, can wechoose eigenstates to be either totally symmetricor totally antisymmetric? Or are some distin-guishable eigenstates of some kind of mixed sym-metry? (Hint: Consider

Φ(x,y, z) =1/2 (ψ(x)φ(y)ζ(z)− ψ(x)φ(z)ζ(y)

+ψ(y)φ(x)ζ(z)− ψ(y)φ(z)ζ(x)) .(7.37)

Is it an eigenstate? Does it symmetrize to zero?Antisymmetrize to zero?)

(7.7) Universe of light baryons. ©2Presume a parallel universe where neutrons andprotons were as light as electrons (2000 timeslighter than they are now).Estimate the superfluid transition temperature Tc

for water in the parallel universe. Will it be su-perfluid at room temperature? (Presume the num-ber of molecules of water per unit volume is com-parable to the number of atoms of liquid heliumper unit volume in the real world, and the watermolecule stays about the same size in the parallelworld.)

(7.8) Many-fermion wavefunction nodes. ©iConsider an N-electron atom or molecule in threedimensions, treating the nuclei as fixed (andhence the Coulomb attraction to the electrons asa fixed external potential with electron ground-state wavefunction Ψ(r), r ∈ R3N . We shallexplore the structure of the zeros of this 3N-dimensional wave function.In high-dimensional spaces, one often measuresthe co-dimension of hypersurfaces by countingdown from the dimension of the embedding space.Thus the unit hypersphere is co-dimension one,whatever the dimension of space is.(a) What is the co-dimension of the sets whereΨ(r) = 0?(b) What zeros are ‘pinned in place’ by the Fermistatistics of the electrons? What is the co-dimension of this lattice of pinned nodes? (As-sume all electrons have parallel spins, for simplic-ity.)I view the nodal surfaces as being like a soap bub-ble, pinned on the fixed nodes and then smoothly

interpolating to minimize the energy of the wave-function. Diffusion quantum Monte Carlo, infact, makes use of this. One starts with a care-fully constructed variational wavefunction Ψ0(r),optimizes the wavefunction in a ‘fixed-node’ ap-proximation (finding the energy in a path inte-gral restricted to one local connected region whereΨ0(r) > 0), and then at the end relaxes the fixed-node approximation.(c) Consider a connected region in configurationspace surrounded by nodal surfaces where Ψ(r) >0. Do you expect that Ψ in this region will define,through antisymmetry, Ψ everywhere in space?(Hint: For bosons and distinguishable particles,the ground state wavefunction has no nodes.)

(7.9) The Greenhouse effect and cooling coffee.(Astrophysics, Ecology) ©2Vacuum is an excellent insulator. This is why thesurface of the Sun can remain hot (TS = 6000 K)even though it faces directly onto outer space atthe microwave background radiation temperatureTMB = 2.725K, (Exercise 7.15). The main way58

in which heat energy can pass through vacuum isby thermal electromagnetic radiation (photons).We will see in Exercise 7.7 that a black body ra-diates an energy σT 4 per square meter per second,where σ = 5.67× 10−8 J/(sm2 K4).

A vacuum flask or Thermos bottleTM

keeps cof-fee warm by containing the coffee in a Dewar—adouble-walled glass bottle with vacuum betweenthe two walls.(a) Coffee at an initial temperature TH(0) =100 C of volume V = 150mL is stored in a vac-uum flask with surface area A = 0.1m2 in aroom of temperature TC = 20 C. Write downsymbolically the differential equation determininghow the difference between the coffee temperatureand the room temperature ∆(t) = TH(t)− TC de-creases with time, assuming the vacuum surfacesof the dewar are black and remain at the cur-rent temperatures of the coffee and room. Solvethis equation symbolically in the approximationthat ∆ is small compared to Tc (by approximat-ing T 4

H = (TC +∆)4 ≈ T 4C + 4∆T 3

C). What is theexponential decay time (the time it take for thecoffee to cool by a factor of e), both symbolicallyand numerically in seconds? (Useful conversion:0 C = 273.15 K.)Real Dewars are not painted black! They are

58The sun and stars can also radiate energy by emitting neutrinos. This is particu-larly important during a supernova.

Page 51: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 41

coated with shiny metals in order to minimize thisradiative heat loss. (White or shiny materials notonly absorb less radiation, but they also emit lessradiation, see exercise 7.7.)The outward solar energy flux at the Earth’s or-bit is ΦS = 1370W/m2, and the Earth’s radiusis approximately 6400 km, rE = 6.4× 106 m. TheEarth reflects about 30% of the radiation from theSun directly back into space (its albedo α ≈ 0.3).The remainder of the energy is eventually turnedinto heat, and radiated into space again. Like theSun and the Universe, the Earth is fairly well de-scribed as a black-body radiation source in theinfrared. We will see in Exercise 7.7 that a blackbody radiates an energy σT 4 per square meter persecond, where σ = 5.67× 10−8 J/(sm2 K4).(b) What temperature TA does the Earth radi-ate at, in order to balance the energy flow fromthe Sun after direct reflection is accounted for?Is that hotter or colder than you would estimatefrom the temperatures you’ve experienced on theEarth’s surface? (Warning: The energy flow in isproportional to the Earth’s cross-sectional area,while the energy flow out is proportional to itssurface area.)The reason the Earth is warmer than would beexpected from a simple radiative energy balanceis the greenhouse effect.59 The Earth’s atmo-sphere is opaque in most of the infrared regionin which the Earth’s surface radiates heat. (This

frequency range coincides with the vibration fre-quencies of molecules in the Earth’s upper atmo-sphere. Light is absorbed to create vibrations,collisions can exchange vibrational and transla-tional (heat) energy, and the vibrations can lateragain emit light.) Thus it is the Earth’s atmo-sphere which radiates at the temperature TA youcalculated in part (b); the upper atmosphere hasa temperature intermediate between that of theEarth’s surface and interstellar space.The vibrations of oxygen and nitrogen, the maincomponents of the atmosphere, are too symmetricto absorb energy (the transitions have no dipolemoment), so the main greenhouse gases are water,carbon dioxide, methane, nitrous oxide, and chlo-rofluorocarbons (CFCs). The last four have sig-nificantly increased due to human activities; CO2

by ∼ 30% (due to burning of fossil fuels and clear-ing of vegetation), CH4 by ∼ 150% (due to cattle,sheep, rice farming, escape of natural gas, and de-composing garbage), N2O by ∼ 15% (from burn-ing vegetation, industrial emission, and nitrogenfertilizers), and CFCs from an initial value nearzero (from former aerosol sprays, now banned tospare the ozone layer). Were it not for the Green-house effect, we’d all freeze (like Mars)—but wecould overdo it, and become like Venus (whosedeep and CO2-rich atmosphere leads to a surfacetemperature hot enough to melt lead).

59The glass in greenhouses also is transparent in the visible and opaque in the in-frared. This, it turns out, isn’t why it gets warm inside; the main insulating effectof the glass is to forbid the warm air from escaping. The greenhouse effect is in thatsense poorly named.

Page 52: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

42 List of figures

Chapter 8: Calculation and computation

Exercises

(8.1) Solving the Ising model with parallel up-dates. ©pOur description of the heat-bath and Metropolisalgorithm equilibrates one spin at a time. Thisis inefficient even in Fortran and C++ (memorynonlocality); in interpreted languages like Pythonor Mathematica conditional loops like this are re-ally slow.Could we update all the spins at once, thermal-izing them according to their current neighbor-hood? If not, can you figure out a way of bringingmany of the spins to equilibrium with their neigh-borhoods in one set of array operations (without‘looping’ over spins)? (No implementation isneeded – just a description of the method.)

(8.2) Detailed balance. ©iIn an equilibrium system, for any two states αand β with equilibrium probabilities ρ∗α and ρ∗β,detailed balance states (eqn 8.14) that

Pβ⇐αρ∗α = Pα⇐βρ

∗β, (8.38)

that is, the equilibrium flux of probability fromα to β is the same as the flux backward from βto α. It is both possible and elegant to refor-mulate the condition for detailed balance so thatit does not involve the equilibrium probabilities.Consider three states of the system, α, β, and γ.(a) Assume that each of the three types of transi-tions among the three states satisfies detailed bal-ance. Eliminate the equilibrium probability densi-ties to derive

Pα⇐βPβ⇐γPγ⇐α = Pα⇐γPγ⇐βPβ⇐α. (8.39)

Viewing the three states α, β, and γ as forming acircle, you have derived a relationship between therates going clockwise and the rates going counter-clockwise around the circle.It is possible to show conversely that if everytriple of states in a Markov chain satisfies thecondition 8.39 then it satisfies detailed balance(that there is at least one probability density ρ∗

which makes the probability fluxes between allpairs of states equal), except for complicationsarising when some of the rates are zero.(b) Suppose P is the transition matrix for someMarkov chain satisfying the condition 8.39 for ev-ery triple of states α, β, and γ. Assume that thereis a state α0 with non-zero transition rates fromall other states δ. Construct a probability den-sity ρ∗δ that demonstrates that P satisfies detailedbalance (eqn 8.38). (Hint: Assume you knowρ∗α0

; use eqn (8.38) to write a formula for ρ∗δ toensure detailed balance for the pair. Solve forρ∗α0

to make the probability distribution normal-ized. Then use the cyclic condition eqn (8.39) toshow that ρ∗ satisfies detailed balance for any twostates β and δ.)

(8.3) Ising model cluster expansion. ©pConsider the low-temperature cluster expansion ofeqn (8.19). Which zero-temperature ground state(spin-up or spin-down) is it perturbing about?What cluster gives the first term? Explain thepower x6 and the coefficient −2. What clus-ter(s) contribute to the second term, proportionalto x10?

(8.4) Low-temperature expansion of Ising model.©3Consider the Ising model on a two-dimensionalsquare lattice.60 The lattice is finite, havingperiodic boundary conditions and dimensions Lcolumns by L rows, so that there are N = L× Lsites. First note that the Hamiltonian can be writ-ten

H = −2JN + (2J)N↑↓ (8.40)

where N↑↓ is the number of bonds where the spinsare not parallel – which must be rare, at low T .(The cost is (2J) because this is the differencebetween +J for antiparallel and −J for parallel.)(a) Explain in one sentence why the canonical par-tition function Z is a polynomial in u, where

u ≡ e−2βJ . (8.41)

60Not the cubic lattice discussed in the text!

Page 53: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 43

The low temperature expansion is based on sim-ply organizing Z as a sum over states of increasingenergy, assuming a background in which almostall spins are up. The series has the form

Z = Z0(1 + cmNum + ...) (8.42)

where the prefactor is exp(2βJN) from the triv-ial first term in (8.40), and the second term of thesum in (8.42) comes from the lowest excited state.(b) Please draw an example of that state, assum-ing the net magnetization is up, on (say) a 4× 4lattice. What is the excitation energy, and hencewhat is the power um in (8.42)?(c) What is the degeneracy of that state (i.e., thenumber of distinct spin configurations that havethe same excitation energy, while having almostall spins up)? Hence what is the coefficient cm?

(8.5) Fruit flies, Markov matrices, and free ener-gies.61 (Biology) ©3Fruit flies exhibit several stereotyped behaviors.A Markov model for the transitions between thesestates has been used by Gordon Berman and col-laborators to describe the transitions between be-haviors.62 Let us simplify the fruit fly behaviorinto three states: idle (α = 1), grooming (α = 2),and locomotion (α = 3). Assume that measure-ments are done at regular time intervals (say oneminute), and let Pβα be the probability that afly in state α in time interval n goes into stateβ in the next interval n + 1, so the probabilitydistribution ρβ(n+ 1) =

∑α Pβαρα(n).

Fig. 8.24 Fruit fly behavior map, fromG.J. Berman, W. Bialek, and J.W. Shaevitz, “Hierar-chy and predictability in Drosophilia behavior” (soonto be submitted). Fruit fly stereotyped behaviorswere analyzed using an unsupervised machine-learningplatform and projected into a two dimensional spaceto preserve local similarity. The resulting behaviorclusters were labeled by hand. Black lines representtransitions between states, with right-handed curva-tures indicating direction of transmission, and linethickness proportional to the transition rates. An-terior, abdomen, and wing motions are basically allgrooming. We lump together slow motions (consist-ing of small leg twitches and proboscis extension) withthe idle state. The transition matrix here does notfaithfully represent the actual behavior. Courtesy ofG.J. Berman.

Assume that the transition matrix is

P =

29/32 1/8 1/321/12 55/64 1/321/96 1/64 15/16

. (8.43)

(a) Show that probability is conserved under thetransition matrix P , both directly and by checkingthe appropriate left eigenvector and eigenvalue.(b) Is the dynamics given by the transition ma-trix P Markov? Ergodic? Must the dynamics of afruit fly be Markov? Why or why not? Is real fruitfly dynamics ergodic? If so, why? If not, give astate of the fly (not included in our model) that

61This problem was created with the assistance of Gordon Berman (a former Cornellphysics grad).62See Berman, G.J., Choi, D. M., Bialek, W., and Shaevitz, J.W., “Mapping thestereotyped behaviour of freely moving fruit flies” J. Royal Soc. Interface, 11,20140672 (2014), and Berman, G.J., Choi, D. M., Bialek, W., and Shaevitz, J.W.,“Mapping the structure of drosophilid behavior”, arXiv, 2013, 1310.4249v1.

Page 54: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

44 List of figures

can never return to the locomotion state. (Hint:Flies are not immortal.)(c) Find the stationary state giving the probabil-ity distribution ρ∗α = ρα(∞) of fruit fly behaviorat long times. Is it a left or a right eigenvector?What is its eigenvalue?(d) Does the transition matrix P satisfy detailedbalance? First check this directly by demandingthat no net probability flows around cycles

Pα⇐βPβ⇐γPγ⇐α = Pα⇐γPγ⇐βPβ⇐α. (8.44)

(eqn (8.22) in Exercise 8.5). Then check it usingyour stationary solution in part (c) by demand-ing that the stationary state has equal probabilityflows forward and backward

Pβ⇐αρ∗α = Pα⇐βρ

∗β, (8.45)

(eqn (8.14)). (Hint: the second method shouldserve as a check for your answer to part (c).)(e) Must fruit flies obey detailed balance in theirtransition rates? If so, why? If not, discuss howthe fruit fly might plausibly have a net probabil-ity flow around a cycle from idle to grooming tolocomotion.In the morning, when the lights are turned on,60% of the fruit flies are idle (α = 1), 10% aregrooming (α = 2), and 30% are undergoing loco-motion (α = 3).

(f) What is the net entropy per fly of this initialprobability distribution? What is the entropy perfly of the stationary state ρ∗? Does fly entropyincrease or decrease from morning to mid-day?Plot the entropy per fly for the first hour, by it-erating our Markov transition matrix directly onthe computer. Is the entropy change monotonic?Fruit flies are not isolated systems: they can ex-change energy with their environment. We wouldlike to check if the fruit fly evolves to minimize itsfree energy F = E − TS.(g) Must fruit flies minimize a free energy as timeprogresses? Why or why not?How can we estimate the energy for each fly state?(h) Assume that the stationary state of the flies isa Boltzmann distribution ρ∗α ∝ exp(−Eα/kBT ).Calculate Eα/kBT for the three fly states. (Note:Only the differences are determined; choose thezero of energy to make the partition functionequal to one.) What is the average fly energyfirst thing in the morning? In the stationarystate? Does the fly energy increase or decreasefrom morning to mid-day? Plot the energy forthe first hour; is it monotonic?(i) Let the free energy be F = E−TS for the fliesin our simple model. Plot F/kBT for the firsthour. Does it monotonically decrease?

Page 55: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 45

Chapter 9: Order parameters, broken symmetry, and topol-

ogy

Exercises

(9.1) Nematic order parameter half space. ©pLet us develop some intuition for the projectiveplane, the order parameter space of a nematic.

Fig. 9.25 Projective Plane. Just like the torus, theorder parameter space RP 2 for a nematic can be de-scribed as a square with opposite sides identified, butnow with a twist. (Think of the square as a top-viewof the hemisphere, when the latter is pinched outwardalong the four diagonals.) The dashed line, for exam-ple, connects two points on the edge of the square thatcorrespond to the same nematic orientation. What ge-ometrical figure (disk, cylinder, Moebius strip, spherewith a hole, two disks) is formed when you cut alongthe dashed line?

(a) If one cuts the order parameter space fora two-dimensional crystal (Fig. 9.7) around theedge, it becomes a cylinder. What topological ob-ject is formed by cutting the order parameter spacefor a nematic along the curved path shown inFig. (9.5b)?This provides us an excuse to exercise our skills

with scissors and tape. Unfortunately, the projec-tive plane cannot be pasted together in ordinarythree dimensions: we will need to improvise.(b) Get an printed, elongated version ofFig. (9.25), suitable for cutting.63 Tape the shortends together, aligning the arrows. What objectdo you form? Cut along the dotted line. Canyou now tape the long ends together? (Remem-ber, in higher dimensions one can pass one paperstrip through another (moving it into the fourthdimension). You can mimic this by cutting a stripand passing it (without twisting) past another be-fore taping it back together.) What object do youform in the end?

(9.2) Ising order parameter. ©pWhat would the topological order parameter spacebe for the Ising model? (Hint: what is the spaceof possible ground states?)

(9.3) Pentagonal order parameter. ©iGlasses are rigid, like crystals, but have disor-dered atomic positions, like liquids. We do nothave a definitive theory of glasses, but one featureappears to be frustration – good glass-formingsystems can form local low energy structures thatare difficult to arrange into crystalline patterns.The classic example is the packing of spheres.Four spheres can neatly pack into a low-energyregular tetrahedron, but tetrahedra cannot fillspace (Fig. 12.18).The regular tetrahedron has six edges, each withan dihedral angle of arccos(1/3)=70.5 .64 Undis-torted tetrahedra cannot fit together around anedge without leaving a gap.(a) How many tetrahedra meet at a vertex in thefrustrated icosahedron of Fig. (12.18a)? What isthe angle of the gap formed by this many tetrahe-dra meeting at an edge?

63If you are not provided with one, download it fromhttp://pages.physics.cornell.edu/∼sethna/teaching/562/HW/ProjectivePlaneCutout2Pages.pdf.64A dihedral angle is the angle between two intersecting planes – the ‘opening angle’of the edge.

Page 56: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

46 List of figures

We can create a frustrated order parameter forthis kind of glassy system, by considering an‘ideal glass’ formed by curving space to close thegap. Consider the analogous problem – a two-dimensional material whose lowest-energy localconfigurations form the faces of a soccer ball,Fig. (9.26).

Fig. 9.26 Frustrated soccer ball. A material thatprefers to form alternating pentagons and hexagonscannot fill two-dimensional flat space, but does livehappily on the soccer-ball sphere S2 in three dimen-sions. Note the distortion of the pentagons andhexagons becomes more and more severe as one worksoutward from the origin.

Figure (9.26) shows a low energy region. To de-scribe the order at a point x on the plane, wemust both identify the orientations of the poly-gons and the translational positions within thepolygons. How can we combine both into a singleorder parameter field?(b) Get a soccer ball. Generate a copy ofFig. (9.26) blown up so that the polygons roughlymatch the size of the faces of the soccer ball. Picka point x on the sheet, and align the ball atop thesheet so as to match the local material structureto the ideal soccer-ball template. What is the or-der parameter space you need to describe the localorder? (Hint: You need a three-dimensionalrotation matrix to define the orientation of thesoccer ball tangent to x. Is there more than onerotation matrix that can give an equivalent localfit to the structure?)This is a challenging question. Constructing this

order parameter, and exploring the consequences,consumed a large portion of the author’s post-doctoral years. In particular, the natural low en-ergy structure in Fig. (9.26) is given by rolling thesphere without slipping (a kind of ‘parallel trans-port’).One cannot fill 2D space by flattening the spherewithout stretching (a longstanding complaint ofmap-makers). One must introduce a disclina-tions, by adding a wedge of material chosen tosew naturally into the gaps created by flatteningthe ideal template (Fig. 12.18(a) and Fig. 9.27).For our soccer-ball material, one disclination isneeded at the center of each pentagon.

Fig. 9.27 Cutting and sewing: disclinations.The natural defect needed to flatten the soccer ballis a disclination, introducing a wedge of material atthe center of a pentagon. This produces a net changein angle as one follows the order parameter around aloop containing the pentagon’s center.

(c) Generate a copy of Fig. (9.27) blown up so thatthe polygons match the faces on the soccer ball.Place a pentagon of the soccer ball atop the centralblack polygon, tilted so that one edge on the balllies atop the corresponding edge in the material.Roll the soccer ball around the central polygon soas to trace out its perimeter. As you traverse apath x(θ) an angle θ = 2π around the center of apentagon in the plane, what is the net angle φ thatthe soccer ball rotates? If we define the strengthof the defect line as s = φ/2π in analogy with thenematic defect line of Fig. (9.19), what is s?

Page 57: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 47

(9.4) Superfluids and Second Sound. ©pThere are two different order parameters for su-perfluids. The ‘soft-spin’ Landau order param-eter is a complex number as a function of posi-tion. (Remember the Bose condensate Ψ(r) =ψ(x1)ψ(x2) . . . of non-interacting bosons; thequantum single-particle state ψ(x) gives the Lan-dau order parameter.) The ‘topological’ order pa-rameter is the phase φ(x) of this complex functionψ(x) = ψ0 exp (iφ(x)). Superfluids break gaugesymmetry (the arbitrary phase φ of the quantumwavefunction).Under most circumstances, for every continuousbroken symmetry there is an elementary excita-tion consisting of a long wavelength, sinusoidalvariation of the broken symmetry direction of thecorresponding order parameter (Goldstone’s the-orem).Argue that the elementary excitation for the su-perfluid order is a slowly-varying small ampli-tude oscillation in the phase, as in φ(x) = φ0 +a cos(kx). Using the single-particle formula J =(~/2mi) (ψ∗∇ψ − ψ∇ψ∗) (as in Exercise 9.7a),argue that the oscillation φ(x) will change thedensity of superfluid bosons periodically in space.There is already another density wave in fluids –the longitudinal phonon that gives ordinary soundin liquids and gases. One often uses a two-fluidmodel with a superfluid density and a density of‘normal fluid’ (quasiparticles); sound waves arean in-phase oscillation of the two fluids (with netdensity changes), and ‘second sound’ has the nor-mal and superfluid components oscillating out ofphase (with little net density change). Since thesuperfluid carries no heat, the ‘normal fluid’ com-ponent of second sound forms an oscillation ofheat entropy. . .

(9.5) No Heisenberg line defect. (Mathematics) ©i

The magnetization M of a Heisenberg ferromag-net is of fixed magnitude |M| = M0, but canpoint with equal energy in any direction. Thusthe order parameter space for this magnet is asphere (Fig. 9.28(b)). Mathematicians sometimesconstruct the sphere topologically by pasting to-gether a square as in Fig. (9.28a), with the cor-responding sides matched according to the styleof arrows (single or double), and with the sidespasted so that the arrows point in the same direc-tions.

Fig. 9.28 Sphere. The sphere (b) can be topolog-ically constructed by gluing together the edges of asquare (a) as shown.

(a) To test that you understand this cutting andpasting, sketch directly onto Fig. (9.29b) theseams where the edges are pasted together, turningthe square (a) into the sphere (b). Include the ar-rows along the seams, with the correct styles andorientation. Assume that the seams are predom-inantly on the visible (front) side of the sphere,and that the north pole N and south pole S are asshown. (Did you draw the arrows?)There are no topological line defects for Heisen-berg ferromagnets. This is because Π1(S

2) = 0;any closed path on the sphere can be contractedto a point, so a loop in real space never surroundssomething that cannot be smoothly filled in. Wecan give a tangible example of this – sometimescalled ‘escape into the third dimension’.A two-dimensional Heisenberg ferromagnet has anorder parameter field

M =M0(zλ(r) + r√

1− λ2), (9.46)

expressed in cylindrical coordinates. (Fig. 9.29).Here λ(r) decreases rapidly from one at r = 0 tozero at larger r, so thatM points radially outwardat long distances from the origin. One may thusmeasure a winding number s = 1 for this magne-tization pattern at large r, but it nonetheless hasa smooth magnetic field everywhere inside.

P

q

Page 58: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

48 List of figures

Fig. 9.29 Escape in the third dimension. Themagnetic field for a planar two-dimensional materialpoints upward (along z) at the origin, and points al-most radially outward (along r) far away (radius r3).As one circles ~r around the origin counter-clockwise atdistance r3, the magnetization M(~r) circles the orderparameter space around the equator, as shown.

(b) Just as the image M(~r) of the magnetizationalong r = r3 is shown in order parameter space,

sketch on the right the approximate curves de-scribed by the magnetization at distances r2 andr1 from the origin, as denoted in real space onthe left. Label these M(r1) and M(r2), in analogywith M(r3) shown.(c) Does the path in order parameter space con-tract to a point, as the loop in real space shrinksto r = 0? What is the homotopy group element ofthe sphere, g ∈ Π1(S

2), corresponding to M(r3)?

Page 59: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 49

Chapter 10: Correlations, response, and dissipation

Exercises

(10.1) Human correlations. ©iConsider a person-to-person correlation func-tion for the horizontal center-of-mass positionshα = (xα, yα) of people labeled by α, in vari-ous public places. In particular, let Ccrowd(r) =〈ρ(h)ρ(h+r)〉, where ρ(h) = ∑

α δ(h−hα), andthe ensemble average 〈. . . 〉 is over a large numberof similar events.(a) Sketch a rough, two-dimensional contour plotfor Ccrowd(r), for a subway car of length L =10m with two benches along the edges x = 0 andx = 3m. Assume the people are all seated, withrandom distances δy = 0.5±0.2m between neigh-bors.(b) Sketch a rough contour plot for Ccrowd(r, t)in the same subway car, where t = 20 minutes islong enough that most people will have gotten offat their stops.(c) Sketch a rough contour plot for the Fouriertransform Ccrowd(k) for the correlation functionin part (a).

(10.2) Liquid free energy. ©pThe free energy F of a liquid (unlike an idealgas), resists rapid changes in its density fluctua-tions ρ(x):

F (ρ) =

∫1/2K(∇ρ)2 + 1/2α(ρ− ρ0)

2dx.

(See Exercise (9.4), where a similar term led toa surface tension for domain walls in magnets.)Thus in a periodic box of length L = 1,

F (ρ) =∑

m

1/2(Kk2m + α)|ρm|2L, (10.47)

with km = 2πm/L.(a) Show that equal-time correlation function isC(km) = |ρ(km)|2 = kBT/(Kk

2m + α)L. (Hint:

F is a sum of harmonic oscillators; use equipar-tition.) How should we express it in the con-tinuum limit, where L → ∞? (Hint:

∑m ≡

L/(2π)∫dk. Alternatively, as a function of con-

tinuous k, C(k) = kBT/(Kk2 + α), but remem-

ber the 1/2π in the continuum inverse Fouriertransform of eqn (A.7).)(b) Transform to real space. Does the sur-face tension introduce correlations (as opposedto those in the ideal gas)? Do the correlationsfall off to zero at long distances? (Hint:∫ ∞−∞ exp(iqy)/(1 + q2)dq = π exp(|y|).65The density of a liquid moving through a porousmedium will macroscopically satisfy the diffusionequation.(c) According to Onsager’s regression hypothe-

sis, what will the correlation C(k, t) be? (Seefootnote 12 and the new footnote 13.)It is non-trivial to Fourier transform this backinto real space, but the short-distance fluctu-ations (as they do in the ideal gas) will getsmoothed out even more with time.

(10.3) Onsager Regression Hypothesis and theHarmonic Oscillator. ©iA harmonic oscillator with mass M , position Qand momentum P has frequency ω. The Hamil-tonian for the oscillator is thus H = P 2/2M +1/2Mω2Q2. The correlation function CQ(τ ) =〈Q(t+ τ )Q(t)〉.(a) What is CQ(0) (Hint: Use equipartition.)The harmonic oscillator is coupled to a heatbath, and macroscopically has overdamped dy-namics dQ/dt = −λQ.(b) What is CQ(τ )? (Hint: Use the Onsagerregression hypothesis to derive an equation fordCQ/dτ .)Now consider a system of two uncoupled har-monic oscillators, one as described above andanother with mass m, position q, momentum p,

65The function 1/(1 + q2) is often called a Lorentzian or a Cauchy distribution. ItsFourier transform can be done using Cauchy’s residue theorem, using the fact that1/(1 + q2) = 1/((q + i)(q − i))) has a pole at q = ±i. So

∫∞−∞ exp(iqy)/(1 + q2)dq =

π exp(|y|). The absolute value happens because one must close the contour differentlyfor y > 0 and y < 0.

Page 60: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

50 List of figures

and frequency Ω, coupled to a heat bath withoverdamped decay rate Λ. (Capital letters aremeant to indicate size: this is a small mass witha high frequency and a quick decay time.) Herewe only observe X = Q+ q. Over the followingfew exercises, we shall compute properties of thiscombination X.(c) What is CX(0)? CX(τ )? Roughly sketcha graph of the latter, assuming λ ≈ Λ/10 andMω2 ≈ mΩ2/2. (Hint: They are com-pletely uncoupled, and so also uncorrelated ran-dom variables.)A consequence of the Onsager regression hypoth-esis is that the correlation function decays ‘ac-cording to the same laws as [a deviation] that hasbeen produced artificially’. But clearly there aredifferent ways in which X(t) could evolve froman artificially imposed initial condition, depend-ing on the way we start Q and q.(d) How would X(t) decay from an initial stateQ(0) = X(0), q(0) = 0? From a state q(0) =X(0), Q(0) = 0? Can you describe how onemight produce a state like one of these at t = 0purely by applying a force F (t) to X at negativetimes?Onsager’s macroscopically perturbed systemmust be perturbed slowly compared to the in-ternal dynamical relaxation times, for its futureevolution to be determined by its current state,and for his hypothesis to be well posed.

(10.4) Susceptibility and dissipation for the Har-monic Oscillator. ©iThis is a continuation of exercise (10.3), wherewe considered damped harmonic oscillators.Consider one harmonic oscillator (P,Q) withH = P 2/2M+1/2Mω2

0Q2−F (t)Q(t), overdamped

with decay rate λ.The static susceptibility χ0 is defined as the ratioof response Q to force F .(a) Under a constant external force F0, whatis the equilibrium position Q0? What is thestatic susceptiblity χ0? (Hint: Ignoring fluctu-ations, this is just freshman mechanics. What isthe spring constant for our harmonic oscillator?)Generally speaking, a susceptibility or compli-ance will be the inverse of the elastic modulus orstiffness – sometimes a matrix inverse.Consider our harmonic oscillator held with aconstant force F for all t < 0, and then released.For t > 0, the dynamics is clearly an exponen-tial decay (releasing the overdamped harmonicoscillator), and can also be written as an integral

of the time-dependent susceptibility χ(τ ) for theoscillator:

Q(t) = Q0 exp(−λt)

=

∫ 0

−∞dt′χ(t− t′)F

(10.48)

(Note that the integral ends at t′ = 0, where theforce is released.)(b) Solve eqn (10.48) for χ(τ ). (Hint: Youcan take derivatives of both sides and integratethe right-hand-side by parts. Or you can plugin χ(τ ) as an exponential decay A exp(−Bt) andsolve.)We can now Fourier transform χ to find χ(ω) =∫dt exp(iωt)χ(t) = χ′(ω) + iχ′′(ω).

(c) What is χ(ω)? χ′(ω)? χ′′(ω)? Is the powerdissipated for an arbitrary frequency dependentforce fω (given by eqn 10.37 always positive?

(10.5) Liquid susceptibility and dissipation. ©p(Continuation of Liquid free energy exercise.)Looking ahead to equation 10.60, calculate theliquid susceptibility in k-space χ(k, t) using yourOnsager’s regression analysis from the Liquidfree energy exercise, and doing Fourier trans-form to find the real and imaginary partsχ(k, ω) = χ′(k, ω) + iχ′′(k, ω). Show in particu-lar that

χ′′(k, ω) =Dk2ω

(Kk2 + α)(D2k4 + ω2).

(Remember that χ = 0 for t < 0.) Is the imag-inary part positive at all k and ω? Is the powerdissipated (eqn 10.37) positive?

(10.6) Fluctuation Dissipation for the HarmonicOscillator. ©iThis is a continuation of Exercise (10.3), whereyou derived the correlation function CQ(τ ) =A exp(−λτ ) for the overdamped harmonic os-cillator with rate λ, and the correlation func-tion CX(τ ) = A exp(−λτ ) + B exp(−Λτ ) forthe sum X = Q + q of two uncoupled oscilla-tors with overdamped decay rates λ and Λ. InExercise (10.4) we calculated the susceptibilitydirectly. Now we shall calculate it using thefluctuation-dissipation theorem.(a) Using the classical fluctuation-dissipationtheorem, eqn (10.60), derive the time-dependentsusceptibilities χQ(τ ) and χX(τ ). You may leaveyour answer in terms of A and B.(b) Sketch a graph of χX(τ ) for −1/λ < τ <1/λ, assuming that λ ≈ Λ/10 and A ≈ 2B.

Page 61: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 51

(10.7) Static susceptibility and the fluctuation–dissipation theorem for liquids.. ©p(Continuation of Liquid free energy exercise.)(a) Use your equal-time correlation function forthe liquid free energy to derive the static densityprofile ρ(x) of our liquid if a static force f isapplied at x = 0.(b) Check the AC version of the classicalfluctuation-dissipation theorem using the corre-lation function C(k, t) and χ′′(k, ω).

(10.8) Kramers–Kronig for the Harmonic Oscil-lator. ©iIn general, susceptibilities are causal: χ(τ ) = 0for τ < 0 because the response cannot precedethe force. We saw in general that this impliedthat χ(ω) had to be analytic in the upper halfplane, Im(ω) > 0.Let us check this for the overdamped harmonicoscillator, where χ(τ ) = C exp(−λτ ).What is χ(ω)? Where are its poles? Are they inthe upper half plane? Are there any singularitiesin the lower half plane? (Beware: The conven-tions on Fourier transforms are pretty strange(Appendix 13). Whether the poles are in theupper or lower half plane depends on whethergoing from t to ω multiplies by exp(iωt) or byexp(−iωt). Equation A.6 tells us the formerchoice is correct, but eqn A.9 tells us the oppositeconvention holds for spatial Fourier transforms.)

(10.9) Kramers Kronig for liquids.. ©p(Continuation of Liquid free energy exercise.)Check your formula for the liquid susceptibilityχ(k, ω). Is it analytic in the upper half plane?Explicitly check the Kramers–Kronig formulasfor the real and imaginary parts χ′ and χ′′.(Hint: Integrating through a pole, using theprinciple value, gives the average of an integralpassing below the pole and above the pole.)

(10.10) Subway Bench Correlations. (Mathemat-ics) ©3We shall study the correlation function betweenthe centers of passengers sitting on a benchalongside the windows in a subway car. Wemodel the system as a one-dimensional equilib-rium hard-disk gas of N disks with length a, con-tained in a car 0 < y < L.66

If Na = L, there is no free space on thebench, and the passengers have centers of massat packed positions

ypacked = [a/2, 3a/2, . . . , (2N − 1)a/2] .(10.49)

Call the free space Lfree = L − Na. We can re-late the ensemble of hard disks in one dimensionin a box of length L and an ideal gas in a box oflength Lfree.(a) Show that the configuration y1 < y2 < · · · <yN of passengers sitting on the bench of lengthL corresponds precisely to legal configurationsz1 < · · · < zN of ideal gas particles in a box oflength Lfree, with zn measuring the free space tothe left67 of passenger n, by giving an invertiblelinear mapping from z → y. Is the configura-tional entropy the same for the ideal gas and thesubway car? Why? (Feel free to look up theentropy of the Tonks gas to check your answer.But do explain your reasoning. Would the en-tropy be the same if the mapping were nonlinear,but invertible?)The correlation functions for the ideal gas andthe subway car are definitely not the same. Butthe mapping to the ideal gas provides a very effi-cient way of sampling the equilibrium ensembleof subway passengers:(b) For L = 10, a = 0.4, and N = 20, gener-ate a sample from the equilibrium ideal gas en-semble (that is, N uniformly distributed randomnumbers zn in the range [0, Lfree]). Generate asample yn from the subway car ensemble. (Hint:Use zn and ypacked.) Plot a function that is oneif a person is occupying the space, and zero ifthe space is empty, by generating a curve withfour points (0, yn − a/2), (1, yn − a/2), (1, yn +a/2), (0, yn + a/2) for each passenger.We can now collect the distances between eachpair of passengers for this snapshot.(c) Take your sample yn, and make a list of allthe (signed) separations ∆y = yn − ym betweenpairs of passengers. Plot a histogram of this dis-tribution, using the standard normalization (thenumber of counts in each bin) and a bin-widthof one. Explain why it has an overall triangu-lar shape. Zoom in to −2 < ∆y < 2, using abin width of 0.1. Explain briefly the two strikingfeatures you see near ∆y = 0 – the spike, and

66In one dimension, a disk of diameter a is really a rod of length a, so this is oftencalled a hard-rod gas. It is also known as the Tonks gas.67That is, in the direction of decreasing y.

Page 62: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

52 List of figures

the gap.To explore the correlation function more quan-titatively, we now want to take an average of anensemble of many passenger packings.(d) Generate n = 100 ensembles of passengersas in part (b), and collect all the pair separa-tions. Make a histogram of the pair separations,in the range −2 < ∆y < 2 with bin width of0.02. Adjust the vertical range of the plot toshow the entire first bump (but clipping the cen-tral spike). Explain how the bumps in the cor-relation function are related to the distances toneighbors seated m passengers to the right.At short distances, you can show that your his-togram is related68 to the density-density corre-lation function C(r) = 〈ρ(y)ρ(y + r)〉 evaluatedat ∆y = r, as follows.(e) In an infinite system with ρ particles perunit length, given there is a particle at y, writein terms of C(r) the probability per unit lengthP (y+ r|y) that there is another particle at y+ r.If we take n samples from the ensemble with Npassengers each, and compute all the pair sep-arations, what is the number per unit length ofseparations we expect at r, in terms of C(r)?Show that the expected number of separationsin a narrow bin of width B centered at r is(nNB/ρ)C(r). (Hint: The probability densityP (A|B) of A (person at r) given B (person at 0)is the probability density P (A&B) of both hap-pening divided by the probability density P (B)of B happening. Multiply by the number of peo-ple and the bin size to get the expected numberin the bin.)Next, we will decompose C(r) between all pairsinto a sum

C(r) =∑

m

Cm(r) (10.50)

of terms collecting distances to the mth neigh-bor to the right (with negative m correspondingto the left). We calculate the shapes of thesebumps for an infinite system at density ρ.Let us start with C1(r). The separations be-tween nearest neighbors m = 1 in the sub-way is just a larger than the separations be-

tween nearest neighbors in the correspondingone-dimensional ideal gas, ∆ynn = a + ∆znn.The corresponding ideal gas, remember, has den-sity ρideal = N/Lfree.(f) Argue that the m = 1 nearest-neighbor con-tribution to the correlation function is an expo-nential decay

C1(r) = ρ ρideal exp [−ρideal(r − a)]Θ(r − a),(10.51)

where Θ is the Heaviside function (one for pos-itive argument, zero otherwise). (Hint: How isit related to Cideal

1 (r) for the ideal gas?) Usingthe relation between C(r) and your histogram de-rived in part (d), plot your prediction against thefirst bump at positive ∆y in your histogram.(g) Calculate C1(k) from eqn 10.51, using theconventions of eqn A.9 in the Fourier appendix.(Hint: It is the integral of an exponential over ahalf-space.)The gaps between passengers in the infinite sub-way are just the same as the gaps between par-ticle in the ideal gas – and hence are inde-pendent. The second-neighbor distance is thusthe sum of two first-neighbor distances indepen-dently drawn from the same distribution, so theprobability distribution P2(y + r|y) is the con-volution of two first-neighbor probability distri-butions. Fourier transforms become useful here(see Exercise 12.11 and eqn A.23);(h) Using this and your answer to part (e), writethe Fourier spectrum of the second neighbor peakC2(k), first in terms of the Fourier spectrumof the first neighbor peak C1(k), and then us-ing your answer from part (g). What is Cn(k)?(Hint: Start with calculating P2(k).)By either doing inverse Fourier transforms backto r or convolutions, one may derive the corre-lation function for the subway problem ignoringthe finite size of the car:69 We present the an-

68C(r) and our histogram are also closely related to the pair distribution function

g(r) we introduced in Exercise 10.2.69These are just the mth neighbor correlations in the ideal gas, shifted to the rightby ma, the sum of the widths of m passengers.70In Mathematica, you may need to increase PlotPoints and MaxRecursion to avoidmissing features in the curve.

Page 63: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 53

swer here:

Cm(r) =ρ ρmideal(r −ma)m−1

(m− 1)!

e−ρideal(r−ma)Θ(r −ma) m > 0,

C0(r) =ρ δ(0)

Cm(r) =C(−m)(−r) m < 0.(10.52)

(i) Plot70 the theoretical prediction of eqn 10.52against your measured histogram, for −2 < r <2. Explain any discrepancy you find. (Hint:Remember the triangle plot in part (c).)

(10.11) Gibbs free energy barrier. (Chemistry) ©pFigure 11.14 shows the chemical potential for atheory (due to van der Waals) for the liquid-gastransition, as a function of density ρ = N/V , ata pressure and temperature where gas and liquidcoexist.(a) Which well is the gas? Which is the liquid?How can we tell that the two coexist?This figure is analogous to Fig. 11.2, except thatinstead of the Helmholtz potential A we plot theGibbs free energy per particle µ = G/N . In Ex-ercise 11.3, we shall use the chemical potentialbarrier between the two wells to estimate thesurface tension of water.But what does this plot mean? Normally,G(T, P,N) = A(T, V,N) + PV is not a functionof V , and so not a function of ρ = N/V .(b) In equilibrium, show that d/dV (A+PV ) = 0,and hence dµ/dρ = 0. Which points on thisgraph are in local equilibrium? Which local equi-librium is unstable to uniform changes in den-sity?We often want to study the free energy of sys-tems that are not in a global equilibrium. In Sec-tion 6.7, for example, we studied the free energyof a spatially varying ideal gas by constrainingthe local density ρ(x) and minimizing the freeenergy with respect to other degrees of freedom.Here we constrain the local density of a van derWaals gas, in an environment kept at a fixedpressure.Knowing the equilibrium Helmholtz free en-ergy A(T, V,N) for a local volume ρ = N/V ,can we calculate the Gibbs free energy densityG(T, P,N, V )? We can think of this as two sub-systems (as in Fig. 6.4), each in equilibrium, butwhere we have not yet allowed the subsystemsto exchange volume.(c) Argue that G(T, P,N, V ) = A(T, V,N)+PVis the free energy for the local subsystem incor-

porating the cost of stealing volume from the en-vironment at pressure P .The Helmholtz free energy for the van der Waalsgas is

A(T, V,N) =NkBT(log

(Nλ3

V − bN

)− 1

)(10.53)

−aN2/V.

(10.54)

(c) Verify eqn 11.22.

(10.12) Susceptibilities and Correlations. (Math-ematics) ©i

Fig. 10.30 Pulling. An order parameter y ispulled upward slightly by a time-independent pointforce F at the origin; the resulting height Y (x) isshown. Without the force, the order parameter ex-hibits small fluctuations about height zero.

(a) The height fluctuations are measured in thesame system of Fig. 10.30, at a temperature T ,in the absence of an external force. Calculatethe thermal ensemble average 〈y(0, t)y(∆x, t)〉of these height fluctuations without the force,in terms of Y (x), F , and kBT . Plot it overx ∈ (−10, 10), labeling the limits of your verticalaxis in terms of these same variables. (Hints:The fluctuations should be larger for higher tem-perature, and the response should be larger forhigher forces. Section (10.7) may be useful.)

Page 64: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

54 List of figures

Fig. 10.31 Hitting. The height y(x, t) at the originx = 0 after the same system is hit upward with animpulsive force gδ(x)δ(t). The decay is exponential,

y(0, t) = y(0, 0) exp(−γt) for t > 0.

(b) The same system is hit (Fig. 10.31) andthe time decay at the origin is measured. Theheight fluctuations are measured in the systemat x = 0 as a function of time in the absenceof an external hit. Calculate the thermal ensem-ble average 〈y(0, t)y(0, t + ∆t)〉 without the hit,in terms of y(0, 0), g, kBT , ∆t, and γ. Plot itfor t ∈ (−4/γ, 4/γ), labeling the limits of yourvertical axis in terms of these same variables.(c) Give a formula for Y0 in Fig. 10.30 (the re-sponse to a time-independent pull), in terms ofy(0, 0) (the response to a δ-function hit), g, F ,γ, and kBT .

Page 65: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 55

Chapter 11: Abrupt phase transitions

Exercises

(11.1) What is it unstable to?. ©pIn Figure 11.3a, suppose a system held at con-stant pressure P < Pv and temperature startsin the metastable liquid state. Draw this initialstate on a copy of the figure, and draw an arrowto the final state.

(11.2) Maxwell and van der Waals. (Chemistry) ©i

The van der Waals (vdW) equation

(P +N2a/V 2)(V −Nb) = NkBT (11.55)

is often applied as an approximate equationof state for real liquids and gases. The termV −Nb arises from short-range repulsion betweenmolecules (Exercise 3.2); the term N2a/V 2 in-corporates the leading effects71 of long-range at-traction between molecules.

50 100 150 200 250 300

Volume per mole (cm3)

0

1 × 108

2 × 108

3 × 108

4 × 108

5 × 108

6 × 108

Pres

sure

(dy

nes/

cm2 )

500 K550 K575 K600 K625 K650 K700 K750 K

Fig. 11.32 P–V plot: van der Waals.Van der Waals (vdW) approximation (eqn 11.55)to H2O, with a = 0.55 Jm3/mol2 (a =1.52 × 10−35 erg cm3/molecule), and b = 3.04 ×10−5 m3/mol (b = 5.05 × 10−23 cm3/molecule), fitto the critical temperature and pressure for water.

Figure 11.32 shows the pressure versus volumecurves for one mole of H2O, within the vdWmodel. A real piston of water held at constanttemperature would in equilibrium pass throughthree regimes as it expanded – first a decom-pressing liquid, then coexisting liquid and vaporat constant pressure (as the liquid evaporates orboils to fill the piston), and then a decompress-ing gas. The Maxwell construction tells us whatthe vapor pressure and the two densities for thecoexisting liquid and gas is at each temperature.Trace over the figure, or download a version fromthe book web site [?]. By hand, roughly imple-ment the Maxwell construction for each curve,and sketch the region in the P–V plane whereliquid and gas can coexist.

(11.3) Droplet Nucleation in 2D. ©pAbrupt phase transitions can also arise in two-dimensional systems. Lipid bilayer membranes,such as the ones surrounding your cells, oftenundergo a miscibility phase transition. At hightemperatures, two types of lipds may dissolve inone another, but at low temperatures they maybe immiscible. If the transition is abrupt, it willhappen via the same kind of critical droplet nu-cleation that Section 11.3 analyzed in three di-mensions.Calculate the critical radius Rc and the free en-ergy barrier B for the critical droplet in a two-dimensional abrupt phase transition. Let thetransition be at temperature Tm, with latent heatper unit area ℓ, and line tension λ per unit lengthbetween the two phases.

71These corrections are to leading orders in the density; they are small for dilutegases.

Page 66: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

56 List of figures

It is something of a surprise that living cell mem-branes appear to be quite near this kind of mis-cibility transition (five percent above the criticaltemperature). What is even more interesting isthat the transition upon cooling is not abrupt,but almost continuous – in the Ising universalityclass we shall study in Chapter 12.

(11.4) Coarsening in the Ising model. ©iCoarsening is the process by which phases sepa-rate from one another; the surface tension drivestiny fingers and droplets to shrink, leading to acharacteristic length scale that grows with time.Start up the Ising model again72 Run with afairly large system, demagnetize the system toa random initial state (T = ∞), and set T = 1.5(below Tc) and run one sweep at a time. (As thepattern coarsens, you may wish to reduce thegraphics refresh rate.) The pattern looks statis-tically the same at different times, except for atypical coarsening length that is growing. Howcan we define and measure the typical lengthscale L(t) of this pattern?(a) Argue that at zero temperature the total en-ergy above the ground-state energy is propor-tional to the perimeter separating up-spin anddown-spin regions.73 Argue that the inverse ofthe perimeter per unit area is a reasonable defi-nition for the length scale of the pattern.(b) With a random initial state, set temperatureand external field to zero. Measure the mean en-ergy E(t) per spin as a function of time. Plotyour estimate for L(t) in part (a) versus t on alog-log plot. (You might try plotting a few suchcurves, to estimate errors.) What power law doesit grow with? What power law did we expect?

(11.5) Linear stability of a growing interface. ©p

A plant disease has infested one side of a cornfield in Nebraska, infecting all plants with y <u(x), where u(x) is small (a nearly flat front).The disease propagates at different speeds de-pending on whether the surrounding plants arehealthy. The curvature of the front tells onewhether the nearby plants along the x directionare sick or not.

(a) In regions where u′′(x) > 0, are the sur-rounding plants more or less healthy than regionswhere u′′(x) < 0?We model the initial time evolution of the frontwith a simple, linear growth law:

∂u/∂t = v0 +w∂2u/∂x2. (11.56)

Let us assume that the disease spreads morerapidly in regions surrounded by healthy plants.(b) According to your answer in part (a), is wpositive or negative under this assumption?To solve this evolution law, it is convenient towork in Fourier space, writing

u(k) =

∫ ∞

−∞u(x) exp(−ikx)dx. (11.57)

(b) Find the differential equation for ∂u(k)/∂t,giving the evolution of the wiggles in the cropgrowth boundary as it grows. (Hint: As youhave seen in other linear problems, the evolu-tion law will only depend upon u(k) at the samewavevector k. Take the second derivative of bothsides of eqn 11.57, use eqn 11.56 to evolve u(x),and integrate by parts twice. You may want toconsider k = 0 separately.)(c) Solve the differential equation for u(k)(t) interms of the initial u(k) at t = 0. For what val-ues of w does an initial irregularity in the frontsmoothen out with time? For what values doesit grow? (Hint: With the assumption that thedisease spreads more rapidly when surroundedby healthy plants, the tips of a sinusoidal u(x)should advance faster than the valleys, leadingto the growth of small initial waves.)

(11.6) Fracture nucleation: elastic theory haszero radius of convergence.74 (Condensedmatter) ©3In this exercise, we shall use methods from quan-tum field theory to tie together two topics whichAmerican science and engineering students studyin their first year of college: Hooke’s law and theconvergence of infinite series.Consider a large steel cube, stretched by a mod-erate strain ǫ = ∆L/L (Figure 11.33). You mayassume ǫ ≪ 0.1%, where we can ignore plasticdeformation.

72E.g., Bierbaum’s simulation at http://mattbierbaum.github.io/ising.js.73At finite temperatures, there is a contribution from thermally flipped spins, whichshould not really count as perimeter for coarsening.74This exercise draws heavily on Alex Buchel’s work [?,?].

Page 67: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 57

(a) At non-zero temperature, what is the equi-librium ground state for the cube as L → ∞for fixed ǫ? (Hints: Remember, or show, thatthe free energy per unit (undeformed) volume ofthe cube is 1/2Y ǫ

2. Notice figure 11.34 as an al-ternative candidate for the ground state.) Forsteel, with Y = 2 × 1011 N/m2, γ ≈ 2.5 J/m2,75

and density ρ = 8000 kg/m3, how much can westretch a beam of length L = 10m before the equi-librium length is broken in two? How does thiscompare with the amount the beam stretches un-der a load equal to its own weight?

L/L = F/(YA)∆

MaterialElastic

Fig. 11.33 Stretched block of elastic material,length L and width W , elongated vertically by a forceF per unit area A, with free side boundaries. Theblock will stretch a distance ∆L/L = F/Y A verti-cally and shrink by ∆W/W = σ∆L/L in both hor-izontal directions, where Y is Young’s modulus andσ is Poisson’s ratio, linear elastic constants charac-teristic of the material. For an isotropic material,the other elastic constants can be written in termsof Y and σ; for example, the (linear) bulk modulusκlin = Y/3(1− 2σ).

MaterialElastic

Aγ2Surface Energy

Fig. 11.34 Fractured block of elastic material, asin figure 11.33 but broken in two. The free energyhere is 2γA, where γ is the free energy per unit areaA of (undeformed) fracture surface.

Why don’t bridges fall down? The beams in thebridge are in ametastable state. What is the bar-rier separating the stretched and fractured beamstates? Consider a crack in the beam, of lengthℓ. Your intuition may tell you that tiny crackswill be harmless, but a long crack will tend togrow at small external stress.For convenient calculations, we will now switchproblems from a stretched steel beam to a tauttwo-dimensional membrane under an isotropictension, a negative pressure P < 0. That is, weare calculating the rate at which a balloon willspontaneously pop due to thermal fluctuations.

P < 0

Crack length

Fig. 11.35 Critical crack of length ℓ, in a two-dimensional material under isotropic tension (nega-tive hydrostatic pressure P < 0).

75This is the energy for a clean, flat [100] surface, a bit more than 1eV/surfaceatom [?]. The surface left by a real fracture in (ductile) steel will be rugged andseverely distorted, with a much higher energy per unit area. This is why steel ismuch harder to break than glass, which breaks in a brittle fashion with much lessenergy left in the fracture surfaces.

Page 68: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

58 List of figures

The crack costs a surface free energy 2αℓ, whereα is the free energy per unit length of membraneperimeter. A detailed elastic theory calculationshows that a straight crack of length ℓ will re-lease a (Gibbs free) energy πP 2(1− σ2)ℓ2/4Y .(b) What is the critical length ℓc of the crack,at which it will spontaneously grow rather thanheal? What is the barrier B(P ) to crack nucle-ation? Write the net free energy change in termsof ℓ, ℓc, and α. Graph the net free energy change∆G due to the the crack, versus its length ℓ.The point at which the crack is energetically fa-vored to grow is called the Griffiths threshold, ofconsiderable importance in the study of brittlefracture.The predicted fracture nucleation rate R(P ) perunit volume from homogeneous thermal nucle-ation of cracks is thus

R(P ) = (prefactors) exp(−B(P )/kBT ).(11.58)

One should note that thermal nucleation of frac-ture in an otherwise undamaged, undisorderedmaterial will rarely be the dominant failuremode. The surface tension is of order an eV perbond (> 103 K/A), so thermal cracks of arealarger than tens of bond lengths will have insur-mountable barriers even at the melting point.Corrosion, flaws, and fatigue will ordinarily leadto structural failures long before thermal nucle-ation will arise.

Advanced topic: Elastic theory has zero radius ofconvergence.Many perturbative expansions in physics havezero radius of convergence. The most preciselycalculated quantity in physics is the gyromag-netic ratio of the electron [?]

(g − 2)theory =α/(2π)− 0.328478965 . . . (α/π)2

+ 1.181241456 . . . (α/π)3

− 1.4092(384)(α/π)4

+ 4.396(42) × 10−12 (11.59)

a power series in the fine structure constantα = e2/~c = 1/137.035999 . . . . (The last term isan α-independent correction due to other kindsof interactions.) Freeman Dyson gave a wonder-ful argument that this power-series expansion,and quantum electrodynamics as a whole, has

zero radius of convergence. He noticed that thetheory is sick (unstable) for any negative α (cor-responding to a pure imaginary electron chargee). The series must have zero radius of conver-gence since any circle in the complex plane aboutα = 0 includes part of the sick region.How does Dyson’s argument connect to fracturenucleation? Fracture at P < 0 is the kind of in-stability that Dyson was worried about for quan-tum electrodynamics for α < 0. It has impli-cations for the convergence of nonlinear elastictheory.Hooke’s law tells us that a spring stretchesa distance proportional to the force applied:x − x0 = F/K, defining the spring constant1/K = dx/dF . Under larger forces, the Hooke’slaw will have corrections with higher powers ofF . We could define a ‘nonlinear spring constant’K(F ) by

1

K(F )=x(F )− x(0)

F= k0 + k1F + . . . (11.60)

Instead of a spring constant, we’ll calculate anonlinear version of the bulk modulus κnl(P )giving the pressure needed for a given fractionalchange in volume, ∆P = −κ∆V/V . The linearisothermal bulk modulus76 is given by 1/κlin =−(1/V )(∂V /∂P )|T ; we can define a nonlineargeneralization by

1

κnl(P )= − 1

V (0)

V (P )− V (0)

P

= c0 + c1P + c2P2 + · · ·+ cNP

N + · · ·(11.61)

This series can be viewed as higher and higher-order terms in a nonlinear elastic theory.(c) Given your argument in part (a) about thestability of materials under tension, would Dysonargue that the series in eqn 11.61 has a zero ora non-zero radius of convergence?In Exercise ?? we saw the same argument holdsfor Stirling’s formula for N !, when extendedto a series in 1/N ; any circle in the com-plex 1/N plane contains points 1/(−N) fromlarge negative integers, where we can show that(−N)! = ∞. These series are asymptotic expan-sions. Convergent expansions

∑cnx

n convergefor fixed x as n → ∞; asymptotic expansionsneed only converge to order O(xn+1) as x → 0

76Warning: For many purposes (e.g. sound waves) one must use the adiabatic elasticconstant 1/κ = −(1/V )(∂V /∂P )|S . For most solids and liquids these are nearly thesame.

Page 69: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 59

for fixed n. Hooke’s law, Stirling’s formula, andquantum electrodynamics are examples of howimportant, powerful, and useful asymptotic ex-pansions can be.Buchel [?,?], using a clever trick from field the-ory [?, Chapter 40], was able to calculate thelarge-order terms in elastic theory, essentiallyby doing a Kramers–Kronig transformation onyour formula for the decay rate (eqn 11.58) inpart (b). His logic works as follows.• The Gibbs free energy density G of themetastable state is complex for negative P . Thereal and imaginary parts of the free energy forcomplex P form an analytic function (at leastin our calculation) except along the negative Paxis, where there is a branch cut.• Our isothermal bulk modulus for P > 0 canbe computed in terms of G = G/V (0). SincedG = −S dT+V dP+µdN , V (P ) = (∂G/∂P )|Tand hence77

1

κnl(P )= − 1

V (0)

(∂G/∂P )|T − V (0)

P

= − 1

P

(∂G∂P

∣∣∣∣T

− 1

). (11.62)

(d) Write the coefficients cn of eqn 11.61 interms of the coefficients gm in the nonlinear ex-pansion

G(P ) =∑

gmPm. (11.63)

• The decay rateR(P ) per unit volume is propor-tional to the imaginary part of the free energyIm[G(P )], just as the decay rate Γ for a quan-tum state is related to the imaginary part i~Γ ofthe energy of the resonance. More specifically,for P < 0 the imaginary part of the free energyjumps as one crosses the real axis:

Im[G(P ± iǫ)] = ±(prefactors)R(P ). (11.64)

PCFRe[P]

Im[P]

A

EBD 0

Fig. 11.36 Contour integral in complex pres-

sure. The free energy density G of the elastic mem-brane is analytic in the complex P plane except alongthe negative P axis. This allows one to evaluate G atpositive pressure P0 (where the membrane is stableand G is real) with a contour integral as shown.

• Buchel then used Cauchy’s formula to evaluatethe real part of G in terms of the imaginary part,and hence the decay rate R per unit volume:

G(P0) =1

2πi

ABCDEF

G(P )

P − P0dP

=1

2πi

∫ 0

B

G(P + iǫ)− G(P − iǫ)

P − P0dP

+

EFA

+

BCD

=1

π

∫ 0

B

Im[G(P + iǫ)]

P − P0dP

+ (unimportant) (11.65)

where the integral over the small semicircle van-ishes as its radius ǫ→ 0 and the integral over thelarge circle is convergent and hence unimportantto high-order terms in perturbation theory.The decay rate (eqn 11.58) for P < 0 should beof the form

R(P ) ∝ (prefactors) exp(−D/P 2), (11.66)

where D is some constant characteristic of thematerial. (You may use this to check your an-swer to part (b).)

77Notice that this is not the (more standard) pressure-dependent linear bulk modulus,κlin(P ) which is given by 1/κlin(P ) = −(1/V )(∂V /∂P )|T = −(1/V )(∂2G/∂P 2)|T .This would also have a Taylor series in P with zero radius of convergence at P = 0,but it has a different interpretation; κnl(P ) is the nonlinear response at P = 0, whileκlin(P ) is the pressure-dependent linear response.

Page 70: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

60 List of figures

(e) Using eqns. 11.64, 11.65, and 11.66, and as-suming the prefactors combine into a constantA, write the free energy for P0 > 0 as an inte-gral involving the decay rate over −∞ < P < 0.Expanding 1/(P −P0) in a Taylor series in pow-ers of P0, and assuming one may exchange sumsand integration, find and evaluate the integral forgm in terms of D and m. Calculate from gm thecoefficients cn, and then use the ratio test to cal-culate the radius of convergence of the expansionfor 1/κnl(P ), eqn 11.61. (Hints: Use a table ofintegrals, a computer algebra package, or changevariable P = −

√D/t to make your integral into

the Γ function,

Γ(z) = (z − 1)! =

∫ ∞

0

tz−1 exp(−t)dt. (11.67)

If you wish, you may use the ratio test on everysecond term, so the radius of convergence is the

value limn→∞√

|cn/cn+2|.)(Why is this approximate calculation trustwor-thy? Your formula for the decay rate is valid onlyup to prefactors that may depend on the pres-sure; this dependence (some power of P ) won’tchange the asymptotic ratio of terms cn. Yourformula for the decay rate is an approximation,but one which becomes better and better forsmaller values of P ; the integral for the high-order terms gm (and hance cn) is concentratedat small P , so your approximation is asymptot-ically correct for the high order terms.)Thus the decay rate of the metastable state canbe used to calculate the high-order terms in per-turbation theory in the stable phase! This isa general phenomena in theories of metastablestates, both in statistical mechanics and in quan-tum physics.

Page 71: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 61

Chapter 12: Continuous phase transitions

Exercises

(12.1) Renormalization-group trajectories. ©3This exercise provides an early introduction tohow we will derive power laws and universal scal-ing functions in Section 12.2 from universalityand coarse-graining.An Ising model near its critical temperature Tc isdescribed by two variables: the distance to thecritical temperature t = (T − Tc)/Tc, and theexternal field h = H/J . Under coarse-graining,changing lengths to x′ = (1− ǫ)x, the system isobserved to be similar to itself at a shifted tem-perature t′ = (1+aǫ) t and a shifted external fieldh′ = (1+bǫ)h, with ǫ infinitesimal and a > b > 0(so there are two relevant eigendirections, withthe temperature more strongly relevant than theexternal field).The curves shown below connect points that aresimilar up to some rescaling factor.(a) Which diagram below has curves consistentwith this flow, for a > b > 0? Is the flow undercoarse graining inward or outward from the ori-gin? (No math should be required. Hint: Aftercoarse-graining, how does h/t change?)

(A)

t

h

(B)

t

h

(C)

t

h

(D)

t

h

(E)

t

h

The solid dots are at temperature t0; the opencircles are at temperature t = 2t0.(b) In terms of ǫ and a, by what factor mustx be rescaled by to relate the systems at t0 and2t0? (Algebraic tricks: Use (1 + δ) ≈ exp(δ)everywhere. If you rescale multiple times un-til exp(naǫ) = 2, you can solve for (1 − ǫ)n ≈exp(−nǫ) without solving for n.) If one of thesolid dots in the appropriate figure from part (a)is at (t0, h0), what is the field h for the corre-sponding open circle, in terms of a, b, ǫ, and theoriginal coordinates? (You may use the rela-tion between h and h0 to check your answer forpart (a).)The magnetization M(t, h) is observed to rescaleunder this same coarse-graining operation to

Page 72: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

62 List of figures

M ′ = (1 + cǫ)M , so M ((1 + aǫ) t, (1 + bǫ)h) =(1 + cǫ)M(t, h).(c) Suppose M(t, h) is known at (t0, h0), one ofthe solid dots. Give a formula for M(2t0, h)at the corresponding open circle, in terms ofM(t0, h0), the original coordinates, a, b, c, andǫ. (Hint: Again, rescale n times.) Substituteyour formula for h into the formula, and solvefor M(t0, h0).You have now basically derived the key resultof the renormalization group; the magnetizationcurve at t0 can be found from the magnetizationcurve at 2t0. In Section 12.2, we shall coarse-grain not to t = 2t0, but to t = 1. We shallsee that the magnetization everywhere can bepredicted from the magnetization where the in-variant curve crosses t = 1.(d) Substitute 2 → 1/t0 in your formula frompart (c). Show that M(t, h) = tβM(h/tβδ) (thestandard scaling form for the magnetization inthe Ising model). What are β and δ in terms ofa, b, and c? How is M related to M(t, h) wherethe curve crosses t = 1?

(12.2) Singular corrections to scaling. (Con-densed matter) ©3The renormalization group says that the numberof relevant directions at the fixed point in sys-tem space is the number of parameters we needto tune to see a critical point, and that the criti-cal exponents depend on the eigenvalues of theserelevant directions. Do the irrelevant directionsmatter?Let the Ising model in zero field be described byflow equations

dtℓ/dℓ = tℓ/ν, duℓ/dℓ = −yuℓ (12.68)

where tℓ describes the renormalization of thereduced temperature t = (Tc − T )/Tc after acoarse-graining by a factor b = exp(ℓ), and uand uℓ represent a slowly-decaying irrelevantperturbation under the renormalization group.In Fig. 12.8, one may view t as the expandingeigendirection running roughly horizontally, andu as the contracting, irrelevant coordinate run-ning roughly vertically. Thus our model startswith a value u associated to the distance in sys-tem space between Rc and S∗.

(a) What is the invariant combination z = utω

that stays constant under the renormalizationgroup? What is ω in terms of the eigenvalues−y and 1/ν?Properties near critical points have universalpower law singularities, but the corrections tothese power laws also have universal propertiespredicted by the renormalization group. Thesecome in two types – analytic corrections to scal-ing and singular corrections to scaling.Let us consider corrections to the susceptibility.In analogy with other systems we have studied,we would expect that the susceptibility

χ(t, u) = t−γX(z) (12.69)

with X(z) a universal function of the invariantcombination you found in part (a). As a func-tion of t, χ(t, u) has singularities at small t. Butwe expect properties to be analytic as we vary u,since the irrelevant direction is not being tunedto a special value, so we expect that a Taylor se-ries of χ(t, u) in powers of u should make sense.Since z ∝ u, we thus expect that X(z) will bean analytic function of z for small z.78

(b) Show that for small t, your z from part (a)goes to zero. Taylor expand X(z). What correc-tions do you predict for the susceptibility fromthe first and second-order terms in the series?These are the singular corrections to scaling dueto the irrelevant perturbation u.An Ising magnet on a sample holder is loadedinto a magnetometer, and the susceptibility ismeasured79 at zero external field as a functionof reduced temperature t = (T − Tc)/Tc. It isfound to be well approximated by

χ(T ) = At−1.24 +Bt−0.83 + Ct0.42

+D +Et+ . . .(12.70)

You may ignore any errors due to the magne-tometer.(c) The exponent ω ≈ 0.407 for the 3D Ising uni-versality class, and γ ≈ 1.237. Which terms areexplained as singular corrections to scaling?(d) Can you provide a physical interpretation forthe terms in eqn 12.70 that are not explained byyour theory? For example, how do we expect thesusceptibility of the sample holder to depend ontemperature?

78Had we used a scaling variable Z = tu1/ω , for example, we would not have expectedthe corresponding scaling function to be analytic in small Z.79The accuracy of the quoted exponents is not experimentally realistic.

Page 73: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 63

So far, we have relied on universality and rescal-ing to derive the universal power laws and scalingforms for properties near critical points. We canderive these in a mathematical way by includingthe flows of the predictions along with the flowsof the control parameters under coarse-graining:

dχℓ/dℓ = −(γ/ν)χℓ,

dtℓ/dℓ = tℓ/ν,

duℓ/dℓ = −yuℓ

(12.71)

How do we derive the universal scaling functionX(z) from these renormalization group flows?Consider the flows illustrated in Fig. 12.8, ex-cept now with a third dimension involving theprediction χ. Consider a point (t0, u0, χ0) in thesystem space, and the invariant curve defined byz = u0t

ω0 (dashed lines). Our renormalization

group allows us to calculate χℓ(tℓ, uℓ) along thesecurves – relating the behavior everywhere nearthe critical manifold (vertical swath flowing to-ward S∗) to the properties along the outgoingtrajectories, which approach closer and closerto the unstable manifold (the horizontal swathflowing away from S∗.For example, we can define the universal scalingfunction X(z) (for positive time t) to be the χℓ∗

where the flow crosses tℓ∗ = 1.(e) Solve eqns 12.71 for uℓ and tℓ. Settingtℓ∗ = 1, what is uℓ∗ in terms of your invariantcombination z?So we label each invariant scaling curve by thevalue of the vertical position uℓ∗ where it crossestℓ∗ = 1.(f) Solve eqns 12.71 for χℓ∗(1, uℓ∗), in termsof z, t0, and χ0(t0, u0). Use your solutionto solve for the physical behavior χ0(t0, u0) interms of t and X(z). Express X(z) in terms ofχℓ∗(1, uℓ∗).Remember the critical manifold is co-dimensionone (or two, if you include temperature and ex-ternal field), and the unstable manifold is dimen-sion one (or two) – so we get universal predic-tions for a huge variety of systems, by observingthe outgoing trajectories near a narrow tube orsurface emitted from the fixed point.

(12.3) Nonlinear flows, analytic corrections, andhyperscaling.80 ©3We consider the effects of nonlinear terms inrenormalization group flows. The Ising model inzero field has one relevant variable (the deviationt of the temperature from Tc). To calculate thespecific heat, we shall also consider the flow ofthe free energy per spin f under the renormaliza-tion group. Instead of a discrete coarse-grainingby a factor b, here we use a continuous coarse-graining measured by ℓ. One can think of onecoarse-graining step by b = (1+ ǫ) as increment-ing ℓ → ℓ + ǫ; equivalently, coarse-graining to ℓchanges length scales by exp(ℓ).Consider the particular flow equations81

dfℓ/dℓ = Dfℓ − atℓ2 dtℓ/dℓ = tℓ/ν, (12.72)

where D is the dimension of space and atℓ2 is

a nonlinear term that will be important in twodimensions.Notice there are several free energies here. Weshall call the free energy per spin of the actualsystem either f or f0, and the temperature ofthe actual system either t or t0. The coarse-grained free energy and temperature are fℓ andtℓ, after being coarse-grained by a factor exp(ℓ).(Hence at ℓ = 0 we have not yet coarse-grained,so f0 = f and t0 = t.) Notice here that thefree energy of our system is the initial conditionf0(t0) of this differential equation, and fℓ = f(ℓ)and tℓ = t(ℓ) are the renormalization group flowsof the two variables. To derive the scaling behav-ior, we shall coarse-grain to ℓ∗ where tℓ∗ = 1, atwhich point the coarse-grained free energy is fℓ∗ .Let us start with the linear case a = 0.(a) Solve for fℓ and tℓ for a = 0. Setting tℓ∗ = 1,solve for f0 in terms of fℓ∗ , t0, D, and ν. Solvefor the specific heat per spin c = T∂2f/∂T 2,where t = t0 = (T − Tc)/Tc and f = f0. (Hint:Use the chain rule ∂f/∂T = (∂f/∂t)(dt/dT ).)Show that the specific heat near Tc has a power-law singularity c ∝ t−α, with α = 2 −Dν. (Forexample, in D = 3, ν ≈ 0.63, so α = 2 − Dν ∼0.11; the specific heat diverges at Tc.) Writing

c = t−α(c0 + c1t+ c2t2 + . . . ), (12.73)

80This exercise developed in collaboration with Colin Clement81Note that these are total derivatives. So the first equation tells us the total changein f after coarsening by a factor 1+ dℓ. f(t) then will coarse grain to fℓ(tℓ) withoutneeding to worry about the chain rule df(t)/dℓ = ∂f/∂ℓ+ ∂f/∂tdt/dℓ.

Page 74: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

64 List of figures

what is the first correction c1 to the specific heatnear t = 0 in the absence of the nonlinear term?Why is the linear term in the the free energyflow equal to the dimension, df/dℓ = Df + . . . ,where all other terms are hard-to-compute criti-cal exponents? There is no simple answer to thisquestion.82 Indeed, other models of disorderedsystems and glasses, and models above the uppercritical dimension, the linear term is not givenby the dimension. The relation 2 − α = Dν iscalled a hyperscaling relation (to emphasize theyinvolve the dimension D), and these other mod-els are said to violate hyperscaling.(b) In the case a = 0, show that the singular freeenergy f contained in a correlated volume ξD

near the critical temperature becomes indepen-dent of the distance to the critical point. (Hint:Look up the critical exponent describing how thecorrelation length ξ diverges as t→ 0.)Glassy and disordered systems become ex-tremely sluggish as they are cooled. In at leastsome cases, this is precisely because the energybarriers needed to continue equilibration divergeas their correlation lengths grow – they are glassybecause their RG flows violate the hyperscalingrelation.So much for the power law singularity – whatabout the correction term c1 in part (a)? It isan analytic correction to scaling.83 Here it is sub-dominant – near the critical temperature wheret → 0, it is less singular than the leading term.One expects in a real physical system that themicroscopic bond free energy J between spinswill be some analytic function of temperature,and the physical free energy and specific heatwill be multiplied by J . Expanding J in a Tay-lor series about t = 0 would also give us termslike those in eqn 12.73.

Does the introduction of the higher-order nonlin-ear term atℓ

2 in eqn 12.72 change the behaviorin an important way? Rather than exercisingyour expertise in analytic solutions of nonlineardifferential equations, eqn 12.74 provides not fℓand tℓ as functions of ℓ, but the relation betweenthe two:

fℓ(tℓ) =f0

(tℓt0

)Dν

− aνtℓ2

(2−Dν)(1− (tℓ/t0)

−(2−Dν)).

(12.74)(c) Show that fℓ(tℓ) in eqn 12.74 satisfies the dif-ferential equation given by eqn 12.72, using

dfℓdtℓ

=dfℓdℓ

/dtℓdℓ

(12.75)

Show that it has the correct initial conditions atℓ = 0. What is fℓ∗ at ℓ∗, where tℓ = 1? Showyour method.Again, it is important to remember that fℓ(tℓ)is not the free energy as a function of temper-ature – it is the coarse-grained free energy as afunction of the coarse-grained temperature of asystem starting at a free energy f0 at a tempera-ture t0. It is f0(t0) that we want to know. Sincehere we have only one relevant variable (in zerofield), all the flows lead to the same84 final pointfℓ∗(tℓ∗ = 1)(d) Solve for f0 in terms of fℓ∗ and t0. Solvefor the specific heat c = T∂2f/∂T 2, where t =(T − Tc)/Tc. Show that it can be written in theform

c = c+analytic(t) + t−αc∗analytic(t) (12.76)

where the additive correction c+analytic(t) and themultiplicative correction c∗analytic(t) have a sim-ple Taylor series about t = 0. Write these twocorrections, in terms of fℓ∗ , Tc, a, ν, and D.

82The total free energy of a system stays the same under coarse-graining (tracing outsome degrees of freedom), and one can see that the free energy per spin thus mustincrease as dfcoarsened/dℓ = Df due to the reduction in the number of spins. But theRG has both coarse-graining and rescaling. Why is the total free energy not rescaled?Also, part of the coarse-grained free energy is left in an analytic background – whyis the singular part governed by eqn 12.72 scaled to preserve the total singular freeenergy?83These are distinct from singular corrections to scaling that arise, for example, fromirrelevant terms under the renormalization group, that would produce subdominantcorrections to c with powers t−α+∆ where ∆ is not an integer, and is bigger thanzero (hence subdominant).84Remember for systems with more than one variable (say t and h), the free en-ergy depends on the invariant curve departing from the fixed point (labeled, say, byh/tβδ = hℓ∗(tℓ∗ = 1). We solve for f0(t0, h0) in terms of fℓ∗ (1, hℓ∗) just as we do inpart (b).

Page 75: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 65

Here the nonlinear term a gives us not only asmooth multiplicative term in the specific heat,but also a smooth additive background. Thisclearly is also expected in a real physical system– other degrees of freedom uncoupled to the mag-netization, or even the box holding the magnet,will contribute a specific heat that is analyticnear Tc.

(12.4) Beyond power laws: Nonlinear flows andlogarithms in the 2D Ising model.85 ©3The two dimensional Ising model has a logarith-mic singularity in the specific heat. The exactresult shows that the specific heat per spin is

c(T ) =kB2

π

(2J

kBTc

)2 [− log(1− T/Tc)

+ log(kBTc/(2J) − (1 + π/4)]

=− 8

πkBT 2c

log

(t

1/2kBTc exp(−(1 + π/4))

)

= −c0 log(t

τ

)

(12.77)where t = (T − Tc)/Tc and we set J = 1. (Re-member log(t) is negative for small t.) Lin-earized flows around the renormalization groupfixed point predict c ∼ t−α, and when α→ 0 oneoften observes logarithmic corrections. But suchcorrections are not predicted by the linearizedflows. The key nonlinear term is the term atℓ

2

of eqn 12.72 in Exercise 12.3.(a) Is the solution for fℓ(tℓ) in eqn 12.74 usefulin D = 2? Why or why not? (Hint: The expo-nent ν = 1 for the two-dimensional Ising model.)Again, we provide the solution of the nonlinearRG eqns 12.72 for D = 2

fℓ(tℓ) = f0(tℓ/t0)2 + atℓ

2 log(tℓ/t0). (12.78)

(b) Show that fℓ(tℓ) in eqn 12.78 satisfies thedifferential equation given by eqn 12.72, with thecorrect initial conditions. Solve for f0 in termsof fℓ∗ and t0, where tℓ∗ = 1. Solve for thespecific heat c = T∂2f/∂T 2, where t = t0 =(T − Tc)/Tc and f = f0. (Remember the chainrule: ∂f/∂T = (∂f/∂t)(dt/dT ).) Does it agreeasymptotically with the exact result in eqn 12.77?What are c0 and τ , in terms of a and fℓ∗?

(12.5) The Gutenberg Richter law. (Scaling) ©3Power laws often arise at continuous transitionsin non-equilibrium extended systems, particu-larly when disorder is important. We don’t yethave a complete understanding of earthquakes,but they seem clearly related to the transitionbetween pinned and sliding faults as the tectonicplates slide over one another.The size S of an earthquake (the energy radi-ated, shown in the upper axis of Figure 12.3b) istraditionally measured in terms of a ‘magnitude’M ∝ log S (lower axis). The Gutenberg-Richterlaw tells us that the number of earthquakes ofmagnitude M ∝ log S goes down as their sizeS increases. Figure 12.3b shows that the num-ber of avalanches of magnitude between M andM + 1 is proportional to S−B with B ≈ 2/3.However, it is traditional in the physics commu-nity to consider the probability density P (S) ofhaving an avalanche of size S.If P (S) ∼ S−τ , give a formula for τ in termsof the Gutenberg-Richter exponent B. (Hint:The bins in the histogram have different rangesof size S. Use P (M) dM = P (S) dS.)

(12.6) Period Doubling. (Scaling) ©pMost of you will be familiar with the period dou-bling route to chaos, and the bifurcation diagramshown below. (See also Section 12.3.3).

δ

α

Fig. 12.37 Scaling in the period doubling bi-

furcation diagram. Shown are the points x onthe attractor (vertical) as a function of the con-trol parameter µ (horizontal), for the logistic mapf(x) = 4µx(1 − x), near the transition to chaos.

85This exercise developed in collaboration with Colin Clement

Page 76: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

66 List of figures

The self-similarity here is not in space, but intime. It is discrete instead of continuous; thebehavior is the similar if one rescales time by afactor of two, but not by a factor 1 + ǫ. Henceinstead of power laws we find a discrete self-similarity as we approach the critical point µ∞.(a) From the diagram shown, roughly estimatethe values of the Feigenbaum numbers δ (govern-ing the rescaling of µ−µ∞) and α (govering therescaling of x−xp, where xp = 1/2 is the peak ofthe logistic map). (Hint: be sure to check thesigns.)Remember that the relaxation time for the Isingmodel became long near the critical tempera-ture; it diverges as t−ζ where t measures thedistance to the critical temperature. Remem-ber that the correlation length diverges as t−ν .Can we define ζ and ν for period doubling?(b) If each rescaling shown doubles the period Tof the map, and T grows as T ∼ (µ∞ − µ)−ζ

near the onset of chaos, write ζ in terms of αand δ. If ξ is the smallest typical length scaleof the attractor, and we define ξ ∼ (µ∞ − µ)−ν

(as is traditional at thermodynamic phase tran-sitions), what is ν in terms of α and δ? (Hint:be sure to check the signs.)

(12.7) Random Walks. (Scaling) ©3Self-similar behavior also emerges without prox-imity to any obvious transition. One might saythat some phases naturally have self-similarityand power laws. Mathematicians have a tech-nical term generic which roughly translates to‘without tuning a parameter to a special value’,and so this is termed generic scale invariance.The simplest example of generic scale invarianceis that of a random walk. Figure 12.38 showsthat a random walk appears statistically self-similar.

Fig. 12.38 Random walk scaling. Each boxshows the first quarter of the random walk in the pre-vious box. While each figure looks different in detail,they are statistically self-similar. That is, an ensem-ble of medium-length random walks would be indis-tinguishable from an ensemble of suitably rescaledlong random walks.

Let X(T ) =∑T

t=1 ξt be a random walk of lengthT , where ξt are independent random variableschosen from a distribution of mean zero and fi-nite standard deviation. Derive the exponentν governing the growth of the root-mean-squareend-to-end distance d(T ) =

√〈(X(T )−X(0))2〉

with T . Explain the connection between this andthe formula from freshman lab courses for theway the standard deviation of the mean scaleswith the number of measurements.

(12.8) Hysteresis and Barkhausen Noise. (Scal-ing) ©iHysteresis is associated with abrupt phase tran-sitions. Supercooling and superheating are ex-amples (as temperature crosses Tc). Magneticrecording, the classic place where hysteresis isstudied, is also governed by an abrupt phasetransition – here the hysteresis in the magne-tization, as the external field H is increased(to magnetize the system) and then decreased

Page 77: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 67

again to zero. Magnetic hysteresis is character-ized by crackling (Barkhausen) electromagneticnoise. This noise is due to avalanches of spinsflipping as the magnetic interfaces jerkily arepushed past defects by the external field (muchlike earthquake faults jerkily responding to thestresses from the tectonic plates). It is interest-ing that when dirt is added to this abrupt mag-netic transition, it exhibits the power-law scalingcharacteristic of continuous transitions.Our model of magnetic hysteresis (unlike the ex-periments) has avalanches and scaling only at aspecial critical value of the disorder Rc ∼ 2.16(Figure 12.13). The integrated probability dis-tribution D(S,R) has a power law D(S,Rc) ∝S−τ at the critical point (where τ = τ + σβδfor our model) but away from the critical pointtakes the scaling form

D(S,R) ∝ S−τD(Sσ (R −Rc)). (12.79)

Note from eqn (12.79) that at the critical disor-der R = Rc the distribution of avalanche sizes isa power law D(S,Rc) = S−τ . The scaling formcontrols how this power law is altered as Rmovesaway from the critical point. From Figure 12.13we see that the main effect of moving above Rc

is to cut off the largest avalanches at a typicallargest size Smax(R), and another important ef-fect is to form a ‘bulge’ of extra avalanches justbelow the cut–off.Using the scaling form from eqn 12.79, with whatexponent does Smax diverge as r = (Rc−R) → 0?(Hint: At what size S is D(S,R), say, one mil-lionth of S−τ?) Given τ ≈ 2.03, how does themean 〈S〉 and the mean-square 〈S2〉 avalanchesize scale with r = (Rc −R)? (Hint: Your inte-gral for the moments should have a lower cutoffS0, the smallest possible avalanche, but no up-per cutoff, since that is provided by the scalingfunction D. Assume D(0) > 0. Change variablesto Y = Sσr. Which moments diverge?)

(12.9) First to fail: Weibull.86 (Mathematics,Statistics, Engineering) ©3Suppose you have a brand-new supercomputerwith N = 1000 processors. Your parallelizedcode, which uses all the processors, cannot berestarted in mid-stream. How long a time t can

you expect to run your code before the first pro-cessor fails?This is example of extreme value statistics (seealso exercises 12.10 and 12.11), where here weare looking for the smallest value of N randomvariables that are all bounded below by zero. Forlarge N the probability distribution ρ(t) and sur-vival probability S(t) =

∫∞tρ(t′) dt′ are often

given by the Weibull distribution

S(t) = e−(t/α)γ ,

ρ(t) = −dS

dt=γ

α

(t

α

)γ−1

e−(t/α)γ .(12.80)

Let us begin by assuming that the processorshave a constant rate Γ of failure, so the prob-ability density of a single processor failing attime t is ρ1(t) = Γ exp(−Γt) as t → 0), andthe survival probability for a single processorS1(t) = 1−

∫ t

0ρ1(t

′)dt′ ≈ 1− Γt for short times.(a) Using (1 − ǫ) ≈ exp(−ǫ) for small ǫ, showthat the the probability SN(t) at time t that allN processors are still running is of the Weibullform (eqn 12.80). What are α and γ?Often the probability of failure per unit timegoes to zero or infinity at short times, ratherthan to a constant. Suppose the probability offailure for one of our processors

ρ1(t) ∼ Btk (12.81)

with k > −1. (So, k < 0 might reflect abreaking-in period, where survival for the firstfew minutes increases the probability for latersurvival, and k > 0 would presume a dominantfailure mechanism that gets worse as the proces-sors wear out.)(b) Show the survival probability for N identi-cal processors each with a power-law failure rate(eqn 12.81) is of the Weibull form for large N ,and give α and γ as a function of B and k.The parameter α in the Weibull distribution justsets the scale or units for the variable t; onlythe exponent γ really changes the shape of thedistribution. Thus the form of the failure distri-bution at large N only depends upon the powerlaw k for the failure of the individual compo-nents at short times, not on the behavior of ρ1(t)at longer times. This is a type of universality,87

which here has a physical interpretation; at large

86Developed with the assistance of Paul (Wash) Wawrzynek87The Weibull distribution forms a one-parameter family of universality classes; seechapter 12.

Page 78: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

68 List of figures

N the system will break down soon, so only earlytimes matter.The Weibull distribution, we must mention, isoften used in contexts not involving extremalstatistics. Wind speeds, for example, are nat-urally always positive, and are conveniently fitby Weibull distributions.

Advanced discussion: Weibull and fracturetoughnessWeibull developed his distribution when study-ing the fracture of materials under externalstress. Instead of asking how long a time t a sys-tem will function, Weibull asked how big a loadσ the material can support before it will snap.88

Fracture in brittle materials often occurs due topre-existing microcracks, typically on the surfaceof the material. Suppose we have an isolated89

microcrack of length L in a (brittle) concrete pil-lar, lying perpendicular to the external stress. Itwill start to grow when the stress on the beamreaches a critical value roughly90 given by

σc(L) ≈ Kc/√πL. (12.82)

Here Kc is the critical stress intensity factor, amaterial-dependent property which is high forsteel and low for brittle materials like glass.(Cracks concentrate the externally applied stressσ at their tips into a square–root singularity;longer cracks have more stress to concentrate,leading to eqn 12.82.)The failure stress for the material as a whole isgiven by the critical stress for the longest pre-existing microcrack. Suppose there are N mi-crocracks in a beam. The length L of each mi-crocrack has a probability distribution ρ(L).(c) What is the probability distribution ρ1(σ) forthe critical stress σc for a single microcrack, interms of ρ(L)? (Hint: Consider the population

in a small range dσ, and the same population inthe corresponding range dℓ.)The distribution of microcrack lengths dependson how the material has been processed. Thesimplest choice, an exponential decay ρ(L) ∼(1/L0) exp(−L/L0), perversely does not yield aWeibull distribution, since the probability of asmall critical stress does not vanish as a powerlaw Bσk (eqn 12.81).(d) Show that an exponential decay of microcracklengths leads to a probability distribution ρ1(σ)that decays faster than any power law at σ = 0(i.e., is zero to all orders in σ). (Hint: You mayuse the fact that ex grows faster than xm for anym as x→ ∞.)Analyzing the distribution of failure stresses fora beam with N microcracks with this exponen-tially decaying length distribution yields a Gum-bel distribution [?, section 8.2], not a Weibulldistribution.Many surface treatments, on the other hand,lead to power-law distributions of microcracksand other flaws, ρ(L) ∼ CL−η with η > 1. (Forexample, fractal surfaces with power-law correla-tions arise naturally in models of corrosion, andon surfaces exposed by previous fractures.)(e) Given this form for the length distribution ofmicrocracks, show that the distribution of frac-ture thresholds ρ1(σ) ∝ σk. What is k in termsof η?According to your calculation in part (b), thisimmediately implies a Weibull distribution offracture strengths as the number of microcracksin the beam becomes large.

(12.10) Biggest of bunch: Gumbel. (Mathematics,Statistics, Engineering) ©3Much of statistical mechanics focuses on theaverage behavior in an ensemble, or the meansquare fluctuations about that average. In many

88Many properties of a steel beam are largely independent of which beam is chosen.The elastic constants, the thermal conductivity, and the the specific heat dependsto some or large extent on the morphology and defects in the steel, but nonethelessvary little from beam to beam—they are self-averaging properties, where the fluctu-ations due to the disorder average out for large systems. The fracture toughness ofa given beam, however, will vary significantly from one steel beam to another. Self-averaging properties are dominated by the typical disordered regions in a material;fracture and failure are nucleated at the extreme point where the disorder makes thematerial weakest.89The interactions between microcracks are often not small, and are a popular re-search topic.90This formula assumes a homogeneous, isotropic medium as well as a crack orien-tation perpendicular to the external stress. In concrete, the microcracks will usuallyassociated with grain boundaries, second-phase particles, porosity. . .

Page 79: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 69

cases, however, we are far more interested in theextremes of a distribution.Engineers planning dike systems are interestedin the highest flood level likely in the next hun-dred years. Let the high water mark in yearj be Hj . Ignoring long-term weather changes(like global warming) and year-to-year correla-tions, let us assume that each Hj is an inde-pendent and identically distributed (IID) ran-dom variable with probability density ρ1(Hj).The cumulative distribution function (cdf) is theprobability that a random variable is less than agiven threshold. Let the cdf for a single year beF1(H) = P (H ′ < H) =

∫Hρ1(H

′) dH ′.(a) Write the probability FN(H) that the high-est flood level (largest of the high-water marks)in the next N = 1000 years will be less than H,in terms of the probability F1(H) that the high-water mark in a single year is less than H.The distribution of the largest or smallest ofN random numbers is described by extremevalue statistics [?]. Extreme value statistics isa valuable tool in engineering (reliability, disas-ter preparation), in the insurance business, andrecently in bioinformatics (where it is used to de-termine whether the best alignments of an un-known gene to known genes in other organismsare significantly better than that one would gen-erate randomly).(b) Suppose that ρ1(H) = exp(−H/H0)/H0 de-cays as a simple exponential (H > 0). Using theformula

(1−A) ≈ exp(−A) small A (12.83)

show that the cumulative distribution functionFN for the highest flood after N years is

FN(H) ≈ exp

[− exp

(µ−H

β

)]. (12.84)

for large H. (Why is the probability FN (H)small when H is not large, at large N?) Whatare µ and β for this case?The constants β and µ just shift the scale andzero of the ruler used to measure the variable ofinterest. Thus, using a suitable ruler, the largestof many events is given by a Gumbel distribution

F (x) = exp(− exp(−x))ρ(x) = ∂F/∂x = exp(−(x+ exp(−x))).

(12.85)

How much does the probability distribution forthe largest of N IID random variables depend onthe probability density of the individual randomvariables? Surprisingly little! It turns out thatthe largest of N Gaussian random variables alsohas the same Gumbel form that we found forexponentials. Indeed, any probability distribu-tion that has unbounded possible values for thevariable, but that decays faster than any powerlaw, will have extreme value statistics governedby the Gumbel distribution [?, section 8.3]. Inparticular, suppose

F1(H) ≈ 1− A exp(−BHδ) (12.86)

as H → ∞ for some positive constants A, B,and δ. It is in the region near H∗[N ], definedby F1(H

∗[N ]) = 1 − 1/N , that FN varies in aninteresting range (because of eqn 12.83).(c) Show that the extreme value statistics FN(H)for this distribution is of the Gumbel form(eqn 12.84) with µ = H∗[N ] and β =1/(Bδ H∗[N ]δ−1). (Hint: Taylor expand F1(H)at H∗ to first order.)The Gumbel distribution is universal. It de-scribes the extreme values for any unboundeddistribution whose tails decay faster than apower law.91 (This is quite analogous to the cen-tral limit theorem, which shows that the normalor Gaussian distribution is the universal form forsums of large numbers of IID random variables,so long as the individual random variables havenon-infinite variance.)The Gaussian or standard normal distributionρ1(H) = (1/

√2π) exp(−H2/2), for example, has

a cumulative distribution F1(H) = (1/2)(1 +erf(H/

√2)) which at large H has asymptotic

form F1(H) ∼ 1− (1/√2πH) exp(−H2/2). This

is of the general form of eqn 12.86 with B = 1/2and δ = 2, except that A is a slowly varyingfunction of H . This slow variation does notchange the asymptotics. Hints for the numericsare available in the computer exercises section ofthe text Web site [?].(d) Generate M = 10000 lists of N = 1000random numbers distributed with this Gaussianprobability distribution. Plot a normalized his-togram of the largest entries in each list. Plotalso the predicted form ρN(H) = dFN/dH from

91The Gumbel distribution can also describe extreme values for a bounded distri-bution, if the probability density at the boundary goes to zero faster than a powerlaw [?, section 8.2].

Page 80: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

70 List of figures

part (c). (Hint: H∗(N) ≈ 3.09023 for N = 1000;check this if it is convenient.)Other types of distributions can have extremevalue statistics in different universality classes(see Exercise 12.11). Distributions with power-law tails (like the distributions of earthquakesand avalanches described in Chapter 12) haveextreme value statistics described by Frechet dis-tributions. Distributions that have a strict up-per or lower bound92 have extreme value distri-butions that are described by Weibull statistics(see Exercise 12.9).

(12.11) Extreme value statistics: Gumbel,Weibull, and Frechet. (Mathematics, Statis-tics, Engineering) ©3Extreme value statistics is the study of the max-imum or minimum of a collection of randomnumbers. It has obvious applications in theinsurance business (where one wants to knowthe biggest storm or flood in the next decades,see Exercise 12.10) and in the failure of largesystems (where the weakest component or flawleads to failure, see Exercise 12.9). Recently ex-treme value statistics has become of significantimportance in bioinformatics. (In guessing thefunction of a new gene, one often searches en-tire genomes for good matches (or alignments)to the gene, presuming that the two genes areevolutionary descendents of a common ancestorand hence will have similar functions. One mustunderstand extreme value statistics to evaluatewhether the best matches are likely to arise sim-ply at random.)The limiting distribution of the biggest or small-est ofN random numbers asN → ∞ takes one ofthree universal forms, depending on the proba-bility distribution of the individual random num-bers. In this exercise we understand these formsas fixed points in a renormalization group.Given a probability distribution ρ1(x), we definethe cumulative distribution function (CDF) asF1(x) =

∫ x

−∞ ρ(x′) dx′. Let us define ρN(x) tobe the probability density that, out of N randomvariables, the largest is equal to x. Let FN(x) tobe the corresponding CDF.(a) Write a formula for F2N (x) in terms ofFN (x). If FN (x) = exp(−gN(x)), show thatg2N (x) = 2gN (x).

Our renormalization group coarse-graining oper-ation will remove half of the variables, throwingaway the smaller of every pair, and returningthe resulting new probability distribution. Interms of the function g(x) = − log

∫ x

−∞ ρ(x′)dx′,it therefore will return a rescaled version of the2g(x). This rescaling is necessary because, asthe sample size N increases, the maximum willdrift upward—only the form of the probabilitydistribution stays the same, the mean and widthcan change. Our renormalization-group coarse-graining operation thus maps function space intoitself, and is of the form

T [g](x) = 2g(ax+ b). (12.87)

(This renormalization group is the same as thatwe use for sums of random variables in Exer-cise 12.11 where g(k) is the logarithm of theFourier transform of the probability density.)There are three distinct types of fixed-point dis-tributions for this renormalization group trans-formation, which (with an appropriate linearrescaling of the variable x) describe most ex-treme value statistics. The Gumbel distribution(Exercise 12.10) is of the form

Fgumbel(x) = exp(− exp(−x))ρgumbel(x) = exp(−x) exp(− exp(−x)).ggumbel(x) = exp(−x)

The Weibull distribution (Exercise 12.9) is of theform

Fweibull(x) =

exp(−(−x)α) x < 0

1 x ≥ 0

gweibull(x) =

(−x)α x < 0

0 x ≥ 0,

(12.88)

and the Frechet distribution is of the form

Ffrechet(x) =

0 x ≤ 0

exp(−x−α) x > 0

gfrechet(x) =

∞ x < 0

x−α x ≥ 0,

(12.89)

where α > 0 in each case.(b) Show that these distributions are fixed pointsfor our renormalization-group transformationeqn 12.87. What are a and b for each distri-bution, in terms of α?

92More specifically, bounded distributions that have power-law asymptotics haveWeibull statistics; see note 91 and Exercise 12.9, part (d).

Page 81: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 71

In parts (c) and (d) you will show that there areonly these three fixed points g∗(x) for the renor-malization transformation, T [g∗](x) = 2g∗(ax+b), up to an overall linear rescaling of the vari-able x, with some caveats. . .(c) First, let us consider the case a 6= 1. Showthat the rescaling x → ax + b has a fixed pointx = µ. Show that the most general form for thefixed-point function is

g∗(µ± z) = zα′

p±(γ log z) (12.90)

for z > 0, where p± is periodic and α′ and γ areconstants such that p± has period equal to one.(Hint: Assume p(y) ≡ 1, find α′, and then show

g∗/zα′

is periodic.) What are α′ and γ? Whichchoice for a, p+, and p− gives the Weibull dis-tribution? The Frechet distribution?Normally the periodic function p(γ log(x − µ))is assumed or found to be a constant (some-

times called 1/β, or 1/βα′

). If it is not constant,then the probability density must have an infi-nite number of oscillations as x → µ, forming aweird essential singularity.(d) Now let us consider the case a = 1. Showagain that the fixed-point function is

g∗(x) = e−x/βp(x/γ) (12.91)

with p periodic of period one, and with suitableconstants β and γ. What are the constants interms of b? What choice for p and β yields theGumbel distribution?Again, the periodic function p is often assumeda constant (eµ), for reasons which are not as ob-vious as in part (c).What are the domains of attraction of the threefixed points? If we want to study the maximumof many samples, and the initial probability dis-tribution has F (x) as its CDF, to which universalform will the extreme value statistics converge?Mathematicians have sorted out these questions.If ρ(x) has a power-law tail, so 1− F (x) ∝ x−α,then the extreme value statistics will be of theFrechet type, with the same α. If the initialprobability distribution is bounded above at µand if 1−F (µ−y) ∝ yα, then the extreme valuestatistics will be of theWeibull type. (More com-monly, Weibull distributions arise as the small-est value from a distribution of positive random

numbers, Exercise 12.9.) If the probability dis-tribution decays faster than any polynomial (say,exponentially) then the extreme value statisticswill be of the Gumbel form. (Gumbel extreme-value statistics can also arise for bounded ran-dom variables if the probability decays to zerofaster than a power law at the bound.)

(12.12) Diffusion equation and universal scalingfunctions.93 ©2The diffusion equation universally describes mi-croscopic hopping systems at long length scales.We will investigate how to write the evolution ina universal scaling form.The solution to a diffusion problem with anon-zero drift velocity is given by ρ(x, t) =1/

√4πDt exp

(−((x− vt)2/(4Dt)

). We will

coarse grain by throwing away half the timepoints. We will then rescale the distributionso it looks like the original distribution. Wecan just write these two operations as t′ = t/2,x′ = x/

√2, ρ′ =

√2ρ.94 These three together

constitute our renormalization group operation.(a) Write an expression for ρ′(x′, t′) in terms ofD, v, x′, and t′ (not in terms of D′ and v′). Useit to determine the new renormalized velocity v′

and diffusion constant D′. Are v and D relevant,irrelevant or marginal variables?Typically, whenever writing properties in a scal-ing function, there is some freedom in decidingwhich invariant combinations to use. Here letus use the invariant combination of variables,X = x/

√t and V =

√tv. We can then write

ρ(x, t) = t−αP(X ,V, D), (12.92)

a power law times a universal scaling function ofinvariant combination of variables.(b) Show that X and V are invariant under ourrenormalization group operation. What is α?Write an expression for P, in terms of X , V,and D (and not x, v, or t).(Note that we need to solve the diffusion equa-tion to find the universal scaling function P , butwe can learn a lot from just knowing that it isa fixed point of the renormalization group. So,the universal exponent α and the invariant scal-ing combinations X , V, and D are determinedjust by the coarsening and rescaling steps in therenormalization group. In experiments and sim-ulations, one often uses data to extract the uni-

93This exercise was developed in collaboration with Archishman Raju.94Because ρ is a density, we need to rescale ρ′dx′ = ρdx.

Page 82: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

72 List of figures

versal critical exponents and universal scalingfunctions, relying on emergent scale invarianceto tell us that a scaling form like eqn (12.92) isexpected.)

(12.13) Avalanche Size Distribution. (Scaling func-tion) ©3One can develop a mean-field theory foravalanches in non-equilibrium disordered sys-tems by considering a system of N Ising spinscoupled to one another by an infinite-range in-teraction of strength J/N , with an external fieldH and each spin also having a local random fieldh:

H = −J0/N∑

i,j

SiSj −∑

i

(H + hi)Si. (12.93)

We assume that each spin flips over when it ispushed over; i.e., when its change in energy

H loci =

∂H∂Si

= J0/N∑

j

Sj +H + hi

= J0m+H + hi

changes sign.95 Here m = (1/N)∑

j Sj is theaverage magnetization of the system. All spinsstart by pointing down. A new avalanche islaunched when the least stable spin (the un-flipped spin of largest local field) is flippedby increasing the external field H . Each spinflip changes the magnetization by 2/N . If themagnetization change from the first spin flip isenough to trigger the next-least-stable spin, theavalanche will continue.We assume that the probability density for therandom field ρ(h) during our avalanche is a con-stant

ρ(h) = (1 + t)/(2J0). (12.94)

The constant t will measure how close the den-sity is to the critical density 1/(2J0).(a) Show that at t = 0 each spin flip will trig-ger on average one other spin to flip, for largeN . Can you qualitatively explain the differencebetween the two phases with t < 0 and t > 0?We can solve exactly for the probability D(S, t)of having an avalanche of size S. To have anavalanche of size S triggered by a spin with ran-dom field h, you must have precisely S− 1 spinswith random fields in the range h, h+2J0S/N

(triggered when the magnetization changes by2S/N). The probability of this is given by thePoisson distribution. In addition, the randomfields must be arranged so that the first spin trig-gers the rest. The probability of this turns outto be 1/S.(b) (Optional) By imagining putting periodicboundary conditions on the interval h, h +2J0S/N, argue that exactly one spin out of thegroup of S spins will trigger the rest as a singleavalanche. (Hint from Ben Machta: For sim-plicity, we may assume96 the avalanche startsat H = m = 0. Try plotting the local fieldH loc(h′) = J0m(h′) + h′ that a spin with ran-dom field h′ would feel if the spins between h′

and h were flipped. How would this functionchange if we shuffle all the random fields aroundthe periodic boundary conditions?)(c) Show that the distribution of avalanche sizesis thus

D(S, t) =SS−1

S!(t+ 1)S−1e−S(t+1). (12.95)

With t small (near to the critical density) and forlarge avalanche sizes S we expect this to have ascaling form:

D(S, t) = S−τD(S/t−x) (12.96)

for some mean-field exponent x. That is, tak-ing t → 0 and S → ∞ along a path with Stx

fixed, we can expand D(S, t) to find the scalingfunction.(d) Show that τ = 3/2 and x = 2. What is thescaling function D? Hint: You’ll need to useStirling’s formula S! ∼

√2πS(S/e)S for large S,

and that 1+t = exp(log(1+t)) ≈ et−t2/2+t3/3....This is a bit tricky to get right. Let’s check it bydoing the plots.(e) Plot SτD(S, t) versus Y = S/t−x for t = 0.2,0.1, and 0.05 in the range Y ∈ (0, 10). Does itconverge to D(Y )?(See “Hysteresis and hierarchies, dynamics ofdisorder-driven first-order phase transforma-tions”, J. P. Sethna, K. Dahmen, S. Kartha,J. A. Krumhansl, B. W. Roberts, and J. D.Shore, PRL 70, 3347 (1993) for more informa-tion.)

95We ignore the self-interaction, which is unimportant at large N96Equivalently, measure the random fields with respect to h0 of the triggering spin,and let m be the magnetization change since the avalanche started.

Page 83: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 73

(12.14) Conformal Invariance.97 (Mathematics) ©3Emergence in physics describes new laws thatarise from complex underpinnings, and often ex-hibits a larger symmetry than the original model.The diffusion equation emerges as a continuumlimit from complex random microscopic motion,and diffusion on a square lattice has circularsymmetry. Critical phenomena emerges nearcontinuous phase transitions, and the resultantsymmetry under dilations is exploited by therenormalization group to predict universal powerlaws and scaling functions.The Ising model on a square lattice at the criti-cal point, like the diffusion equation, also has anemergent circular symmetry: the complex pat-terns of up and down spins look the same onlong length scales also when rotated by an an-gle. Indeed, making use of the symmetries un-der changes of length scale, position, and angle(plus one spatially nonuniform transformation),systems at their critical points have a conformalsymmetry group.In two dimensions, the conformal symmetrygroup becomes huge. Roughly speaking, anycomplex analytic function f(z) = u(x + iy) +iv(x + iy) takes a snapshot of an Ising modelM(x, y) and warps it into a new magnetizationpattern at (u, v) that ‘looks the same’. (Here u,v, x, and y are all real.)You may remember that most ordinary functions(like z2,

√z, sin(z), log(z), and exp(iz)) are an-

alytic. All of them yield cool transformations ofthe Ising model – weird and warped when mag-nified until you see the pixels, but recognizablyIsing-like on long scales. This exercise will gen-erate an example.

Fig. 12.39 Two proteins in a critical mem-

brane. The figure shows the pixels of a critical Isingmodel simulated in a square, conformally warpingthe square onto the exterior of two circles (represent-ing two proteins in a cell membrane). The warpedpixels vary in size – largest in the upper and lowerleft, smallest near the smaller circle. They also lo-cally rotate and translate the square lattice, but no-tice the pixels remain looking square – the angles andaspect ratios remain unchanged. The pixels are grayrather than black and white, with only the smallestpixels pure black and white; we must not only warpthe pixels conformally, but also rescale the magneti-zation. Ignoring the pixelation, the different regionslook statistically similar. A gray large pixel mim-ics the average color of similar-sized regions of tinypixels.

(a) What analytic function shrinks a region uni-formly by a factor b, holding z = 0 fixed? Whatanalytic function translates the lattice by a vectorr0 = (u0, v0)? What analytic function rotates aregion by an angle θ?(b) Expanding f(z+δ) = f(z)+δf ′(z), show thatan analytic function f transforms a local regionabout z to linear order in δ by first rotating anddilating δ about z and then translating. Whatcomplex number gives the net translation?Figure 12.39 shows how one can use this con-formal symmetry to study the interactions be-tween circular ‘proteins’ embedded in a two di-mensional membrane at an Ising critical point.98

In the renormalization group, we first coarsegrain the system (shrinking by a factor b) andthen rescale the magnetization (by some powerbyM ) in order to return to statistically the same

97This exercise is based on a project of Benjamin Machta’s98We use this transformation to study the effective interaction between two circular‘proteins’ in a two-dimensional cell membrane near its critical point. The energy ofattraction between two ‘up’ proteins is derivable from the energy of the square-latticeIsing model with the two side boundaries set to ‘up’. (See B. B. Machta, S. L. Veatch,and J. P. Sethna, ‘Critical Casimir forces in cellular membranes, PRL 109, 138101(2012).)

Page 84: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

74 List of figures

critical state: M = byMM . This rescaling turnsthe larger pixels in Fig. 12.39 more gray; amostly up-spin ‘white’ region with tiny pixels ismimicked by a large single pixel with the statis-tically averaged gray color.We can discover the correct power for M by ex-amining the rescaling of the correlation func-tion C(r) = 〈M(x)M(x + r)〉. In the Isingmodel at its critical point the correlation func-tion C(r) ∼ r−(2−d+η). In dimension d = 2,η = 1/4. We expect that the correlation func-tion for the conformally transformed magnetiza-tion will be the same as the original correlationfunction.(c) If we coarse-grain by a constant factor b,what power of b must we multiply M by to makeC(r) = 〈M(f(z))M(f(z) + r)〉? Explain yourreasoning.When our conformal transformation takes apixel at M(z) to a warped pixel of area A atf(z), it rescales the magnetization by

M(f(z)) = |A|−1/16M(z). (12.97)

The pixel area for a locally uniform compressionby b changes by |df/dz|2 = 1/b2. You may usethis to check your answer to part (c).Figure 12.40 illustrates the self-similarity for theIsing model under rescaling, in analogy withFig. 12.11’s treatment of scale invariance in therandom-field avalanche model. Here, unlike inFig. 12.11, we incorporate the renormalizationof the magnetization by changing the grayscaleas we blow up successive lower–right–hand cor-ners.

Fig. 12.40 Ising model under conformal rescal-

ing. The ‘powers of two’ rescaling of the avalanchesin Fig. 12.11 ignored this rescaling of M . Here weshow the Ising model, again with the lowest right-hand quarter of each square inflated to fill the next– but now properly faded according to eqn 12.97.

Let us now explore the image of the Ising modelunder various analytic functions. Generate asnapshot of an Ising model, equilibrated at thecritical temperature.99

Notice that Mathematica’s notebook has trou-ble with plotting large numbers of polygons. UsePython. Or, if you prefer – save often, try notusing a second display (a known problem), trynot opening other programs in the background(including second Mathematica files), and avoidtyping or otherwise disturbing the Mathematicawindow while it is running a long task. Debugeverything with small systems, directly save yourgraphs of larger systems to a file, and displaythem outside Mathematica. In either language,

99You can get one from Matt Bierbaum’s Web simulation athttp://mattbierbaum.github.io/ising.js/) of size at least 256x256, ordownload one from the course Web site (along with hints files) athttp://pages.physics.cornell.edu/∼sethna/teaching/562/HW.html.100If you work in Mathematica or Fortran, where the indices of arrays run from(1. . . L), zmn =

((m − 1/2) + i(n− 1/2)

)/L.

Page 85: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 75

512x512 will strain your machine; anything over128x128 is acceptable, but larger systems willmake the physics a bit clearer.The hints files will allow you to import an Isingimage, convert it into a two-dimensional arraySmn = ±1, and select an L × L subregion (ifyou wish, especially while you debug your code).We imagine these spins spread over the unitsquare in the complex plane; our code gener-ates a list of spins and square polygons associ-ated to each, with the spin Smn sitting at thecenter100 zmn =

((m+ 1/2) + i(n+ 1/2)

)/L of a

1/L × 1/L square. The code will allow you toprovide a function f(z) of your choice, and willreturn the deformed quadrilaterals101 centeredat f(zmn) with areas Amn, and their associated

rescaled spins A−1/16mn Smn.

102 The software willalso provide example routines showing how tooutput and shade in these quadrilaterals.(d) Generate a conformal transformation with anonlinear analytic function of your choice, warp-ing the Ising mode in an interesting way. Zoomin to regions occupied by lots of small pixels,where the individual pixels are not obvious.103

Include both your entire plot and a cropped fig-ure showing the expanded zoom. Discuss yourzoomed plot critically – does it appear that theIsing correlations and visual structures have beenfaithfully retained by your analytic transforma-tion?(e) Load an Ising model equilibrated above thecritical temperature at T = 100 (random noise),and one at T = 3 (short-range correlations).Distort and zoom each using your chosen confor-mal transformation. Analyze each and includeyour images. If you ‘blur your eyes’ enoughto ignore the individual pixels, can you tell how

much the system has been dilated? Are your con-formally transformed images faithfully retainingthe correlations and visual structures away fromthe critical point? For T = 3, which regionslook qualitatively like Tc? Which regions look likeT = 100?(f) Invent a non-analytic function, and use it todistort your Ising model. (Warning: most func-tions you write down, like log(cosh4(z + 1/z2))will be analytic except at a few singularities.The author tried two methods: inventing real-valued functions u(x, y) and v(x, y) and form-ing f = u + iv, and picking two different ana-lytic functions g(z) and h(z) and using f(z) =Re(g(z)) + iIm(h(z)). Make sure your functionis non-analytic almost everywhere (e.g., violatesthe Cauchy-Riemann equations), not just at apoint.)104 Find an example that makes for an in-teresting picture; include your images, includinga zoom into a region with many pixels that rangein size. As above, examine the images critically– does it appear that the Ising correlations andvisual structures have been faithfully retained byyour non-analytic transformation? Describe thedistortions you see. (Are the pixels still approx-imately square?)Conformal symmetry in two dimensions wasstudied in an outgrowth of string theory. Therepresentations of the conformal group allowedthem to deduce the exact critical exponents forall the usual two dimensional statistical mechan-ical models – reproducing Onsager’s result forthe 2D Ising model, known solutions for the 2Dtri-critical Ising model, 2D percolation, . . .Morerecently, field theorists [?] have used conformalinvariance in higher dimensions (with a strategycalled ‘conformal bootstrap’) to produce bounds

101The routine will drop quadrilaterals that extend to infinity, and also will removequadrilaterals with ‘negative area’. The latter are associated with pixels which getinverted by f(z); plotting packages usually do not provide routines that color theexterior of a polygon.102The list of quadrilaterals and spins will be linear, not two-dimensional, sincegraphics routines plotting polygons usually want flattened lists.103You can zoom either using the graphics software or (sometimes more efficient)by saving a vector-graphics figure (like pdf) and viewing it separately. Start withL = 64 or so to make plots and zooming efficient, but for your final plots use L aslarge as feasible.104Warning: The program automatically drops quadrilaterals with ‘negative area’,which usually happen when an internal point goes to infinity (and the polygon shouldbe shaded ‘outside’). This will also happen for the conjugate of an analytic function(e.g., f(x+ iy) = y+ ix = iz); you will get either errors about Transpose[] in Mathe-matica or an empty plot in Python. If this happens for your choice of u(x, y)+iv(x, y),try using u− iv.

Page 86: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

76 List of figures

for critical exponents. They now hold the recordon accuracy for the exponents of the 3D Ising

model, giving β = 0.326419(3), ν = 0.629971(4),and δ = 4.78984(1).

Page 87: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Advanced Statistical

Mechanics 3

Exercises

These exercises cover topics in more advanced statisti-cal mechanics and the renormalization group, algorithms,and computer science.

(3.1) A fair split? Number partitioning.1

(Computer science, Mathematics, Statistics) ©3A group of N kids want to split up into twoteams that are evenly matched. If the skill ofeach player is measured by an integer, can thekids be split into two groups such that the sumof the skills in each group is the same?This is the number partitioning problem (NPP),a classic and surprisingly difficult problem incomputer science. To be specific, it is NP–complete—a category of problems for which noknown algorithm can guarantee a resolution ina reasonable time (bounded by a polynomial intheir size). If the skill aj of each kid j is in therange 1 ≤ aJ ≤ 2M , the ‘size’ of the NPP is de-fined as NM . Even the best algorithms will, forthe hardest instances, take computer time thatgrows faster than any polynomial in MN , get-ting exponentially large as the system grows.In this exercise, we shall explore connections be-tween this numerical problem and the statisti-cal mechanics of disordered systems. Numberpartitioning has been termed ‘the easiest hardproblem’. It is genuinely hard numerically; un-like some other NP–complete problems, thereare no good heuristics for solving NPP (i.e., thatwork much better than a random search). Onthe other hand, the random NPP problem (the

ensembles of all possible combinations of skillsaj) has many interesting features that can beunderstood with relatively straightforward argu-ments and analogies. Parts of the exercise are tobe done on the computer; hints can be found onthe computer exercises portion of the book Website [?].We start with the brute-force numerical ap-proach to solving the problem.(a) Write a function ExhaustivePartition(S)

that inputs a list S of N integers, exhaus-tively searches through the 2N possible parti-tions into two subsets, and returns the min-imum cost (difference in the sums). Testyour routine on the four sets [?] S1 =[10, 13, 23, 6, 20], S2 = [6, 4, 9, 14, 12, 3, 15, 15],S3 = [93, 58, 141, 209, 179, 48, 225, 228], andS4 = [2474, 1129, 1388, 3752, 821, 2082, 201, 739].Hint: S1 has a balanced partition, and S4 has aminumum cost of 48. You may wish to returnthe signs of the minimum-cost partition as partof the debugging process.What properties emerge from studying ensem-bles of large partitioning problems? We find aphase transition. If the range of integers (Mdigits in base two) is large and there are rela-tively few numbers N to rearrange, it is unlikelythat a perfect match can be found. (A randominstance with N = 2 and M = 10 has a onechance in 210 = 1024 of a perfect match, be-cause the second integer needs to be equal to

1This exercise draws heavily from [?, chapter 7].

Page 88: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

78 Advanced Statistical Mechanics

the first.) If M is small and N is large it shouldbe easy to find a match, because there are somany rearrangements possible and the sums areconfined to a relatively small number of possiblevalues. It turns out that it is the ratio κ =M/Nthat is the key; for large random systems withM/N > κc it becomes extremely unlikely that aperfect partition is possible, while if M/N < κc

a fair split is extremely likely.(b) Write a function MakeRandomPartitionProb-

lem(N,M) that generates N integers randomlychosen from 1, . . . , 2M, rejecting lists whosesum is odd (and hence cannot have perfect par-titions). Write a function pPerf(N,M,trials),which generates trials random lists and callsExhaustivePartition on each, returning thefraction pperf that can be partitioned evenly(zero cost). Plot pperf versus κ = M/N , forN = 3, 5, 7 and 9, for all integers M with0 < κ = M/N < 2, using at least a hundredtrials for each case. Does it appear that there isa phase transition for large systems where fairpartitions go from probable to unlikely? Whatvalue of κc would you estimate as the criticalpoint?Should we be calling this a phase transition? Itemerges for large systems; only in the ‘thermo-dynamic limit’ where N gets large is the transi-tion sharp. It separates two regions with quali-tatively different behavior. The problem is muchlike a spin glass, with two kinds of random vari-ables: the skill levels of each player aj are fixed,‘quenched’ random variables for a given randominstance of the problem, and the assignment toteams can be viewed as spins sj = ±1 that canbe varied (‘annealed’ random variables)2 to min-imize the cost C = |∑j ajsj |.(c) Show that the square of the cost C2 is of thesame form as the Hamiltonian for a spin glass,H =

∑i,j Jijsisj. What is Jij?

The putative phase transition in the optimiza-tion problem (part (b)) is precisely a zero-temperature phase transition for this spin-glassHamiltonian, separating a phase with zeroground-state energy from one with non-zero en-ergy in the thermodynamic limit.We can understand both the value κc of the

phase transition and the form of pperf(N,M)by studying the distribution of possible ‘signed’costs Es =

∑j ajsj . These energies are dis-

tributed over a maximum total range of Emax −Emin = 2

∑Nj=1 aj ≤ 2N 2M (all players play-

ing on the plus team, through all on the minusteam). For the bulk of the possible team choicessj, though, there will be some cancellation inthis sum. The probability distribution P (E) ofthese energies for a particular NPP problem ajis not simple, but the average probability distri-bution 〈P (E)〉 over the ensemble of NPP prob-lems can be estimated using the central limit the-orem. (Remember that the central limit theoremstates that the sum of N random variables withmean zero and standard deviation σ convergesrapidly to a normal (Gaussian) distribution ofstandard deviation

√Nσ.)

(d) Estimate the mean and variance of a sin-gle term sjaj in the sum, averaging over boththe spin configurations sj and the different NPPproblem realizations aj ∈ [1, . . . , 2M ], keep-ing only the most important term for large M .(Hint: Approximate the sum as an integral,or use the explicit formula

∑K1 k2 = K3/3 +

K2/2 +K/6 and keep only the most importantterm.) Using the central limit theorem, whatis the ensemble-averaged probability distributionP (E) for a team with N players? Hint: HereP (E) is non-zero only for even integers E, so forlarge N P (E) ≈ (2/

√2πσ) exp(−E2/2σ2); the

normalization is doubled.Your answer to part (d) should tell you that thepossible energies are mostly distributed amongintegers in a range of size ∼ 2M around zero, upto a factor that goes as a power of N . The to-tal number of states explored by a given systemis 2N . So, the expected number of zero-energystates should be large if N ≫ M , and go to zerorapidly if N ≪ M . Let us make this more pre-cise.(e) Assuming that the energies for a specific sys-tem are randomly selected from the ensemble av-erage P (E), calculate the expected number ofzero-energy states as a function of M and Nfor large N . What value of κ = M/N shouldform the phase boundary separating likely from

2Quenched random variables are fixed terms in the definition of the system, represent-ing dirt or disorder that was frozen in as the system was formed (say, by quenchingthe hot liquid material into cold water, freezing it into a disordered configuration).Annealed random variables are the degrees of freedom that the system can vary toexplore different configurations and minimize its energy or free energy.

Page 89: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 79

unlikely fair partitions? Does that agree well withyour numerical estimate from part (b)?The assumption we made in part (e) ignores thecorrelations between the different energies dueto the fact that they all share the same stepsizes aj in their random walks. Ignoring thesecorrelations turns out to be a remarkably goodapproximation.3 We can use the random-energyapproximation to estimate pperf that you plottedin part (b).(f) In the random-energy approximation, argue

that pperf = 1− (1− P (0))2N−1

. Approximating(1− A/L)L ≈ exp(−A) for large L, show that

pperf(κ,N) ≈ 1− exp

[−√

3

2πN2−N(κ−κc)

].

(3.1)

Rather than plotting the theory curve througheach of your simulations from part (b), wechange variables to x = N(κ−κc)+(1/2) log2N ,where the theory curve

pscalingperf (x) = 1− exp

[−√

3

2π2−x

](3.2)

is independent of N . If the theory is correct,your curves should converge to pscalingperf (x) as Nbecomes large(g) Reusing your simulations from part (b),make a graph with your values of pperf(x,N) ver-sus x and pscalingperf (x). Does the random-energyapproximation explain the data well?Rigorous results show that this random-energyapproximation gives the correct value of κc. Theentropy of zero-cost states below κc, the proba-bility distribution of minimum costs above κc (ofthe Weibull form, exercise 12.9), and the proba-bility distribution of the k lowest cost states arealso correctly predicted by the random-energyapproximation. It has also been shown thatthe correlations between the energies of differ-ent partitions vanish in the large (N,M) limit

so long as the energies are not far into the tailsof the distribution, perhaps explaining the suc-cesses of ignoring the correlations.What does this random-energy approximationimply about the computational difficulty ofNPP? If the energies of different spin configura-tions (arrangements of kids on teams) were com-pletely random and independent, there wouldbe no better way of finding zero-energy states(fair partitions) than an exhaustive search of allstates. This perhaps explains why the best al-gorithms for NPP are not much better than theexhaustive search you implemented in part (a);even among NP–complete problems, NPP isunusually unyielding to clever methods.4 Italso lends credibility to the conjecture in thecomputer science community that P 6= NP–complete; any polynomial-time algorithm forNPP would have to ingeneously make use of theseemingly unimportant correlations between en-ergy levels.

(3.2) Cardiac dynamics.5 (Computation, Biology,Complexity) ©4Reading: References [?, ?], Niels Otani,various web pages on cardiac dynamics,http://otani.vet.cornell.edu, and Arthur T.Winfree, ‘Varieties of spiral wave behav-ior: An experimentalist’s approach to thetheory of excitable media’, Chaos, 1, 303-334 (1991). See also spiral waves in Dic-tyostelium by Bodenschatz and Franck,http://newt.ccmr.cornell.edu/Dicty/diEp47A.movand http://newt.ccmr.cornell.edu/Dicty/diEp47A.avi.The cardiac muscle is an excitable medium.In each heartbeat, a wave of excitation passesthrough the heart, compressing first the atriawhich pushes blood into the ventricles, and thencompressing the ventricles pushing blood intothe body. In this exercise we will study sim-plified models of heart tissue, that exhibit spiralwaves similar to those found in arrhythmias.

3More precisely, we ignore correlations between the energies of different teamss = si, except for swapping the two teams s → −s. This leads to the N − 1in the exponent of the exponent for pperf in part (f). Notice that in this approxima-tion, NPP is a form of the random energy model (REM, exercise 3.9), except that weare interested in states of energy near E = 0, rather than minimum energy states.4The computational cost does peak near κ = κc. For small κ ≪ κc it’s relativelyeasy to find a good solution, but this is mainly because there are so many solutions;even random search only needs to sample until it finds one of them. For κ > κc

showing that there is no fair partition becomes slightly easier as κ grows [?, fig 7.3].5This exercise and the associated software were developed in collaboration withChristopher Myers.

Page 90: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

80 Advanced Statistical Mechanics

An excitable medium is one which, when trig-gered from a resting state by a small stimulus, re-sponds with a large pulse. After the pulse thereis a refractory period during which it is difficultto excite a new pulse, followed by a return to theresting state. The FitzHugh-Nagumo equationsprovide a simplified model for the excitable hearttissue:6

∂V

∂t= ∇2V +

1

ǫ(V − V 3/3−W )

∂W

∂t= ǫ(V − γW + β), (3.3)

where V is the transmembrane potential, W isthe recovery variable, and ǫ = 0.2, γ = 0.8, andβ = 0.7 are parameters. Let us first explore thebehavior of these equations ignoring the spatialdependence (dropping the ∇2V term, appropri-ate for a small piece of tissue). The dynamicscan be visualized in the (V,W ) plane.(a) Find and plot the nullclines of the FitzHugh-Nagumo equations: the curves along whichdV/dt and dW/dt are zero (ignoring ∇2V ). Theintersection of these two nullclines representsthe resting state (V ∗,W ∗) of the heart tissue.We apply a stimulus to our model by shiftingthe transmembrane potential to a larger value—running from initial conditions (V ∗ + ∆,W ∗).Simulate the equations for stimuli ∆ of varioussizes; plot V and W as a function of time t, andalso plot V (t) versus W (t) along with the null-clines. How big a stimulus do you need in orderto get a pulse?Excitable systems are often close to regimeswhere they develop spontaneous oscillations.Indeed, the FitzHugh-Nagumo equations areequivalent to the van der Pol equation (whicharose in the study of vacuum tubes), a standardsystem for studying periodic motion.(b) Try changing to β = 0.4. Does the system os-cillate? The threshold where the resting statebecomes unstable is given when the nullcline in-tersection lies at the minimum of the V nullcline,at βc = 7/15.Each portion of the tissue during a contractionwave down the heart is stimulated by its neigh-bors to one side, and its pulse stimulates theneighbor to the other side. This triggering in ourmodel is induced by the Laplacian term ∇2V .

We simulate the heart on a two-dimensional gridV (xi, yj , t), W (xi, yj , t), and calculate an ap-proximate Laplacian by taking differences be-tween the local value of V and values at neigh-boring points.There are two natural choices for this Laplacian.The five-point discrete Laplacian is generaliza-tion of the one-dimensional second derivative,∂2V /∂x2 ≈ (V (x+dx)−2V (x)+V (x−dx))/dx2:

∇2[5]V (xi, yi) ≈ (V (xi, yi+1) + V (xi, yi−1)

+ V (xi+1, yi) + V (xi−1, yi)

− 4V (xi, yi))/dx2

↔ 1

dx2

0 1 01 −4 10 1 0

(3.4)

where dx = xi+1 − xi = yi+1 − yi is the spac-ing between grid points and the last expressionis the stencil by which you multiply the pointand its neighbors by to calculate the Laplacian.The nine-point discrete Laplacian has been fine-tuned for improved circularly symmetry, withstencil

∇2[9]V (xi, yi) ↔

1

dx2

1/6 2/3 1/62/3 −10/3 2/31/6 2/3 1/6

.

(3.5)We will simulate our partial-differential equation(PDE) on a square 100 × 100 grid with a gridspacing dx = 1.7 As is often done in PDEs,we will use the crude Euler time-step schemeV (t+∆) ≈ V (t) + ∆∂V /∂t (see Exercise 3.12):we find ∆ ≈ 0.1 is the largest time step we canget away with. We will use ‘no-flow’ boundaryconditions, which we implement by setting theLaplacian terms on the boundary to zero (theboundaries, uncoupled from the rest of the sys-tem, will quickly turn to their resting state). Ifyou are not supplied with example code thatdoes the two-dimensional plots, you may findthem at the text web site [?].(c) Solve eqn 3.3 for an initial condition equalto the fixed-point (V ∗,W ∗) except for a 10 × 10square at the origin, in which you should applya stimulus ∆ = 3.0. (Hint: Your simulationshould show a pulse moving outward from theorigin, disappearing as it hits the walls.)

6Nerve tissue is also an excitable medium, modeled using different Hodgkin-Huxley

equations.7Smaller grids would lead to less grainy waves, but slow down the simulation a lot.

Page 91: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 81

If you like, you can mimic the effects of thesinoatrial (SA) node (your heart’s natural pace-maker) by stimulating your heart model period-ically (say, with the same 10× 10 square). Real-istically, your period should be long enough thatthe old beat finishes before the new one starts.We can use this simulation to illustrate generalproperties of solving PDEs.(d) Accuracy. Compare the five and nine-pointLaplacians. Does the latter give better circu-lar symmetry? Stability. After running for awhile, double the time step ∆. How does the sys-tem go unstable? Repeat this process, reducing∆ until just before it goes nuts. Do you see in-accuracies in the simulation that foreshadow theinstability?This checkerboard instability is typical of PDEswith too high a time step. The maximum timestep in this system will go as dx2, the latticespacing squared—thus to make dx smaller by afactor of two and simulate the same area, youneed four times as many grid points and fourtimes as many time points—giving us a good rea-son for making dx as large as possible (correctingfor grid artifacts by using improved Laplacians).Similar but much more sophisticated tricks havebeen used recently to spectacularly increase theperformance of lattice simulations of the inter-actions between quarks [?].As mentioned above, heart arrhythmias are dueto spiral waves. To generate spiral waves we needto be able to start up more asymmetric states—stimulating several rectangles at different times.Also, when we generate the spirals, we wouldlike to emulate electroshock therapy by apply-ing a stimulus to a large region of the heart.We can do both by writing code to interactivelystimulate a whole rectangle at one time. Again,the code you have obtained from us should havehints for how to do this.(e) Add the code for interactively stimulating ageneral rectangle with an increment to V of size∆ = 3. Play with generating rectangles in differ-ent places while other pulses are going by: makesome spiral waves. Clear the spirals by giving astimulus that spans the system.There are several possible extensions of thismodel, several of which involve giving our modelspatial structure that mimics the structure of theheart. (One can introduce regions of inactive

‘dead’ tissue. One can introduce the atrium andventricle compartments to the heart, with theSA node in the atrium and an AV node connect-ing the two chambers . . . ) Niels Otani has anexercise with further explorations of a numberof these extensions, which we link to from theCardiac Dynamics web site.

(3.3) Quantum dissipation from phonons.(Quantum) ©2Electrons cause overlap catastrophes (X-ray edgeeffects, the Kondo problem, macroscopic quan-tum tunneling); a quantum transition of a sub-system coupled to an electron bath ordinarilymust emit an infinite number of electron-holeexcitations because the bath states before andafter the transition have zero overlap. Thisis often called an infrared catastrophe (becauseit is low-energy electrons and holes that causethe zero overlap), or an orthogonality catastro-phe (even though the two bath states aren’tjust orthogonal, they are in different Hilbertspaces). Phonons typically do not produce over-lap catastrophes (Debye–Waller, Frank–Condon,Mossbauer). This difference is usually attributedto the fact that there are many more low-energyelectron-hole pairs (a constant density of states)than there are low-energy phonons (ωk ∼ ck,where c is the speed of sound and the wave-vector density goes as (V/2π)3d3k).

Vo

Qo

AFM /STM tip AFM /STM tip

Substrate Substrate

Fig. 3.1 Atomic tunneling from a tip. Any in-

ternal transition among the atoms in an insulatorcan only exert a force impulse (if it emits momen-tum, say into an emitted photon), or a force dipole(if the atomic configuration rearranges); these leadto non-zero phonon overlap integrals only partiallysuppressing the transition. But a quantum transi-tion that changes the net force between two macro-scopic objects (here a surface and a STM tip) canlead to a change in the net force (a force monopole).We ignore here the surface, modeling the force as ex-erted directly into the center of an insulating elasticmedium.8See “Atomic Tunneling from a STM/AFMTip: Dissipative Quantum Effects from Phonons”Ard A. Louis and James P. Sethna, Phys. Rev. Lett.

8∗

Page 92: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

82 Advanced Statistical Mechanics

74, 1363 (1995), and “Dissipative tunneling and or-thogonality catastrophe in molecular transistors”, S.Braig and K. Flensberg, Phys. Rev. B 70, 085317(2004).

However, the coupling strength to the low energyphonons has to be considered as well. Considera small system undergoing a quantum transitionwhich exerts a net force at x = 0 onto an insu-lating crystal:

H =∑

k

p2k/2m + 1/2mω2kq

2k + F · u0. (3.6)

Let us imagine a kind of scalar elasticity, toavoid dealing with the three phonon branches(two transverse and one longitudinal); we thusnaively write the displacement of the atom at lat-tice site xn as un = (1/

√N)

∑k qk exp(−ikxn)

(with N the number of atoms), so qk =(1/

√N)

∑n un exp(ikxn).

Substituting for u0 in the Hamiltonian and com-pleting the square, find the displacement ∆k ofeach harmonic oscillator. Write the formulafor the likelihood 〈F |0〉 that the phonons will allend in their ground states, as a product overk of the phonon overlap integral exp(−∆2

k/8a2k)

(with ak =√

~/2mωk the zero-point motion inthat mode). Converting the product to the ex-ponential of a sum, and the sum to an integral∑

k ∼ (V/(2π)3∫dk, do we observe an overlap

catastrophe?

Mean Field: IntroductionMean field theory can be derived and motivatedin several ways.

(a) The interaction (field) from the neigh-bors can be approximated by the average(mean) field of all sites in the system, ef-fectively as described in Cardy section 2.1.This formulation makes it easy to under-stand why mean-field theory becomes moreaccurate in high dimensions and for long-range interactions, as a spin interacts withmore and more neighbors.

(b) The free energy can be bounded above bythe free energy of a noninteracting mean-field model. This is based on a variationalprinciple I first learned from Feynman (Ex-ercise 3.4, based on Cardy’s exercise 2.1).

(c) The free energy can be approximated bythe contribution of one order parameter

configuration, ignoring fluctuations. (Froma path-integral point of view, this is a ‘zeroloop’ approximation.) For the Ising model,one needs first to change away from theIsing spin variables to some continuous or-der parameter, either by coarse-graining(as in Ginsburg-Landau theory, Sethna 9.5below) or by introducing Lagrange multi-pliers (the Hubbard-Stratonovich transfor-mation, not discussed here).

(d) The Hamiltonian can be approximated byan infinite-range model, where each spindoes interact with the average of all otherspins. Instead of an approximate calcu-lation for the exact Hamiltonian, this isan exact calculation for an approximateHamiltonian – and hence is guaranteedto be at least a sensible physical model.See Exercise 12.13 for an application toavalanche statistics.

(e) The lattice of sites in the Hamiltonian canbe approximated as a branching tree (re-moving the loops) called the Bethe lattice(not described here). This yields a dif-ferent, solvable, mean-field theory, whichordinarily has the same mean-field criticalexponents but different non-universal fea-tures.

(3.4) Variational mean-field derivation. (Meanfield) ©3Just as the entropy is maximized by a uni-form probability distribution on the energy shellfor a fixed-energy (microcanonical) ensemble, itis a known fact that the Boltzmann distribu-tion exp(−βH)/Z minimizes the free energy Ffor a fixed-temperature (canonical) ensemble de-scribed by Hamiltonian energy function H. If ageneral ensemble ρ′ gives the probability of everystate X in a system, and we define for an observ-able B the average 〈B〉ρ′ =

∑X B(X)ρ′(X) then

what this says is that

F =− (1/β) log∑

X

exp(−βH(X))

≤〈F 〉ρ′ = 〈E − TS〉ρ′= 〈H〉ρ′ − T 〈S〉ρ′ (3.7)

Feynman, in his work on the polaron problem,derived a variational formula for what appearsto be a related problem. If we define 〈B〉H′ to

Page 93: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 83

be the Boltzmann average of B under the hamil-tonian H′, Feynman says that

Tr e−βH ≥ Tr e−β(H′+〈H−H′〉H′ ). (3.8)

Feynman’s formula varies not among generalprobability distributions ρ′, but only among ap-proximating Hamiltonians H′.(a) Show that eqn 3.7 implies eqn 3.8. (Hint:〈S〉H′ is not dependent on H, and F ′ = E′ −T S′.)(b: Cardy exercise 2.1) Derive the mean fieldequations for the Ising ferromagnet using Feyn-man’s inequality (eqn 3.8), where H is the fullHamiltonian and H′ = h

∑i Si is a trial Hamil-

tonian, with h chosen to maximize the right-handside.Notice that having a variational bound on thefree energy is not as useful as it sounds. Thefree energy itself is usually not of great inter-est; the objects of physical importance are al-most always derivatives (first derivatives like thepressure, energy and entropy, and second deriva-tives like the specific heat) or differences (tellingwhich phase is more stable). Knowing an upperbound to the free energy doesn’t tell us muchabout these other quantities. Like most varia-tional methods, mean-field theory is useful notbecause of its variational bound, but because (ifdone well) it provides qualitative insight into thebehavior.

(3.5) Ising Lower Critical Dimension. (Dimen-sion dependence) ©3What is the lower critical dimension of the Isingmodel? If the total energy ∆E needed to destroylong-range order is finite as the system size Lgoes to infinity, and the associated entropy growswith system size, then surely long-range order ispossible only at zero temperature.(a) Ising model in D dimensions. Considerthe Ising model in dimension D on a hypercubiclattice of length L on each side. Estimate theenergy9 needed to create a domain wall splittingthe system into two equal regions (one spin up,the other spin down). In what dimension willthis wall have finite energy as L → ∞? Suggesta bound for the lower critical dimension of theIsing model.The scaling at the lower critical dimension is of-ten unusual, with quantities diverging in ways

different from power laws as the critical temper-ature Tc is approached.(b) Correlation length in 1D Ising model.Estimate the number of domain walls at tempera-ture T in the 1D Ising model. How does the cor-relation length ξ (the distance between domainwalls) grow as T → Tc = 0? (Hint: Changevariables to ηi = SiSi+1, which is −1 if there isa domain wall between sites i and i+1.) The cor-relation exponent ν satisfying ξ ∼ (T − Tc)

−ν is1, ∼ 0.63, and 1/2 in dimensions D = 2, 3, and≥ 4, respectively. Is there an exponent ν govern-ing this divergence in one dimension? How doesν behave as D → 1?

(3.6) XY Lower Critical Dimension and theMermin-Wagner Theorem. (Dimension de-pendence) ©3Consider a model of continuous unit-length spins(e.g., XY or Heisenberg) on a D-dimensional hy-percubic lattice of length L. Assume a nearest-neighbor ferromagnetic bond energy

− J Si · Sj . (3.9)

Estimate the energy needed to twist the spinsat one boundary 180 with respect to the otherboundary (the energy difference between periodicand antiperiodic boundary conditions along oneaxis). In what dimension does this energy stayfinite in the thermodynamic limit L→ ∞? Sug-gest a bound for the lower critical dimension forthe emergence of continuous broken symmetriesin models of this type.Note that your argument produces only onethick domain wall (unlike the Ising model, wherethe domain wall can be placed in a variety ofplaces). If in the lower critical dimension its en-ergy is fixed as L → ∞ at a value large com-pared to kBT , one could imagine most of thetime the order might maintain itself across thesystem. The actual behavior of the XY model inits lower critical dimension is subtle.On the one hand, there cannot be long-rangeorder. This can be seen convincingly, but notrigorously, by estimating the effects of fluctu-ations at finite temperature on the order pa-rameter, within linear response. Pierre Hohen-berg, David Mermin and Herbert Wagner provedit rigorously (including nonlinear effects) usingan inequality due to Bogoliubov. One should

9Energy, not free energy! Think about T = 0.

Page 94: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

84 Advanced Statistical Mechanics

note, though, that the way this theorem is usu-ally quoted (”continuous symmetries cannot bespontaneously broken at finite temperatures inone and two dimensions”) is too general. Inparticular, for two-dimensional crystals one haslong-range order in the crystalline orientations,although one does not have long-range brokentranslational order.On the other hand, the XY model does have aphase transition in its lower critical dimensionat a temperature Tc > 0. The high-temperaturephase is a traditional paramagentic phase, withexponentially decaying correlations between ori-entations as the distance increases. The low-temperature phase indeed lacks long-range or-der, but it does have a stiffness – twisting thesystem (as in your calculation above) by 180

costs a free energy that goes to a constant asL→ ∞. In this stiff phase the spin-spin correla-tions die away not exponentially, but as a powerlaw.The corresponding Kosterlitz–Thouless phasetransition has subtle, fascinating scaling prop-erties. Interestingly, the defect that destroys thestiffness (a vortex) in the Kosterlitz–Thoulesstransition does not have finite energy as the sys-tem size L gets large. We shall see that its energygrows ∼ logL, while its entropy grows ∼ T logL,so entropy wins over energy as the temperaturerises, even though the latter is infinite.

(3.7) Long-range Ising. (Dimension depen-dence) ©3The one-dimensional Ising model can have afinite-temperature transition if we give each spinan interaction with distant spins.Long-range forces in the 1d Ising model.Consider an Ising model in one dimension, withlong-range ferromagnetic bonds

H =∑

i>j

J

|i− j|σ SiSj . (3.10)

For what values of σ will a domain wall betweenup and down spins have finite energy? Suggest abound for the ‘lower critical power law’ for thislong-range one-dimensional Ising model, belowwhich a ferromagnetic state is only possible whenthe temperature is zero. (Hint: Approximatethe sum by a double integral. Avoid i = j.)The long-range 1D Ising model at the lower crit-ical power law has a transition that is closely re-lated to the Kosterlitz-Thouless transition. It isin the same universality class as the famous (but

obscure) Kondo problem in quantum phase tran-sitions. And it is less complicated to think aboutand less complicated to calculate with than ei-ther of these other two cases.

(3.8) Equilibrium Crystal Shapes. (Crystalshapes) ©3What is the equilibrium shape of a crys-tal? There are nice experiments on singlecrystals of salt, gold, and lead crystals (seehttp://www.lassp.cornell.edu/sethna/CrystalShapes/Equilibrium Crystal Shapes.html).They show beautiful faceted shapes, formed bycarefully annealing single crystals to equilibriumat various temperatures. The physics governingthe shape involves the anisotropic surface ten-sion γ(n) of the surface, which depends on theorientation n of the local surface with respect tothe crystalline axes.We can see how this works by considering theproblem for atoms on a 2D square lattice withnear-neighbor interations (a lattice gas which wemap in the standard way onto a conserved-orderparameter Ising model). Here γ(θ) becomes theline tension between the up and down phases– the interfacial free energy per unit length be-tween an up-spin and a down-spin phase. Wedraw heavily on Craig Rottman and MichaelWortis, Phys. Rev. B 24, 6274 (1981), and onW. K. Burton, N. Cabrera and F. C. Frank, Phil.Trans. R. Soc. Lond. A 243 299-358 (1951).(a) Interfacial free energy, T = 0. Consider aninterface at angle θ between up spins and downspins. Show that the energy cost per unit lengthof the interface is

γ0(θ) = 2J(cos(|θ|) + sin(|θ|)), (3.11)

where length is measured in lattice spacings.(b) Interfacial free energy, low T . Consider aninterface at angle θ = arctanN/M , connect-ing the origin to the point (M,N). At zerotemperature, this will be an ensemble of stair-cases, with N steps upward andM steps forward.Show that the total number of such staircases is(M + N)!/(M !N !). Hint: The number of waysof putting k balls into ℓ jars allowing more thanone ball per jar (the ‘number of combinationswith repetition’) is (k+ ℓ− 1)!/(k!(ℓ− 1)!. Lookup the argument. Using Stirling’s formula, show

Page 95: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 85

that the entropy per unit length is

s0(θ) =kB((cos(|θ|) + sin(|θ|))log (cos(|θ|) + sin(|θ|))

− cos(|θ|) log(cos(|θ|)) (3.12)

− sin(|θ|) log(sin(|θ|))).

How would we generate an equilibrium crystalfor the 2D Ising model? (For example, see “TheGibbs-Thomson formula at small island sizes- corrections for high vapour densities” Badri-narayan Krishnamachari, James McLean, Bar-bara Cooper, and James P. Sethna, Phys. Rev.B 54, 8899 (1996).) Clearly we want a conservedorder parameter simulation (otherwise the up-spin ‘crystal’ cluster in a down-spin ‘vapor’ envi-ronment would just flip over). The tricky part isthat an up-spin cluster in an infinite sea of down-spins will evaporate – it’s unstable.10 The key isto do a simulation below Tc, but with a net (con-served) negative magnetization slightly closer tozero that expected in equilibrium. The extraup-spins will (in equilibrium) mostly find one an-other, forming a cluster whose time-average willgive an equilibrium crystal.Rottman and Wortis tell us that the equilibriumcrystal shape (minimizing the perimeter energyfor fixed crystal area) can be found as a para-

metric curve

x = cos(θ)γ(θ, T )− sin(θ)dγ

y = sin(θ)γ(θ, T ) + cos(θ)dγ

where γ(θ, T ) = γ0(θ) − Ts(θ, T ) is the free en-ergy per unit length of the interface. Derivingthis is cool, but somewhat complicated.11

(c) Wolff construction. Using the energy ofeqn 3.11 and approximating the entropy atlow temperatures with the zero-temperature formeqn 3.12, plot the Wulff shape for T = 0.01, 0.1,0.2, 0.4, and 0.8J for one quadrant (0 < θ <π/2). Hint: Ignore the parts of the face outsidethe quadrant; they are artifacts of the low tem-perature approximation. Does the shape becomecircular12 near Tc = 2J/ log(1 +

√2) ∼ 2.269J?

Why not? If you’re ambitious, Rottman andWortis’s article above gives the exact interfacialfree energy for the 2D Ising model, which shouldfix this problem. What is the shape at low tem-peratures, where our approximation is good? Dowe ever get true facets? The approximationof the interface as a staircase is a solid-on-solidmodel, which ignores overhangs. It is a goodapproximation at low temperatures.Our model does not describe faceting – one needsthree dimensions to have a roughening transi-tion, below which there are flat regions on theequilibrium crystal shape.13

10If you put on an external field favoring up spins, then large clusters will grow andsmall clusters will shrink. The borderline cluster size is the critical droplet (see En-tropy, Order Parameters, and Complexity, section 11.3). Indeed, the critical dropletwill in general share the equilibrium crystal shape.11See Burton, Cabrera, and Frank, Appendix D. This is their equation D4, withp ∝ γ(θ) given by equation D7.12This is the emergent spherical symmetry at the Ising model critical point.13Flat regions demand that the free energy for adding a step onto the surface becomeinfinite. Steps on the surfaces of three-dimensional crystals are long, and if they havepositive energy per unit length the surface is faceted. To get an infinite energy for astep on a two-dimensional surface you need long-range interactions.

Page 96: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Page 97: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Numerical Methods 4

Exercises

These exercises cover numerical methods, and are de-signed to complement the text Numerical Recipes, byWilliam H. Press, Saul A. Teukolsky, William T. Vet-terling, and Brian P. Flannery.

(4.1) Condition Number and Accuracy.1 (Nu-merical) ©3You may think this exercise, with a 2x2 matrix,hardly demands a computer. However, it intro-duces tools for solving linear equations, condi-tion numbers, singular value decomposition, allwhile illustrating subtle properties of matrix so-lutions. Use whatever linear algebra packagesare provided in your software environment.Consider the equation Ax = b, where

A =

(0.780 0.5630.913 0.659

)and b =

(0.2170.254

).

(4.1)The exact solution is x = (1,−1). Consider thetwo approximate solutions xα = (0.999,−1.001)and xβ = (0.341,−0.087).(a) Compute the residuals rα and rβ correspond-ing to the two approximate solutions. (The resid-ual is r = b − Ax.) Does the more accuratesolution have the smaller residual?(b) Compute the condition number2 of A. Doesit help you understand the result of (a)? (Hint:

V T maps the errors xα−x and xβ −x into whatcombinations of the two singular values?)(c) Use a black-box linear solver to solve for x.Subtract your answer from the exact one. Doyou get within a few times the machine accuracyof 2.2 × 10−16? Is the problem the accuracy ofthe solution, or rounding errors in calculating Aand b? (Hint: try calculating Ax− b.)

(4.2) Sherman–Morrison formula. (Numeri-cal) ©3Consider the 5x5 matrices

T =

E − t t 0 0 0t E t 0 00 t E t 00 0 t E t0 0 0 t E − t

(4.2)

and

C =

E t 0 0 tt E t 0 00 t E t 00 0 t E tt 0 0 t E

. (4.3)

These matrices arise in one-dimensional modelsof crystals.3 The matrix T is tridiagonal: its en-tries are zero except along the central diagonal

1Adapted from Saul Teukolsky, 2003.2See section 2.6.2 for a technical definition of the condition number, and how it isrelated to singular value decomposition. Look on the Web for the more traditionaldefinition(s), and how they are related to the accuracy.3As the Hamiltonian for electrons in a one-dimensional chain of atoms, t is thehopping matrix element and E is the on-site energy. As the potential energy for lon-gitudinal vibrations in a one-dimensional chain, E = −2t = K is the spring constantbetween two neighboring atoms. The tridiagonal matrix T corresponds to a kindof free boundary condition, while C corresponds in both cases to periodic boundaryconditions.

Page 98: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

88 Numerical Methods

and the entries neighboring the diagonal. Tridi-agonal matrices are fast to solve; indeed, manyroutines will start by changing basis to make thearray tridiagonal. The matrix C, on the otherhand, has a nice periodic structure: each basiselement has two neighbors, with the first and lastbasis elements now connected by t in the upper-right and lower-left corners. This periodic struc-ture allows for analysis using Fourier methods(Bloch waves and k-space).For matrices like C and T which differ in only afew matrix elements4 we can find C−1 from T−1

efficiently using the Sherman-Morrison formula(section 2.7).Compute the inverse5 of T for E = 3 and t = 1.Compute the inverse of C. Compare the differ-ence ∆ = T−1 − C−1 with that given by theSherman-Morrison formula

∆ =T−1u⊗ vT−1

1 + v · T−1 · u . (4.4)

(4.3) Methods of interpolation. (Numerical) ©3We’ve implemented four different interpolationmethods for the function sin(x) on the inter-val (−π, π). On the left, we see the methodsusing the five points ±π, ±π/2, and 0; on theright we see the methods using ten points. Thegraphs show the interpolation, its first deriva-tive, its second derivative, and the error. Thefour interpolation methods we have implementedare (1) Linear, (2) Polynomial of degree three(M = 4), (3) Cubic spline, and (4) Barycentricrational interpolation. Which set of curves (A,B, C, or D) in Figure 4.1 corresponds with whichmethod?

(4.4) Numerical definite integrals. (Numeri-cal) ©3In this exercise we will integrate the function yougraphed in the first, warmup exercise:

F (x) = exp(−6 sin(x)). (4.5)

As discussed in Numerical Recipes, the word in-tegration is used both for the operation that is

inverse to differentiation, and more generally forfinding solutions to differential equations. Theold-fashioned term specific to what we are doingin this exercise is quadrature.(a) Black-box. Using a professionally writ-ten black-box integration routine of your choice,integrate F (x) between zero and π. Com-pare your answer to the analytic integral6 (≈0.34542493760937693) by subtracting the ana-lytic form from your numerical result. Readthe documentation for your black box routine,and describe the combination of algorithms be-ing used.(b) Trapezoidal rule. Implement the trapezoidalrule with your own routine. Use it to calculatethe same integral as in part (a). Calculate the es-timated integral Trap(h) for N +1 points spacedat h = π/N , with N = 1, 2, 4, . . . , 210. Plot theestimated integral versus the spacing h. Does itextrapolate smoothly to the true value as h→ 0?With what power of h does the error vanish? Re-plot the data as Trap(h) versus h2. Does theerror now vanish linearly?Numerical Recipes tells us that the error is aneven polynomial in h, so we can extrapolate theresults of the trapezoidal rule using polynomialinterpolation in powers of h2.(c) Simpson’s rule (paper and pencil). Considera linear fit (i.e., A + Bh2) to two points at 2h0

and h0 on your Trap(h) versus h2 plot. Noticethat the extrapolation to h → 0 is A, and showthat A is 4/3 Trap(h0) − 1/3 Trap(2h0). Whatis the net weight associated with the even pointsand odd points? Is this Simpson’s rule?(d) Romberg integration. Apply M-pointpolynomial interpolation (here extrapolation)to the data points h2,Trap(h) for h =π/2, . . . , π/2M , with values of M between twoand ten. (Note that the independent variableis h2.) Make a log-log plot of the absolute valueof the error versus N = 2M . Does this extrapo-lation improve convergence?(e) Gaussian Quadrature. Implement Gaussianquadrature with N points optimally chosen on

4More generally, this works whenever the two matrices differ by the outer productu⊗v of two vectors. By taking the two vectors to each have one non-zero componentui = uδia, vj = vδjb, the matrices differ at one matrix element ∆ab = uv; for ourmatrices u = v = 1, 0, 0, 0, 1 (see section 2.7.2).5Your software environment should have a solver for tridiagonal systems, rapidly giv-ing u in the equation T · u = r. It likely will not have a special routine for invertingtridiagonal matrices, but our matrix is so small it’s not important.6π(BesselI[0, 6]− StruveL[0, 6]), according to Mathematica.

Page 99: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 89

interpolation

-π 0 πx

-1

0

1

Firs

t der

ivat

ive

ABCD

-π 0 πx

-1

0

1

Seco

nd d

eriv

ativ

e

ABCD

-π 0 πx

-0.4

-0.2

0

0.2

0.4

Err

or

ABCD

-π 0 πx

-1

0

1si

n(x)

app

rox

ABCD

-π 0 πx

-1

0

1

Firs

t der

ivat

ive

ABCD

1 1.10.4

0.5

-π 0 πx

-1

0

1

Seco

nd D

eriv

ativ

e

ABCD

-π 0 πx

-0.05

0

0.05

Err

or

ABCD

Five data points Ten data points

Interpolatedsin(x)

Firstderivative

Secondderivative

Errorsin(x) −

-π 0 πx

-1

0

1

sin(

x) a

ppro

x

ABCD

Fig. 4.1 Interpolation methods.

Page 100: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

90 Numerical Methods

the interval (0, π), with N = 1, 2, . . . 5. (Youmay find the points and the weights appropriatefor integrating functions on the interval (−1, 1)on the course Web site; you will need to rescalethem for use on (0, π).) Make a log-log plot of theabsolute value of your error as a function of thenumber of evaluation points N , along with thecorresponding errors from the trapezoidal ruleand Romberg integration.(f) Integrals of periodic functions. Apply thetrapezoidal rule to integrate F (x) from zero to2π, and plot the error on a log plot (log ofthe absolute value of the error versus N) asa function of the number of points N up toN = 20. (The true value should be around422.44623805153909946.) Why does it convergeso fast? (Hint: Don’t get distracted by the funnyalternation of accuracy between even and oddpoints.)The location of the Gauss points depend uponthe class of functions one is integrating. Inpart (e), we were using Gauss-Legendre quadra-ture, appropriate for functions which are ana-lytic at the endpoints of the range of integra-tion. In part (f), we have a function with pe-riodic boundary conditions. For functions withperiodic boundary conditions, the end-points areno longer special. What corresponds to Gaus-sian quadrature for periodic functions is justthe trapezoidal rule: equally-weighted points atequally spaced intervals.

(4.5) Numerical derivatives. (Rounding errors,Accuracy) ©2Calculate the numerical first derivative of thefunction y(x) = sin(x), using the centered two-point formula dy/dx ∼ (y(x+h)−y(x−h))/(2h),and plot the error y′(x) − cos(x) in the range(−π, π) at 100 points. (Do a good job! Use thestep-size h described in Numerical Recipes sec-tion 5.7 to optimize the sum of the truncationerror and the rounding error. Also, make surethat the step-size h is exactly representable onthe machine.) How does your actual error com-pare to the fractional error estimate given in NRsection 5.7? Calculate and plot the numericalsecond derivative using the formula

d2y

dx2∼ y(x+ h)− 2y(x) + y(x− h)

h2, (4.6)

again optimizing h and making it exactly repre-sentable. Estimate your error again, and com-pare to the observed error.

(4.6) Summing series. (Efficiency) ©2Write a routine to calculate the sum

sn =

n∑

j=0

(−1)j1

2j + 1. (4.7)

As n → ∞, s∞ = π/4. About how many termsdo you need to sum to get convergence to within10−7 of this limit? Now try using Aitken’s ∆2

process to accelerate the convergence:

s′n = sn − (sn+1 − sn)2

sn+2 − 2sn+1 + sn. (4.8)

About how many terms do you need with Aitken’smethod to get convergence to within 10−7?

(4.7) Random histograms. (Random numbers) ©2

(a) Investigate the random number generator foryour system of choice. What is its basic algo-rithm? Its period?(b) Plot a histogram with 100 bins, giving thenormalized probability density of 100,000 ran-dom numbers sampled from (a) a uniform dis-tribution in the range 0 < x < 2π, (b) an ex-ponential distribution ρ(x) = 6 exp(−6x), and(c) a normal distribution of mean x = 3π/2 andstandard deviation σ = 1/

√6. Before each plot,

set the seed of your random number generator.Do you now get the same plot when you repeat?

(4.8) Monte Carlo integration. (Random num-bers, Robust algorithms) ©3How hard can numerical integration be? Sup-pose the function f is wildly nonanalytic, or hasa peculiar or high-dimensional domain of inte-gration? In the worst case, one can always tryMonte Carlo integration. The basic idea is topepper points at random in the integration in-terval. The integration volume times the averageof the function V 〈f〉 is the estimate of the inte-gral.As one might expect, the expected error inthe integral after N evaluations is given by1/

√N − 1 times the standard deviation of the

sampled points (NR equation 7.7.1).(a) Monte Carlo in one dimensional integrals.Use Monte Carlo integration to estimate the in-tegral of the function introduced in the prelimi-nary exercises

y(x) = exp(−6 sin(x)) (4.9)

over 0 ≤ x < 2π. (The correct value of the inte-gral is around 422.446.) How many points do you

Page 101: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 91

need to get 1% accuracy? Answer this last ques-tion both by experimenting with different randomnumber seeds, and by calculating the expectednumber from the standard deviation. (You mayuse the fact that 〈y2〉 = (1/2π)

∫ 2π

0y2(x)dx ≈

18948.9.)Monte Carlo integration is not the most efficientmethod for calculating integrals of smooth func-tions like y(x). Indeed, since y(x) is periodicin the integration interval, equally spaced pointsweighted equally (the trapezoidal rule) gives ex-ponentially rapid convergence; it takes only ninepoints to get 1% accuracy. Even for smoothfunctions, though, Monte Carlo integration isuseful in high dimensions.(b) Monte Carlo in many dimensions (No de-tailed calculations expected). For a hypotheti-cal ten-dimensional integral, if we used a regulargrid with nine points along each axis, how manyfunction evaluations would we need for equiva-lent accuracy? Does the number of Monte Carlopoints needed depend on the dimension of thespace, presuming (perhaps naively) that the vari-ance of the function stays fixed?Our function y(x) is quite close to a Gaussian.(Why? Taylor expand sin(x) about x = 3π/2.)We can use this to do importance sampling. Theidea is to evaluate the integral of h(x)g(x) byrandomly sampling h with probability g, pick-ing h(x) = y(x)/g(x). The variance is then〈h2〉 − 〈h〉2. In order to properly sample the tailnear x = π/2, we should mix a Gaussian and auniform distribution:

g(x) =ǫ

2π+

1− ǫ√π/3

exp(−6(x− 3π/2)2/2).

(4.10)I found minimum variance around ǫ = 0.005.(c) Importance Sampling (Optional for 480).Generate 1000 random numbers with probabilitydistribution g.7 Use these to estimate the inte-gral of y(x). How accurate is your answer?

(4.9) The Birthday Problem. (Sorting, Randomnumbers) ©2Remember birthday parties in your elementaryschool? Remember those years when two kidshad the same birthday? How unlikely!

How many kids would you need in class to get,more than half of the time, at least two with thesame birthday?(a) Numerical. Write BirthdayCoincidences(K,C), a routine that returns the fraction among Cclasses for which at least two kids (among Kkids per class) have the same birthday. (Hint:By sorting a random list of integers, commonbirthdays will be adjacent.) Plot this probabilityversus K for a reasonably large value of C. Isit a surprise that your classes had overlappingbirthdays when you were young?One can intuitively understand this, by remem-bering that to avoid a coincidence there areK(K − 1)/2 pairs of kids, all of whom musthave different birthdays (probability 364/365 =1− 1/D, with D days per year).

P (K,D) ≈ (1− 1/D)K(K−1)/2 (4.11)

This is clearly a crude approximation – it doesn’tvanish if K > D! Ignoring subtle correlations,though, it gives us a net probability

P (K,D) ≈ exp(−1/D)K(K−1)/2

≈ exp(−K2/(2D)) (4.12)

Here we’ve used the fact that 1 − ǫ ≈ exp(−ǫ),and assumed that K/D is small.(b) Analytical. Write the exact formula givingthe probability, for K random integers among Dchoices, that no two kids have the same birthday.(Hint: What is the probability that the secondkid has a different birthday from the first? Thethird kid has a different birthday from the firsttwo?) Show that your formula does give zero ifK > D. Converting the terms in your product toexponentials as we did above, show that your an-swer is consistent with the simple formula above,if K ≪ D. Inverting equation 4.12, give a for-mula for the number of kids needed to have a 50%chance of a shared birthday.Some years ago, we were doing a large simula-tion, involving sorting a lattice of 10003 randomfields (roughly, to figure out which site on thelattice would trigger first). If we want to makesure that our code is unbiased, we want differ-ent random fields on each lattice site – a giantbirthday problem.

7Take (1− ǫ)M Gaussian random numbers and ǫM random numbers uniformly dis-tributed on (0, 2π). The Gaussian has a small tail that extends beyond the integrationrange (0, 2π), so the normalization of the second term in the definition of g is notquite right. You can fix this by simply throwing away any samples that include pointsoutside the range.

Page 102: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

92 Numerical Methods

Old-style random number generators generateda random integer (232 ‘days in the year’) andthen divided by the maximum possible integerto get a random number between zero and one.Modern random number generators generate all252 possible doubles between zero and one.(c) If there are 231 − 1 distinct four-byte posi-tive integers, how many random numbers wouldone have to generate before one would expect co-incidences half the time? Generate lists of thatlength, and check your assertion. (Hints: It isfaster to use array operations, especially in in-terpreted languages. I generated a random ar-ray with N entries, sorted it, subtracted the firstN−1 entries from the last N−1, and then calledmin on the array.) Will we have to worry aboutcoincidences with an old-style random numbergenerator? How large a lattice L × L × L ofrandom double precision numbers can one gener-ate with modern generators before having a 50%chance of a coincidence? If you have a fast ma-chine with a large memory, you might test thistoo.

(4.10) Washboard Potential. (Solving) ©2Consider a washboard potential8

V (r) = A1 cos(r) + A2 cos(2r)− Fr (4.13)

with A1 = 5, A2 = 1, and F initially equal to1.5.(a) Plot V (r) over (−10, 10). Numerically findthe local maximum of V near zero, and the localminimum of V to the left (negative side) of zero.What is the potential energy barrier for movingfrom one well to the next in this potential?Usually finding the minimum is only a first step– one wants to explore how the minimum movesand disappears. . .(b) Increasing the external tilting field F , graph-ically roughly locate the field Fc where the barrierdisappears, and the location rc at this field wherethe potential minimum and maximum merge.(This is a saddle-node bifurction.) Give thecriterion on the first derivative and the second

derivative of V (r) at Fc and rc. Using these twoequations, numerically use a root-finding routineto locate the saddle-node bifurcation Fc and rc.

(4.11) Sloppy Minimization. (Minimization) ©3“With four parameters I can fit an elephant.With five I can make it waggle it’s trunk.” Thisstatement, attributed to many different sources(from Carl Friedrich Gauss to Fermi), reflects theproblems found in fitting multiparameter modelsto data. One almost universal problem is slop-piness – the parameters in the model are poorlyconstrained by the data.9

Consider the classic ill-conditioned problem offitting exponentials to radioactive decay data. Ifyou know that at t = 0 there are equal quantitiesof N radioactive materials with half-lives γn, theradioactivity that you would measure is

y(t) =

N−1∑

n=0

γn exp(−γnt). (4.14)

Now, suppose you don’t know the decay rates γn.Can you reconstruct them by fitting the data toexperimental data y0(t)?Let’s consider the problem with two radioactivedecay elements N = 2. Suppose the actual de-cay constants for y0(t) are αn = n + 2 (so theexperiment has γ0 = α0 = 2 and γ1 = α1 = 3),and we try to minimize the least-squared error10

in the fits C:

C[γ] =

∫ ∞

0

(y(t)− y0(t))2 dt. (4.15)

You can convince yourself that the least-squarederror is

C[γ] =N−1∑

n=0

N−1∑

m=0

( γnγmγn + γm

+αnαm

αn + αm− 2

γnαm

γn + αm

)

=49

10+γ02

− 4 γ02 + γ0

− 6 γ03 + γ0

+γ12

− 4 γ12 + γ1

− 6 γ13 + γ1

+2 γ0 γ1γ0 + γ1

.

8A washboard is what people used to hand-wash clothing. It is held at an angle,and has a series of corrugated ridges; one holds the board at an angle and rubsthe wet clothing on it. Washboard potentials arise in the theory of superconductingJosephson junctions, in the motion of defects in crystals, and in many other contexts.9‘Elephantness’ says that a wide range of behaviors can be exhibited by varying theparameters over small ranges. ‘Sloppiness’ says that a wide range of parameterscan yield approximately the same behavior. Both reflect a skewness in the relationbetween parameters and model behavior.10We’re assuming perfect data at all times, with uniform error bars.

Page 103: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 93

(a) Draw a contour plot of C in the square1.5 < γn < 3.5, with enough contours (perhapsnon-equally spaced) so that the two minima canbe distinguished. (You’ll also need a fairly finegrid of points.)One can see from the contour plot that measur-ing the two rate constants separately would be achallenge. This is because the two exponentialshave similar shapes, so increasing one decay rateand decreasing the other can almost perfectlycompensate for one another.(b) If we assume both elements decay with thesame decay constant γ = γ0 = γ1, minimize thecost to find the optimum choice for γ. Whereis this point on the contour plot? Plot y0(t) andy(t) with this single-exponent best fit on the samegraph, over 0 < t < 2. Do you agree that it wouldbe difficult to distinguish these two fits?This problem can become much more severe inhigher dimensions. The banana-shaped ellipsesin your contour plot can become needle-like, withaspect ratios of more than a thousand to one(about the same as a human hair). Followingthese thin paths down to the true minima canbe a challenge for multidimensional minimiza-tion programs.(c) Find a method for storing and plottingthe locations visited by your minimization rou-tine (values of (γ0, γ1) at which it evaluatesC while searching for the minimum). Startingfrom γ0 = 1, γ1 = 4, minimize the cost us-ing as many methods as is convenient withinyour programming environment, such as Nelder-Mead, Powell, Newton, Quasi-Newton, conju-gate gradient, Levenberg-Marquardt (nonlinearleast squares). . . . (Try to use at least one thatdoes not demand derivatives of the function).Plot the evaluation points on top of the contourplot of the cost for each method. Compare thenumber of function evaluations needed for thedifferent methods.

(4.12) Sloppy Monomials.11 (Eigenvalues, Fit-ting) ©3We have seen that the same function f(x) canbe approximated in many ways. Indeed, thesame function can be fit in the same intervalby the same type of function in several differ-

ent ways! For example, in the interval [0, 1], thefunction sin(2πx) can be approximated (badly)by a fifth-order Taylor expansion, a Chebyshevpolynomial, or a least-squares (Legendre12 ) fit:

f(x) = sin(2πx)

≈ 0.000 + 6.283x + 0.000x2 − 41.342x3

+ 0.000x4 + 81.605x5 Taylor

≈ 0.007 + 5.652x + 9.701x2 − 95.455x3

+ 133.48x4 − 53.39x5 Chebyshev

≈ 0.016 + 5.410x + 11.304x2 − 99.637x3

+ 138.15x4 − 55.26x5 Legendre

It is not a surprise that the best fit polynomialdiffers from the Taylor expansion, since the lat-ter is not a good approximation. But it is asurprise that the last two polynomials are so dif-ferent. The maximum error for Legendre is lessthan 0.02, and for Chebyshev is less than 0.01,even though the two polynomials differ by

Chebyshev− Legendre = (4.16)

− 0.009 + 0.242x − 1.603x2

+ 4.182x3 − 4.67x4 + 1.87x5

a polynomial with coefficients two hundred timeslarger than the maximum difference!This flexibility in the coefficients of the polyno-mial expansion is remarkable. We can study itby considering the dependence of the quality ofthe fit on the parameters. Least-squares (Legen-dre) fits minimize a cost CLeg, the integral of thesquared difference between the polynomial andthe function:

CLeg = (1/2)

∫ 1

0

(f(x)−M∑

m=0

amxm)2 dx.

(4.17)How quickly does this cost increase as we movethe parameters am away from their best-fit val-ues? Varying any one monomial coefficient willof course make the fit bad. But apparentlycertain coordinated changes of coefficients don’tcost much – for example, the difference be-tween least-squares and Chebyshev fits given ineqn (4.16).How should we explore the dependence in arbi-trary directions in parameter space? We can use

11Thanks to Joshua Waterfall, whose research is described here.12The orthogonal polynomials used for least-squares fits on [-1,1] are the Legendrepolynomials, assuming continuous data points. Were we using orthogonal polynomi-als for this exercise, we would need to shift them for use in [0,1].

Page 104: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

94 Numerical Methods

the eigenvalues of the Hessian to see how sensi-tive the fit is to moves along the various eigen-vectors. . .(a) Note that the first derivative of the cost CLeg

is zero at the best fit. Show that the Hessian sec-ond derivative of the cost is

HLegmn =

∂2CLeg

∂am∂an=

1

m+ n+ 1. (4.18)

This Hessian is the Hilbert matrix, famous forbeing ill-conditioned (having a huge range ofeigenvalues). Tiny eigenvalues of HLeg corre-spond to directions in polynomial space wherethe fit doesn’t change.(b) Calculate the eigenvalues of the 6×6 Hessianfor fifth-degree polynomial fits. Do they span alarge range? How big is the condition number?(c) Calculate the eigenvalues of larger Hilbertmatrices. At what size do your eigenvalues seemcontaminated by rounding errors? Plot them, inorder of decreasing size, on a semi-log plot toillustrate these rounding errors.Notice from Eqn 4.18 that the dependence of thepolynomial fit on the monomial coefficients is in-dependent of the function f(x) being fitted. Wecan thus vividly illustrate the sloppiness of poly-nomial fits by considering fits to the zero func-tion f(x) ≡ 0. A polynomial given by an eigen-vector of the Hilbert matrix with small eigen-value must stay close to zero everywhere in therange [0, 1]. Let’s check this.(d) Calculate the eigenvector corresponding tothe smallest eigenvalue of HLeg, checking tomake sure its norm is one (so the coefficients areof order one). Plot the corresponding polynomialin the range [0, 1]: does it stay small everywherein the interval? Plot it in a larger range [−1, 2]to contrast its behavior inside and outside the fitinterval.This turns out to be a fundamental propertythat is shared with many other multiparame-ter fitting problems. Many different terms areused to describe this property. The fits arecalled ill-conditioned: the parameters an are notwell constrained by the data. The inverse prob-lem is challenging: one cannot practically ex-tract the parameters from the behavior of themodel. Or, as our group describes it, the fit issloppy: only a few directions in parameter space(eigenvectors corresponding to the largest eigen-values) are constrained by the data, and thereis a huge space of models (polynomials) varying

along sloppy directions that all serve well in de-scribing the data.At root, the problem with polynomial fits isthat all monomials xn have similar shapes on[0, 1]: they all start flat near zero and bendupward. Thus they can be traded for one an-other; the coefficient of x4 can be lowered with-out changing the fit if the coefficients of x3 andx5 are suitably adjusted to compensate. Indeed,if we change basis from the coefficients an of themonomials xn to the coefficients ℓn of the orthog-onal (shifted Legendre) polynomials, the situa-tion completely changes. The Legendre polyno-mials are designed to be different in shape (or-thogonal), and hence cannot be traded for oneanother. Their coefficients ℓn are thus well de-termined by the data, and indeed the Hessian forthe cost CLeg in terms of this new basis is theidentity matrix.Numerical Recipes states a couple of times thatusing equally-spaced points gives ill-conditionedfits. Will the Chebyshev fits, which emphasizethe end-points of the interval, give less sloppy co-efficients? The Chebyshev polynomial for a func-tion f (in the limit where many terms are kept)minimizes a different cost: the squared differenceweighted by the extra factor 1/

√x(1− x):

CCheb =

∫ 1

0

(f(x)−∑M

m=0 cmxm)2√

x(1− x)dx. (4.19)

One can show that the Hessian giving the depen-dence of CCheb on cm is

HChebmn =

∂2CCheb

∂cm∂cn=

21−2(m+n)π(2(m+ n)− 1)!

(m+ n− 1)!(m+ n)!.

(4.20)with HCheb

00 = π (doing the integral explicitly, ortaking the limit m,n→ 0).(d) (Optional for 480) Calculate the eigenvaluesof HCheb for fifth-degree polynomial fits.So, sloppiness is not peculiar to least-squares fits;Chebyshev polynomial coefficients are sloppytoo.

(4.13) Conservative differential equations: Accu-racy and fidelity. (Ordinary differential equa-tions) ©3In this exercise, we will solve for the motion of aparticle of mass m = 1 in the potential

V (y) = (1/8) y2(−4A2 + log2(y2)). (4.21)

Page 105: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 95

That is,

d2y/dt2 = −dV/dy (4.22)

= −(1/4)y(−4A2 + 2 log(y2) + log2(y2)).

We will start the particle at y0 = 1, v0 =dy/dt|0 = −A, and choose A = 6.(a) Show that the solution to this differentialequation is13

F (t) = exp(−6 sin(t)). (4.23)

Note that the potential energy V (y) is zero atthe five points y = 0, y = ± exp(±A).(b) Plot the potential energy for−3/2 exp(±A) < y < 3/2 exp(±A) (both zoomedin near y = 0 and zoomed out). The correcttrajectory should oscillate in the potential wellwith y > 0, turning at two points whose energyis equal to the initial total energy. What is thisinitial total energy for our the particle? Howmuch of an error in the energy would be needed,for the particle to pass through the origin whenit returns? Compare this error to the maximumkinetic energy of the particle (as it passes thebottom of the well). Small energy errors in ourintegration routine can thus cause significantchanges in the trajectory.(c) (Black-box) Using a professionally writtenblack-box differential equation solver of yourchoice, solve for y(t) between zero and 4π, athigh precision. Plot your answer along with F (t)from part (a). (About half of the solvers will getthe answer qualitatively wrong, as expected frompart (b).) Expand your plot to examine the re-gion 0 < y < 1; is energy conserved? Finally,read the documentation for your black box rou-tine, and describe the combination of algorithmsbeing used. You may wish to implement andcompare more than one algorithm, if your black-box has that option.

Choosing an error tolerance for your differentialequation limits the error in each time step. Ifsmall errors in early time steps lead to impor-tant changes later, your solution may be quitedifferent from the correct one. Chaotic motion,

for example, can never be accurately simulatedon a computer. All one can hope for is a faithfulsimulation – one where the motion is qualita-tively similar to the real solution. Here we findan important discrepancy – the energy of the nu-merical solution is drifting upward or downward,where energy should be exactly conserved in thetrue solution. Here we get a dramatic change inthe trajectory for a small quantitative error, butany drift in the energy is qualitatively incorrect.The Leapfrog algorithm is a primitive lookingmethod for solving for the motion of a particlein a potential. It calculates the next positionfrom the previous two:

y(t+ h) = 2y(t)− y(t− h) + h2f [y(t)] (4.24)

where f [y] = −dV/dy. Given an initial positiony(0) and an initial velocity v(0), one can initial-ize and finalize

y(h) = y(0) + h(v(0) + (h/2)f [y(0)])

v(T ) = (y(T )− y(T − h))/h+ (h/2)f [y(T )](4.25)

Leapfrog is one of a variety of Verlet algorithms.In a more general context, this is called Sto-ermer’s rule, and can be extrapolated to zerostep-size h as in the Bulirsch–Stoer algorithms.A completely equivalent algorithm, with morestorage but less roundoff error, is given by com-puting the velocity v at the midpoints of eachtime step:

v(h/2) = v(0) + (h/2)f [y(0)]

y(h) = y(0) + h v(h/2)

v(t+ h/2) = v(t− h/2) + hf [y(t)]

y(t+ h) = y(t) + h v(t+ h/2)

(4.26)

where we may reconstruct v(t) at integer timesteps with v(t) = v(t− h/2) + (h/2)f [y(t)].(d) (Leapfrog and symplectic methods) Showthat y(t + h) and y(h) from eqns 4.24 and 4.25converge to the true solutions as as h → 0 –that is, the time-step error compared to the so-lution of d2y/dt2 = f [y] vanishes faster than h.To what order in h are they accurate after onetime step?14 Implement Leapfrog, and apply it tosolving equation 4.22 in the range 0 < x < 4π,

13I worked backward to do this. I set the kinetic energy to 1/2(dF/dt)2 , set thepotential energy to minus the kinetic energy, and then substituted y for t by solvingy = F (t).14Warning: for subtle reasons, the errors in Leapfrog apparently build up quadrati-cally as one increases the number of time steps, so if your error estimate after one timestep is hn the error after N = T/h time steps can’t be assumed to be Nhn ∼ hn−1,but is actually N2hn ∼ hN−2.

Page 106: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

96 Numerical Methods

starting with step size h = 0.01. How does theaccuracy compare to your more sophisticated in-tegrator, for times less than t = 2? Zoom into the range 0 < y < 1, and compare with yourpackaged integrator and the true solution. Whichhas the more accurate period (say, by measuringthe time to the first minimum in y)? Which hasthe smaller energy drift (say, by measuring thechange in depth between subsequent minima)?

Fidelity is often far more important than accu-racy in numerical simulations. Having an al-gorithm that has a small time-step error butgets the behavior qualitatively wrong is less use-ful than a cruder answer that is faithful to thephysics. Leapfrog here is capturing the oscillat-ing behavior far better than vastly more sophisti-cated algorithms, even though it gets the periodwrong.How does Leapfrog do so well? Systems likeparticle motion in a potential are Hamiltoniansystems. They not only conserve energy, butthey also have many other striking properties

like conserving phase-space volume (Liouville’stheorem, the basis for statistical mechanics).Leapfrog, in disguise, is also a Hamiltonian sys-tem. (Eqns 4.26 can be viewed as a compositionof two canonical transformations – one advanc-ing the velocities at fixed positions, and one ad-vancing the positions at fixed velocities.) Henceit exactly conserves an approximation to the en-ergy – and thus doesn’t suffer from energy drift,satisfies Liouville’s theorem, etc. Leapfrog andthe related Verlet algorithm are called symplecticbecause they conserve the symplectic form thatmathematicians use to characterize Hamiltoniansystems.It is often vastly preferable to do an exact sim-ulation (apart from rounding errors) of an ap-proximate system, rather than an approximateanalysis of an exact system. That way, one canknow that the results will be physically sensible(and, if they are not, that the bug is in yourmodel or implementation, and not a feature ofthe approximation).

Page 107: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Quantum 5

Exercises

These exercises were designed for a graduate quantummechanics course at Cornell.

(5.1) Quantum Notation. (Notation) ©2For each entry in the first column, match thecorresponding entries in the second and third col-umn.

Schrodingerψ(x)ψ∗(x)|ψ(x)|2∫

φ∗(x)ψ(x)dx|∫φ∗(x)ψ(x)dx|2∫|ψ(x)|2dx

(~/2mi)(ψ∗∇ψ−ψ∇ψ∗)DiracA. 〈ψ|x〉p/2m〈x|ψ〉−〈x|ψ〉p/2m〈ψ|x〉B. Ket |ψ〉 in position basis, or 〈x|ψ〉C. Bra 〈ψ| in position basis, or 〈ψ|x〉D. 〈ψ|x〉〈x|ψ〉E. Braket 〈φ|ψ〉F. 〈ψ|ψ〉G. 〈ψ|φ〉〈φ|ψ〉PhysicsI. The number oneII. Probability density at xIII. Probability ψ is in state φIV. Amplitude of ψ at xV. Amplitude of ψ in state φVI. Current densityVII. None of these.

(5.2) Light Proton Atomic Size. (DimensionalAnalysis) ©3In this exercise, we examine a parallel worldwhere the proton and neutron masses are equalto the electron mass, instead of ∼2000 timeslarger.

In solving the hydrogen atom in your undergrad-uate quantum course, you may have noted thatin going from the 6-dimensional electron-protonsystem into the three-dimensional center-of-masscoordinates, the effective mass gets shifted to areduced mass mred = µ = 1/(1/me+1/mnucleus),and is otherwise the hydrogen potential with afixed (infinite-mass) nucleus. Let us assume thatthe atomic sizes and the excitation energies aredetermined solely by this mass shift.What is the reduced mass for the hydrogen atomin the parallel world of light protons, comparedto the electron mass? How much larger willthe atom be? How much will the binding en-ergy of the atom change? (You may approxi-mate Mp ∼ ∞ when appropriate.) (Units hint:[~] = ML2/T , [ke2] = Energy ∗ L = ML3/T 2,and [me] = M . Here k = 1 in CGS units, andk = 1/(4πǫ0) in SI units.)

(5.3) Light Proton Tunneling. (DimensionalAnalysis) ©3

Fig. 5.1 Atom tunneling. A hydrogen atom tun-nels a distance a0, breaking a bond of strength Ebind

equal to its ionization energy.

Page 108: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

98 Quantum

In this exercise, we continue to examine a paral-lel world where the proton and neutron massesare equal to the electron mass, instead of ∼2000times larger.With everything two thousand times lighter,will atomic tunneling become important? Let’smake a rough estimate of the tunneling suppres-sion (given by the approximate WKB formulaexp(−

√2MV Q/~)).

Imagine an atom hopping between two positions,breaking and reforming a chemical bond in theprocess – an electronic energy barrier, and anelectronic-scale distance. The distance will besome fraction of a Bohr radius a0 and the bar-rier energy will be some fraction of a Rydberg,but the the atomic mass would be some multipleof the proton mass.In our world, what would the suppression factorbe for an hydrogen atom of mass ∼Mp tunnel-ing through a barrier of height V of one Rydberg= ~2/(2mea

20), and width Q equal to the Bohr

radius a0? How would this change in the parallelworld where Mp → me? (Simplify your answeras much as possible.) (Use the real-world1 a0and Rydberg for the parallel world, not your an-swers from a previous exercise. Also please usethe simple formula above: don’t do the integral.Your answer should involve only two of the fun-damental constants.)

(5.4) Light Proton Superfluid. (DimensionalAnalysis) ©3In this exercise, we yet again examine a paral-lel world where the proton and neutron massesare equal to the electron mass, instead of ∼2000times larger.

Fig. 5.2 Atoms with imaginary box of size

equal to average space per atom

Atoms Bose condense when the number densityn is high and the temperature T is low. One canview this condensation point as where tempera-ture and the confinement energy become com-petitive. Up to a constant of order one, thethermal energy at the transition temperaturekBTc equals the confinement energy needed toput a helium atom in a box with impenetrablewalls of length the mean nearest-neighbor spac-ing L = n−1/3.Calculate the ratio of the confinement energyEHe,ours of Helium in our world and waterEH2O,light in the parallel world. Helium goes su-perfluid at ∼2K in our universe. From that,estimate the superfluid transition temperature Tc

for water in the parallel universe. Will it be su-perfluid at room temperature? (Assume the boxsizes for He in our world and water in the par-allel world are the same.2 Room temperature isabout 300K. Helium 4, the isotope that goessuperfluid, has two neutrons and two protons,with roughly equal masses. Oxygen 16 has eightprotons, eight neutrons, and eight electrons.)

(5.5) Aharonov-Bohm Wire. (ParallelTrans-port) ©3What happens to the electronic states in a thinmetal loop as a magnetic flux ΦB is threadedthrough it? This was a big topic in the mid-1980’s, with experiments suggesting that theloops would develop a spontaneous current, thatdepended on the flux ΦB/Φ0, with Φ0 = hc/ethe ‘quantum of flux’ familiar from the Bohm-Aharnov effect. In particular, Nandini Trivediworked on the question while she was a gradu-ate student here:

Nandini Trivedi and Dana Browne,‘Mesoscopic ring in a magnetic field:Reactive and dissipative response’,Phys. Rev. B 38, 9581-9593 (1988);

she’s now a faculty member at Ohio State.Some of the experiments clearly indicated thatthe periodicity in the current went as Φ0/2 =hc/2e – half the period demanded by Bohm andAharonov from fundamental principles. (Thisis OK; having a greater period would cause oneto wonder about fractional charges.) Others

1The reduced mass effects you found in the earlier exercise will be much less impor-tant for larger atoms and molecules, so we shall not include them here.2The number of molecules of water per unit volume is comparable to the number ofatoms of liquid helium per unit volume in the real world, and the water molecule willstay about the same size in the parallel world.

Page 109: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 99

found (noisier) periods of Φ0. Can we do a free-particle-on-a-ring calculation to see if for somereason we get half the period too?Consider a thin wire of radius R along x2+y2 =R2. Let a solenoid containing magnetic flux ΦB ,thin compared to R, lie along the z axis. Let φbe the angle around the circle with respect to thepositive x-axis. (Don’t confuse the flux ΦB withthe angle φ!) We’ll assume the wire confines theelectron to lie along the circle, so we’re solvinga one-dimensional Schrodinger’s equation alongthe coordinate s = Rφ around the circle. As-sume the electrons experience a random poten-tial V (s) along the circumference C = 2πR ofthe wire.(a) Choose a gauge for ~A so that it pointsalong φ. What is the one-dimensional time-independent Schrodinger equation giving theeigenenergies for electrons on this ring? Whatis the boundary conditions for the electron wave-function ψ at s = 0 and s = C? (Hint: the wireis a circle; nothing fancy yet. I’m not asking youto solve the equation – only to write it down.)Deriving the Bohm-Aharonov effect usingSchrodinger’s equation is easiest done using asingular gauge transformation.(b) Consider the gauge transformation Λ(r, φ) =−φΦB/(2π). Show that ~A′ = ~A + ∇Λ is zeroalong the wire for 0 < s < C, so that we areleft with a zero-field Schrodinger equation. Whathappens at the endpoint C? What is the newboundary condition for the electron wave func-tion ψ′ after this gauge transformation? Doesthe effect vanish for ΦB = nΦ0 for integer n, asthe Bohm-Aharonov effect says it should?Realistically, the electrons in a large, room-temperature wire get scattered by phonons orelectron-hole pairs (effectively, a quantum mea-surement of sorts) long before they propagatearound the whole wire, so these effects were onlyseen experimentally when the wires were cold(to reduce phonons and electron-hole pairs) and‘mesoscopic’ (tiny, so the scattering length iscomparable to or larger than the circumference).Finally, let’s assume free electrons, so V (s) = 0.What’s more, to make things simpler, let’s imag-ine that there is only one electron in the wire.

(c) Ignoring the energy needed to confine theelectrons into the thin wire, solve the one-dimensional Schrodinger equation to give theground state of the electron as a function of ΦB.Plot the current in the wire as a function of ΦB.Is it periodic with period Φ0, or periodic with pe-riod Φ0/2?In the end, it was determined that there weretwo classes of experiments. Those that mea-sured many rings at once (measuring an averagecurrent, an easier experiment) got periodicity ofhc/2e, while those that attempted the challengeof measuring one mesoscopic ring at a time findhc/e.

(5.6) Anyons. (Statistics) ©3Frank Wilczek, “Quantum mechanics of

fractional-spin particles”, Phys. Rev. Lett. 49,957 (1982).

Steven Kivelson, Dung-Hai Lee, andShou-Cheng Zhang, “Electrons in Flatland”,

Scientific American, March 1996.

In quantum mechanics, identical particles aretruly indistinguishable (Fig. 5.3). This meansthat the wavefunction for these particles mustreturn to itself, up to an overall phase, when theparticles are permuted:

Ψ(r1, r2, · · · ) = exp(iχ)Ψ(r2, r1, · · · ). (5.1)

where · · · represents potentially many otheridentical particles.We can illustrate this with a peek at an advancedtopic mixing quantum field theory and relativity.Here is a scattering event of a photon off an elec-tron, viewed in two reference frames; time is ver-tical, a spatial coordinate is horizontal. On theleft we see two ‘different’ electrons, one which iscreated along with an anti-electron or positrone+, and the other which later annihilates thepositron. On the right we see the same eventviewed in a different reference frame; here thereis only one electron, which scatters two photons.(The electron is virtual, moving faster than light,between the collisions; this is allowed in interme-diate states for quantum transitions.) The two

3 This idea is due to Feynman’s thesis advisor, John Archibald Wheeler. As Feynmanquotes in his Nobel lecture, I received a telephone call one day at the graduate college

at Princeton from Professor Wheeler, in which he said, “Feynman, I know why all

electrons have the same charge and the same mass.” “Why?” “Because, they are

all the same electron!” And, then he explained on the telephone, “suppose that the

Page 110: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

100 Quantum

electrons on the left are not only indistinguish-able, they are the same particle! The antiparticleis also the electron, traveling backward in time.3

Fig. 5.3 Feynman diagram: identical particles.

In three dimensions, χ must be either zero or π,corresponding to bosons and fermions. In two di-mensions, however, χ can be anything: anyonsare possible! Let’s see how this is possible.In a two-dimensional system, consider changingfrom coordinates r1, r2 to the center-of-mass vec-tor R = (r1 + r2)/2, the distance between theparticles r = |r2 − r1|, and the angle φ of thevector between the particles with respect to thex axis. Now consider permuting the two par-ticles counter-clockwise around one another, byincreasing φ at fixed r. When φ = 180 ≡ π, theparticles have exchanged positions, leading to aboundary condition on the wavefunction

Ψ(R, r, φ, · · · ) = exp(iχ)Ψ(R, r, φ+ π, · · · ).(5.2)

Permuting them counter-clockwise (back-ward along the same path) must then4 giveΨ(R, r, φ, · · · ) = exp(−iχ)Ψ(R, r, φ − π, · · · ).This in general makes for a many-valued wave-function (similar to Riemann sheets for complexanalytic functions).Why can’t we get a general χ in three dimen-sions?

(a) Show, in three dimensions, that exp(iχ) =±1, by arguing that a counter-clockwise rota-tion and a clockwise rotation must give the samephase. (Hint: The phase change between φand φ + π cannot change as we wiggle the pathtaken to swap the particles, unless the particleshit one another during the path. Try rotatingthe counter-clockwise path into the third dimen-sion: can you smoothly change it to clockwise?What does that imply about exp(iχ)?)

Fig. 5.4 Braiding of paths in two dimensions.In two dimensions, one can distinguish swappingclockwise from counter-clockwise. Particle statisticsare determined by representations of the Braid group,rather than the permutation group.

Figure 5.4 illustrates how in two dimensions ro-tations by π and −π are distinguishable; the tra-jectories form ‘braids’ that wrap around one an-other in different ways. You can’t change froma counter-clockwise braid to a clockwise braidwithout the braids crossing (and hence the par-ticles colliding).An angular boundary condition multiplying bya phase should seem familiar: it’s quite sim-ilar to that of the Bohm-Aharonov effect westudied in exercise 2.4. Indeed, we can imple-ment fractional statistics by producing compos-ite particles, by threading a magnetic flux tubeof strength Φ through the center of each 2D bo-son, pointing out of the plane.(b) Remind yourself of the Bohm-Aharonovphase incurred by a particle of charge e encirclingcounter-clockwise a tube of magnetic flux Φ. Ifa composite particle of charge e and flux Φ en-circles another identical composite particle, whatwill the net Bohm-Aharonov phase be? (Hint:

world lines which we were ordinarily considering before in time and space - instead

of only going up in time were a tremendous knot, and then, when we cut through the

knot, by the plane corresponding to a fixed time, we would see many, many world

lines and that would represent many electrons, except for one thing. If in one section

this is an ordinary electron world line, in the section in which it reversed itself and

is coming back from the future we have the wrong sign to the proper time - to the

proper four velocities - and that’s equivalent to changing the sign of the charge, and,

therefore, that part of a path would act like a positron.”4The phase of the wave-function doesn’t have to be the same for the swapped par-ticles, but the gradient of the phase of the wavefunction is a physical quantity, so itmust be minus for the counter-clockwise path what it was for the clockwise path.

Page 111: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 101

You can view the moving particle as being ina fixed magnetic field of all the other particles.The moving particle doesn’t feel its own flux.)(c) Argue that the phase change exp(iχ) uponswapping two particles is exactly half that foundwhen one particle encircles the other. How muchflux is needed to turn a boson into an anyon withphase exp(iχ)? (Hint: The phase change can’tdepend upon the precise path, so long as it braidsthe same way. It’s homotopically invariant, seechapter 9 of “Entropy, Order Parameters, andComplexity”.)Anyons are important in the quantum Hall ef-fect. What is the quantum Hall effect? At lowtemperatures, a two dimensional electron gas ina perpendicular magnetic field exhibits a Hallconductance that is quantized, when the fillingfraction ν (electrons per unit flux in units of Φ0)passes near integer and rational values.Approximate the quantum Hall system as abunch of composite particles made up of elec-trons bound to flux tubes of strength Φ0/ν. Asa perturbation, we can imagine later relaxin thebinding and allow the field to spread uniformly.5

(d) Composite bosons and the integerquantum Hall effect. At filling fraction ν =1 (the ‘integer’ quantum Hall state), what arethe effective statistics of the composite particle?Does it make sense that the (ordinary) resistancein the quantum Hall state goes to zero?

• The excitations in the fractional quan-tum Hall effect are anyons with fractionalcharge. (The ν = 1/3 state has excitationsof charge e/3, like quarks, and their wave-functions gain a phase exp(iπ/3) when ex-citations are swapped.)

• It is conjectured that, at some fillingfractions, the quasiparticles in the frac-tional quantum Hall effect have non-abelian statistics, which could become use-

ful for quantum computation.

• The composite particle picture is a centeraltool both conceptually and in calculationsfor this field.

(5.7) Bell.6 (Quantum,Qbit) ©3Consider the following cooperative game playedby Alice and Bob: Alice receives a bit x and Bobreceives a bit y, with both bits uniformly randomand independent. The players win if Alice out-puts a bit a and Bob outputs a bit b, such that(a+ b = xy)mod2. They can agree on a strategyin advance of receiving x and y, but no subse-quent communication between them is allowed.(a) Give a deterministic strategy by which Aliceand Bob can win this game with 3/4 probability.

(b) Show that no deterministic strategy lets themwin with more than 3/4 probability. (Note thatAlice has four possible deterministic strategies[0, 1, x,∼x], and Bob has four [0, 1, y,∼y], sotheres a total of 16 possible joint deterministicstrategies.)

(c) Show that no probabilistic strategy lets themwin with more than 3/4 probability. (In a prob-abilistic strategy, Alice plays her possible strate-gies with some fixed probabilities p0, p1, px, p∼x,and similarly Bob plays his with probabilitiesq0, q1, qy , q∼y.)

The upper bound of <= 75% of the time thatAlice and Bob can win this game provides, inmodern terms, an instance of the Bell inequal-ity, where their prior cooperation encompassesthe use of any local hidden variable.Let’s see how they can beat this bound of 3/4,by measuring respective halves of an entangledstate, thus quantum mechanically violating theBell inequality.7

5This is not nearly as crazy as modeling metals and semiconductors as non-interactingelectrons, and adding the electron interactions later. We do that all the time – ‘elec-tons and holes’ in solid-state physics, ’1s, 2s, 2p’ electrons in multi-electron atoms,all have obvious meanings only if we ignore the interactions. Both the compositeparticles and the non-interacting electron model are examples of how we use adia-

batic continuity – you find a simple model you can solve, that can be related to thetrue model by turning on an interaction.6This exercise was developed by Paul Ginsparg, based on an example by Bell ’64with simplifications by Clauser, Horne, Shimony, & Holt (’69).7There’s another version for GHZ state, where three people have to get a+b+c mod2 = x or y or z. Again one can achieve only 75% success classically, but they can winevery time sharing the right quantum state

Page 112: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

102 Quantum

Suppose Alice and Bob share the entangled state1√2(| ↑ 〉ℓ| ↑ 〉r + | ↓ 〉ℓ| ↓ 〉r), with Alice hold-

ing the left Qbit and Bob holding the rightQbit. Suppose they use the following strat-egy: if x = 1, Alice applies the unitary matrix

Rπ/6 =

(cos π

6− sin π

6

sin π6

cos π6

)to her Qbit, otherwise

doesn’t, then measures in the standard basis andoutputs the result as a. If y = 1, Bob applies

the unitary matrix R−π/6 =

(cos π

6sin π

6

− sin π6

cos π6

)

to his Qbit, otherwise doesn’t, then measures inthe standard basis and outputs the result as b.(Note that if the Qbits were encoded in photonpolarization states, this would be equivalent toAlice and Bob rotating measurement devices byπ/6 in inverse directions before measuring.)(d) Using this strategy: (i) Show that if x =y = 0, then Alice and Bob win the game withprobability 1.(ii) Show that if x = 1 and y = 0 (or vice versa),then Alice and Bob win with probability 3/4.(iii) Show that if x = y = 1, then Alice and Bobwin with probability 3/4.(iv) Combining parts (i)–(iii), conclude that Al-ice and Bob win with greater overall probabilitythan would be possible in a classical universe.

This proves an instance of the CHSH/Bell In-equality, establishing that “spooky action at adistance” cannot be removed from quantum me-chanics. Alice and Bob’s ability to win the abovegame more than 3/4 of the time using quantumentanglement was experimentally confirmed inthe 1980s (A. Aspect et al.).8

(e) (Bonus) Consider a slightly different strat-egy, in which before measuring her half ofthe entangled pair Alice does nothing or ap-plies Rπ/4, according to whether x is 0 or 1,and Bob applies Rπ/8 or R−π/8, according towhether y is 0 or 1. Show that this strategydoes even better than the one analyzed in a–c,with an overall probability of winning equal tocos2 π/8 = (1 +

√1/2)/2 ≈ .854.

(Extra bonus) Show this latter strategy is opti-

mal within the general class of strategies in whichbefore measuring Alice applies Rα0

or Rα1, ac-

cording to whether x is 0 or 1, and Bob appliesRβ0

or Rβ1, according to whether y is 0 or 1.

This will demonstrate that no local hidden vari-able theory can reproduce all predictions ofquantum mechanics for entangled states of twoparticles.

(5.8) Parallel Transport, Frustration, and theBlue Phase. (Liquid Crystals) ©3

“Relieving Cholesteric Frustration:The Blue Phase in a Curved Space,”J. P. Sethna, D. C. Wright and N. D.Mermin, Phys. Rev. Lett. 51, 467(1983).“Frustration, Curvature, and DefectLines in Metallic Glasses and theCholesteric Blue Phases,” James P.Sethna, Phys. Rev. B 31, 6278(1985).“Frustration and Curvature:the Orange Peel Carpet”,http://www.lassp.cornell.edu/sethna/FrustrationCurvature/“The Blue Phases, Frustrated LiquidCrystals and Differential Geometry”,http://www.lassp.cornell.edu/sethna/LiquidCrystals/BluePhase/BluePhases.html.

(Optional: for those wanting a challenge.) Boththe Aharonov Bohm effect and Berry’s phase(later) are generalizations of the idea of paralleltransport. Parallel transport, from differentialgeometry, tells one how to drag tangent vectorsaround on smooth surfaces. Just as we discov-ered that dragging the phase of a wavefunctionaround a closed loop in space gave it a net rota-tion due to the magnetic field (Aharonov-Bohm),the phase can rotate also when the Hamiltonianis changed around a closed curve in Hamilto-nian space (Berry’s phase). Here we discuss howvectors rotate as one drags them around closedloops, leading us to the curvature tensor.(a) Parallel transport on the sphere. Imag-ine you’re in St. Louis (longitude 90W, latitude

8Ordinarily, an illustration of these inequalities would appear in the physics literaturenot as a game but as a hypothetical experiment. The game formulation is more natu-ral for computer scientists, who like to think about different parties optimizing theirperformance in various abstract settings. As mentioned, for physicists the notion of aclassical strategy is the notion of a hidden variable theory, and the quantum strategyinvolves setting up an experiment whose statistical results could not be predicted bya hidden variable theory.

Page 113: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 103

∼ 40N), pointing north. You walk around a tri-angle, first to the North Pole, then take a right-angle turn, walk down through Greenwich, Eng-land and Madrid, Spain (longitude ∼ 0W) downto the equator, turn right, walk back to longitude90W along the equator, turn north, and walkback to St. Louis. All during the walk, you keeppointing your finger in the same direction as faras feasible (i.e., straight ahead on the first leg, offto the left on the second, and so on). What angledoes the vector formed by your final pointing fin-ger make with respect to your original finger? Ifyou turned west in Madrid and walked along thatlatitude (∼ 40 N) to St. Louis (yes, it’s just asfar north), would the angle be the same? (Hint:what about transport around a tiny quarter circlecentered at the north pole?)Lightning Intro to Differential Geometry. Paral-lel transport of vectors on surfaces is describedusing a covariant derivative (Div)

j = ∂ivj +

Γjikv

k involving the Christoffel symbol Γµiν . The

amount a vector vν changes when taken around atiny loop ∆s∆t(−∆s)(−∆t) is given by the Rie-mannian curvature tensor and the area of theloop

v′i − vi = Rijkℓv

j∆sk∆tℓ. (5.3)

The algebra gets messy on the sphere, though(spherical coordinates are pretty ugly).Instead, we’ll work with a twisty kind of paral-lel transport, that my advisor and I figured outdescribe how molecules of the Blue Phase like tolocally align. These long, thin molecules locallyline up with their axes in a direction n(r), andare happiest when that axis twists so as to makezero the covariant derivative

(Din)j = ∂in

j − qǫijknk, (5.4)

where ǫijk is the totally antisymmetric tensor9

with three indices (Fig. 5.5 and 5.6). Since theBlue Phases are in flat space, you don’t need toworry about the difference between upper andlower indices.

Fig. 5.5 Blue Phase Molecules, long thinmolecules aligned along an axis n(r), like to sit ata slight angle with respect to their neighbors. Toalign the threads and grooves on the touching sur-face between the molecules demands a slight twist.This happens because, like threaded bolts or screws,these molecules are chiral.

Fig. 5.6 Twisted Parallel Transport. The natu-ral parallel transport, Dinj = ∂inj − qǫijknk, twistsperpendicular to the long axis, but doesn’t twistwhen moving along the axis.

9Hence Γjik = −qǫijk. Warning: this connection has torsion: it’s not symmetric un-

der interchange of the two bottom indices. This is what allows it to have a non-zerocurvature even when the liquid crystal lives in flat space. But it will make some ofthe traditional differential geometry formulas incorrect, which usually presume zerotorsion.

Page 114: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

104 Quantum

(b) What direction will a molecule with n = xlike to twist as one moves along the x-direction?The y-direction? The z-direction? Is the locallow energy structure more like that of Fig. 5.7(a)(the low-temperature state of this system), orthat of Fig. 5.7(b)?

(a) (b)

Fig. 5.7 Local structures of cholesteric liquid crys-tals. (a) Helical. (b) Tube.

So, what’s the curvature tensor (eqn 5.3) for theblue phase? I figured out at the time that itcame out to be

Rijkℓ = q2(δiℓδkj − δikδℓj). (5.5)

This curvature means that the blue phases arefrustrated; you can’t fill space everywhere withmaterial that’s ‘happy’, with Dn = 0.(c) If n starts at x, and we transport alongthe tiny loop ∆x,∆y,−∆x,−∆y, calculate thevector n′ as it returns to the origin using the cur-vature tensor (eqns 5.3 and 5.5). according toeqn (5.3) Is the shift in n along the same direc-tion we observe for the larger loop in Fig 5.8?

Fig. 5.8 Parallel transport frustration. We cancheck your answer to part (c) qualitatively by think-ing about a larger loop (Fig. 5.8). Consider start-ing from two places separated ∆x = d apart alongthe axis of a double-twisting region (the center ofthe tube in Fig. 5.7(b)). If we move a distance∆y = π/(2q) radially outward, the orientation ofboth will now point along z. Transporting them eachinward by d/2 then will cause a further twist.

We can use the curvature tensor of the BluePhase to calculate a scalar curvature Rijij =−6q2. Thus the blue phases are negativelycurved, even though they live in flat space. Welooked to see what would happen if we put theblue phases into a positively curved space. Pick-ing the sphere in four dimensions with the cor-rect radius, we could make the curvature (andthe frustration) go away, and find the ideal tem-plate for the blue phases. We think of the realblue phase as pieces of this ideal template, cut,flattened, and sewn together to fill space, like anorange-peel carpet (Fig. 5.9).

Page 115: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 105

Fig. 5.9 Orange-peel carpet. (Copyright PamelaDavis Kivelson)

(5.9) Crystal field theory: d-orbitals. (Groupreps) ©3The vector space of functions f(x, y, z) on theunit sphere transforms into itself under rotationsf(x) →R f(R−1x). These transformations arelinear (af(x) + g(x) →R af(R−1x) + g(R−1x)),and obey the group composition rule, and thusform a representation of the rotation group.(a) Argue that the homogeneous polynomials ofdegree ℓ,

f(x, y, z) =ℓ∑

m=0

ℓ−m∑

n=0

fℓmnxmynzℓ−m−n (5.6)

form a subspace that is invariant under rota-tions.Thus the irreducible representations are con-tained in these invariant subspaces. Sakuraiindeed mentions in his section 3.11 on tensoroperators that Y 0

1 =√

3/πz/r and Y ±11 =

√3/2π(x ± iy)/r, and also gives a formula for

Y ±22 ; since r = 1 on our unit sphere these are

homogeneous polynomials of degree one.(b) Look up the ℓ = 2 spherical harmonics(e.g. in Sakurai’s appendix B) and write themas quadratic polynomials in x, y, and z.The ℓ = 2 spherical harmonics are the angu-lar parts of the wavefunctions for electrons in dorbitals (e.g. of transition metal atoms).10 Elec-trons in d-orbitals are much more tightly con-tained near the nucleus than p and s orbitals. Inmolecules and solids, the s and p orbitals usuallyhybridize (superimpose) into chemical bonds andbroad electron bands, where the original orbitalsare strongly distorted. In contrast, d-electronsrarely participate in chemical bonds, and theirelectron bands are narrow – almost undistortedorbitals with small hopping rates. The energylevels of the five d-orbitals are, however, shiftedfrom one another by their environments. (Forcrystals, these shifts are called crystal field split-tings.)We can use group representation theory to un-derstand how the d-orbitals are affected by theirmolecular or crystalline environment.First, we need to calculate the character χ(R) =χ(n, φ) of the ℓ = 2 representation. Rememberthat the character is the trace of the (here 5×5)matrix corresponding to the rotation. Remem-ber that this trace depends only on the conjugacyclass of R – that is, if S is some other group el-ement then χ(S−1RS) = χ(R). Remember thatany two rotations by the same angle φ are con-jugate to one another.11

In class, we found that the character of theℓ = 1 representation by using the Cartesian

10Here we use the common independent-electron language, where the complex many-body wavefunction of an atom, molecule, or solid is viewed as filling single-electronstates, even though the electron-electron repulsion is almost as strong as the electron-nuclear attraction. This idea can be dignified in three rather different ways. First, onecan view each electron as feeling an effective potential given by the nucleus plus theaverage density of electrons. This leads to mean-field Hartree-Fock theory. Second,one can show that the ground state energy can be written as an unknown functionalof the electron density (the Hohenberg-Kohn theorem, and then calculate the kineticenergy terms as an effective single-body Schrodinger equation in the resulting ef-fective potential due to the net electron density (the Kohn-Sham equations). Third,one can start with independent electrons (or Hartree-Fock electrons) and slowly ‘turnon’ the electron-electron repulsion. The independent-electron excited eigenstates de-velop lifetimes and become resonances. For atoms these lifetimes represent Augerdecay rates. For crystals these resonances are called quasiparticles and the theoryis called Landau Fermi liquid theory. Landau Fermi-liquid theory is usually derivedusing Greens functions and Feynman diagrams, but it has recently been re-cast as arenormalization-group flow.11If the two rotations have axes n and n′, choose S to rotate n′ into n.

Page 116: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

106 Quantum

x, y, z basis, where Rz(φ) =

(cos(φ) − sin(φ) 0sin(φ) cos(φ) 0

0 0 1

).

Hence χ(1)(φ) = 1 + 2 cos(φ). We can do thissame calculation in the mz basis of the sphericalharmonics, where Y 0

1 is unchanged under rota-tions and Y ±1

1 → exp(±iφ)Y ±11 . Here Rz(φ) =(

exp(iφ) 0 00 1 00 0 exp(−iφ)

), and again χ(1)(φ) = 1 +

2 cos(φ).(c) Calculate χ(2)(φ). Give the characters forrotations Cn by 2π/n, for n = 1, 2, 3, and 4(the important rotations for crystalline symme-try groups.)The most common symmetry groups for d-electron atoms in crystals is O, the octahedralgroup (the symmetry group of a cube). Lookup the character tables for the irreducible repre-sentations of this finite group. (To simplify thecalculation, we’ll assume that inversion symme-try is broken; otherwise we should use Oh, whichhas twice the number of group elements.)(d) Use the orthogonality relations of the charac-ters of irreducible representations for O, decom-pose the ℓ = 2 representation above into irre-ducible representations of the octahedral group.How will the energies of the single-particle d-orbitals of a transition metal atom split in an oc-tahedral environment? What will the degenera-cies and symmetries (A1, A2, E, . . . ) be of thedifferent levels? (Hint: If you are doing it right,the dot product of the characters should equal amultiple of the number of octahedral group el-ements o(O) = 24, and the dimensions of thesub-representations should add up to five.)The five d-orbitals are often denoted dxy, dxz,dyz, dz2 and dx2−y2 . This is a bit of a cheat;really dz2 should be written d2z2−x2−y2 or some-thing like that.(e) Figure out which of these orbitals are in eachof the two representations you found in part (d).(Hint: Check how these five orbitals transformunder the octahedral symmetries that permutex, y, and z among themselves.)

(5.10) Entangled Spins. (Spins) ©3In class, we studied the entanglement of the sin-glet spin state |S〉 = (1/

√2)(| ↑ 〉ℓ| ↓ 〉r − | ↓ 〉ℓ| ↑

〉r) of electrons of a diatomic molecule as theatoms L and R are separated;12 the spins on thetwo atoms are in opposite directions, but the sys-tem is in a superposition of the two choices. Herewe discuss another such superposition, but witha different relative phase for the two choices:

|χ〉 = (1/√2)(|↑ 〉ℓ|↓ 〉r + |↓ 〉ℓ|↑ 〉r) (5.7)

You should know from angular momentum ad-dition rules that the space of product wavefunc-tions of two spin 1/2 states can be decomposedinto a spin 1 and a spin 0 piece: 1/2 ⊗ 1/2 = 1⊕ 0.So there are two orthogonal eigenstates of Sz

with eigenvalue zero: one of total spin zero andone of total spin one.(a) Total spin. Which state is which? (If youdon’t know from previous work, calculate!) Whydo we call |S〉 a singlet?Now, is the spin wavefunction compatible withwhat we know about electron wavefunctions?(b) Symmetry. When the two spins are ex-changed, how does |χ〉 change? If the totalwavefunction Ψ(xL, sL, xR, sL) is a product ofthis spin wavefunction χ(sL, sR) and and a two-particle spatial wavefunction ψ(xL, xR), whatsymmetry must ψ have under interchange ofelectrons?We noted in class that two other spin-1 productstates, |↑ 〉ℓ|↑ 〉r and |↓ 〉ℓ|↓ 〉r do not form entan-gled states when L and R separate. Is |χ〉 likethese spin-1 states, or is it entangled like |S〉 is?(c) Entanglement. Give the Schmidt decom-position of |χ〉. What are the singular values?What is the entanglement entropy? (Hint: Thesteps should be very familiar from class.)(d) Singular Value Decomposition (SVD).Let M be the matrix which gives |χ〉 as a productof left and right spin states:

|χ〉 = ( |↑ 〉ℓ |↓ 〉ℓ )M(

|↑ 〉r|↓ 〉r

). (5.8)

What is M? Give an explicit singular value de-

12We assumed that, when separated, one electron is localized basically on each ofthe two atoms, and the spin kets are labeled based on the primary locale of thecorresponding spatial wavefunction for that electron.13Remember that the SVD guarantees that U and V have orthonormal columns, andΣ is a diagonal matrix whose diagonal elements σi are all positive and decreasing (soσi ≥ σi+1 ≥ 0). There is some flexibility in the singular vectors (i.e., matched pairscan both have their signs changed), but the singular values are unique and hence aproperty of the matrix.

Page 117: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 107

composition13 M = UΣV T of the matrixM . Ex-plain how the SVD gives us the Schmidt decom-position of part (c).

(5.11) Rotating Fermions. (Group theory) ©3In this exercise, we’ll explore the geometry of thespace of rotations.Spin 1/2 fermions transform upon rotations un-der SU(2), the unitary 2×2 matrices with deter-minant one. Vectors transform under SO(3), theordinary 3× 3 rotation matrices you know of.Sakurai argues that a general SU(2) matrixU =

(a b

−b∗ a∗

)with |a|2 + |b|2 = 1. Viewing

Re(a), Im(a),Re(b), Im(b) as a vector in fourdimensions, SU(2) then geometrically is the unitsphere S3 in R4.Remember that for every matrix R in SO(3),there are two unitary matrices U and −U cor-responding to the same physical rotation. Thematrix −U has coordinates (−a,−b) – it is theantipodal point on S3, exactly on the oppositeside of the sphere. So SO(3) is geometricallythe unit sphere with antipodal points identified.This is called (for obscure reasons) the projectiveplane, RP 3.Feynman’s plate (in Feynman’s plate trick) as itrotates 360 travels in rotation space from oneorientation to its antipode. While I’m not sureanyone has figured out whether arms, shoulders,and elbows duplicate the properties of fermionsunder rotations, the plate motion illustrates thepossibility of a minus sign.But we can calculate this trajectory ratherneatly by mapping the rotations not to the unitsphere, but to the space R3 of three-dimensionalvectors. (Just as the 2-sphere S2 can be pro-jected onto the plane, with the north pole goingto infinity, so can the 3-sphere S3 be projectedonto R3.) Remember the axis-angle variables,where a rotation of angle φ about an axis n isgiven by

exp(−iS·nφ/~) = exp(−iσ·nφ/2) = exp(−iJ·nφ/~)(5.9)

where the middle formula works for SU(2)(where S = ~σ/2, because the particles havespin 1/2) and the last formula is appropriate forSO(3).14 We figure n will be the direction of

the vector in R3 corresponding to the rotation,but how will the length depend on φ? Since allrotations by 360 are the same, it makes senseto make the length of the vector go to infinityas φ → 360. We thus define the Modified Ro-drigues coordinates for a rotation to be the vectorp = n tan(φ/4).(a) Fermions, when rotated by 360, develop aphase change of exp(iπ) = −1 (as discussed inSakurai & Napolitano p. 165, and as we illus-trated with Feynman’s plate trick). Give the tra-jectory of the modified Rodrigues coordinate forthe fermion’s rotation as the plate is rotated 720

about the axis n = z. (We want the continu-ous trajectory on the sphere S3, perhaps whichpasses through the point at∞. Hint: The trajec-tory is already defined by the modified Rodriguescoordinate: just describe it.)(b) For a general Rodrigues point p parameter-izing a rotation in SO(3), what antipodal pointp′ corresponds to the same rotation? (Hint: Arotation by φ and a rotation by φ+2π should beidentified.)

(5.12) Lithium ground state symmetry. (Quan-tum) ©3A simple model for heavier atoms, that’s surpris-ingly useful, is to ignore the interactions betweenelectrons (the independent electron approxima-tion).15

HZ =Z∑

i=1

p2i /2m − kεZe2/ri (5.10)

Remember that the eigenstates of a single elec-tron bound to a nucleus with charge Z are thehydrogen levels (ψZ

n = ψZ1s, ψ

Z2s, ψ

Z2p, . . . ), except

shrunken and shifted upward in binding energy(EZ more negative):

HZψZn = EZ

nψn

ψZn(r) = ψH

n (λrr)

EZ = λEEH (5.11)

(a) By what factor λr do the wavefunctionsshrink? By what factor λE do the energies grow?(Hint: Dimensional arguments are preferred overlooking up the formulas.)

14The factor of two is there for SU(2) because the spin is 1/2. For SO(3), infinitesi-

mal generators are antisymmetric matrices, so S(i)jk = J

(i)jk = i~ǫijk in the xyz basis;

in the usual quantum basis mz = (−1, 0, 1) the formula will be different.15Here kε = 1 in CGS units, and kε = 1/(4πε0) in SI units. We are ignoring theslight shift in effective masses due to the motion of the nucleus.

Page 118: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

108 Quantum

In the independent electron approximation, themany-body electron eigenstates are created fromproducts of single-electron eigenstates. ThePauli exclusion principle (which appears onlyuseful in this independent electron approxima-tion) says that exactly two electrons can fill eachof the single-particle states.(b) Ignoring identical particle statistics, showthat a product wavefunction

Ψ(r1, r2, r3, . . . ) = ψZn1

(r1)ψZn2

(r2)ψZn3

(r3) . . .(5.12)

has energy E =∑

iEZni.

The effect of the electron-electron repulsion inprinciple completely destroys this product struc-ture. But for ground-state and excited-statequantum numbers, the language of filling inde-pendent electron orbitals is quite useful.16 How-ever, the energies of these states are stronglycorrected by the interactions between the otherelectrons.(c) Consider the 2s and 2p states of an atomwith a filled 1s shell (one electron of each spin in1s states). Which state feels a stronger Coulombattraction from the nucleus? Argue heuristicallythat the 2s state will generally have lower (morenegative) energy and fill first.

Let’s check something I asserted, somewhat ten-tatively, in lecture. There I said that, for atomswith little spin-orbit coupling, the ground statewavefunction can be factored into a spatial anda spin piece:

Ψ(r1, s1; r2, s2; r3, s3; . . . )?= ψ(r1, r2, r3 . . . )χ(s1, s2, s3 . . . )

(5.13)We’ll check this in the first non-trivial case – thelithium atom ground state, in the independentelectron approximation. From part (c), we knowthat two electrons should occupy the 1s orbital,and one electron should occupy the 2s orbital.The two spins in the 1s orbital must be antipar-allel; let us assume the third spin is pointing up↑3:

Ψ0(r1, s1; r2, s2; r3, s3) = ψLi1s(r1)ψ

Li1s(r2)ψ

Li2s(r3) ↑1↓2↑3 .(5.14)

But this combination is not antisymmetric underpermutations of the electrons.(d) Antisymmetrize Ψ0 with respect to elec-trons 1 and 2. Show that the resulting stateis a singlet with respect to these two electrons.Antisymmetrize Ψ0 with respect to all three elec-trons (a sum of six terms). Does it go to zero(in some obvious way)? Can it be written as aproduct as in eqn 5.13?

(5.13) Matrices, wavefunctions, and group rep-resentations. (Group reps) ©3In this exercise, we shall explore the tensor prod-uct of two vector spaces, and how they transformunder rotations. We’ll draw analogies betweentwo examples: vectors → matrices and single-particle-states → two-particle-wavefunctions.The tensor product between two vectors is (v⊗w)ij = viwj . The tensor product between twosingle-particle wavefunctions ζ(x) for particle Aand φ(y) for particle B is the product wavefunc-tion Ψ(x, y) = ζ(x)φ(y). If H(A) and H(B) arethe Hilbert spaces for particles A and B, the ten-sor product space H(AB) = H(A) ⊗ H(B) is thespace of all linear combinations of tensor prod-ucts Ψ(x, y) = ζ(x)φ(y) of states in H(A) andH(B). Two-particle wavefunctions live in H(AB).Let ei = x, y, z be an orthonormal basis forR3, and let ζi and φj be orthonormal basesfor the Hilbert spaces H(A) and H(B) for parti-cles A and B.(a) Show that the tensor products Ψij(x, y) =ζi(x)φj(y) are orthonormal. (The dot productis the usual

∫dxdyΨ∗Ψ.) With some more work,

it is possible to show that they are also complete,forming an orthonormal basis of H(AB).)Suppose the two particles are both in states withtotal angular momentum LA = LB = 1, and arethen coupled with a small interaction. Angu-lar momentum addition rules then say that thetwo-particle state can have angular momentumL(AB) equal to 2 or 1 or 0: 1⊗ 1 = 2⊕ 1⊕ 0. Ingroup representation theory, this decompositioncorresponds to finding three subspaces that areinvariant under rotations.The tensor product space R3 ⊗ R3 are nor-mally written as 3×3 matrices Mij , where M =

16The excited states of an atom aren’t energy eigenstates, they are resonances, with afinite lifetime. If you think of starting with the independent electron eigenstates andgradually turning on the Coulomb interaction and the interaction with photons, thetrue ground state and the resonances are adiabatic continuations of the single-particleproduct eigenstates – inheriting their quantum numbers.

Page 119: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 109

∑3i=1

∑3j=1Mij eiej . Vectors transform under

L = 1, so we would expect matrices to decom-pose into L = 2, 1, and 0.(b) Show that the antisymmetric matrices, themultiples of the identity matrix, and the trace-less symmetric matrices are all invariant underrotation (i.e., R−1MR is in the same subspaceas M for any rotation R). Which subspace cor-responds to which angular momentum?(c) Consider the L = 1 subspace of matrices M .Provide the (standard) formula taking this spaceinto vectors in R3. Why are these called pseu-dovectors?I always found torque τ = r × F quite mysteri-ous. (Its direction depends on whether you areright- or left-handed!) Fundamentally, we seenow that this is because torque isn’t a vector –it is an antisymmetric 3×3 matrix.How does this relate back to quantum wavefunc-tions? Suppose our two L = 1 particles are iden-tical, with spins in the same state.(d) Which angular momentum states are allowedfor spin-aligned fermions? For spin-aligned orspinless bosons?Many physical properties are described by sym-metric matrices: the dielectric constant in elec-tromagnetism, the stress and strain tensors inelastic theory, and so on.

(5.14) Molecular Rotations. (Quantum) ©3In class, we estimated the frequency of atomicvibrations, by generating a simple model of anatom of mass AMP in a harmonic potentialwhose length and energy scales were set by elec-tron physics (a Bohr radius and a fraction of aRydberg). In the end, we distilled the answerthat atomic vibrations were lower in frequencythan those of electrons by a factor

√MP /me,

times constants of order one.Here we consider the frequencies of molecular ro-tations.(a) By a similar argument, derive the depen-dence of molecular rotation energy splittings onthe mass ratio MP /me.(b) Find some molecular rotation energy split-tings in the literature. Are they in the range youexpect from your estimates of part (a)?

(5.15) Propagators to Path Integrals. (PathInte-grals) ©3In class, we calculated the propagator forfree particles, which Sakurai also calculates

(eqn 2.6.16):

K(x′, t;x0, t0) =

√m

2πi~(t − t0)exp

[im(x′ − x0)

2

2~(t − t0)

].

(5.15)Sakurai also gives the propagator for the simpleharmonic oscillator (eqn 2.6.18):

K(x′, t;x0, t0) =

√mω

2πi~ sin[ω(t− t0)]

× exp[ imω

2~ sin[ω(t− t0)]

(5.16)

[(x′2 + x2

0) cos[ω(t− t0)]− 2x′x0

] ].

In deriving the path integral, Feynman approx-imates the short-time propagator in a potentialV (x) using the ‘trapezoidal’ rule:

K(x0+∆x, t0 +∆t;x0, t0) (5.17)

=N∆t exp

[i∆t

~

1/2m(∆x/∆t)2 − V (x0)

],

where the expression in the curly brackets is thestraight-line approximation to the Lagrangian1/2mx

2−V (x). Check Feynman’s approximation:is it correct to first order in ∆t for the free parti-cle and the simple harmonic osillator? For sim-plicity, let’s ignore the prefactors (coming fromthe normalizations), and focus on the terms in-side the exponentials.Taking t = t0 + ∆t and x′ = x0 + x∆t, expandto first order in ∆t the terms in the exponentialfor the free particle propagator (eqn 5.15) andthe simple harmonic oscillator (eqn 5.16). Dothey agree with Feynman’s formula? (Hint: Forthe simple harmonic oscillator, the first term isproportional to 1/∆t, so you’ll need to keep thesecond term to second order in ∆t.)

(5.16) Three particles in a box. (Quantum) ©3(Adapted from Sakurai, p. 4.1)Consider free, noninteracting particles of massm in a one-dimensional box of length L with in-finitely high walls.(a) What are the lowest three energies of thesingle-particle energy eigenstates?

If the particles are assumed non-interacting, thequantum eigenstates can be written as suitablysymmetrized or antisymmetrized single-particleeigenstates. One can use a level diagram, suchas in Fig. 5.10, to denote the fillings of the singleparticle states for each many-electron eigenstate.

Page 120: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

110 Quantum

Bosons

Fermions

Distinguished

Fig. 5.10 Level diagram, showing one of theground states for each of the three cases.

(b) If three distinguishable spin-1/2 particles ofthe same mass are added to the box, what is theenergy of the three-particle ground state? Whatis the degeneracy of the ground state? What isthe first three-particle energy eigenvalue abovethe ground state? Its degeneracy? The degener-acy and energy of the second excited state? Drawa level diagram for one of the first excited states,and one of the second excited states (the groundstate being shown on the left in Fig. 5.10).

(c) The same as part (b), but for three identicalspin-1/2 fermions.

(d) The same as part (b), but for three identicalspin-zero bosons.

(5.17) Rotation Matrices. (Math,×1) ©2A rotation matrix R takes an orthonormal basisx, y, z into another orthonormal triad u, v, w,with u = Rx, v = Ry, and w = Rz.(a) Which is another way to write the matrix R?

I. R =

(u1v1 + v1w1 +w1u1 · · ·

· · ·

)

II. R =

( u )( v )( w )

;

III. R =((

u) (

v) (

w))

;

IV. R = u⊗ v + v ⊗ w + w ⊗ uRotation matrices are to real vectors what uni-tary transformations (common in quantum me-chanics) are to complex vectors. A unitary trans-formation satisfies U†U = 1, where the ‘dagger’gives the complex conjugate of the transpose,U† = (UT )∗. Since R is real, R† = RT .(b) Argue that RTR = 1.Thus R is an orthogonal matrix, with transposeequal to its inverse.

(c) In addition to (b), what other condition dowe need to know that R is a proper rotation (i.e.,in SO(3)), and not a rotation-and-reflection withdeterminant -1?(I) u, v, and w must form a right-handed triad

(presuming as usual that x, y, and z are right-handed),(II) u · v × w = 1(III) w · u× v = 1(IV) All of the above

One of the most useful tricks in quantum me-chanics is multiplying by one. The operator|k〉〈k| can be viewed as a projection operator:|k〉〈k|ψ〉 is the part of |ψ〉 that lies along direc-tion |k〉. If k labels a complete set of orthogonalstates (say, the eigenstates of the Hamiltonian),then the original state can be reconstructed byadding up the components along the differentdirections: |ψ〉 =

∑k |k〉〈k|ψ〉. Hence the iden-

tity operator 1 =∑

k |k〉〈k|. We’ll use this toderive the path-integral formulation of quantummechanics, for example. Let’s use it here to de-rive the standard formula for rotating matrices.Under a change of basis R, a matrix A trans-forms to RTAR. We are changing from the basisx1, x2, x3 = |xi〉 to the basis |uj〉, with |un〉 =R|xn〉. Since |uj〉 = R|xj〉, we know 〈xi|uj〉 =〈xi|R|xj〉 = Rij , and similarly 〈ui|xj〉 = RT

ij .Let the original components of the operator A

be Akℓ = 〈xk|A|xℓ〉 and the new coordinates beA′

ij = 〈ui|A|uj〉.(d) Multipling by one twice into the formula forA′: A′

ij = 〈ui|1A1|uj〉 and expanding the firstand second identities in terms of xk and xℓ, de-rive the matrix transformation formula A′

ij =RT

ikAkℓRℓj = RTAR, where we use the Einsteinsummation convention over repeated indices..

(5.18) Trace. (Math,×1) ©2The trace of a matrix A is Tr(A) =

∑iAii = Aii

where the last form makes use of the Einsteinsummation convention.(a) Show the trace has a cyclic invariance:Tr(ABC) = Tr(BCA). (Hint: write it out asa sum over components. Matrices don’t com-mute, but products of components of matri-ces are just numbers, and do commute.) IsTr(ABC) = Tr(ACB) in general?Remember from exercise 5.17(b) that a rotationmatrix R has its inverse equal to its transpose,so RTR = 1, and that a matrix A transformsinto RTAR under rotations.

Page 121: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

Exercises 111

(b) Using the cyclic invariance of the trace, showthat the trace is invariant under rotations.Rotation invariance is the primary reason thatthe trace is so important in mathematics andphysics.

(5.19) Complex Exponentials. (Math) ©2You can prove the double angle formulas us-ing complex exponentials: just take the realand imaginary parts of the identity cos(A +B) + i sin(A + B) = ei(A+B) = eiAeiB =(cosA + i sinA)(cosB + i sinB) = cosA cosB −sinA sinB + i(cosA sinB + sinA cosB).In a similar way, use complex exponentials and(x + y)3 = x3 + 3x2y + 3xy2 + y3 to derive thetriple angle formulas. Which is true?(A) cos(3θ) = cos3(θ) − 3 cos(θ) sin2(θ),sin(3θ) = sin3(θ)− 3 cos2(θ) sin(θ);(B) sin(3θ) = cos3(θ) − 3 cos(θ) sin2(θ),cos(3θ) = 3 cos2(θ) sin(θ)− sin3(θ);(C) cos(3θ) = cos3(θ) − 3 cos(θ) sin2(θ),sin(3θ) = 3 cos2(θ) sin(θ)− sin3(θ);(D) sin(3θ) = 3 cos2(θ) sin(θ) + sin3(θ);cos(3θ) = cos3(θ) + 3 cos(θ) sin2(θ),(E) cos(3θ) = cos3(θ) − 3i cos(θ) sin2(θ),sin(3θ) = 3i cos2(θ) sin(θ)− sin3(θ).

(5.20) Dirac δ-functions. (Math) ©3Quantum bound-state wavefunctions are unitvectors in a complex Hilbert space. If there areN particles in 3 dimensions, the Hilbert space isthe space of complex-valued functions ψ(x) withx ∈ R3N whose absolute squares are integrable:∫dx|ψ(x)|2 <∞.

But what about unbound states? For example,the propagating plane-wave states ψ(x) = |k〉 ∝exp(−ikx) for a free particle in one dimension?Because unbound states are spread out over aninfinite volume, their probability density at anygiven point is zero – but we surely don’t want tonormalize |k〉 by multiplying it by zero.Mathematicians incorporate continuum statesby extending the space into a rigged Hilbertspace. The trick is that the unbound states forma continuum, rather than a discrete spectrum –so instead of summing over states to decomposewavefunctions |φ〉 = 1φ =

∑n |n〉〈n|φ〉 we inte-

grate over states φ = 1φ =∫dk |k〉〈k|φ〉. This

tells us how we must normalize our continuumwavefunctions: instead of the Kronecker-δ func-tion 〈m|n〉 = δmn enforcing orthonormal states,we demand |k′〉 = 1|k′〉 =

∫dk |k〉〈k|k′〉 =

|k′〉 =∫dk |k〉δ(k − k′) telling us that 〈k|k′〉 =

δ(k − k′) is needed to ensure the useful decom-position 1 =

∫dk |k〉〈k|.

Let’s work out how this works as physicists,by starting with the particles in a box, andthen taking the box size to infinity. For con-venience, let us work in a one dimensional box0 ≤ x < L, and use periodic boundary condi-tions, so ψ(0) = ψ(L) and ψ′(0) = ψ′(L). Thischoice allows us to continue to work with plane-wave states |n〉 ∝ exp(−iknx) in the box. (Wecould have used a square well with infinite sides,but then we’d need to fiddle with wave-functions∝ sin(kx).)(a) What values of kn are allowed by the peri-odic boundary conditions? What is the separa-tion ∆k between successive wavevectors? Does itgo to zero as L→ ∞, leading to a continuum ofstates?To figure out how to normalize our continuumwavefunctions, we now start with the relation〈m|n〉 = δmn and take the continuum limit. Wewant the normalization Nk of the continuumwavefunctions to give

∫∞−∞NkNk′ exp(−i(k′ −

k)x)dx = δ(k′ − k).(b) What is the normalization 〈x|n〉 =Nn exp(iknx) for the discrete wave-functionsin the periodic box, to make 〈n|n′〉 =∫ L

0NnNm exp(−i(k′n − kn)x)dx = δmn? Write

1 =∑

n |n〉〈n|, and change the sum toan integral in the usual way (

∫dk |k〉〈k| ≈∑

n |n〉〈n|∆k). Show that the normalization ofthe continuum wavefunctions must be Nk =1/

√2π, so ψk(x) = 〈x|k〉 = exp(ikx)/

√2π.

(Hint: If working with operators is confusing,ensure that 〈x|1|x′〉 for 0 < x, x′ < L is thesame for 1 =

∑n |n〉〈n| (valid in the periodic

box) and for 1 =∫dk|k〉〈k| (valid for all x).

Notice some interesting ramifications:I. The fact that our continuum plane waves havenormalization 1/

√2π incidentally tells us one

form of the δ function:

δ(k′ − k) = 〈k|k′〉 = 1

∫ ∞

−∞e−i(k′−k)xdx.

(5.18)Also, δ(x′ − x) = 1/(2π)

∫∞−∞ dk exp(ik(x′ − x)).

II. The same normalization is used for ‘posi-tion eigenstates’ |x〉, so 〈x′|x〉 = δ(x′ − x) and1 =

∫dx|x〉〈x|.

III. The Fourier transform can be viewed as achange of variables from the basis |x〉 to the ba-

Page 122: Copyright James P. Sethna 2011pages.physics.cornell.edu/sethna/StatMech/NewExercises.pdf · 4.13 Conservative differential equations: Accuracy and fidelity 94 ... 6.22 Hairpins

Copyright James P. Sethna 2011 -- Copyright James P. Sethna 2011 --

112 Quantum

sis |k〉:

φ(k) = 〈k|φ〉 = 〈k|1|φ〉

=

∫dx〈k|x〉〈x|φ〉 (5.19)

= 1/√2π

∫dx exp(−ikx)φ(x)

Note that this definition is differentfrom that I used in the appendix ofmy book (Statistical Mechanics: En-tropy, Order Parameters, and Complexity,http://pages.physics.cornell.edu/∼sethna/StatMech/EntropyOrderParametersComplexity.pdf);there the 2π is placed entirely on the inverseFourier transform, which here it is split symmet-rically between the two, so the inverse Fouriertransform is

φ(x) = 〈x|φ〉 = 〈x|1|φ〉 (5.20)

=

∫dk〈x|k〉〈k|φ〉 (5.21)

= 1/√2π

∫dk exp(ikx)φ(k).

IV. The Dirac δ-function can be written in manydifferent ways. It is basically the limit17 asǫ → 0 of sharply-peaked, integral-one functionsof width ǫ and height 1/ǫ centered at zero. Let’suse this to derive the useful relation:

limǫ→0+

1

x− iǫ= p.v.

1

x+ iπδ(x). (5.22)

Here all these expressions are meant to be insideintegrals, and p.v. is the Cauchy principal value

of the integral:18

p.v.

∫ ∞

−∞= lim

ǫ→0+

∫ −ǫ

−∞+

∫ ∞

ǫ

. (5.23)

(c) Note that ǫ/(x2+ǫ2) has half-width ǫ at half-maximum, height ǫ, and integrates to π, so basi-cally (i.e., in the weak limit) limǫ→0 ǫ/(x

2+ǫ2) =πδ(x). Argue that limǫ→0

∫f(x)/(x − iǫ)dx =

p.v.∫f(x)/xdx+ iπf(0). (Hints: The integral

of 1/(1 + y2) is arctan(y), which becomes ±π/2at y = ±∞. Multiply numerator and denomina-tor by x+ iǫ.)

(5.21) Eigen Stuff. (Math,×1) ©2Consider an operator for a two-state systemO =

(0 −4−4 6

)Its eigenvectors are |e1〉 = 1√

5( 21 )

and |e2〉 = 1√5(−1

2 )

(a) What are the associated eigenvalues o1 ando2?(b) Use |e1〉 and |e2〉 to construct a rotation ma-trix R that diagonalizes O, so RTOR =

(o1 00 o2

).

(Hint: See problem 5.17(a). We want R to ro-tate the axes into u = |e1〉 and v = |e2〉.) Whatangle does R rotate by?(c) Assume that the system is in a state |L〉 =( 10 ) Decompose |L〉 into the eigenvectors of O.

(Hint: As in exercise 5.17(d), multiplying |L〉 byone is useful.) If the observable corresponding tothe operator O is measured for state |L〉, whatis the probability of finding the value o1? Doesthe probability of finding either o1 or o2 sum toone?

17Clearly this is not a limit in the ordinary sense: the difference between functionsdoes not go to zero as ǫ goes to zero, but rather (within ǫ of the origin) has largepositive and negative values that cancel. It is a weak limit – when integrated againstany smooth functions, the differences go to zero.18If f(x) is positive at zero,

∫f(x)/x dx is the sum of minus infinity for x < 0 and

positive infinity for x > 0; taking the principal value tells us to find the sum of thesetwo canceling infinities by chopping them symmetrically about zero.