phys 559 - advanced statistical mechanics - …coish/courses/phys559/phys559...ii preface these are...

150
PHYS 559 - Advanced Statistical Mechanics Martin Grant Physics Department, McGill University c MG, 2004, 2009, version 1.0

Upload: others

Post on 01-Jun-2020

26 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

PHYS 559 - Advanced Statistical Mechanics

Martin GrantPhysics Department, McGill University

c© MG, 2004, 2009, version 1.0

Page 2: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

ii

Preface

These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught atMcGill for many years. I’m intending to tidy this up into a book, or rather the first half of a book.This half is on equilibrium, the second half would be on dynamics.

These were handwritten notes which were heroically typed by Ryan Porter over the summer of2004, and may someday be properly proof-read by me. Even better, maybe someday I will revise thereasoning in some of the sections. Some of it can be argued better, but it was too much trouble torewrite the handwritten notes. I am also trying to come up with a good book title, amongst otherthings. The two titles I am thinking of are “Dirty tricks for statistical mechanics”, and “Valhalla,we are coming!”. Clearly, more thinking is necessary, and suggestions are welcome.

While these lecture notes have gotten longer and longer until they are almost self-sufficient, itis always nice to have real books to look over. My favorite modern text is “Lectures on PhaseTransitions and the Renormalisation Group”, by Nigel Goldenfeld (Addison-Wesley, Reading Mass.,1992). This is referred to several times in the notes. Other nice texts are “Statistical Physics”, byL. D. Landau and E. M. Lifshitz (Addison-Wesley, Reading Mass., 1970) particularly Chaps. 1, 12,and 14; “Statistical Mechanics”, by S.-K. Ma (World Science, Phila., 1986) particularly Chaps. 12,13, and 28; and “Hydrodynamic Fluctuations, Broken Symmetry, and Correlation Functions”, byD. Forster (W. A. Benjamin, Reading Mass., 1975), particularly Chap. 2. These are all great books,with Landau’s being the classic. Unfortunately, except for Goldenfeld’s book, they’re all a little outof date. Some other good books are mentioned in passing in the notes.

Finally, at the present time while I figure out what I am doing, you can read these notes as manytimes you like, make as many copies as you like, and refer to them as much as you like. But pleasedon’t sell them to anyone, or use these notes without attribution.

Page 3: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Contents

1 Introduction 1

1.1 Random variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.1 Gaussian distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Multivariate distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2.1 Gaussian distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Large systems, and independence of parts 7

2.1 Independence of parts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Self averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3.1 Probability of the sum of two independent variables . . . . . . . . . . . . . . 122.3.2 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4 Correlations in Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.5 (*) Central Limit Theorem through explicit calculation of moments . . . . . . . . . . 15

3 Self similarity and Fractals 19

3.1 Von Koch Snowflake . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2 Sierpinskii Triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.3 Correlations in Self-Similar Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4 Review of Statistical Mechanics and Fluctuations 25

4.1 Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.2 Statistical Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.2.1 Closed system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.2.2 System at constant temperature . . . . . . . . . . . . . . . . . . . . . . . . . 294.2.3 Other ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.3 Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.3.1 Simple mechanical example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.4 Thermodynamic Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.4.1 Correlation functions and response functions . . . . . . . . . . . . . . . . . . 35

5 Fluctuations of Surfaces 37

5.1 Lattice models of fluctuating surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.2 A continuum model of fluctuating surface . . . . . . . . . . . . . . . . . . . . . . . . 40

5.2.1 Review of Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.2.2 Height-height correlation function . . . . . . . . . . . . . . . . . . . . . . . . 445.2.3 Partition function, free energy, and surface tension . . . . . . . . . . . . . . . 50

5.3 Impossibility of Phase Coexistence in d=1 . . . . . . . . . . . . . . . . . . . . . . . . 525.4 Order of magnitude calculations in d=3 . . . . . . . . . . . . . . . . . . . . . . . . . 535.5 (*) A different view of gradient terms . . . . . . . . . . . . . . . . . . . . . . . . . . 55

6 Broken Symmetry and Correlation Functions 57

6.1 Goldstone’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

iii

Page 4: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

iv CONTENTS

7 Equilibrium correlation functions and Scattering 63

7.1 Measurement of g(~r) by diffraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647.2 Scattering off a flat interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677.3 Scattering from a line defect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 697.4 Scattering from rough interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717.5 (*) Scattering from many planar and randomly oriented surfaces . . . . . . . . . . . 72

8 Fluctuations and Crystalline Order 77

8.1 Calculation of 〈ρ〉 and 〈ρ(0)ρ(~r)〉 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 798.1.1 Scattering intensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

9 Ising Model 85

9.1 (*) Lattice Gas Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 899.2 One dimensional Ising Model. Transfer matrix solution . . . . . . . . . . . . . . . . . 89

9.2.1 No magnetic field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 909.2.2 One dimensional Ising model in a magnetic field . . . . . . . . . . . . . . . . 929.2.3 Pair correlation function. Exact solution . . . . . . . . . . . . . . . . . . . . . 92

9.3 Mean Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

10 Continuum theories of the Ising model 101

10.1 Landau theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10310.2 Ornstein-Zernicke theory of correlations . . . . . . . . . . . . . . . . . . . . . . . . . 10610.3 Ginzburg criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

11 Scaling Theory 111

11.1 Scaling With ξ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11511.1.1 Derivation of exponent equalities . . . . . . . . . . . . . . . . . . . . . . . . . 115

12 Renormalization Group 119

12.1 Decimation in one dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11912.2 Block scaling renormalization group . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

12.2.1 Monte Carlo Renormalization Group . . . . . . . . . . . . . . . . . . . . . . . 12612.2.2 Scaling form, and critical exponents . . . . . . . . . . . . . . . . . . . . . . . 127

12.3 φ4 model: dimensional analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12812.4 Momentum shell RG. Gaussian fixed point . . . . . . . . . . . . . . . . . . . . . . . . 12912.5 (*) ǫ expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

13 Master Equation and the Monte Carlo method 139

13.1 Other ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

Page 5: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 1

Introduction

Why is a coin toss random?

The response that you would receive in a Statistical Mechanicscourse is more or less as follows: individual states follow the “equal apriori probability” postulate. All many-body quantum states withthe same energy (for a given fixed volume) have equal weight orprobability when computing macroscopic averages.

This is a mouthful, but at least it gives the right answer. Thereare two microscopic states, namely HEADS and TAILS, and sincethey are energetically equivalent, they appear with equal probabilityafter a coin toss. So the result of the toss is “random”.

However nice it is to get the right answer, and however satisfying it is to use this prescription inthe many other problems which we will cover in this class, it is somewhat unsatisfying. It is worthgoing a little deeper to understand how microscopic states (HEADS or TAILS) become macroscopicobservations (randomly HEADS or TAILS).

To focus our thinking it is useful to do an experiment. Take a coin and flip it, always preparingit as heads. If you flip it at least 10–20cm in the air, the result is random. If you flip it less than thisheight, the result is not random. Say you make N attempts, and the number of heads is nheads andthe number of tails is ntails, for different flipping heights h. Then you obtain something like this:

h

1

0 2 4 6 8 10

Figure 1.1: Bias of flips with height h

1

Page 6: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

2 CHAPTER 1. INTRODUCTION

There are a lot of other things one could experimentally measure. In fact, my cartoon shouldlook exponential, for reasons we will see later on in the course.

Now we have something to think about: Why is the coin toss biased for flips of small height? Instatistical mechanics language we would say, “why is the coin toss correlated with its initial state?”Remember the coin was always prepared as heads. (Of course, if it was prepared randomly, why flipit?) There is a length scale over which those correlations persist. For our example, this scale is ∼2–3cm, which we call ξ, the correlation length. The tailing off of the graph above follows:

Correlations ∼ e−h/ξ

which we need more background to show. But if we did a careful experiment, we could show this istrue for large h. It is sensible that

ξ ≥ coin diameter.

Coin tosses appear random because we flip the coin to a much larger height than the correlationlength ξ. Therefore we learn that true randomness requires ξ/h→ 0, which is analogous to requiringan infinite system in the thermodynamic limit. [In fact, in passing, note that for ξ/h = O(1) theoutcome of the flip (a preponderance of heads) does not reflect the equivalence of the microscopicstates (HEADS and TAILS). To stretch the argument a bit, we would call the preponderance ofHEADS a “broken symmetry”; that is the macroscopic state (preponderance of HEADS) has alower symmetry than the microscopic state (in which HEADS and TAILS are equally likely). Thisalso happens in phase transitions, where ξ/L = O(1) because ξ ≈ L as L → ∞, where L is thelinear dimension of the system. It also happens in quantum mesoscopic systems, where ξ/L = O(1)because L ≈ ξ and L is somewhat small. This latter case is more like the present example of cointossing.]

But what happened to Newton’s laws and the fact that one would expect to be able to calculatethe trajectory of the coin exactly, given some initial condition? It is true that given an initial config-uration, and a precise force, we can calculate the trajectory of the coin if we neglect inhomogeneitiesin the air. However, your thumb cannot apply a precise force for flipping to within one correlationlength of height, and in practice computing the trajectory by using Newton’s laws is impossible. Theanalogous physical observation is that given a macroscopic state (say, at a fixed pressure or energy)it does not determine the underlying microscopic states. Many microscopic states are completelyconsistent with the specified macroscopic state.

These ideas, correlations on a microscopic scale, broken symmetries at a macroscopic scale, andcorrelation lengths, will come up throughout the rest of the course. The rest of the course followsthis scenario:

• Fluctuations and correlations: What can we do without statistical mechanics?

• What can we do with thermodynamics and statistical mechanics?

• Detailed study of interfaces.

• Detailed study of phase transitions.

The topics we will cover are as current as any issue of Physical Review E. Just take a look.

1.1 Random variables

In order to define a random variable x one needs to specify

1. The set of values that x can adopt (the sample space, or phase space later). An event is asubset of outcomes from the sample space. For example, in casting a die, the sample space is1, 2, 3, 4, 5, 6. A possible event E would be that the outcome be an even number E = 2, 4, 6.

Page 7: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

1.1. RANDOM VARIABLES 3

2. Each event (or single outcome) is assigned a probability p(E) (or p(x)). For any event or valueof x, we have,

p(x) ≥ 0

dxp(x) = 1.

and the additive property:p(E1 or E2) = p(E1) + p(E2).

We interpret that p(x)dx is the probability that the value of x lies between x and x+ dx.

Of course, the integral above assumes that the sample space is continuous. In many cases, thevalues that x can adopt are discrete. In this case we could write,

p(x) =∑

n

pnδ(x− xn),

so that ∫

dxp(x) =∑

n

pn = 1,

where pn is the probability of occurrence of xn.Given a function of x, f(x), we define its average as

〈f(x)〉 =

f(x)p(x)dx.

In particular, we introduce the moments of x as

µ1 = 〈x〉 =

xp(x)dx

or the average of x,

µn = 〈xn〉 =

xnp(x)dx,

or the nth moment of x. A special role is played by the variance

σ2 = 〈(x− 〈x〉)2〉 = µ2 − µ21.

The quantity σ is also known as the standard deviation.Note, in passing, that not all probability distributions have variances or higher moments. See

for example,

p(x) =1

π

γ

(x− a)2 + γ2, −∞ < x <∞.

Although the distribution has a peak at x = a and is symmetric around that value, not even 〈x〉exists.

We next define the characteristic function G(k) as the Fourier transform of the probability p(x):

G(k) =

dxeikxp(x) = 〈eikx〉. (1.1)

Expanding the exponential in Taylor series leads to the following result:

G(k) =

∞∑

k=0

(ik)m

m!µm. (1.2)

It is also customary to define the cumulants of x through the following relation,

lnG(k) =

∞∑

m

(ik)m

m!〈〈xm〉〉,

Page 8: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

4 CHAPTER 1. INTRODUCTION

where 〈〈xm〉〉 is the mth cumulant of x. The lowest order cumulants are as follows,

〈〈x〉〉 = µ1

〈〈x2〉〉 = µ2 − µ21 = σ2

〈〈x3〉〉 = µ3 − 3µ2µ1 + 2µ31

〈〈x4〉〉 = µ4 − 4µ3µ1 − 3µ22 + 12µ2µ

21 − 6µ4

1.

1.1.1 Gaussian distribution

A very common distribution in what follows is the Gaussian. It is given by

p(x) =1√2πe−x

2/2.

With this definition, one has

µ2n+1 = 0,

µ2n = (2n− 1)!! = (2n− 1)(2n− 3)(2n− 5) . . . 1,

〈〈xm〉〉 = 0, m > 2.

1.2 Multivariate distributions

Most of what we will do in what follows, involves multidimensional (or infinite dimensional) spaces.Let x1, x2, . . . , xr be random variables, one defines p(x1, x2, . . . , xr) as the joint probability distri-bution. This is the probability that all the variables, x1, x2, . . . , xr have exactly a specified value. Itis also useful to define the marginal probability as a partial integral,

p(x1, . . . xs) =

dxs+1 . . . dxrp(x1, x2, . . . , xr).

Or course, this is the probability that x1, . . . xs have specified values, for any value of the remainingvariables xs+1, . . . , xr.

A conditional probability can be defined p(x1, . . . xs|xs+1, . . . xr) which is the probability thatx1, . . . , xs have specified values, given also specified values of xs+1, . . . , xr. Bayes Theorem assertsthat

p(x1, . . . xr) = p(x1, . . . , xs|xs+1, . . . , xr)p(xs+1, . . . , xr).

We introduce now the concept of statistical independence. The sets x1, . . . , xs and xs+1, . . . , xrare said to be statistically independent if

p(x1, . . . xr) = p(x1, . . . , xs)p(xs+1, . . . , xr),

that is, the joint probability factorizes. An equivalent statement is that

p(x1, . . . , xs|xs+1, . . . xr) = p(x1, . . . , xs).

so that the distribution of x1, . . . , xs is not affected by prescribing different values of xs+1, . . . , xr.Generalized moments are introduced now as

〈xm1 xn2 〉 =

dx1 . . . dxrxm1 x

n2p(x1, . . . , xr)

Note that if x1 and x2 are independent, then

〈xm1 xn2 〉 = 〈xm1 〉〈xn2 〉,

Page 9: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

1.2. MULTIVARIATE DISTRIBUTIONS 5

that is, multivariate moments factorize into products of the moments of the individual variables.One therefore defines correlations as deviations from statistical independence,

〈〈x1x2〉〉 = 〈x1x2〉 − 〈x1〉〈x2〉.

The quantity 〈〈x1x2〉〉, formally analogous to a cumulant, is identically zero if x1 and x2 are statisti-cally independent. A nonzero cumulant indicates a statistical correlation between the two variables.

In general, if two variables x1 and x2 are independent, then

〈〈xm1 xn2 〉〉 = 0 m 6= 0 and n 6= 0.

1.2.1 Gaussian distribution

A multivariate Gaussian distribution is given by

p(x1, . . . , xr) = Ce−12

P

Aijxixj−P

Bixi ,

where the matrix A has constant coefficients and is positive definite, and the vector B also hasconstant coefficients. The normalization constant is given by

C = (2π)−r/2√|A|e− 1

2B·A−1·B.

Averages are given by,

〈xi〉 = −∑

(A−1)ijBj ,

and second moments by,〈xixj〉 − 〈xi〉〈xj〉 = 〈〈xixj〉〉 = (A−1)ij .

Therefore, if the variables are uncorrelated, the matrix A−1 is diagonal.An important result concerns the sum of random variables that are Gaussianly distributed. In

general, if x1, . . . , xr are random variables, and we define y = x1 + . . .+ xr, then

〈〈y〉〉 =∑

i

〈〈xi〉〉 〈〈y2〉〉 =∑

ij

〈〈xixj〉〉.

If, on the other hand, x1, . . . , xr are Gaussianly distributed, then y = x1 + . . .+xr is also Gaussianlydistributed. If, in addition, all the xi are statistically independent, then

〈y〉 =∑

〈xi〉 σ2y =

σ2xi

Page 10: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

6 CHAPTER 1. INTRODUCTION

Page 11: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 2

Large systems, and independenceof parts

The systems that we will study are composed of N entities (particles, most often), with N verylarge (macroscopic). In order to accomplish a statistical description of such a system, we must firstintroduce the sample space. It is over this space (often called phase space or configuration space)that probabilities need to be defined, and averages calculated.

In order to put in perspective the difficulty of the task of just enumerating the states (samplepoints), and then computing averages, we recall a few results from Statistical Mechanics. In thecanonical ensemble, the probability of being in state ν is

pν =1

Ze−Eν/kBT (2.1)

where the partition function is

Z =∑

νe−Eν/kBT , (2.2)

where the sum extends over all possible microscopic states of the system. The connection to ther-modynamics is from the Helmholtz free energy F = E − TS where S is the entropy, and

e−F/kBT = Z (2.3)

Note that since F is extensive, that is F ∼ O(N), we have

Z = (e−fkBT )N

where F = fN . This exponential factor of N arises from the sum over states which is typically

Number of states =∑

ν∼ eO(N) (2.4)

For example, for a set of discrete magnetic dipoles which each can have an orientation parallelanti parallel to a given direction, there are evidently

dipole #1× dipole #2...dipole #N = 2× 2× ...2 = 2N = eN ln 2 (2.5)

total states.Another example concerns the total number of states of a set of N particles in a volume V

V N

N !≈ V N

NN= eN ln V

N (2.6)

where the factor N ! accounts for indistinguishability.

7

Page 12: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

8 CHAPTER 2. LARGE SYSTEMS, AND INDEPENDENCE OF PARTS

These numbers are huge in a way which cannot be easily imagined. Microscopic sizes are O(A)and microscopic times O(10−12sec). Astronomical sizes, like the size and age of the universe areO(1030m) and O(1018s) respectively. The ratio of these numbers is

1030m

10−10m∼ 1040 (2.7)

This is in no way comparable to (!)

e1023

(2.8)

the number of states for a system of N = 1023 particles. In fact, it seems a little odd (and unfair) thata theorist must average over eN states to explain what an experimentalist does in one experiment.

2.1 Independence of parts

In order to illustrate the concept of independence of parts (or subsystems) of a macroscopic system,it is worth thinking about experiments a bit. Say an experimentalist measures the speed of soundc in a sample, as shown. When doing the experiment, the researcher obtains some definite value c.

Figure 2.1: Experiment 1 on the whole system.

Afterwards, the experimentalist might cut up the sample into many parts so that his or her friendscan repeat the same experiment. That is as shown in Fig. 2.2. Each experimenter will measure the

Figure 2.2: Experiment 2, on many parts of the same system as in experiment 1.

sound speed as shown, and they will all obtain slightly different values, but all close to c, the valueobtained for the whole system. Why are the values different ? Which one is more accurate ?

It would seem that in experiment 2, since four independent values are obtained, averaging overthem would give a more accurate determination of c because improved statistics. In fact, one can

Page 13: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

2.1. INDEPENDENCE OF PARTS 9

continue breaking up the system like this, as shown in Fig. 2.3, and potentially keep improving thestatistical accuracy in the determination of c for the material in question.

Figure 2.3: Experiment 3

Of course, one would expect that the expected value (and accuracy) of the determination of c inone experiment on the big system must be equivalent to many experiments conducted on individualparts or subsystems of it, as long as these individual parts are independent. By this it is meant thata system “self-averages” all its independent parts: the value obtained for the system as a whole is anaverage over the values obtained over the independent parts or subsystems. You can visualize this byimagining that in order to measure c a sound wave must traverse the entire system so that, in effect,the speed of propagation over the entire system will be an average over the speeds of propagationin all the subsystems along the path of the wave. To the extent that they are independent, then thespeed measured over the whole system will be a simple average over the speeds in the various parts.

The number of independent parts N of a given system (to temporarily use the same notation asif they were particles) can be estimated to be

N = O(L3

ξ3) (2.9)

for a three dimensional system of linear scale L. The new quantity introduced here is the correla-tion length ξ, the length over which the motion of the microscopic constituents of our system arecorrelated, and hence cannot be taken as statistically independent. The correlation length mustbe at least as large as the size of an atom so ξ ≥ O(A), and indeed usually ξ ≈ OA. In short, ifthe microscopic motion of the system constituents is correlated over a volume of the order of ξ3,then our macroscopic system of volume L3 can be approximately decomposed in V 3/ξ3 independentsubsystems.

Page 14: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

10 CHAPTER 2. LARGE SYSTEMS, AND INDEPENDENCE OF PARTS

2.2 Self averaging

In this section we explore the mathematical consequences of independence and self averaging, andin particular the consequences of the Central Limit Theorem of Probability Theory on StatisticalMechanics.

In order to do so, consider an extensive thermodynamic variable X. If a system consists of Nindependent parts as shown, then

i = 6

i = Ni = 8i = 7

i = 4 i = 5

i = 3i = 2i = 1

Figure 2.4: N independent parts

X =

N∑

i=1

Xi (2.10)

where Xi is the value of X in part i. This definition applies to any microscopic configuration of thesystem. The statistical average of X is

〈X〉 =

N∑

i=1

〈Xi〉 (2.11)

The average 〈·〉 is understood over all possible microscopic configurations of the system, with prob-ability distribution given by some appropriate statistical ensemble. Of course, the expectation isthat in equilibrium, the statistical average 〈X〉 coincides with the spatial average of X over all theindependent parts.

More precisely, the statistical average must equal

〈X〉 =

∫ T

0dtX(t)

T=

∫ T

0dt∑Ni=1Xi(t)

T, (2.12)

where T is a microscopically very long time (although it could be macroscopically short). Of course,

X

t

〈X〉

τ

〈(X − 〈X〉)2〉1/2

Figure 2.5: Time Average

X(t) is a fluctuating quantity as the system evolves over phase space.

The concept of self averaging can now be stated as follows: The values of X(t) over the entiresystem deviate negligibly from 〈X〉 when N is large and the parts are independent. If so, thestatistical average equals the sum over the subsystems. In order to prove this statement, let uscalculate the deviations of X from the average, 〈(X − 〈X〉)2〉1/2 (see Fig. 2.5. First define ∆X =

Page 15: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

2.3. CENTRAL LIMIT THEOREM 11

X − 〈X〉 so that 〈∆X〉 = 〈X〉 − 〈X〉 = 0. The variance is

〈(∆X)2〉 = 〈(X − 〈X〉)2〉= 〈X2 − 2X〈X〉+ 〈X〉2〉= 〈X2〉 − 〈X〉2

(2.13)

Giving the condition, in passing,

〈X2〉 ≥ 〈X〉2 (2.14)

since 〈(∆X)2〉 ≥ 0. By using the definition of X in terms of a sum over parts, we find,

〈(∆X)2〉 = 〈N∑

i=1

∆Xi

N∑

j=1

∆Xj〉,

which can be rewritten as,

〈(∆X)2〉 = 〈N∑

i=1(i=j)

(∆Xi)2〉+ 〈

N∑

i=1j=1(i6=j)

(∆Xi)(∆Xj)〉 (2.15)

But part i is independent of part j, so that

〈∆Xi∆Xj〉 = 〈∆Xi〉〈∆Xj〉 = 0. (2.16)

Therefore,

〈(∆X)2〉 =

N∑

i=1

〈(∆Xi)2.〉

Since the standard deviation in each subsystem 〈(∆Xi)2〉 is finite and independent of N (say of

O(1)), then

〈(∆X)2〉 = O(N), (2.17)

or that the variance of X is proportional to the number of independent parts N . This implies thatthe size of the fluctuations relative to the average,

〈(∆X)2〉〈X〉2 = O(

1

N), (2.18)

and hence the relative size of fluctuations is very small if N is large. This is the self averaging result.Note that, since N = V/ξ3, we have

〈(∆X)2〉〈X〉2 = O(

ξ3

V) (2.19)

so that fluctuations measure microscopic correlations. They would only be appreciable in the special

limit in which ξ3

V ∼ O(1), that is, when the system cannot be decomposed in independent parts.This occurs in quantum mesoscopic systems (where V is small), and in phase transitions (where ξis large).

2.3 Central Limit Theorem

In this section we show that the limiting distribution for X when the system is comprised of a largenumber of independent parts is a Gaussian distribution. This is the so called Central Limit theorem.

Page 16: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

12 CHAPTER 2. LARGE SYSTEMS, AND INDEPENDENCE OF PARTS

2.3.1 Probability of the sum of two independent variables

Consider x1 and x2 to be two independent variables, and define its sum y = x1 + x2. By definition,

p(y)∆y =

y<x1+x2<y+∆y

dx1dx2p(x1, x2), (2.20)

or

p(y) =

dx1dx2p(x1, x2)δ(y − x1 − x2). (2.21)

If x1 and x2 are independent, then p(x1, x2) = px1(x1)px2

(x2). Hence

p(y) =

dx1px1(x1)px2

(y − x1). (2.22)

The probability of the sum is the convolution of the individual probability distributions. By takingthe Fourier transform, one has

Gy(k) = Gx1(k)Gx2

(k). (2.23)

For the particular case of Gaussian variables, we note that if p(x) is a Gaussian distribution ofmean µ and variance σ2, its characteristic function is

Gx(k) = eiµk−12σ

2k2

. (2.24)

Therefore,

Gy(k) = ei(µ1+µ2)k− 12 (σ2

1+σ22)k2

. (2.25)

2.3.2 Central Limit Theorem

Consider a set of N independent variables Xi. We assume for simplicity that 〈X1〉 = 0 (i.e.,we are really considering the variable ∆Xi defined earlier). Otherwise, Xi is characterized by a

completely arbitrary distribution function, not necessarily a Gaussian. We now define y =∑Ni=1Xi

and z = y/√N . Note that

Gz(k) = Gy

(k√N

)

=

[

GX1

(k√N

)]N

.

The first equality follows from the definition of z, and the second from the fact that the variablesXi are independent. We have further assumed that all the distributions pXi are identical. This isthe appropriate case for a macroscopic physical system. By definition,

Gz(k) =

dzeikzp(z) =

∫dy√Neiky/

√Np(z) =

dyei(k/√N p(z)√

N.

Since p(z)dz = p(y)dy, we finally find Gz(k) = Gy(k/√N). This is the result used above.

For large N , we will only need the dependence of GX1for small values of its argument. We

therefore expand in Taylor series,

GX1(k) =

eikX1p(X1)dX1 =

∫ (

1 + ikX1 +(ik)2

2X2

1 + . . .

)

p(X1)dX1

= 1 + ik〈X1〉+(ik)2

2〈X2

1 〉+ . . .

= 1− σ2X1k2

2+ . . . (2.26)

Therefore,

Gz(k) =

[

1− σ2X1

2

k2

N+O

(k√N

)3]N

.

Page 17: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

2.4. CORRELATIONS IN SPACE 13

Now recall that

limN→∞

(

1 +z

N

)N

= ez,

so that for large N ,

Gz(k) = e−12σ

2X1k2

.

In short, z = (∑Xi)/

√N is a Gaussian variable with zero average and variance σ2

X1, quite indepen-

dently of the form of p(X1).In summary, the central limit theorem states that if X1, . . . ,XN are independent, arbitrary but

identically distributed random variables of zero mean, then

〈(∑

Xi)2〉 = Nσ2, (2.27)

where σ2 is the variance of each of the variables in Xi.

2.4 Correlations in Space

The results obtained so far apply to a large system, and perhaps it is not surprising that globalfluctuations in a large system are small. The results for the global fluctuations have implications forthe correlations of quantities, however. Consider the density of X, x = X/N . If we divide the systemin elements of volume that are microscopically large, but macroscopically small, then x = x(~r) is afunction of position. Consider now the correlation function

C(~r, ~r′) = 〈x(~r)x(~r′)〉 − 〈x(~r)〉〈x(~r′)〉 = 〈∆x(~r) ∆x(~r′

). (2.28)

We can say a few things about C(~r, ~r′

) from general considerations.

1. If the system is translationally invariant, then the origin of the coordinate system for ~r and~r‘

is arbitrary. C can only depend on the separation between ~r and ~r′, not on their absolutevalue:

~r

~r′

~r − ~r′∆x(~r)

∆x(~r′)

o

Figure 2.6: Transitional invariance (only ~r − ~r′ is important)

C(~r, ~r′

) = C(~r − ~r ′

) = C(~r′ − ~r) (2.29)

and, of course,〈∆x(~r) ∆x(~r

)〉 = 〈∆x(~r − ~r ′

) ∆x(0)〉. (2.30)

2. If the system is isotropic, then the direction of ∆x(~r) from ∆x(~r′

) is unimportant, and onlythe magnitude of the distance | ~r − ~r′ | is physically relevant. Hence

C(~r, ~r′

) = C(| ~r − ~r ′ |) (2.31)

So what was a function of six variables, (~r, ~r′

), is now a function of only one: | ~r − ~r ′ |.

3. Usually correlations fall off with distance, so that points that are far apart are statisticallyindependent. That is,

〈∆x(r →∞) ∆x(0)〉 = 〈∆x( :0r →∞)〉〈 :0

∆x(0)〉 = 0, (2.32)

Page 18: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

14 CHAPTER 2. LARGE SYSTEMS, AND INDEPENDENCE OF PARTS

∆x(~r′)

∆x(~r)

|~r − ~r′|

Figure 2.7: Isotropy (only |~r − ~r′| is important)

orC(r →∞) = 0. (2.33)

The decay to zero will occur over distances of the order of the correlation length ξ preciselybecause of its definition.

C(r)

r

O(ξ)

Figure 2.8: Typical correlation function

Finally, we can estimate in general the magnitude of the function C. The deviation of the densityx = X/N from its average can be written as,

∆x =

∫d~r ∆x(~r)∫d~r

=1

V

d~r ∆x(~r). (2.34)

Therefore1

V 2

d~r

d~r′ 〈∆x(~r) ∆x(~r

)〉 = 〈(∆x)2〉 ∼ O(1

N), (2.35)

according to the Central Limit Theorem, or,∫

d~r

d~r′

C(~r − ~r ′

) ∼ O(V 2

N)

We now introduce the following coordinate transformation: ~R = ~r − ~r ′

, ~R′

= 12 (~r + ~r

). TheJacobian of the transformation is one, and we obtain

d~R′

d~RC(~R) = V

d~R C(~R) ∼ O(V 2

N) (2.36)

or reintroducing the variable ~r, ∫

d~rC(~r) ∼ O(V

N).

Since N = Vξ3 , we have

d~rC(~r) ∼ O(ξ3). (2.37)

This result provides the order of magnitude of the function C. Although many functions satisfy theconstraint of Eq. (2.37) (see Fig. 2.8), it turns out that, usually,

C(r) ∼ e−r/ξ (2.38)

Page 19: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

2.5. (*) CENTRAL LIMIT THEOREM THROUGH EXPLICIT CALCULATION OF MOMENTS15

for large r, a function that satisfies Eq. 2.37. For systems comprised of a large number of independentsubsystems, the correlation function C(r) usually decays exponentially, with a characteristic decaylength equal to ξ.

In summary, we have focused on a system having N → ∞ independent parts or subsystems.For x = X/N , where X is an extensive thermodynamic variable, and N = V/ξ3 the number ofindependent parts in the volume V , with ξ is the correlation length, we have,

〈x〉 = O(1) (2.39)

〈(∆x)2〉 ∼ O(1

N) (2.40)

p(x) ∝ e(x−〈x〉)2

2〈(∆x)2〉 (2.41)∫

d~r 〈∆x(r) ∆x(0)〉 ∼ O(ξ3) (2.42)

This is the generic situation in classical statistical mechanics in which fluctuations are small. Thepurpose of this course, on the other hand, is to examine those cases in which V/ξ3 ∼ O(1) and hencethese results break down!

2.5 (*) Central Limit Theorem through explicit calculationof moments

We present an alternative derivation of the central limit theorem that considers the explicit calcu-lation of the moments of the variable X. For simplicity we will assume that all the odd momentsvanish

〈(∆X)2n+1〉 = 0

for integer n. This means the distribution p(X) is symmetric about the mean 〈X〉 as shown in Fig.2.9. It turns out that one can show that the distribution function is symmetric as (N → ∞), but

ρρρ

x x x〈x〉〈x〉〈x〉

Figure 2.9: Symmetric distribution functions ρ(x)

lets just assume it so the derivation is simpler.In order to obtain the distribution of X, p(X) we could first calculate its characteristic function

GX(k), which in turn can be obtained by calculating all the moments of X, as shown in Eq. (1.2).We therefore proceed to evaluate all the even moments

〈(∆X)2n〉for positive integers n. We have already evaluated the case n = 1 (Eq. (2.17), so we start now byconsidering n = 2,

〈(∆X)4〉 = 〈N∑

i=1

∆Xi

N∑

j=1

∆Xj

N∑

k=1

∆Xk

N∑

l=1

∆Xl〉

= 〈N∑

i=1(i=j=k=l)

(∆Xi)4〉+ 3〈

N∑

1=1j=1(i6=j)

(∆Xi)2(∆Xj)

2〉+O :0〈∆Xi〉

∼ O(N) +O(N2)

(2.43)

Page 20: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

16 CHAPTER 2. LARGE SYSTEMS, AND INDEPENDENCE OF PARTS

Therefore, as N →∞,

〈(∆X)4〉 = 3〈N∑

i=1

(∆Xi)2〉〈

N∑

j=1

(∆Xj)2〉

or〈(∆X)4〉 = 3〈(∆X)2〉2 ∼ O(N2) (2.44)

Note that the dominant term as (N →∞) originated from pairing up all the ∆Xi’s appropriately.A little thought makes it clear that the dominant term in 〈(∆X)2n〉 will result from this sort of pairinggiving,

〈(∆X)2n〉 ∼ 〈(∆X)2〉n ∼ O(Nn) (2.45)

The constant of proportionality in Eq. 2.45 can be determined exactly. Note that in

〈(∆X)2n〉 = 〈∆X ∆X ∆X . . .∆X ∆X〉

there are (2n − 1) ways to pair off the first two ∆X. After those two ∆X’s are considered, therethere remain (2n− 3) ways to pair off the next two ∆X, and so on. Until finally, we get to the lasttwo ∆X’s which have only way to be paired off. Hence

〈(∆X)2n〉 = (2n− 1)(2n− 3)(2n− 5)...(1)〈(∆X)2〉n = (2n− 1)!!〈(∆X)2〉n

or more conventionally,

〈(∆X)2n〉 =2n!

2nn!〈(∆X)2〉n (2.46)

In short, we have computed all the even moments of ∆X in terms of 〈(∆X)2〉. Of course, we alsoneed the average 〈X〉 that enters the definition of 〈∆X〉. Given the definition of the characteristicfunction, we now have,

p(X) =1

∫ ∞

−∞dk eikX

∞∑

n=0

(ik)n

n!〈Xn〉. (2.47)

In our case, all odd moments vanish, so that,

p(X) =1

∫ ∞

−∞dk eikX

∞∑

n=0

(−k2)n

(2n)!〈X2n〉

But 〈X2n〉 = (2n)!2nn! 〈X2〉n (assuming 〈X〉 = 0 for simplicity. The same derivation holds in the more

general case). Hence,

p(X) =1

∫ ∞

−∞dk eikX

∞∑

n=0

1

n!(−k2

2〈X2〉)n =

1

∫ ∞

−∞dk eikX e

−k2

2 〈X2〉 (2.48)

This integral can be done by completing the square, that is adding and subtracting −X2

2〈X2〉 , so that

p(X) =1

2πe

−X2

2〈X2〉

∫ ∞

−∞dk e

−(k〈X2〉1/2

21/2+ iX

2〈X2〉1/2)2

=1

2πe

−X2

2〈X2〉

2

〈X2〉

*

√π

∫ ∞

−∞du e−u

2

The final result, allowing for a nonzero average 〈X〉 is,

p(X) =1

2π〈(∆X)2〉e

−(X−〈X〉)2

2〈(∆X)2〉 , (2.49)

a Gaussian distribution.

Page 21: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

2.5. (*) CENTRAL LIMIT THEOREM THROUGH EXPLICIT CALCULATION OF MOMENTS17

In passing, it is worth remembering the following important corollary,

〈eikX〉 = e−k2

2 〈X2〉, (2.50)

for a Gaussian distribution.This is again the central limit theorem for a system of many independent parts. Its consequences

are most transparent if one deals with an intensive density x = XN . Then,

〈x〉 ∼ O(1)

〈(∆x)2〉 ∼ O(1/N)(2.51)

Let

〈(∆x)2〉 ≡ a2

N(2.52)

where a2 ∼ O(1). Then, up to normalization, we have

p(x) ∝ e−N(x−〈x〉)2

2a2 (2.53)

so the width of the Gaussian is exceedingly small. As stated earlier, fluctuations around the average

〈x〉 = O(1)

O(1/N1/2)ρ(x)

x

Figure 2.10: Gaussian (Width Exaggerated)

〈x〉 = O(1)

ρ(x)

x

O(1/N1/2)

Figure 2.11: Gaussian (Width (sort of) to scale)

are small, and hence the average value is well defined for a system of many N → ∞ independentparts. The distribution of x is Gaussian regardless of the distribution of the Xi in each of the partsor subsystems. This result applies to the equilibrium distribution of any thermodynamic variable,with the only assumption that the system be composed of many independent subsystems.

Page 22: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

18 CHAPTER 2. LARGE SYSTEMS, AND INDEPENDENCE OF PARTS

Page 23: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 3

Self similarity and Fractals

This chapter allows us to explore one situation in which a large object is not comprised of manyindependent parts. More precisely, by looking at objects that are self similar, we will uncover aclass of systems in which spatial correlation functions do not decay exponentially with distance, thehallmark of statistical independence among the parts.

Fractals are mathematical constructs that seem to exist in fractional dimensions between 1 and 2,2 and 3, etc.. They provide a useful starting point for thinking about systems n which V/ξ3 ∼ O(1),or if V = L3, ξ/L ∼ O(1). Figure 3.1 schematically shows a fractal object. The perimeter appears

Figure 3.1: Self affine Fractal

very jagged, or as it is said in physics, rough. The self-affine fractal has the property that itsperimeter length P satisfies,

P ∼ Lds as L→∞ (3.1)

where ds > 1 is the so called self-affine fractal exponent. Note that for a circle of diameter L,Pcircle = πL, while for a square of side L Psquare = 4L. In fact, P ∼ L1 for any run-of-the-mill,non fractal, object in two spatial dimensions. More generally, the surface of a compact (non fractal)object in d spatial dimensions is

Pnon fractal ∼ Ld−1 (3.2)

It is usually the case thatds > d− 1 (3.3)

Another example is given in Fig. 3.2. This object has a distribution of holes such that its massM satisfies

M ∼ Ldf as L→∞, (3.4)

19

Page 24: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

20 CHAPTER 3. SELF SIMILARITY AND FRACTALS

Figure 3.2: Volume Fractal

where df is the fractal dimension. Note again that for a circle M = (π/4)ρL2 if L is the diameter,while for a square M = ρL2. In fact, for run-of-the-mill compact objects

Mnon fractal ∼ Ld (3.5)

in d dimensions. Usuallydf < d (3.6)

S since the mass density is normally defined as M/Ld in d dimensions, this would imply that

ρf ∼1

Ld−df→ 0 (3.7)

as L→∞.

3.1 Von Koch Snowflake

We construct in this section a specific example of a fractal object, and compute its fractal dimension.In order to construct a von Koch fractal (or snowflake), follows the steps outlined in Fig. 3.3. In

Figure 3.3: Von Koch Snowflake

order to find the self-affine or fractal exponent for the perimeter, imagine we had drawn one of thesefractals recursively. Say the smallest straight element has length l. We calculate the perimeter P aswe go through the recursion indicated in Fig. 3.4.

From the figure, we find that the perimeter Pn = 4nl after n iterations, whereas the end toend distance after n iterations is Ln = 3nl. By taking the logarithm, we find ln(P/l) = n ln 4 andln(L/l) = n ln 3. Hence,

lnP/l

lnL/l=

ln 4

ln 3or P/l = (L/l)ds

Page 25: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

3.2. SIERPINSKII TRIANGLE 21

Figure 3.4:

with

ds =ln 4

ln 3≈ 1.26 > d− 1 = 1. (3.8)

3.2 Sierpinskii Triangle

The second example that we discuss is the Sierpinskii triangle. It can also be constructed recursivelyby following the steps outlined in Fig. 3.5. The iteration to construct this fractal is also shown in

Figure 3.5: Sierpinskii Triangle

Fig. 3.6. Define the mass of the smallest filled triangle as m and its volume as v. Successive massesare given by Mn = 3nm after n iterations, with a volume Vn = 4nv. Since Vn = 1

2L2n, and v = 1

2 l2

Figure 3.6:

we have Ln = 2nl Taking logarithms, we obtain ln(M/m) = n ln 3 and ln(L/l) = n ln 2, taking theratio

ln(M/m)

ln(L/l)=

ln 3

ln 2

M

m= (

L

l)df ,

Page 26: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

22 CHAPTER 3. SELF SIMILARITY AND FRACTALS

with

df =ln 3

ln 2≈ 1.585 < d = 2. (3.9)

3.3 Correlations in Self-Similar Objects

Figure 3.7:

The defining feature about a fractal is that it “looks the same” on different length scales: If youtake the fractal in Fig. 3.7, select a small portion of it and magnify it, the magnified structure willbe identical to the original fractal. This is the case irrespective of the size of the portion that is beingmagnified. Although this is true in this particular example, we will focus in what follows in a lessrestrictive definition: “statistical self similarity”. In this case, configurations and their magnifiedcounterparts are geometrically similar only in a statistical sense. More specifically, any parameterof the structure that is invariant under uniform magnification has to be, on average, independent ofscale.

This statement can be quantified by introducing the correlation function C(r) of the density n(r)in Fig. 3.7,

C(r) = 〈∆n(r) ∆n(0)〉The average is, say, over all orientations in space for convenience. In the case of a physical system,the average would be the usual statistical average. If the structure is uniformly magnified by a factorλ: r → λr, then statistical self-similarity is defined as

C(λr) = λ−pC(r), (3.10)

where the exponent p is a constant. Both the magnified C(λr) and the original C(r) are functionallythe same, except for a constant scale factor λp.

We first prove this relationship. The assumption of self similarity implies that C(r) ∝ C(r/b),or C(r) = f(b)C(r/b) with b = 1/λ. If we take the derivative with respect to b, and let r∗ = r/b wefind,

∂C(r)

∂b= 0 =

df

dbC(r∗) + f

dC

dr∗dr∗

dbor,

df

dbC(r∗) = + f

r∗

b

dC

dr∗.

Therefore,d ln f(b)

d ln b=d lnC(r∗)d ln r∗

(3.11)

Page 27: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

3.3. CORRELATIONS IN SELF-SIMILAR OBJECTS 23

Since the left hand side only depends on b, and the right hand side only on r∗, they both must equala constant,

d ln f(b)

d ln b= const. = −p f(b) ∝ b−p (3.12)

and the recursion relation is

C(r) = b−pC(r/b) (3.13)

up to a multiplicative constant.Equation (3.13) is sufficient to determine the functional form of C(r). Since the magnification

factor b is arbitrary, let us consider the special value

b = 1 + ǫ, ǫ << 1 (3.14)

Then, by expanding Eq. (3.13) in Taylor series, we have

C(r) = (1 + ǫ)−p C(r

1 + ǫ)

≈ (1− pǫ)C(r)− ǫr ∂C∂r+ . . .

At first order in ǫ we find,

−pC(r) = r∂C

∂r

The solution of which is

C ∝ 1

rp. (3.15)

In summary, self similarity implies that correlation functions are generalized homogeneous func-tions of their arguments,

C(r) = b−p C(r/b)

relation that immediately leads to a power law dependent of the correlation function on distance,

C(r) ∝ 1

rp(3.16)

Usually, these relations apply in physical systems for large distances. At shorter distances (saythe scale of an elemental triangle in the case at hand) self-similarity breaks down, and with it themathematical relations that we have derived. In addition, for systems of finite size, self similarityalso breaks down at scales of the order of the size of the system.

C(r)

r

1/rp

Figure 3.8: Decay of Correlations in a Self similar object

Note that the the dependence of C(r) on r that we have just found is quite different from thatwhich occurs in a system of many independent parts,

C(r) ∼ e−r/ξ

Page 28: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

24 CHAPTER 3. SELF SIMILARITY AND FRACTALS

C(r)

r

e−r/ξ

ξ

Figure 3.9: Decay of Correlations in a system of many independent parts

In particular, the relation ∫

ddrC(r) = O(ξd)

in d-dimensions that is satisfied for a system comprised of many independent parts is evidently not

satisfied for a self similar system in which

C(r) ∼ 1

rp

if d > p. In this case, the integral diverges! (this formally implies a divergence in the variance of x):

ddrC(r) ∝∫ L

a

rd−1dr1

rp∼ Ld−p, (3.17)

where a is a microscopic length (irrelevant here), and L→∞ the size of the system.In summary, fractals (self similar or rough objects) do not satisfy the central limit theorem since

the integral in Eq. 3.17 diverges. The study of rough surfaces is one of the focuses of the rest of thecourse.

Page 29: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 4

Review of Statistical Mechanicsand Fluctuations

4.1 Thermodynamics

The thermodynamic state of a system in equilibrium is completely specified by a finite set of extensivevariables called thermodynamic variables. A class of variables is related to conservation laws: e.g.,energy E and number of particles N . Others to the system’s extent: The volume V . This set alsoneeds to be complete, a question that is easy to answer a posteriori for any given system, but thatit is not trivial to address in newly discovered systems.

The thermodynamic description entails a huge reduction in the number of degrees of freedomneeded to describe the state of the system in the so called thermodynamic limit (N →∞, V →∞,such that the number density n = N/V is finite). Each classical particle, for example, has in threedimensions six degrees of freedom (position and momentum). So there are ∼ 1023 microscopicdegrees of freedom of freedom, yet the equilibrium state of a fluid needs only three.

The first law of thermodynamics states that the energy of a closed system is only a function ofthe thermodynamic variables in equilibrium. It is extensive, and a function of state: every time thesystem is in the same macroscopic state, it has the same internal energy.

If the system is not closed, and can exchange work with the environment, the change in internalenergy must equal the work exchanged, dE = −dW , where the work dW > 0 if done by the system.For a fluid system, mechanical work has the familiar form dW = pdV where p is the (uniform)pressure of the fluid, and dV the change in volume. This latter relation only applies to a quasi-staticprocess (very slow compared to microscopic relaxation times). If the process is not quasi static, thepressure in the fluid is not uniform. The work done still equals the change in energy, but one mustuse the equations of hydrodynamics to actually calculate it.

If the system is not closed, and can exchange more than work with the environment, then thefirst law adopts the form,

dQ = dE + dW,

where dQ is the heat exchanged between the system and the environment. By convention, dQ > 0if it flows into the system. Note that whereas dE is the differential form of a function of state (E),neither dW and dQ are. Both depend on the specific thermodynamic path of the process.

The second law of thermodynamics introduces a new function of state, the entropy S. It isan extensive function of the thermodynamic variables, so that for a mono component fluid S =S(E,N, V ). The second law defines the state of thermodynamic equilibrium by saying that giventhe macroscopic constraints on a closed system (i.e., E, V,N), the state of equilibrium is the onethat maximizes the entropy over the manifold of constraint states.

In particular, if the state A of the system includes some internal constraint (e.g., no particles inits left bottom corner), and the resulting entropy of this state is S(A), then removing the internalconstraint -thus leading to a new equilibrium state B- the fact that S needs to be maximized

25

Page 30: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

26 CHAPTER 4. REVIEW OF STATISTICAL MECHANICS AND FLUCTUATIONS

immediately implies S(B)/geS(A); i.e., the entropy can only increase in spontaneous processes in aclosed system.

The second law has immediate consequences for the state of equilibrium. Consider a closedsystem, and imagine it decomposed in two subsystems, as indicated in Fig. 4.1. If the system in

1 2 21

Figure 4.1: A system with two parts.

question is a fluid, then the three thermodynamic variables of the whole system (1+2): V,E,Nsatisfy,

V = V1 + V2

E = E1 + E2

N = N1 +N2

We are now going to search the equilibrium state by requiring that S be maximized over any changesin V1, E1, N1, i.e. the way in which the three variables can be apportioned over the manifold ofconstrained states given by fixed (E, V,N).

Since S is a function of the thermodynamic variables and it is extensive, we can write,

dS = dS1+dS2 =

(∂S1

∂E1

)

dE1+

(∂S1

∂V1

)

dV1+

(∂S1

∂N1

)

dN1+

(∂S2

∂E2

)

dE2+

(∂S2

∂V2

)

dV2+

(∂S1

∂N2

)

dN2.

Given the constraints, we have

dV1 + dV2 = 0

dE1 + dE2 = 0

dN1 + dN2 = 0.

Therefore,

dS =

[(∂S1

∂V1

)

−(∂S2

∂V2

)]

dV1 +

[(∂S1

∂E1

)

−(∂S2

∂E2

)]

dE1 +

[(∂S1

∂N1

)

−(∂S2

∂N2

)]

dN1.

Since E, V and N are independent variables, its variations are also independent. The conditionof equilibrium is that S be maximized over any variation or internal variables, or dS = 0 in thisequation. Therefore, sufficient conditions for equilibrium are,

(∂S1

∂V1

)

=

(∂S2

∂V2

)

=p

Tuniform

(∂S1

∂E1

)

=

(∂S2

∂E2

)

=1

Tuniform

(∂S1

∂N1

)

=

(∂S2

∂N2

)

= −µT

uniform

In short, the condition of equilibrium leads to three intensive variables T, µ, p that must be spatiallyuniform in equilibrium. If T is not uniform, energy flows will occur. If p is not uniform, the relativevolume of the two subsystems will change. If µ is not uniform, mass transport will take place. Whenthe system regains equilibrium, all three intensive variables must be uniform.

Note that this derivation admits a straightforward generalization to more complex systems inwhich more than three thermodynamic variables are needed to specify the state of the system. In

Page 31: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

4.2. STATISTICAL MECHANICS 27

this respect, given the number of necessary thermodynamic variables, the second law leads to asmany variables that must be uniform in equilibrium, and they all correspond to the derivative of Swith respect to each of the variables.

Further progress is achieved by the introduction the so called equations of state. These relatethe values of the thermodynamic variables in the equilibrium state, and are system dependent. Mostoften they are determined experimentally. For example, the ideal gas equation of state is

p = p(V, T ) = N kB T/V

supplemented with the condition (∂E/∂V )NT = 0.In general, equations of state are not known. Instead one obtains differential equations of state

of the form

dp =

(∂p

∂V

)

NT

dV +

(∂p

∂T

)

NV

dT

and

dE =

(∂E

∂V

)

NT

dV +

(∂E

∂T

)

NV

dT

Clearly, if we know all the (T, V ) dependence of these derivatives, such as (∂E∂T )NV , we know theequations of state. In fact, most work focuses on determining the derivatives. Some of these are soimportant they are given their own names:

Isothermal Compressibility kT = − 1

V

(∂V

∂p

)

T

Adiabatic Compressibility kS = − 1

V

(∂V

∂p

)

S

Volume Expansivity α =1

V

(∂V

∂T

)

p

Heat Capacity at Constant Volume CV = T

(∂S

∂T

)

V

Heat Capacity at Constant Pressure Cp = T

(∂S

∂T

)

p

Of course, kT = kT (T, V ), etc.These results rest on such firm foundations, and follow from such simple principles, that any

microscopic theory must be consistent with them. Furthermore, a microscopic theory should be ableto address the unanswered questions of thermodynamics, e.g., how to calculate kT or CV from firstprinciples.

We finally mention that these quantities can have very striking behavior. Consider the phasediagram of a pure substance (fig. 4.2). The lines denote first order phase transition lines. The dotdenotes a continuous phase transition, often called of second order. This particular point is knownas the critical point. Imagine preparing a system at the pressure corresponding to the continuoustransition, and increasing temperature from zero through that transition (see fig. 4.3). Figure 4.4shows the expected behavior of kT and CV along that path. At the first-order transition, kT has adelta-function divergence while CV has a cusp. At the continuous transition, both CV and kT areknown to have power-law divergences. We will spend a lot of time on continuous, so called secondorder transitions in this course.

4.2 Statistical Mechanics

The summary just given about thermodynamics has a parallel in statistical mechanics. In principle,we believe that the observable value of a macroscopic variable G is an average over the temporal

Page 32: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

28 CHAPTER 4. REVIEW OF STATISTICAL MECHANICS AND FLUCTUATIONS

T

v

ls

P

Figure 4.2: Solid, liquid, vapor phases of pure substance

v

ls

T1 T2T

P

Pc

Figure 4.3:

kT

T1 T2

P ≡ Pc

T1 T2

Cv P ≡ Pc

Figure 4.4:

Page 33: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

4.2. STATISTICAL MECHANICS 29

evolution of the system over the manifold in phase space that corresponds to the thermodynamicstate of equilibrium. Schematically,

Gobs =1

NN∑

i=1

G(ti),

where G(ti) is the value of G over a small interval around time ti. The starting point is rewritingthis equation as,

Gobs =∑

ν

(1

N #times in which the system is in state ν

)

Gν .

The quantity in parenthesis is the probability of occurrence of state ν while the system evolves inthermodynamic equilibrium, or pν . Therefore,

Gobs = 〈G〉 =∑

ν

pνGν .

The remarkable fact is that while the detailed trajectory over phase space is impossible to compute,the probability pν is known to great accuracy without ever solving the equations of motion (providedthat G is a macroscopic variable). Several cases need to be distinguished.

4.2.1 Closed system

.If Ω(N,V,E) is the total number of microscopic states for a closed system of fixed N,V,E, then,

pν =1

Ω(N,V,E).

All states of the same energy occur with equal probability.The connection with thermodynamics is through Boltzmann’s formula,

S(N,V,E) = kB ln Ω(N,V,E). (4.1)

With this definition the entropy is extensive. If the system is decomposed into two subsystems, thenΩ = Ω1Ω2, and hence S = S1 + S2.

Spontaneous processes lead to entropy increases as well. Removing an internal constraint in anotherwise closed system necessarily increases the number of accessible microscopic states Ω, andhence the entropy S.

4.2.2 System at constant temperature

In order to maintain a system at constant temperature it must be kept in equilibrium with a largeheat bath with which it can exchange energy. When the combined system plus heat bath reachequilibrium (the whole is a closed system), the temperature of both must be the same.

Consider now a state of the system of energy Eν . The probability of such a state in the systemmust be proportional to the number of states of the heat bath that are compatible with the systemhaving an energy Eν). If the combined energy of the system plus heat bath is E, then

pν ∝ Ω(Ebath) = Ω(E − Eν).

We also know that EE ≫ Es since we are considering a large heat bath. We now expand,

pν ∝ exp [ln Ω(E − Eν)] = exp

[

ln Ω(E)− ∂ ln Ω

∂EEν + . . .

]

.

Page 34: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

30 CHAPTER 4. REVIEW OF STATISTICAL MECHANICS AND FLUCTUATIONS

Recalling thermodynamics, β = 1/kBT = ∂ ln Ω∂E . Therefore, after appropriate normalization of pν

by Z, the partition function, we find,

pν =1

Ze−βEν . (4.2)

Therefore, the states of an equilibrium system held at constant temperature are quite simply expo-nentially distributed according to their energy.

4.2.3 Other ensembles

Imagine now a system that can exchange both E andX with a bath. The first law of thermodynamicsin this case is

TdS = dE + (kBTξ) dX, (4.3)

so that (

T∂S

∂X

)

E

= kBTξ,

so that kBξ is the conjugate variable to X.Following the same argument given above we have now,

pν ∝ Ω(E − Es,X −Xs).

Expanding in Taylor series, we have

pν ∝ exp [ln Ω(E − Eν ,X −Xν)] = exp

[

ln Ω(E,X)− ∂Ω

∂EEν −

∂Ω

∂XXν + . . .

]

.

Therefore, given that S = kB ln Ω for the combined closed system, and introducing the partitionfunction Z as normalization, we find

pν =1

Ze−βEν−ξXν Z(β, ξ) =

ν

e−βEν−ξXν . (4.4)

A particular example is the grand canonical distribution in whichX = N , and hence ξ = −µ/kBTand,

Z(β, µ) =∑

e−βEν+βµNν (4.5)

4.3 Fluctuations

Landau and Lifshitz are credited with having organized the foundations of the theory of fluctuations,although they attribute the theory to Einstein in their book on statistical physics. We follow theirtreatment.

Consider the density x of a thermodynamic variable X in a closed system. Let 〈x〉 be theequilibrium value, with S(〈x〉) its entropy. The value of 〈x〉 is such that it maximizes the entropy,or, equivalently, that corresponds to the largest number of microscopic states Ω(〈x〉) given themacroscopic constraints in the entire systems. Imagine that the local density fluctuates to a differentvalue x, which itself can correspond to Ω(x) different microscopic states that would yield the samelocal value of x. Obviously, Ω(x) < Ω(〈x〉), as 〈x〉 is the equilibrium value.

The key insight is to use Boltzmann’s formula to assign the probability of occurrence of thefluctuation (while the system remains in equilibrium). First the probability of occurrence of x isdefined:

p(x) =Ω(x)

Ω(〈x〉) ,

as the system is closed, and hence, by using Boltzmann’s formula,

p(x) = e1kB

(S(x)−S(〈x〉)). (4.6)

Page 35: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

4.3. FLUCTUATIONS 31

x〈x〉

S

Figure 4.5:

This situation is schematically shown in Fig. 4.5.

Near the maximum of S we have

S(x) = S(〈x〉) +∂S

∂x

∣∣∣∣x=〈x〉

(x− 〈x〉) +1

2

∂2S

∂x2

∣∣∣∣x=〈x〉

(x− 〈x〉)2 + . . .

But (∂S/∂x)x=〈x〉 = 0 as a condition for equilibrium, and ∂2S/∂x2 is negative since S is a maximum.Hence,

S(x)− S(〈x〉kB

= − 1

2σ2(x− 〈x〉)2

where we have defined

σ2 = −kB/(∂2S/∂x2)x=〈x〉.

The negative sign is introduced anticipating that the second derivative is negative. Therefore prob-ability of the fluctuation is

p(x) ∝ e−(x−〈x〉)2/2σ2

or a Gaussian distribution. Since x is a local density (not microscopic) the distribution is sharplypeaked at x = 〈x〉, and one normally allow integrals over p(x) to vary from x = −∞ to x = ∞.Then

〈(x− 〈x〉)2〉 = σ2

i.e.,

〈(x− 〈x〉)2〉 =−kB

(∂2S/∂x2)x=〈x〉.

This is called a fluctuation-dissipation relation of the first kind, and is a central result of fluctuationtheory. It relates the variance of fluctuations of x to thermodynamic derivatives taken in equilibrium.They are also called correlation-response relations, as the second derivative of the entropy genericallygives rise to response functions (heat capacities, compressibilities, etc.). In this way, e.g., the varianceof fluctuations in energy is proportional to the equilibrium heat capacity.

Note that since x is intensive we have

〈(∆x)2〉 ∼ O(1/N) = O(ξ3/L3).

Therefore−kB

∂2S/∂〈x〉2 = O(ξ3/L3),

so that thermodynamic second derivatives are determined by microscopic correlations. This is whywe need statistical mechanics to calculate thermodynamic second derivatives (or response functions).We need more degrees of freedom than simply x to do anything all that interesting.

Page 36: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

32 CHAPTER 4. REVIEW OF STATISTICAL MECHANICS AND FLUCTUATIONS

θ

x

Figure 4.6: Simple pendulum of mass m, length l, displaced an angle θ.

4.3.1 Simple mechanical example

Consider the simple pendulum shown in Fig. 4.6. We are going to study fluctuations in its heightx induced presumably by random collisions with air molecules surrounding it. The amount ofwork done by the thermal background to raise the pendulum a height x is mgx. Hence W =−mgx = −mglθ2/l for small θ is the work done by the system. The quantity g is the gravitational

u acceleration, and we have used the fact that x = l − l cos θ ≈ l θ22 .

We assume next that, the pendulum being a mechanical system, nothing has been done to itsinternal degrees of freedom so as to change its internal energy. Therefore the work done, has toequal the change in entropy

∆S = −mglθ2

2T

Since 〈θ〉 = 0 is the equilibrium state, we have

〈θ2〉 =kBT

mgl

This is a very small fluctuation for a pendulum of macroscopic mass m. However, if the massis sufficiently small, fluctuations are appreciable. This is the origin, for example, of fluctuations ofsmall particles immersed in water, called Brownian motion, as studied by Einstein.

Time to

Fall

T

5s

Figure 4.7:

Page 37: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

4.4. THERMODYNAMIC FLUCTUATIONS 33

There are many other similar situations described in the book by Landau and Lifshitz. Aparticularly intriguing one, not given in Landau’s book, is the problem of a pencil balanced on itstip. It is interesting to estimate how long it takes to fall as a function of temperature (at zerotemperature, one needs to use the uncertainty principle).

4.4 Thermodynamic Fluctuations

We now review the classical treatment of thermodynamic fluctuations. Imagine we have two sub-systems making up a larger system. We imagine a fluctuation in which the two subsystem exchangeenergy and volume. Because of the changes in energy and volume of each subsystem, there is also achange in entropy in both, as well as a change in the total entropy of the system as a consequence ofthe fluctuation. Since the entropy is extensive, we write the entropy change due to the fluctuation

1

2 “1” The system

N fixed, common T , P

“2” The resevior

Figure 4.8:

as,

∆S = ∆S1+ ∆S2

= ∆S1+dE2 + pdV2

T︸ ︷︷ ︸

second law for

the reservoir

When the reservoir increases its energy, system 1 decreases it by the same amount. The same istrue of the volume. Because 1 and 2 are in thermal equilibrium, they both have the same pressurep and temperature T ). Therefore we have

∆S = ∆S1 −∆E1

T− p

T∆V1

According to Eq. (4.6), the probability that such a fluctuation happens spontaneously is e∆S/kB ,or

p ∝ e−1

kBT(∆E−T∆S+p∆V )

(4.7)

where we have dropped the subscript “1” when referring to the system. In the case of a monocom-ponent fluid that we are considering, further reduction to only two independent variables is possible.Consider,

∆E =∆E(S, V )

=∂E

∂S∆S +

∂E

∂V∆V+

+1

2

[∂2E

∂S2(∆S)2 + 2

∂2E

∂S∂V(∆S∆V ) +

∂2E

∂V 2(∆V )2

]

to second order in the fluctuation. Rearranging we have,

∆E =∂E

∂S∆S +

∂E

∂V∆V +

1

2

[

(∂E

∂S

)

∆S + ∆

(∂E

∂V

)

∆V

]

.

Page 38: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

34 CHAPTER 4. REVIEW OF STATISTICAL MECHANICS AND FLUCTUATIONS

Recall that (∂E∂S )V = T and (∂E∂V )S = −p. We then obtain,

∆E = T∆S − p∆V +1

2(∆T∆S −∆p∆V )

Note the similarity with the differential form of the second law of thermodynamics. The extra termin brackets is of second order in the variations, and hence disappears from the differential form.However, it gives the lowest order correction to thermodynamics when considering fluctuations thatare finite. Substituting back into Eq. (4.7) we obtain,

p ∝ e−(∆T∆S−∆p∆V )/2kBT

in which only second order terms in ∆ remain. Further manipulation is possible by considering,

∆S =

(∂S

∂T

)

V

∆T +

(∂S

∂V

)

T

∆V

=CVT

∆T +

(∂p

∂T

)

T

∆V

where we have used the definition of CV and a Maxwell relation. Likewise

∆p =

(∂p

∂T

)

V

∆T +

(∂p

∂V

)

T

∆V

=

(∂p

∂T

)

V

∆T − 1

κTV∆V,

where κT is the isothermal compressibility. Putting these expressions together and simplifying yields

p(∆T,∆V ) ∝ eh

−CV2kBT

2 (∆T )2− 12kBTκT V

(∆V )2i

.

As expected, the probability of the fluctuations is Gaussian. Furthermore, the probabilities offluctuations in T and V are independent since

p(∆T,∆V ) = pT (∆T ) · pV (∆V )

By inspection we have,

〈(∆T )2〉 =kBT

2

CV, 〈(∆V )2〉 = kBTκTV, 〈∆T ∆V 〉 = 0. (4.8)

The choice of T and V as independent variables is obviously arbitrary. Many other combinationsare possible. For example, if we had chosen ∆S and ∆p as our independent variables, we wouldhave obtained

p(∆S,∆p) ∝ eh

−12kBCp

(∆S)2− κSV

2kBT(∆p)2

i

so that

〈(∆S)2〉 = kBCp, 〈(∆p)2〉 =kBT

κSV, 〈∆S∆p〉 = 0. (4.9)

It is also possible to obtain higher order moments from the Gaussian distributions. So, forexample,

〈(∆S)4〉 = 3〈(∆S)2〉2 = 3(kBCp)2.

Note from all the relations derived that

〈(∆x)2〉 ∼ O(1/N)

for intensive variables, and〈(∆X)2〉 ∼ O(N)

for extensive variables. This agrees with our general treatment of systems comprised of manyindependent parts, and is a general feature of thermodynamic fluctuations in any physical system.Also note the explicit relationship between fluctuations (〈(∆T )2〉), thermodynamic derivatives (CV =T (∂S/∂T )V ), and microscopic correlations (1/N = ξ3/L3).

Page 39: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

4.4. THERMODYNAMIC FLUCTUATIONS 35

4.4.1 Correlation functions and response functions

Consider the result〈(∆V )2〉 = kBTκTV

and rewrite it in terms of the number density n = N/V for fixed N . Since N is fixed ∆n = ∆NV =

− NV 2 ∆V Hence

〈(∆V )2〉 =V 4

N2〈(∆n)2〉

so

〈(∆n)2〉 =N2

V 4kBTκTV =

n2kBTκTV

.

We have on the other hand that ∆n can be written as

∆n =1

V

d~r∆n(~r)

Hence

〈(∆n)2〉 =1

V 2

d~r

d~r ′〈∆n(~r)∆n(~r ′)〉

If translational invariance of space is assumed, we can carry out one of the integrals,

〈(∆n)2〉 =1

V

d~rC(r).

But 〈(∆n)2〉 = n2kBTκT /V , so we obtain a so called thermodynamic “sum rule”:

d~rC(r) =

d~r〈∆n(r)∆n(0)〉 = n2kBTκT .

The spatial integral of the correlation function equals a response function.This relationship can be rewritten in Fourier space. If

C(k) =

d~rei~k·~rC(~r)

we then havelimk→0

C(k) = n2kBTκT (4.10)

The quantity C(k) is called the structure factor. It is directly observable by X Ray, neutron, or lightscattering as we will see later. For a liquid, one typically obtains a function of the type shown inFig. 4.9. The point at k = 0 is determined by the result Eq. (4.10).

C(k)

C(k)

k

(peaks correspond to short scale order)

Figure 4.9: Typical structure factor for a liquid.

Page 40: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

36 CHAPTER 4. REVIEW OF STATISTICAL MECHANICS AND FLUCTUATIONS

Page 41: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 5

Fluctuations of Surfaces

A striking fact is that, although most natural systems are inhomogeneous, with complicated formand structure, often the focus in thermodynamic studies is in uniform systems.

As an example, consider a mono component system which can be in the homogeneous phases ofsolid, liquid, or vapor. If the system is at constant temperature and pressure, the thermodynamicdescription is based on the Gibbs free energies gs(T, p), gl(T, p), or gv(T, p). In equilibrium, thestate of the system is that which minimizes the Gibbs free energy at each point (T, P ). However,there is a possibility for inhomogeneous configurations along lines and points:

gs(T, p) = gl(T, p),

gives the solidification line

gl(T, p) = gv(T, p),

gives the vaporization line, while

gs(T, p) = gl(T, p) = gv(T, p)

gives the triple point of solid–liquid–vapor coexistence.

T

v

ls

P

Figure 5.1:

These inhomogeneous states along the lines in the phase diagram are a set of measure zerorelative to the area taken up by the p-T plane.

Much modern work in condensed matter physics concerns the study of inhomogeneous configu-rations. To understand their basic properties, we shall consider the most simple case, namely twoequilibrium coexisting phases separated by a surface.

As every one knows, finite volume droplets are spherical, so that very large droplets are locallyplanar. We will consider first the case of a planar surface separating to coexisting phases. Thephenomenology to be discussed is the same for liquid-vapor systems, oil-water systems, magneticdomain walls, as well as many other systems in nature. The reason k for this is simple thermody-namics: A system with a surface has an extra free energy that is proportional to the area of that

37

Page 42: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

38 CHAPTER 5. FLUCTUATIONS OF SURFACES

Tc

Pc

P

T

l

v

Vapor

Liquid

Cartoon of SystemPhase Diagram

Figure 5.2:

surface. Using the Helmholtz free energy for a system at constant temperature and volume, we canwrite,

F = FBulk + σ

d~S

︸ ︷︷ ︸

Surface Area

(5.1)

where the positive constant σ is the called the surface tension. Note that this extra free energy

~x

y

Bulk Energy σ∫d~x

L

Figure 5.3:

due to the existence of a surface is minimized by minimizing the surface area. If constraints allowit, the surface will be planar. Note also that this surface energy is negligible in the thermodynamiclimit. The bulk free energy grows as the volume of the system, whereas the surface excess energyonly grows as the surface area. Hence, the second is negligible in a large system. Nevertheless,when fluctuations are allowed, fluctuations in the surface are important, and they will in fact be thesubject matter of this chapter.

A coordinate system is introduced as sketched in Fig. (5.3). The coordinate y is perpendicular tothe plane of the surface, while ~x is a (d−1)-dimensional vector parallel to it (in d spatial dimensions).Furthermore, the lateral dimension of the system is L, so its total volume is V = Ld, while the totalsurface area is A = Ld−1

We shall limit our investigations to long length scale phenomena, namely surface configurationswith characteristic wavelengths much larger than molecular dimensions.

r > a = a few Angstroms.

where a is an approximate molecular size,

a ≈ 5A. (5.2)

Because of this constraint, there is the associated upper limit in wavenumbers,

k ≤ Λ =2π

a.

A remarkable observation in statistical mechanics is that many microscopic systems can give riseto the same behavior on the long length scales just described. A familiar example is a Newtonian

Page 43: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

5.1. LATTICE MODELS OF FLUCTUATING SURFACES 39

fluid: all such fluids behave in the same way irrespective of their molecular composition. In fact, itis commonplace in statistical mechanics to go much much further than this, and simply introducenew microscopic models that although seemingly unrelated to the microscopic system in question,they share the same long wavelength properties.

5.1 Lattice models of fluctuating surfaces

As a first example of a simple microscopic model of a fluctuating surface, let us consider themodel schematically shown in Fig. (5.4). We have discretized the ~x and y axes. Instead of

~x

y

Physical Surface Microscopic modeled abstraction

Figure 5.4:

having a continuous variable where −L/2 ≤ x ≤ L/2 and −L/2 ≤ y ≤ L/2 (in d = 2), wehave discrete elements of surface at i = −L2 , ...,−3,−2,−1, 0, 1, 2, 3, ..., L2 , where L could be mea-sured in units of a, the small length scale cutoff. We can likewise discretize the y directions asj = −L2 , ...,−3,−2,−1, 0, 1, 2, 3, ..., L2 Actually, for greater convenience, let the index along the xaxis vary as

i = 1, 2, 3, ..., L (5.3)

All the possible states of the system form a discrete set that can easily enumerated. Any con-figuration of this fluctuating surface corresponds to a set of heights hi, where hi is an integer−L/2 ≤ hi ≥ L/2 as shown in Fig. 5.5. Statistical mechanical sums over states ν can be writtenas, partition function is

i = 1 i = 2 i = 3 i = 4 i = L

h1 = 0

h2 = 1 h3 = 0 h4 = 1

h5 = 4

Figure 5.5:

ν=∑

hi=

L∏

i=1

L/2∑

hi=−L/2(5.4)

This equation deserves some thought; its meaning is further illustrated in Fig. 5.5.

This model represents a striking simplification relative to a true microscopic description of aliquid-vapor interface, but as L→∞, it becomes perfectly accurate.

Page 44: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

40 CHAPTER 5. FLUCTUATIONS OF SURFACES

Second, we need to introduce the energy of each state of the system Eν = Ehi in a waythat is consistent with the free energy ∆F = σS of the physical interface. Physically, we want todiscourage surfaces with too much area, since the free energy is minimized by a flat surface. We

Discouraged by Estate Encouraged by Estate

Figure 5.6:

therefore encourage local planarity by defining as excess energy of a surface,

Eν = J

L∑

i=1

(hi − hi+1)2 (5.5)

where J is a positive constant. This particular choice is called the discrete Gaussian solid-on-solidmodel. With the choice of energy Eq. (5.5), the partition function is

Z =∏

i

hi

e− JkBT

PLi′=1

|hi′−hi′+1|2 (5.6)

This is for d = 2, but the generalization to d=3 is easy. Determination of this partition functionproduces the equilibrium macroscopic properties of the coexisting system with due allowance for allthe possible configurations of the fluctuating surface that separates the two bulk phases. Note thathis expression does not contain the bulk free energy of the phases as we have focused here on thesurface part,

Although it is straightforward to solve for Z numerically, it is not possible to do so analytically.It turns out to be easier to solve for the continuum version of Eq. 5.6 which we shall consider next.

5.2 A continuum model of fluctuating surface

Let h(~x) be the height of the interface at any point ~x as shown in Fig. (5.7. Note that this description

y

~xh(~x)

y = h(~x)

is the interface

Figure 5.7:

is not completely general from the outset as it does not allow for overhangs or bubbles (Fig. 5.8).Neither one can be described with a single valued function y = h(~x). Of course, although it was notmentioned above, the “lattice model” (Fig. 5.5) also could not have overhangs nor bubbles.

We now have a continuum of states ν, each corresponding to a different configuration of thesurface h(~x). Without being mathematically precise, let us mention that the continuum analogue of

Page 45: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

5.2. A CONTINUUM MODEL OF FLUCTUATING SURFACE 41

These give a multivalued h(~x), as shown

Overhang Bubble

Figure 5.8:

Eq. (5.4) is

ν

=∑

h~x=∏

~x

dh(~x)

=

Dh(·)(5.7)

where the last step defines a functional integral. Things are a little subtle because ~x is a continuousvariable, and hence the integration is over all possible configurations (or fields) h(~x). In the sectionsthat follow, we will discretize functional integrals before we actually compute them, so we do notneed to go into the details of functional calculus here. Also note that whereas the discrete sum overstates is dimensionless,

~x

∫dh(~x) has dimensions of (

x L) (infinite dimensional). This constantfactor of dimensionality does not affect the results, and we shall ignore it.

We next need to specify the energy of each state ν, or configuration. Since now each state islabeled by a function h(~x), the energy becomes a functional of h(~x). For the same reasons as outlinesin the discrete model, we choose to assign each configuration an energy which is proportional to itssurface Eh(~x) = σS, where S is the (d−1) dimensional surface area for a system in d dimensions,and σ is the -constant- interfacial tension.

In order to compute the surface area S of an arbitrary configuration h(~x), we first focus on thesimple case of d = 2 (Fig. 5.9). Each surface element (perimeter in d = 2) is

(dh)2 + (dx)2

dh

dx

x

y

Figure 5.9:

dS =√

(dh)2 + (dx)2

= dx

1 + (dh

dx)2

(5.8)

In d dimensions one has

dd−1S = dd−1x

1 + (∂h

∂~x)2 (5.9)

so that the total surface area is

S =

dd−1x

1 + (∂h

∂~x)2

Page 46: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

42 CHAPTER 5. FLUCTUATIONS OF SURFACES

so that

Eh(~x) = σ

dd−1~x

1 + (∂h

∂~x)2 (5.10)

It turns out that this model is not yet analytically tractable. One introduces the further restric-tion that

(∂h

∂~x)2 ≪ 1 (5.11)

that is, that the surface fluctuations considered are only those of small amplitude. Expanding Eq.(5.10) in power series gives

Eh(~r) ≃ σLd−1 +σ

2

dd−1~x(∂h

∂x)2

︸ ︷︷ ︸

Extra surface area due to fluctuation

(5.12)

Finally the partition function of this model is

Z =∑

ν

e−Eν/kBT = e−σLd−1

kBT (∏

~x

dh(x)) exp

[ −σ2kBT

dd−1~x(∂h

∂~x′)2]

(5.13)

which is an infinite dimensional functional integral. Although it looks quite formidable, it is straight-forward to evaluate. In particle physics a similar integral appears in the so called free-field theory.

5.2.1 Review of Fourier series

We will compute the functional integrals by Fourier transformation. We first a few basic expressionsrelated to Fourier transforms. It is convenient to work with both continuous and discrete Fouriertransforms. They are defined as

h(~x) =

∫dd−1~k

(2π)d−1 ei~k·~xh(~k) Continuoustransform

1Ld−1

~k

ei~k·~xh~k discrete series, (5.14)

where

h(~k) =

dd−1~xe−i~k·~xh(~x) = h~k (5.15)

For the particular case of the Dirac delta function, we have

δ(~x) =

∫dd−1~k

(2π)d−1ei~k·~x. (5.16)

Consider now three dimensional space and a function F = F (~k) function of wavevector ~k. Givena system of lateral dimension L, the corresponding discretization of wavenumber is ∆k = 2π/L.

Hence, element of volume in Fourier space (the inverse of the density of states) is ∆~k = (2π)3/L3.In the limit L→∞ we have ∆k → 0 and the result,

limL→∞

1

L3

~k

F (~k) =

∫d~k

(2π)3F (~k). (5.17)

In the particular case that F is the Kronecker delta F (~k) = δ~k ~k0 we obtain

δ~k ~k0 →L3

(2π)3δ(~k − ~k0). (5.18)

Page 47: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

5.2. A CONTINUUM MODEL OF FLUCTUATING SURFACE 43

We begin by using these results to simplify the form of Eh(~x) in Eq. (5.12). Consider

dd−1~x

(∂h

∂~x

)2

=

=

d~x

(

∂~x

∫d~k

(2π)d−1ei~k·~xh(~k)

)2

=

d~x

∫d~k

(2π)d−1

∫d~k ′

(2π)d−1(−~k · ~k ′)h(~k)h(~k′)ei(

~k+~k′)·~x

=

∫d~k

(2π)d−1

∫d~k′

(2π)d−1(−~k · ~k′)h(~k)h(~k′)

[∫

d~xei(~k+~k′)·x

]

=

∫d~k

(2π)d−1k2h(~k)h(−~k)

after doing the integral over ~k ′. But h(~x) is real, hence h∗(~k) = h(−~k) (from Eq. (5.15)), and wehave

dd−1~x(∂h

∂~x)2 =

∫dd−1~k

(2π)d−1k2|h(~k)|2 (continuum case)

or ∫

dd−1x(∂h

∂~x)2 =

1

Ld−1

~k

k2|h~k|2 (discrete case) (5.19)

Hence the modes h~k are uncoupled in Fourier space: the energy is a direct sum over ~k where allthe terms in the sum are independent of each other. Therefore we have for the energy of eachconfiguration

E(h~k) = σLd−1 +σ

2Ld−1

~k

k2|h~k|2, (5.20)

where the configuration is expressed in terms of the Fourier coefficients h~k.In order to compute the partition function we need to replace the sum over all configurations

h(~x) by a sum over the Fourier modes hk. This is a little subtle because h(~x) is real while h~k is

complex, and we also have the relationship h∗~k = h−~k. This is standard in Fourier transform theory.Briefly,

ν

=∏

~x

dh(~x) =′∏

~k

d2h~k (5.21)

where d2h~k = dℜ(h~k) dℑ(h~k), that is, the product of the real and imaginary parts, and∏′~k means

a restriction to only those modes that are independent; i.e., not related through h∗~k = h−~k. In two

spatial dimensions, the sum extends over one half of the ~k plane as shown in Fig. 7.12.

Integrating in kg > 0, only, for example

kx

∏′k

ky

Figure 5.10:

Page 48: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

44 CHAPTER 5. FLUCTUATIONS OF SURFACES

5.2.2 Height-height correlation function

We will not attempt yet the calculation of the full partition function. As will be seen later, ourapproximation leads to Gaussian distributions and hence it is sufficient to compute second momentsof the variable h. We address in this section the calculation of the correlation function

〈h(~x)h(~x ′)〉,

the correlation between the surface deflection at points ~x and ~x ′. If we assume translational invari-ance on the plane of the interface, and for convenience we choose

~x1

|x1 − x2|

~x2

o

Figure 5.11: ~x plane

〈h〉 = 0, (5.22)

our only task is to evaluate the correlation function

G(x) = 〈h(~x)h(0)〉 = 〈h(~x+ x ′)h(~x ′)〉 (5.23)

We first write,

〈h(~k) h(~k ′)〉 =

dd−1~x

dd−1~x ′e−i~k·~x−i~k ′·~x ′〈h(~x) h(~x ′)〉 =

d~x

d~x ′e−i~k·~x−i~k ′·~x ′

G(~x− ~x ′)

Because of translational invariance, one of the integrals can be eliminated by letting,

~y = ~x− ~x ′ ~x = ~y ′ +1

2~y

~y ′ =1

2(~x+ ~x ′) ~x ′ = ~y ′ − 1

2~y.

The Jacobian of this transformation is one: d~xd~x ′ = d~yd~y ′, and

〈h(~k) h(~k ′)〉 =

d~y e−i(~k−~k ′)· ~y2G(y)

[∫

d~y ′e−i(~k+~k ′)·~y ′

]

.

The last integral is a delta function,

〈h(~k) h(~k ′)〉 =

d~ye−i~k·~yG(y)(2π)d−1δ(~k + ~k ′). (5.24)

Carrying out the integral over ~y to define the Fourier transform of G, we obtain,

〈h(~k) h(~k′)〉 = G(~k)(2π)d−1δ(~k + ~k′). (5.25)

In short, the function 〈h(~k) h(~k′)〉 is related to the Fourier transform of the correlation functionG(~x) that we are trying to compute. Instead of calculating the function G(~x) directly we proceed

now to calculate the function 〈h(~k) h(~k′)〉 instead. And in order to do so, we turn to a discreterepresentation so that

〈h(~k) h(~k′)〉 = G~k Ld−1δ~k+~k ′, 0. (5.26)

Page 49: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

5.2. A CONTINUUM MODEL OF FLUCTUATING SURFACE 45

Because of the delta function, all terms are zero except

〈h(~k)h(−~k)〉 = 〈|h~k|2〉.

It is this latter second moment of h~k that we now proceed to calculate.By definition of statistical average, we have (in the discrete representation)

〈|h~k|2〉 =

∏′~k′

∫d2h~k′ |h~k|2e

− σ

2kBTLd−1

P

~k′′k′′ 2|hk′′ |2

∏′~k′

∫d2h~k′e

− σ

2kBTLd−1

P

~k′′k′′ 2|h~k′′ |2

(5.27)

Note that we have a multidimensional integral involving d2h~k′ as independent variables. Except for

the integrals involving h~k, identical terms on the numerator and denominator cancel. In addition,

we have to be careful because for any ~k we have to consider two terms in the right hand side thatdo not cancel. This is so because

~k′′

k′′ 2|h~k′′ |2 =∑

~k′′

k′′ 2h~k′′ h−~k′′ .

Consider for example the particular term involving h15,

(15)2h15h−15 + (−15)2h−15h15 = 2(15)2h15h−15

giving a factor of 2 for each ~k. With both considerations in mind, Eq. (5.27) reduces to,

〈|h~k|2〉 =

∫d2h~k|h~k|2e

− 2σ

2kBTLd−1 k

2|h~k|2

∫d2h~ke

− 2σ

2kBTLd−1 k

2|h~k|2

(These steps are given in Goldenfeld’s book in Section 6.3, p.174.) Now, since h~k is a complexvariable, let us write

h~k = Reiθ

(of course, both R and θ depend on ~k, but this does not matter in this calculation). so that

〈|hk|2〉 =

∫∞0RdR R2e−R

2/a

∫∞0RdR e−R2/a

,

as the integrals over θ in the numerator and denominator cancel (the integrand does not depend onθ). We have defined,

a =kBT

σk2Ld−1 (5.28)

The Gaussian integral can be done exactly, to find our final result

〈|hk|2〉 =kBT

σk2Ld−1 (5.29)

Therefore, from Eq. 5.26 we have

G(~k) =kBT

σ

1

k2. (5.30)

The spatial correlation function

G(~x) = 〈h(~x)h(0)〉 (5.31)

can be found by inverse Fourier transform of Eq. (5.30).

Page 50: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

46 CHAPTER 5. FLUCTUATIONS OF SURFACES

~x

y

W W , width of surface

Figure 5.12: Schematic representation of a fluctuating surface and the definition of its width.

Surface width and roughening

Prior to obtaining G(~x), we obtain the equilibrium width of the fluctuating surface. We define thewidth w as the root mean square of the fluctuations of h averaged over the ~x plane,

w2 =1

Ld−1

dd−1~x〈(h(~x)− 〈h(~x)〉)2〉 (5.32)

We set the reference surface at a point such that 〈h(~x)〉 = 0. Note that since 〈h(~x)h(~x)〉 = G(~x = 0),it follows that w2 = G(~0) with, by definition of the Fourier transform,

G(0) =

∫dd−1~k

(2π)d−1

kBT

σk2. (5.33)

Before we carry out the integral we note that it potentially has divergences as ~k → 0, or ~k →∞.In order to avoid them, we will always restrict our domain of integration to 2π

L ≤ |~k| ≤ Λ = πa . The

lattice constant a provides a natural cut off at large k (or short length scales), as our calculationsand models are not really meaningful below those scales. At the lower end, we introduce the inverselateral dimension of the system L as the cut off. We will carry out the integrals with this lowerlimit, and then explore the consequence of taking the thermodynamic limit L→∞.

With these two cut offs in mind, it is easy to calculate the integral Eq. 5.33. First in d = 2 (sothat the interface is one dimensional) we find,

G(0) =kBT

σ

1

∫ Λ

−Λ

dk1

k2=kBT

πσ

∫ Λ

2πL

dk1

k2≃ kBT

2π2σL (5.34)

in the limit of large L and finite Λ. Therefore in d = 2,

w =

kBT

2π2σL1/2 (5.35)

Note the perhaps surprising result that he width w diverges in the thermodynamic limit of L→∞.This result is a direct consequence of the spectrum of surface fluctuations scaling as 〈|h~k|2〉 ∼ 1/k2,which in turn is a direct consequence of the fact that the energy spectrum of surface fluctuationsis proportional to (∂h/∂~x)

2. In physical terms, surface fluctuations (i.e., deviations from planarity)

increase the energy of the configuration by an amount proportional to its gradient squared. Thisincrease is small enough for small gradients so that the presence of this distortions at finite tem-perature dominates the averages, and leads to a divergent variance of surface fluctuations. We willreturn to this issue several times in this course, especially when we discuss the consequences ofbroken symmetries.

Before proceeding, it is instructive to point out that the surface width has a strong dependenceon the dimensionality of space. Indeed, in order to compute w in d = 3, we first note that

G(0) =kBT

σ

1

4π22π

∫ Λ

2πL

k dk

k2=kBT

2πσln (LΛ/2π). (5.36)

Page 51: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

5.2. A CONTINUUM MODEL OF FLUCTUATING SURFACE 47

In d = 3, a two dimensional surface has a width

w =

kBT

2πσ

(

lnLΛ

)1/2

(5.37)

We further illustrate the dependence of w on dimensionality by quoting results that have beenobtained when d is considered to be a continuous variables (we will not give the details here). It hasbeen found that,

w =

undefined , d ≤ 1

Lx , 1 < d < 3

(lnL)1/2 , d = 3

constant , d > 3

(5.38)

where the exponent x is given by,

x =3− d

2(5.39)

and is called the roughening exponent. The fact that w is undefined for d ≤ 1 is explained below.The divergence of w with L for d < 3 shown in Eq. 5.37 is called roughening; the surface is

macroscopically rough. Note that although the width for 1 < d < 3 diverges, it is in fact not large

W ∼ L1/2

L

L

Width in d = 2 remains small

compared to system as L→∞

Figure 5.13:

compared to the lateral dimension of the system itself as L→∞:

w

L=

1

L1−x =1

L(d−1)/2(5.40)

Hence the width is small for d > 1. In d = 1, it would appear that the width is comparable tothe system size. This means the interface, and indeed phase coexistence itself, does not exist. Thisis one signature of what is called lower critical dimension, at and below which phase coexistencecannot occur at any nonzero temperature. Fluctuations are strong enough to prevent the two phasesfrom being separated by a well defined surface.

In more general cases (not just the one considered), a roughening exponent x is defined by therelation w ∼ Lx as L → ∞. This can be shown to be equivalent to a definition of the surfacecorrelation exponent ηs as

G(k) ∼ 1

k2−ηs (5.41)

as k → 0. (In the case discussed above ηs = 0, of course.) The two exponents are then seen to berelated by the formula

x =3− d

2− ηs

2(5.42)

The surface turns out to be self-affine in exactly the same way that we discussed earlier for fractalstructures (the Von Koch snowflake) since correlation functions like G(~k) obey power laws in k. Theself-affine or fractal dimension of the rough surface can be shown to be

ds = x+ d− 1 (5.43)

Page 52: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

48 CHAPTER 5. FLUCTUATIONS OF SURFACES

Spatial correlation function

The spatial correlation function G(x) can be obtained by inverse Fourier transformation of Eq.(5.30). It is not necessary to do the calculation, however, as one can rely of known results ofelectrostatics. The Poisson equation for the electrostatic potential G due to a point charge at ~r2 is

∇2G(~r1, ~r2) = −δ(~r1 − ~r2).

The Fourier transform of this equation yields,

−k2G(~k) = −1, G(~k) =1

k2.

On the other hand, the electrostatic potential is known,

G(~r1, ~r2) =1

1

|~r1 − ~r2|d = 3 (5.44)

G(~r1, ~r2) = − 1

2πln |~r1 − ~r2| d = 2. (5.45)

Therefore in d = 3 or a two dimensional surface, one has

G(~x1, ~x2) = −kBT2πσ

ln |~x1 − ~x2| (5.46)

In the case of d = 2, or a one dimensional interface, the correlation function is

G(x1, x2) ∼ |x1 − x2|,

that is, correlations grow with distance.It is instructive to redo this calculation explicitly, albeit for a slightly different correlation func-

tion,g(~x) = 〈(h(~x)− h(0))2〉 (5.47)

which reduces to

g(~x) = 2(〈h2〉 − 〈h(~x)h(0)〉

)= 2 (G(0)−G(~x)) = 2

(w2 −G(~x)

)

The interpretation of g(~x) from Eq. (5.47) is simple: It is the (squared) difference in the heightsbetween two points separated by ~x, as shown in Fig. 5.14. Of course, g(x → L) = 2w2 because

g(~x− ~x′)

h(~x) h(~x′)

Figure 5.14:

G(L) = 〈h(L) h(0)〉 = 〈 * 0

h(L)〉〈 * 0h(0)〉 as the points 0 and L are so far apart so as to be uncorrelated,

hence G(L) = 0.Substituting explicitly, we find

g(~x) = 2kBT

σ

1

(2π)d−1

dd−1~k(1− ei~k·~x

k2)

= 2kBT

σ

1

(2π)d−1

dd−1~k(1− cos~k · ~x

k2)

(5.48)

Page 53: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

5.2. A CONTINUUM MODEL OF FLUCTUATING SURFACE 49

where the equality follows from the fact that the integrand (except the exponential term) is even ink, so that the sin part of the Fourier transform vanishes.

Consider first the case d = 2, or a one dimensional surface.

g(x) = 2kBT

σ

1

∫ Λ

−Λ

dk(1− cos kx

k2) = 2

kBT

πσ

∫ Λ

2π/L

dk1− cos kx

k2

Let u = kx,

g(x) = 2kBT

πσx

[∫ Λx

2πx/L

du1− cosu

u2

]

.

Now let Λ =∞ as there is no divergence at short distances,

g(x) = 2kBT

πσx∫ ∞

0

du1− cosu

u2−∫ 2πx

L

0

du1− cosu

u2 = 2

kBT

πσx

[

π

2−∫ 2πx

L

0

du1− cosu

u2

]

where we have used the result ∫ ∞

0

du1− cosu

u2=π

2.

Therefore, we can write for d = 2

g(x,L) =kBT

σxf(x/L) (5.49)

where we have defined

f(y) = 1− 2

π

∫ 2πy

0

du1− cosu

u2(5.50)

Equation (5.49) is our final result. It shows that for a one dimensional interface, the correlationfunction grows linearly with x, with a factor that is a function of x/L. It is instructive to considerthe limits of the function f explicitly:

f(y →∞) = 0

f(x→ 0) = 1

This function is called a scaling function and is schematically shown in Fig. 5.15. At short distances

1

0

f(x/L)

x/L

Figure 5.15:

compared with the size of the system (x/L→ 0) f → 1 and correlations indeed grow linearly.A similar calculation can be carried out in d = 3, to find

g(r) = 2kBT

σ

1

4π2

∫ 2π

0

∫ Λ

2π/L

kdk(1− cos kr cos θ

k2)

Evaluating this integral is a bit more complicated, and we just quote the answer

g(r) =kBT

4πσln rΛ (5.51)

Page 54: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

50 CHAPTER 5. FLUCTUATIONS OF SURFACES

The calculation can be carried out for a continuous dimension d, and it can be shown that theresult is

g(r) =

undefined , d = 1

r2x , 1 < d < 3

ln r , d = 3

constant , d > 3

(5.52)

where, again x = (3− d)/2. Correlations are power law functions.

5.2.3 Partition function, free energy, and surface tension

Since the probability distribution function of h~k is a Gaussian, one can calculate exactly in thiscase the partition function and the free energy. We do this next. Recall Eq. (5.13),

Z = e−σLd−1

kBT

~x

dh(~x) exp

( −σ2kBT

dd−1~x′(∂h

∂~x′)2)

Switching to Fourier space, and by using Eqs. (5.19) and (5.21) we find,

e−F/kBT = Z = e−σLd−1

kBT

∫ ′∏

~k

d2h~k

exp

[

−σ2kBT

1

Ld−1

k

k2|hk|2]

(5.53)

where we have used the definition of the free energy F in terms of the partition function Z. EquationEq. (5.53) can be rewritten as

Z = e−σLd−1

kBT

~k

′∫

d2h~ke− σ

2kBTk2

Ld−1 |hk|2

since all modes are independent. The Fourier amplitudes h~k are complex so that we write

d2h~k =

∫ ∞

−∞dℜ(h~k)

∫ ∞

−∞dℑ(h~k) =

∫ 2π

0

∫ ∞

0

RdR, (5.54)

if h~k = Reiθ. Hence

Z = e−σLd−1

kBT

′∏

~k

∫ 2π

0

∫ ∞

0

R dRe−( σ

2kBTk2

Ld−1 )R2

Let u = ( σ2kBT

k2

Ld−1 )R2. Then

Z = e−σLd−1

kBT

′∏

~k

2π1

2( σ2kBT

k2

Ld−1 )

∫ ∞

0

:0du e−u

= e−σLd−1

kBT

′∏

~k

2πkBTLd−1

σk2

= e−σLd−1

kBT elnQ′~k

2πkBTLd−1

σk2

= exp

−σLd−1

kBT+

′∑

~k

ln2πkBTL

d−1

σk2

Page 55: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

5.2. A CONTINUUM MODEL OF FLUCTUATING SURFACE 51

This is our final result (eliminating the restriction on the sum over ~k),

Z = exp

−σLd−1

kBT+

1

2

~k

ln2πkBTL

d−1

σk2

(5.55)

and hence the free energy is

F = σLd−1 − kBT

2

~k

ln2πkBTL

d−1

σk2. (5.56)

This is a very interesting result. The free energy consists of two parts. The first one is the energy dueto the planar surface, and it is proportional to the interfacial tension σ. This is the zero temperaturelimit of Eq. (5.56) in which only the energy (and not the entropy) contribute to the free energy. AtT 6= 0, fluctuations contribute to the entropy as the system explores a larger region of configurationspace, leading to the second term in the equation. Note how the contribution from each mode kappears weighted by 1/k2.

The thermodynamic surface tension σ∗ is defined as the change in free energy relative to thechange in surface,

σ∗ =

(∂F

∂S

)

T,V,N

=

(∂F

∂Ld−1

)

T,V,N

(5.57)

which includes, as it should, the effect of fluctuations. In computing the derivative with respectto L it is important to note that ~k do implicitly depend on the dimension L (see Fig. 5.16). Thedependence of k on L is simple, and it follows that the combination (kL) is precisely independentof L. We therefore rewrite Eq. (5.56) as,

2π/k 2π/k′

L L+ dL

note k shifts to k′

as L→ L+ dL

Figure 5.16:

F = σLd−1 − kBT

2

~k

ln2πkBTL

d+1

σ(kL)2

or,

F = σLd−1 − kBT

2

~k

ln(Ld−1)(d+1)/(d−1) − kBT

2

~k

ln2πkBT

σ(kL)2

or,

F = σLd−1 −

kBT

2(d+ 1

d− 1)(lnLd−1)

~k

1

− kBT

2

~k

ln2πkBT

σ(kL)2. (5.58)

The second term in the right hand side is independent of ~k and hence it has all been removed fromthe sum, leaving only the term

~k 1 which equals the total number of states. Also note that the

Page 56: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

52 CHAPTER 5. FLUCTUATIONS OF SURFACES

third term in the right hand side is independent of L as only the combination (kL) appears here.Therefore,

σ∗ = σ − kBT

2

d+ 1

d− 1

1

Ld−1

~k

1

,

in the discrete representation that we are using. In order to compute the total number of states, itis better to use a continuum representation,

σ∗ = σ − kBT

2

d+ 1

d− 1

∫dd−1~k

(2π)d−1, (5.59)

which yields,

∫dd−1~k

(2π)d−1=

1

∫ Λ

−Λ

dk =Λ

πd = 2 (5.60)

∫dd−1~k

(2π)d−1=

1

4π2

∫ 2π

0

∫ Λ

0

kdk =1

4π2πΛ2 =

Λ2

4πd = 3 (5.61)

(5.62)

where Λ = π/a. We find for the surface tension,

σ∗ = σ − 3kBT

2πΛ d = 2,

σ∗ = σ − kBT

4πΛ2 d = 3.

(5.63)

Note that the surface tension decreases with temperature. At T increases, fluctuations away fromplanarity become more important, and therefore the surface appears. on average, to be more floppy.This is of course a measurable effect, one that follows directly from the effect of thermal fluctuationson macroscopic quantities. Note also that by increasing the temperature, the surface tension σ∗

becomes zero. At this point, fluctuations are so large that there is effectively no restoring force onthe surface to bring it back to planarity. In effect, it disappears, and with it, the coexistence of twoseparate phases.

5.3 Impossibility of Phase Coexistence in d=1

The surface tension σ∗ is not defined for d = 1. In our model, σ∗ actually becomes −∞ at d = 1.This anomaly signals the lack of two phase coexistence in one dimension. In fact, this is a theoremdue to Landau and Peierls, which is worth quickly proving. (See Goldenfeld’s book, p.46.)

A homogeneous phase has a bulk energy EB ∝ +Ld. Adding a surface to the system adsan extra energy ES ∝ +Ld−1 Therefore, if one adds an interface to an initially uniform phase,there is a net increase in the energy of the system by ∆E ∝ +Ld−1 Since the energy increases by

Ebulk ∝ Ld

Esurface ∝ Ld−1

y

~x

Figure 5.17:

Page 57: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

5.4. ORDER OF MAGNITUDE CALCULATIONS IN D=3 53

adding interfaces, it looks like the system wants to have the smallest possible number of them. Inconventional coexistence, this would mean two phases separated by a single surface.

However, at finite temperature the equilibrium state of the system is the one that minimizes thefree energy F = E − TS, not just the energy E. In order to estimate the entropy change due tointroducing an interface in the system, we need to count the number of microscopic states that areconsistent with one single macroscopic interface (see Fig. 5.18). The total number of such states

or or or etc.

Figure 5.18:

is L along the y axis as shown, as the interface could be anywhere along that axis. Of course theinterface could be in any location along the remaining axes, and hence there are ∼ Ld ways to placea single interface in our d dimensional system. The entropy gained by introducing a single surfaceis therefore

∆S ∼ kB lnLd = kBd lnL. (5.64)

Hence, the total change in the free energy due to introducing a single surface is

∆F = O(Ld−1)− kBTdO(lnL) (5.65)

For d > 1 and L → ∞, the free energy increases if a surface is added to the system. Undernormal circumstances, a system with no interfaces will be the equilibrium state. But for d = 1 (andT > 0) the energy loss due to introducing an interface is finite, whereas the entropy gain increasesas lnL. Therefore for sufficiently large L, the system can always reduce its free energy by addingone interface.

Of course, once an interface is added, the same argument can be repeated for any of the two bulkphases resulting in the conclusion that it would be favorable to add a second interface, and a third,and so on. This means that coexisting phases are not stable for d = 1. Since adding one interfacedecreases F , we can add another and another and another, and keep on decreasing F until thereare no longer bulk coexisting phases, just a mishmash of tiny regions of phase fluctuations. Hence,for systems with short-ranged interactions (consistent with our assumption that the energy of aninterface is simply ∼ Ld−1), and T > 0 (so that entropy matters), there can be no phase coexistencein d = 1.

5.4 Order of magnitude calculations in d=3

Some of our earlier consideration may seem a little academic, including the fact that they are done inarbitrary dimension d. We mention, for example, that our results for d = 2 apply to films absorbedon liquid surfaces and to ledges in crystal surfaces, as shown in Fig. 5.19. Experiments have beendone on these systems confirming our d = 2 results, such as w ∼ L1/2.

In d = 3, it is worth looking again at two of our main results, Eqs. (5.37) and (5.63),

w =

kBT

2πσ(ln

2π)1/2

σ∗ = σ − kBT

4πΛ2 = σ(1− 4πkBT

σ(

Λ

2π)2) = σ(1− T/T ∗)

where we have defined

T ∗ =σ

4πkB(Λ/2π)2(5.66)

Page 58: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

54 CHAPTER 5. FLUCTUATIONS OF SURFACES

Figure 5.19:

ls

P

Pc

T

vtriple point

Tc

Critical point

Figure 5.20: Phase Diagram

Page 59: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

5.5. (*) A DIFFERENT VIEW OF GRADIENT TERMS 55

Consider now the phase diagram of a pure substance (Fig. 5.20). Our analysis describes thebehavior along the liquid-vapor coexistence line where an interface separates the two bulk equilibrium

phases. Near the triple point of a typical fluid, the surface tension is approximately√

kBTt2πσ ≈ 1A,

where Tt is the triple point temperature. Similarly, one roughly has Λ2π ≈ 10 A. Therefore,

w = 1A

(

lnL

10A

)1/2

(5.67)

In real systems, it turns out that the divergence is very weak:

w =

2.6A L = 10000A (light wavelength)

4.3A L = 10 cm (coffee cup)

5.9A L = 1000 km (ocean)

9.5A L = 1030 m (universe)

Of course, in a liquid-vapor system on Earth, gravity plays a role. The extra energy due togravity is

EG =1

2

dd−1~x(ρl − ρv)gh2(x)

where ρl,v is the mass density of the liquid, vapor, and g is the intensity of the gravitational field.This extra energy changes our results in a way you should work out for yourself.

As far as the surface tension is concerned, note that σ∗ vanishes at T ∗ (assuming σ 6= σ(T ) is aconstant). It is tempting, although a little too rough, to interpret T ∗ = Tc to be the critical point.With numbers, one gets T ∗ ≈ 1.4Tt, which is the right order of magnitude for Tc. One indicationthat our model is only approximate is that fact that is is known that

σ∗ ∼(

1− T

Tc

for T → Tc, where µ ≈ 1.24 in d = 3. So our theory and interpretation of T ∗ as Tc gives µ = 1instead. This is not too bad, considering we only went to lowest order in the expansion of Eν =∫d~x√

1 + (∂h/∂~x)2 which clearly underestimates the fluctuations of h(x) near the critical point,and did not consider bulk properties at all.

5.5 (*) A different view of gradient terms

Quite obviously, terms proportional to gradients of h(~x) in the energy E increase the energy ofnon uniform configuration, and hence favor a planar surface. We briefly discuss here a differentinterpretation of these terms, to show that they can be understood as the nonlocal interactionbetween neighboring points h(~x) and h(~x′). Consider

d~x

(∂h

∂~x

)2

=

d~x

[∂

∂~x

d~x ′h(~x ′)δ(~x− ~x ′)

]2

=

d~x

d~x ′∫

d~x ′′h(~x)h(~x ′′(∂

∂~xδ(~x− ~x ′)) · ( ∂

∂~xδ(~x− ~x ′′))

where we have used simple properties of the Dirac delta function. Now let

x→ x ′′

x ′ → x

x ′′ → x ′

Page 60: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

56 CHAPTER 5. FLUCTUATIONS OF SURFACES

since they are just dummy indices. We find

=

d~x

d~x ′∫

d~x ′′h(~x)h(~x ′)(∂

∂~x ′′ δ(~x′′ − ~x)) · ( ∂

∂~x ′′ δ(~x′′ − ~x ′))

=

d~x

d~x ′∫

d~x ′′h(~x)h(~x ′)∂

∂~x· ∂

∂~x ′ (δ(~x′′ − ~x) δ(~x ′′ − ~x ′))

=

d~x

d~x ′h(~x)h(~x ′)∂

∂~x· ∂

∂~x ′ δ(~x− ~x′)

=

d~x

d~x ′h(~x)h(~x ′)

[

− ∂2

∂~x2δ(~x− ~x ′)

]

.

We have repeatedly integrated by parts recalling that terms involving surfaces at ∞ vanish. Hence,

dd−1~x(∂h

∂~x)2 =

dd−1~x

dd−1~x ′h(~x)M(~x− ~x ′)h(~x ′) (5.68)

where the kernel M is,

M(~x) = − ∂2

∂~x2.δ(~x) (5.69)

If the delta function is approximated by a continuous function of small but finite width, the M hasthe form shown in Fig. 5.21. This type of interaction matrix appears in many different contexts. It

~x

M(~x) = − ∂2

∂~x2 δ(~x)

Figure 5.21:

causes h(~x) to interact with h(~x′) in their contribution to the energy E.

Page 61: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 6

Broken Symmetry and CorrelationFunctions

A major concept in modern Physics is symmetry. The results obtained for fluctuating surfaces arebut examples of more general properties of correlations in systems with broken symmetries. Weexplore such general properties in this chapter.

First, let us elaborate on what is a spontaneously broken symmetry, In the case of a fluctuatingsurface, note that in the partition function

Z =∑

ν

e−Eν/kBT

the energy of a state has the following “symmetry” or invariance

Eν(h+ constant) = Eν(h), (6.1)

because

Eν = σLd−1 +σ

2

dd−1~x

(dh

∂~x

)2

so that it depends on derivatives of h, but not on the value of h itself. In other words, the energy of aninterface is invariant under the uniform translation of the entire interface by an arbitrary constant,and so the partition function also has the same invariance. Therefore we would expect this symmetryto be respected by any quantity that is obtained from the partition function Z. However in a givenexperiment that average location of the physical interface 〈h〉 has one and only one value. In otherwords, even though by symmetry the interface could be on average at any location in the system, ina physical experiment it picks only one specific value. This is referred to as spontaneously brokensymmetry, as in one physical realization the configuration of the interface breaks the symmetry ofthe system (under translation or rotation in this case). Correlations and averages do not respectthat full symmetry of the energy Eν or the partition function Z.

Homogeneous systems, whether liquid or vapor, are translationally and rotationally invariant.However, a system with a surface, as drawn in Fig. 6.1, only has translational invariance along theplane of the interface ~x but not along the direction normal to the interface y. Hence the systemwith a surface has a lower symmetry compared to the bulk liquid or vapor systems, so that somesymmetry must have been broken. The perfect symmetry of the homogeneous system is lost preciselybecause h, the broken symmetry variable, has a definite value in any given experimental realization,and hence the state breaks the symmetry of the underlying system’s energy.

It is instructive to qualitatively argue that the facts that Z and Eν have a particular symmetryhas implications on the form of the correlation functions involving the broken symmetry variable h.Consider the case of d = 2. Since a surface can be anywhere with equal probability, it will be atdifferent locations in different systems or in different experiments. This is illustrated in Fig. 6.2.Imagine now a very large system thought of being composed of the parts shown in Fig. 6.2. In a part

57

Page 62: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

58 CHAPTER 6. BROKEN SYMMETRY AND CORRELATION FUNCTIONS

Figure 6.1:

y

x

Figure 6.2:

of the system 〈h〉 might have one particular value, while far away along the x axis, it might wantto have another value (both of the same energy, and hence should appear with equal probability).This hypothetical situation is depicted schematically in Fig. 6.3. Imagine we now approximate theinterface in the figure by a sinusoidal function of wavenumber k, with k ≪ 1. Since we know thatthe energy cost of producing this distortion is

∆E(k) =σ

2k2|hk|2,

we find that ∆E(k) → 0 as k → 0. Indeed, if one has a very large system encompassing the fourparts shown in Fig. 6.2, with the four subsystems behaving like independent subsystems because of

x

y

Figure 6.3: Surface of a very big system

the symmetry, the energy cost of such a fluctuations is infinitesimal as k → 0. If we now use theequipartition of energy, one would find,

kBT

2=

1

(Ld−1)2σ

2k2〈|hk|2〉, 〈|hk|2〉 ∼

kBT

σ

1

k2.

As we know, the Fourier transform of the correlation function of the broken symmetry variableh diverges as 1/k2 as k → 0. However one needs not know the form of Eν to obtain the result

〈|hk|2〉 ∼ 1/k2. Rather, it is sufficient to know that the interface breaks a symmetry of Z and Eν .This result is the main topic of this chapter, and it is known as Goldstone Theorem.

Let us recast this argument in a way that does not require the use of the equipartition theorem.As shown in Fig. 6.3, fluctuations of size, say, ±∆y occur over a long distance ∆x. The actualvalues of ∆y (which could be ∼ A) and of ∆x (which could be ∼ meters) are not important for theargument. Since h is a broken symmetry variable, if one considers a displacement on the interfaceof the order of ∆x to the right, then the interface would be distorted up or down at random by

Page 63: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

6.1. GOLDSTONE’S THEOREM 59

step 1

step 2

liquid

vapor

y

x

Figure 6.4: Interface (in units of ∆x and ∆y)

an amount ∆y. This is shown schematically in Fig. 6.4. Of course, the sequence of displacementsconstitutes a one dimensional random walk along the y axis with x playing the role of time. For arandom walk, the root mean square deviation of y from its initial value is

yRMS ∼ x1/2

for large “time” x. The deviation yRMS is related to the correlation function (g(x))1/2 from Eq. 5.47,

so we have g(x) ∼ x. Fourier transforming gives G(k) ∼ 1/k2 for small k, and 〈|hk|2〉 ∼ 1/k2. (TheFourier transform is not simple; g(x) is a divergent function of x, leading to the singularity of G(k)at small k). This argument is only in d = 2, but it can be generalized to higher dimensions.

It is the fact that h is a broken symmetry variable (and hence the random walk of the variabley) which gives rise to the 1/k2 dependence; no specific details about the interaction energies arenecessary. Note also that fluctuations in a way react to a symmetry breaking by “shaking it” in sucha way as to try and restore the broken symmetry. The rough width of the surface is this attemptto remove it from the system and so restore the full translational invariance in the y direction, as

w ∼ Lx

A symmetry breaking variable ends up

having its fluctuations try to restore

the symmetry.

x

y

Figure 6.5:

shown in Fig. 6.5. In fact for d = 1, the fluctuations are so strong that the full symmetry is in factrestored, and one has an equilibrium state with the full translational invariance in the y direction(since coexisting phases cannot exist). In 1 < d ≤ 3, fluctuations do not restore the symmetry, butthey make the surface rough, or diffuse, as L → ∞. In d > 3 fluctuations do not affect the flatinterface.

Finally, it should be noted that our cartoons of rough surfaces attempt to show them as selfsimilar and self affine objects much in the same sense as the von-Koch snowflake discussed earlier.Note an important new ingredient: broken symmetry. Spontaneously broken symmetry leads topower law correlations and to self similar structures.

6.1 Goldstone’s Theorem

Let B(~r) be a general broken continuous symmetry variable, with ~r a field point in d dimensions).For a system with translational invariance,

〈B(~k) B(~k′)〉 = (2π)dδ(~k + ~k′)G(k).

Page 64: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

60 CHAPTER 6. BROKEN SYMMETRY AND CORRELATION FUNCTIONS

The theorem states thatlimk→0

G(k)→∞

with a divergence no weaker thanG(k) ∼ 1/k2

We will not give a proof of theorem, rather present its physical underpinnings. For simplicity, wewill consider 〈B〉 = 0. If this is not the case, one can define B′ = B − 〈B〉. We start by expandingthe free energy as a Taylor series in B around the equilibrium state

F (B) = F0 +

(∂F

∂B

)

0

B +1

2

(∂2F

∂B2

)

0

B2 + ...

The first term in the right hand side F0 is a constant, and(∂F∂B

)

0= 0 if the system is in equilibrium.

Hence the Taylor series is just,

F (B) =1

2

(∂2F

∂B2

)

0

B2 + ... (6.2)

We have to be a bit more careful with the Taylor expansion given that B is really a function of~r. Treated as a function of many variables, the correct expansion reads

F (B) =1

2

dd~r

dd~r′(

∂2F

∂B(~r)∂B(~r′)

)

0

B(~r)B(~r′) (6.3)

Since ∂2F∂B(~r)∂B(~r′) is a thermodynamic derivative, in a system which is translationally invariant, we

write∂2F

∂B(~r)∂B(~r′)= C(~r, ~r′) = C(~r − ~r′) (6.4)

By using Parseval’s theorem in Fourier transforms, one has

F (B) =1

2

∫dd~k

(2π)d

∫dd~k

(2π)d

(

∂2F

∂B(~k) ∂B(~k′)

)

0

B(~k) B(~k′). (6.5)

Translational invariance implies

∂2F

∂B(~k) ∂B(~k′)= (2π)dδ(~k + ~k′)C(k) (6.6)

If B(~x) is real, then B∗(−k) = B(k). We finally obtain

F (B) =1

2

∫ddk

(2π)dC(k)|B(k)|2. (6.7)

If we consider a discrete system, we would obtain instead,

F (B) =1

2Ld

~k

C~k|B~k|2 (6.8)

So far this is straightforward, and we have not used the knowledge that B is a broken symmetryvariable. This has important implications on the dependence of the coefficient C(~k). Anticipatingthe singular nature of the response near k = 0, let us expand

C(~k) = constant + constant k2 + constant k4 + ... (6.9)

where we have assumed that C is analytic near ~k = 0. Analyticity of this function of many variablesallows terms in the expansion of the form ~k · ~k = k2 and (~k · ~k)2 = k4, but not non analytic terms

such as (~k · ~k)|~k| = k3 (i.e., that have an absolute value sign).

Page 65: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

6.1. GOLDSTONE’S THEOREM 61

We now use the fact that B breaks a symmetry of F . Hence F can have no dependence on B inthe limit ~k = 0. Otherwise, the free energy would depend on the spatial average of B, which wouldcontradict the assumption B representing a symmetry of the system. Therefore

limk→0

C(~k) = 0 (6.10)

and, if we have analyticity,C(k) ∼ k2 as k → 0 (6.11)

This gives

F (B) ∝∑

~k

k2| ~B~k|2 (6.12)

This is the same as Eq. (5.20) for the case of a fluctuating surface, so we have

〈B(~k)B(~k′)〉 = (2π)dδ(~k + ~k′)G(~k)

where

G(~k) ∝ 1

k2, ~k → 0.

If C(~k) is not analytic as assumed in Eq. (6.9), we would get the weaker condition

limk→0

G(~k)→∞

Finally, it is interesting to note that if B is not a broken symmetry variable, from Eq. (6.9) wehave

F (B) = constant∑

~k

(k2 + ξ−2)|B~k|2

where ξ−2 is a constant. Taken back to 〈B(~k)B(~k′)〉 = (2π)dδ(~k + ~k′)G(~k) it leads to

G(~kk) ∝ 1

k2 + ξ−2(6.13)

as k → 0. Inverse Fourier transformation gives

〈B(~r)B(0)〉 = G(~r) ∝ e−r/ξ

as r → ∞. In short, absence of the broken symmetry removes power law correlations, and bringsinstead an exponential decay of the correlation function. This was the signature of a system com-prised of many independent sub systems. The quantity ξ can be identified as the correlation lengthdiscussed earlier.

Page 66: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

62 CHAPTER 6. BROKEN SYMMETRY AND CORRELATION FUNCTIONS

Page 67: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 7

Equilibrium correlation functionsand Scattering

Equilibrium fluctuations can be understood as local deviations in a subsystem that is part of a largersystem in equilibrium. We study in this chapter how to describe this equilibrium fluctuations.

Consider as an example a system of N classical and indistinguishable particles at constant tem-perature. The partition function is

Z(N,V, T ) =1

N !

∫dN~rdN~p

h3Ne−βE(~rN ,~pN ).

It is useful to divide the energy into kinetic and potential, and the joint probability of space andmomentum,

p(~rN , ~pN ) = φ(~pN )P (~rN ),

with

P (~rN ) =e−βV (~rN )

∫dN~re−βV (~rN )

(7.1)

where V is the potential energy function. By definition, P is the probability of observing the systemin any specific spatial configuration while it evolves in configuration space in equilibrium. For aninteracting system, P does not factorize in simpler distribution functions involving a single particle,and hence one often resorts to defining marginal probability distribution functions. The function Pitself has too much information to be of any use if it were possible to compute it.

The first function is

ρ1(~r) = N

d~r2 . . . d~rNP (~rN ). (7.2)

The integral is the probability that particle 1 is at ~r. Since all particles are equivalent, the prefactorN leads to ρ1 being the probability that any particle is at ~r. The definition is such that if the systemis uniform P (~rN ) = 1/V N , and ρ1 = N/V = ρ, the equilibrium density.

Analogously, one defines

ρ2(~r1, ~r2) = N(N − 1)

d~r3 . . . d~rNP (~rN ), (7.3)

which is the joint probability of finding any one particle at ~r1 and any other particle at ~r2. If thesystem is uniform, then ρ2 = N(N − 1)/V 2 ≃ ρ2. In the case of the density ρ, it is customary todefine the correlation function g as

g(~r1, ~r2) =ρ2(~r1, ~r2)

ρ2. (7.4)

63

Page 68: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

64 CHAPTER 7. EQUILIBRIUM CORRELATION FUNCTIONS AND SCATTERING

This function has an easy interpretation from probability theory. Recall that P (A,B) = P (A|B)P (B).The joint probability distribution (for a system with translational invariance) would be P (A,B) =ρ2(0, ~r), and since P (B) = ρ, then

P (A|B) =P (A,B)

P (B)= ρg(~r). (7.5)

That is, ρg(~r) is the conditional probability of finding a particle at ~r given that there is a particleat the origin.

7.1 Measurement of g(~r) by diffraction

Consider a radiation scattering experiment as indicated in Fig. 7.1. Scattering experiments involve

Figure 7.1:

the change in momentum of some incident particle or radiation by a sample. The radiation of choicedepends on the scale of the fluctuations that one wishes to probe. In general, the wavelength of theradiation must be of the same order of magnitude as the spatial scale of the fluctuations.

The scattering wave following a scattering event in the sample is a radial outgoing wave. Itselectric field is

E ∼ E0ei~kf ·~r

r.

where ~kf is the outgoing wavevector, and ω the angular frequency of the wave. We consider hereonly elastic scattering in which the frequency of the outgoing wave is the same as the frequency ofthe incident wave (inelastic scattering is also used as a diagnostic tool, but we will not address ithere). The wave that arrives at the detector after scattering off an element of volume in the sampleat ~rj is,

E ∼ E01

|~RD − ~rj |ei~ki·~rj+i~kf ·(~RD−~rj)e−iωt. (7.6)

considering the path that the radiation has traveled from the source, to the sample, and to thedetector. We now define the scattering wavevector as ~k = ~kf − ~ki and rewrite,

E ∼ E01

|~RD − ~RS |ei~kf ·~RDe−i

~k·~rje−iωt,

where we have assumed that the detector is far from the sample and hence |~RD − ~rj | ≈ |~RD − ~RS |,where ~RS is the location of the sample.

Page 69: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

7.1. MEASUREMENT OF G(~R) BY DIFFRACTION 65

If one now adds the scattering from all elements of volume in the sample, the radiation thatarrives at the detector is,

E ∼ E01

|~RD − ~RS |ei~kf ·~RD

j

e−i~k·~rj

e−iωt.

We now compute the intensity of the radiation I(~k) = E∗(~k)E(~k) and compute the statisticalaverage, to find,

I(~kf ) ∼1

|~RD − ~RS |2NS(~k), (7.7)

where we have defined the so called structure factor as,

S(~k) =1

N〈N∑

j=1

N∑

m=1

ei~k·(~rm−~rj)〉. (7.8)

The scattering intensity is a function of ~k, the scattering wavevector.We now show how to relate the structure factor to the correlation function Write,

S(~k) =1

N〈

N∑

l=1,l=j

1 +

N∑

l 6=jei~k·(~rl−~rj)〉 =

1

NN +

1

NN(N − 1)〈ei~k·(~rl−~rj)〉.

By writing the thermal average explicitly, we find,

S(~k) = 1+N(N − 1)

N

∫dN~re−βV (~rN )ei

~k·(~r1−~r2)∫dN~re−βV (~rN )

= 1+N(N − 1)

N

∫d~r1

∫d~r2e

i~k(~r1−~r2) ∫ d~r3 . . . d~rNe−βV (~rN )

∫dN~re−βV (~rN )

.

The last integral in the right hand side and the numerator allow us to introduce the correlationfunction ρ2,

S(~k) = 1 +1

Nd~r1d~r2ρ2(~r1, ~r2)e

i~k·(~r1−~r2).

Change ~r2 = ~r1 + ~r, assume translational invariance, and integrate over ~r1 to obtain our final result

S(~k) = 1 +1

N

d~r1

d~rρ2g(~r)ei~k·~r = 1 + ρ

d~rg(~r)ei~k·~r. (7.9)

In short, the scattered intensity is proportional to the structure factor as shown in Eq. (7.7), andthe structure factor is the Fourier transform of the par correlation function, Eq. (7.9). Thereforethe scattered intensity measured in a diffraction experiment is directly the Fourier transform of thepair correlation function.

In practice, experiments measure the diffraction intensity as a function of the scattering angle θ,the angle between the incident wavevector ki and the outgoing wavevector kf . Since the scattering

wavevector ~k = ~kf − ~ki and since we focus on elastic scattering |~ki| = |~kf |, we then have k =ki + sin(θ/2) + kf sin(θ/2) or

k =4π

λisin

θ

2.

Hence changing the location of the detector relative to the sample (and hence changing θ produces

the scattered intensity for different wavevectors ~k.One can distinguish two cases for ρ(q): (i) it corresponds to a local density or local concentration

that is uniform in equilibrium, and (ii), it corresponds to a more general order parameter that maynot be uniform in equilibrium (e.g., periodic), In the first case, Fourier transformation of a uniformdensity gives

S(~k) ∝ δ(~k). (7.10)

Page 70: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

66 CHAPTER 7. EQUILIBRIUM CORRELATION FUNCTIONS AND SCATTERING

Uniform boring system

〈ρ(x) ρ(0)〉 = ρ2

x

y S(q)

qy, or, qx

S(q) ∝ δ(q)

Scattering as shaded region

Figure 7.2:

= A

= B

Perfectly ordered

A−B alloy

Figure 7.3:

Page 71: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

7.2. SCATTERING OFF A FLAT INTERFACE 67

The cartoon in Fig. 7.2 may make this clearer. The second case is more subtle. An example is thesub lattice concentration of a binary alloy. Say atoms of the two species are A and B, and assumethat locally A wants to be neighbors with B but not with another A. This makes the equilibriumstate look like a checker-board as shown in Fig. 7.3.

Assuming that the scattered wave is different from atoms A or B atoms this equilibrium structurecan be seen by analyzing the structure factor S(k). Note that the A atoms occupy a sub lattice,which has twice the lattice spacing of the original lattice as shown in Fig. 7.4. Hence the density

2a

(Sublatice 2 is unshaded)

Sublatice 1 (Shaded)

a

Figure 7.4:

of A atoms is uniform on this sub lattice with twice the lattice constant of the original system, andthe scattering will show peaks not at ~k = 0 but at ~k = ~k0 corresponding to that structure,

S(~k) ∝∑

~k0

δ(~k − ~k0) (7.11)

where k0 = 2π/(2a), and a is the original lattice spacing. The same scattered intensity is observed

−q0 q0 qy

S(q)

−q0 q0

q0

−q0

qy

qxor

Figure 7.5:

for a crystal in which Bragg peaks form at specific positions in ~k space. By monitoring the heightand width of such peaks, the degree of order can be determined.

A further complication is that usually the sample will not be a single crystal perfectly alignedwith the incident radiation so that one obtains spots on the kx and ky axes as shown. Instead theyappear at some random orientation (Fig. 7.6). This might seem trivial (one could just realign thesample), except that often a real crystal is a poly crystal, that is, it is comprised of many singlecrystal domains of different orientations that scatter simultaneously. In this case, all the orientationsindicated above are smeared into a ring of radius k0.

7.2 Scattering off a flat interface

The two examples just given concern systems that are single phase. Consider now the case of a flatinterface -without roughening- separating two uniform phases as shown in Fig. 7.8. For simplicity

Page 72: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

68 CHAPTER 7. EQUILIBRIUM CORRELATION FUNCTIONS AND SCATTERING

q0

qy

qx

qy

qx

qy

qx

q0q0or or etc

Figure 7.6:

qy

qxq0

Figure 7.7: Average of Many Crystallites

y = 0

is interface

Grey Scale

x

y

ρ

y

1

Density

Figure 7.8:

Page 73: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

7.3. SCATTERING FROM A LINE DEFECT 69

let us focus on an interface embedded in dimension d = 2. If the density is uniform in the bulkphases, we can write, ρ(x, y) = A θ(y) +B, where θ(y) is the Heaviside step function, and A and Bare constants. For simplicity, we will let A = 1 and B = 0 from now on, ρ(x, y) = θ(y)

We now proceed to compute the scattering intensity that follows from this configuration. Inorder to do so, we first need to determine the Fourier transform of the Heaviside function θ. Wefirst note that ∂yρ(y) = δ(y). Hence ikyρ(ky) = 1. We therefore find ρ(ky) = 1/iky. This result iscorrect up to a constant: ∂y(ρ(y) + C) = δ(y) also satisfies the equation for any arbitrary constantC. Fourier transforming, we find iky (ρ(ky) + Cδ(y)) = 1. In order to determine the constant C wenote that

θ(x) + θ(−x) = 1

Let us compute,

θ(ky) =

∫ ∞

−∞dyθ(y)e−ikyy =

∫ −∞

∞(−dy)θ(−y)e−iky(−y) = −

∫ −∞

∞dy (1− θ(y)) e−iky(−y),

and hence,

θ(ky) =

∫ ∞

−∞e−i(−ky)y −

∫ ∞

−∞dyθ(y)e−i(−ky)y = δ(ky)− ˆtheta(−ky).

Finally then θ(ky) + θ(−ky) = δ(ky). This relation applied to ρ gives,

ρ(ky) + ρ(−ky) = δ(ky).

Substituting the expression for ρ above, we have,

1

ik− Cδ(ky) +

1

−iky− Cδ(ky) = δ(ky), or C = −1

2.

The final result for the Fourier transform of the configuration with an interface is

ρ(ky) =1

iky+

1

2δ(ky). (7.12)

More properly in two dimensions, we should write

ρ(~k =1

ikyδ(kx) +

1

2δ(~k), (7.13)

as the system is uniform in the x direction. Therefore

S(k) = |ρ(k)|2 ∝ 1

k2y

δ(kx) (7.14)

where we have omitted the delta function from bulk scattering.Assume now more generally that the unit vector normal to the interface is n, and that the unit

tangential vector is is n⊥, (Fig. 7.9), then

S(k) ∝ 1

(~k · n)2δ(~k · n⊥) (7.15)

Then the scattered intensity would appear as shown in Fig. 7.10: a streak pointed in the directionof n (y in Fig. 7.10).

7.3 Scattering from a line defect

Assume an idealized model of a line defect (density inhomogeneity) given by,

Page 74: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

70 CHAPTER 7. EQUILIBRIUM CORRELATION FUNCTIONS AND SCATTERING

n⊥

n

x

y

Figure 7.9:

S(k)kx = 0

ky

kx

ky

Grey Scale of S(k)

Figure 7.10: Scattering from interface y = 0

x

y

ρ

y

DensityGrey Scale

y = 0

is interface

Figure 7.11:

S(k)kx = 0

ky

kx

ky

Grey Scale of S(k)

Figure 7.12: Scattering from a line defect at y = 0

Page 75: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

7.4. SCATTERING FROM ROUGH INTERFACES 71

ρ(x, y) = δ(y). (7.16)

Then ρ = δ(kx) and S(k) ∝ δ(kx). If the unit vector normal to the surface of the defect is n, and

the tangential vector is n⊥, the proper expression is S(k) ∝ δ(~k · n⊥) for a line defect.For a defect of small (but finite) extent, we can imagine a situation in which the density within

the defect region ρ is not uniform, but it rather corresponds to a modulated phase of the typediscussed above for a binary alloy. We then have to incorporate the wavenumber ~k0 correspondingto the modulated phase. One finds,

S(k) ∝ δ((~k − ~k0) · n⊥) (7.17)

A similar argument can be advanced for the case of an interface now separating modulatedphases. One would find,

S(k) ∝ 1

|(~k − ~k0) · n|2δ((~k − ~k0) · n⊥). (7.18)

7.4 Scattering from rough interfaces

So far we have focused on idealized situations in which the interface is simply a line of discontinuity,and ignored any possible structure.

Before addressing the case of a rough interface, let us consider a diffuse interface in which thedensity profile is smooth instead of discontinuous, as shown in Fig. 7.13. Although the precise

ξ

ρ

ξ y

y

x

Figure 7.13: A diffuse interface, with no roughness

functional dependence does not matter, let us consider the following case:

ρ(x, y) ∼ tanh(y/ξ), (7.19)

with ξ assumed small determining the width of the interface. In this case, one can show that

S(k) = |ρ(k)|2 ∼ δ(kx)

k2y

(1 +O(ξky)2) (7.20)

for an interface at y = 0. Diffuseness only effects scattering at kx = 0 and for large ky of order 1/ξ

ky

kx

extrascattering

due to diffuseness

Figure 7.14:

(again, assuming that ξ is small so that ξky is comparable to 1). The resulting scattering is sketchedin Fig. 7.14. This result is general, and does not depend on our choice of tanh for ρ.

Page 76: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

72 CHAPTER 7. EQUILIBRIUM CORRELATION FUNCTIONS AND SCATTERING

Figure 7.15: A very rough interface, with no diffuseness

A rough interface is not planar, but rather described by some profile y = h(x) in, say d = 2. If therough interface is not diffuse, that is, if it can be locally approximated by a surface of discontinuity,the density satisfies,

ρ(x, y) = θ(y − h(x)). (7.21)

If h(x) involves only small distortions, then by expanding in Taylor series we can write,

ρ ≈ θ(y)− h(x)δ(y) + ... (7.22)

since dθ/dy = δ(y). Taking the Fourier transform and squaring gives

S(k) = 〈|ρ(k)|2〉 ∼ δ(kx)

k2y

+ 〈|h(kx)|2〉, (7.23)

where we have omitted unimportant delta functions at the origin. We obtain the rather remarkableresult that the intensity of the scattered radiation depends on to the spectrum of surface fluctuations〈|h(kx)|2〉. (In earlier cases in this chapter we did not consider thermal averages as the configurationof the interface was unique and fixed. For a rough surface we can consider all possible equilibriumconfigurations that the system adopts during the duration of the scattering experiment).

Extra scattering due to roughness

ky

kx

Figure 7.16:

7.5 (*) Scattering from many planar and randomly orientedsurfaces

Evidently the scattering from many randomly oriented surfaces gives a pinwheel like cluster if streaksas shown in Fig. 7.17. Angularly averaging over all orientations of the interface gives

S =

∫dn|ρk|2∫dn

(7.24)

where n is the unit vector normal to the surface. We write

n = − sin θx+ cos θy (7.25)

Page 77: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

7.5. (*) SCATTERING FROM MANY PLANAR AND RANDOMLY ORIENTED SURFACES73

ky

kx

AverageAngular

ky1/kp

Figure 7.17:

andn⊥ = cos θx+ sin θy (7.26)

with~k = k cosβx+ k sinβy (7.27)

Then, for scattering from a surface where |ρk|2 = δ(~k·n⊥)

|~k·n|2 we have (in d = 2 and assuming a uniform

distribution of orientations θ)

S =1

∫ 2π

0

dθδ[k cosβ cos θ + k sinβ sin θ]

| − k cosβ sin θ + k sinβ cos θ|2

=1

1

k3

∫ 2π

0

dθδ[cos (θ − β)]

| sin (θ − β)|2

=1

1

k3

∫ 2π

0

dθδ[cos θ]

| sin θ|2

But cos θ = 0 at θ = π/2, 3π/2 where | sin θ|2 = 1, so that

S(k) =1

π

1

k3(7.28)

Similarly, in arbitrary dimension one has

S(k) ∝ δ(~k · n⊥|~k · n|2

∼ δkd−1n⊥k2

∼ 1

k2+d−1=

1

kd+1. (7.29)

We have used the facts that the tangent space to the surface is d− 1 dimensional, and that δ(ax) =δ(x)/a. for scattering from a surface.

Scattering from a line defect which has |ρ(k)|2 ∝ δ(kx), one similarly obtains

S(k) ∝ 1

kd−1(7.30)

We shall call these results, Porod’s Law. They involve only geometrical considerations about theinterface.

The case where ρ is not the order parameter, but a modulated structure exists instead is moresubtle. If the wavenumber of the modulated phase is ~k0 then this introduces a new angle via

~k0 = k0 cosαx+ k0 sinαy (7.31)

Now there are two possible averages

1. α or β, average over crystallite orientation, or, angles in a non angularly resolved detector.These are equivalent (look at the figure for a few seconds), and so we only need to do one ofthese two averages.

2. θ, averages over surface orientation.

Page 78: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

74 CHAPTER 7. EQUILIBRIUM CORRELATION FUNCTIONS AND SCATTERING

Figure 7.18:

averageα or β

ky ky

kxkx

interface n = y

Figure 7.19:

If we average over α or β, for a fixed n, it is easy to anticipate the answer as shown in fig. 7.19,where in the cartoon the regions around (kx = k0, ky = 0) retain the singularity of the originalδ(kx − k0)/(ky − k0)

2.This is basically a detector problem, so instead we will consider averaging θ first. In fact it is

obvious what the result of such an average is, it must be the previous result, but shifted from ~k = 0

ky

kx

θ Average

ky

k0k0

kx

Figure 7.20:

to ~k = ~k0, that is

Ssurface(k) ∝1

|~k − ~k0|d+1

Sline(k) ∝1

|~k − ~k0|d−1

(7.32)

If a further average is done over crystallite orientation, we can take these expressions and conductan angular average over the angle between ~k and ~k0. Clearly after averaging we will have

S = S(|k| − |k0||k0|

) ≡ S(∆k) (7.33)

First, letφ = β − α (7.34)

Page 79: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

7.5. (*) SCATTERING FROM MANY PLANAR AND RANDOMLY ORIENTED SURFACES75

so that

(~k − ~k0)2 = k2 − 2kk0 cosφ+ k2

0

or, on using

k ≡ k0(∆k + 1) (7.35)

we have

(~k − ~k0)2 = k2

0(∆k)2 + 2(1− cosφ)(∆k + 1) (7.36)

Hence in two dimensions, for a surface, we have

S(∆k) =1

2πk30

∫ 2π

0

dφ1

|(∆k)2 + 2(1− cosφ)(∆k + 1)|3/2 (7.37)

which is a complex integral. We will work out its asymptotic limits ∆k >> 1 and ∆k << 1.First, if ∆k >> 1, then

S(∆k) =1

2πk30

1

(∆k)3

∫ 2π

0

dφ+ ...

or (in arbitrary d)

S(∆k) ∼ 1

(∆k)d+1, ∆k >> 1 (7.38)

If ∆k << 1, consider

S(∆k → 0)?=

1

2πk30

∫ 2π

0

dφ1

|2(1− cosφ)|3/2

which diverges due to the φ = 0 (and φ = π) contribution. So we keep a small ∆k. For convenience,we will also expand φ around 0.

S(∆k) ≈ 1

2πk30

∫ 2π

0

dφ1

|(∆k)2 + φ2|3/2 + ...

Now let u = φ/∆k

S(∆k) =1

2πk30

1

(∆k)2

∫ :∞2π/∆k

0

du1

|1 + u2|3/2so,

S(∆k) ∼ 1

(∆k)2as ∆k → 0 (7.39)

for all dimensions d.If one does this in d dimensions with

S =1

|~k − ~k0|γ(7.40)

then

S =

1

(k−k0)γ−(d−1) ,k−k0k0

<< 11

(k−k0)γ ,k−k0k0

>> 1(7.41)

If γ − (d− 1) > 0. For the case γ = d− 1, one obtains

S ∼

− ln (k − k0),k−k0k0

<< 11

(k−k0)d−1 ,k−k0k0

>> 1(7.42)

It should be noted that the actual scattering is given by a difficult integral like Eq. 7.37, which is

Page 80: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

76 CHAPTER 7. EQUILIBRIUM CORRELATION FUNCTIONS AND SCATTERING

ky 1|k−k0|d+1 , near tails

1|k−k0|2 , near center

k0

kx

Figure 7.21:

pretty close to an elliptical integral.

Page 81: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 8

Fluctuations and Crystalline Order

We explore in this chapter an important example of a spontaneously broken symmetry: the formationof a solid crystalline lattice. We will pay special attention to the role of thermal fluctuations onlattice structure, and how they can be measured through diffraction experiments.

Consider a regular crystalline lattice such as the one shown in Fig. 8.1. The atomic density ρ

Density

ρ = ρ(~r)

Atoms on a lattice

Figure 8.1:

is a function of ~r as shown. A crystalline lattice, although it looks quite symmetric in a commonsense way, it actually has very little symmetry compared to a liquid or a vapor. Both liquid orvapor are translationally invariant and isotropic. A crystalline solid on the other hand, has a veryrestricted symmetry. As you can see from Fig. 8.1, the lattice is invariant only under a discrete setof translations:

~r → ~r + r0(nx+my) (8.1)

where n and m are integers (positive or negative), and r0 is the lattice constant. Clearly the crystal

~r ~r′

r0

invariance only for

~r → ~r′, with

~r′ = r0(+1x− 1y)

shown here

Figure 8.2:

has less symmetry than both a liquid and a vapor, so that the formation of a crystalline solid fromthe liquid phase necessarily must break the translational and rotational invariance of the liquid. Asa consequence, there must be a broken symmetry variable that satisfies Goldstone’s theorem.

In order to pursue this further, one introduces the displacement field ~u(~r). Let ~r0 be the equi-librium position of a lattice point, and assume that under the influence of thermal fluctuation theparticle that is located at ~r0 in equilibrium is displaced to ~r. The displacement is then simply~u = ~r − ~r0. This same quantity can be defined for each lattice point, and in the continuum limit,the displacement becomes a continuum field ~u(~r) that describes the local lattice distortion with

77

Page 82: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

78 CHAPTER 8. FLUCTUATIONS AND CRYSTALLINE ORDER

reference to its equilibrium configuration. One such distorted configuration and the associated dis-placements is shown in Fig. 8.3. Therefore, the instantaneous density of any given configurationcan be described as ρ(~r + ~u(~r)) where ~u(~r) is the local displacement vector, as shown in Fig. 8.3.

Equilibrium averaged position

Actual position due to fluctuation

~u(~r)

Figure 8.3: Distorted configuration due to thermal fluctuations

We are now in a position to understand what symmetry is broken spontaneously by the formationof a crystalline lattice. Consider a uniform displacement ~u. Figure 8.4 shows two such cases. If allthree configurations have the same lattice constant, they must have the same energy. In other words,the energy must be invariant under displacements of the entire lattice by a constant ~u. However,

Figure 8.4:

in any given experiment a definite value for 〈u(~r)〉 = 0 is obtained corresponding to a particularlocation of the lattice. Therefore, the actual lattice breaks the translational and rotational invarianceof the energy of the system. In all respects, the variable ~u plays the same role in crystalline solidsas the variable h(~x) played in our earlier analysis of interface fluctuations.

By appealing now to Goldstone’s Theorem, we may anticipate that in Fourier space, if all direc-tions of fluctuation are equally likely, then

〈~u(~k)~u(~k′)〉 ∝ (2π)dδ(~k + ~k′)(1

k2) (8.2)

In order to pursue this analogy further, consider for a moment a lattice modeled as a one dimen-sional chain of particles connected by harmonic springs. The argument can be easily generalizedto higher dimensions and to more realistic inter atomic potentials. Let un be the instantaneousdisplacement of the n particle in the chain. The energy of any configuration is given by,

E =C

2

n

(un+1 − un)2 , (8.3)

where we have assumed that C is the spring constant, and also neglected chain end effects by, forexample, assuming periodic boundary conditions. Let a the the equilibrium separation betweenparticles, or lattice constant. Then,

E =Ca2

2

n

(un+1 − un)2a2

.

Page 83: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

8.1. CALCULATION OF 〈ρ〉 AND 〈ρ(0)ρ(~R)〉 79

We now take the continuum limit a→ 0 and note that in this limit (un+1 − un)/a→ du/dx wherex is the coordinate along the chain, and further define K = Ca2 finite. In this limit,

E =K

2

dx

(du

dx

)2

.

This energy is formally the same as the energy of an interface fluctuation in the limit of smalldistortions. The important observation is that the energy does not depend on u directly, but ratheron its derivative du/dx. Therefore, any two configurations that differ by a constant displacement,have the same energy. The energy is invariant under uniform displacement.

Given Eq. (8.2) we immediately find,

〈u2〉 =

∫ Λ

2π/L

ddk

(2π)d1

k2,

so that 〈u2〉 is finite in d = 3, whereas 〈u2〉 ∼ lnL in d = 2 and 〈u2〉 ∼ L in d = 1. Due to thebroken symmetry, fluctuations in displacement diverge with the system size below d = 3, and arefinite in d = 3. We will examine below the consequences of this behavior.

8.1 Calculation of 〈ρ〉 and 〈ρ(0)ρ(~r)〉We consider in this section the consequences of Eq. (8.2) on 〈ρ(~r)〉 and on the results of scatteringexperiments. First note that the density of a perfectly periodic crystalline lattice can be written as

2π/G

x

ρ(x) ∝ (cosGx+ 1)

ρ

Figure 8.5: Rough depiction of a one dimensional periodic variation of density

ρ(x) =∑

G

eiGxaG (8.4)

following Fourier’s decomposition theorem. The wavenumbers G are called reciprocal lattice vectorsin d = 3, and aG are the corresponding Fourier coefficients. Each particular crystalline structure(e.g., cubic, tetragonal, etc.) is characterized by a definite set of vectors ~G, and the details of themolecular structure of the solid are contained in a~G.

2π/Gx

ρ

ρ(x) =∑

G eiGxaG

Figure 8.6: A more accurate depiction of a one dimensional variation in density.

The instantaneous density of any distorted configuration can be written in terms of the displace-ment field,

ρ(~r + ~u(~r)) =∑

~G

ei~G·(~r+~u(~r))a~G (8.5)

Page 84: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

80 CHAPTER 8. FLUCTUATIONS AND CRYSTALLINE ORDER

where both sets ~G and a~G correspond to the undistorted structure. The statistical average issimply,

〈ρ(~r)〉 =∑

~G

ei~G·~ra~G〈ei

~G·~u(~r)〉 (8.6)

In order to calculate the average we assume that ~u(~r) is a Gaussian variable. We mention a math-ematical property of Gaussianly distributed variables. If x is Gaussianly distributed of zero mean,then

〈eξx〉 = eξ2

2 〈x2〉 (8.7)

for any constant ξ. This result can be generalized to a multivariate Gaussian distribution as follows

〈eP

ξnxn〉 = e12

P

m,n ξmξn〈xmxn〉.

With this property and the assumption that ~u is Gaussian, one has,

〈ei ~G·~u〉 = e−12~G~G:〈~u~u〉 (8.8)

If fluctuations of ~u along different directions are uncorrelated 〈uαuβ〉 ∝ δαβ , then

〈e−i ~G·~u〉 ≈ e− 12G

2〈u2〉

The variance of the displacement field has been given earlier from our result Eq. (8.2)

〈u2〉 =

∫ Λ

2π/L

ddk

(2π)d1

k2∝

constant d = 3

lnL d = 2

L d = 1

(8.9)

Substitution of this result into Eq. (8.6 leads to the desired result,

〈ρ〉 =∑

~G

exp−1

2G2〈u2〉ei ~G·~ra~G (8.10)

where exp− 12G

2〈u2〉 is the so called Debye-Waller factor. This factor is a constant for each G ind = 3 and hence originates relatively small changes in the average density. In d = 2 one has instead

〈ρ〉 =∑

~G

ei~G·~raGe

− 12G

2 lnL (8.11)

except for unimportant constants. Note that in the strict limit L→∞ the exponential term vanishes,and hence 〈ρ〉 =constant. The periodicity of the lattice is completely destroyed by fluctuations inthe displacement field. Since the divergence with L is only logarithmic, this effect is in practiceweak. It is more pronounced, however, in d = 1

〈ρ〉 =∑

~G

ei~G·~raG e−

12G

2L (8.12)

which is a large effect, much greater than the periodicity as L → ∞. This is also related to theLandau-Peierls theorem which states that a one dimensional variation in density, ρ(x) (whethercrystal structure or phase coexistence) is impossible: The only G which can survive in Eq. 8.12 isG = 0 as L→∞ which gives ρ(x)→ constant, due to fluctuations in ~u(~r).

Page 85: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

8.1. CALCULATION OF 〈ρ〉 AND 〈ρ(0)ρ(~R)〉 81

8.1.1 Scattering intensity

We now turn to the calculation of the scattering intensity I(k), where

I(k) ∝ |ρk|2 (8.13)

as indicated earlier. As a reference, we consider first a perfect lattice (in the absence of fluctuationsor displacements). Then from Eq. (8.4)

ρk =

ddr e−i~k·~r

~G

ei~G·~ra~G

=∑

~G

aG

ddr e−i(~k−~G)·~r =

~G

aGδ(~k − ~G).

Therefore,

|ρk|2 =∑

~G, ~G′

aGa∗G′δ(~k −G)δ(~k − ~G′) ∝

~G

|aG|2δ(~k − ~G), (8.14)

since ~k = ~G and ~k = ~G′, hence ~G′ = ~G. The quantity |aG|2 is called the atomic form factor anddescribes the molecular arrangements within a unit cell of the crystal. The scattering intensitythat corresponds to Eq. (8.14) displays well defined peaks at ~k = ~G, the scattering vectors thatcorrespond to the reciprocal lattice vector. This is schematically shown in Fig. 8.7.

k

Ik = |ρk|2 Scattering gives sharp

peaks, if thermal fluctuations

are absent.

various ~G’s

Figure 8.7:

We next consider the general case of a fluctuating crystal in which the displacement field is notzero. Then,

ρk =

dd~r e−i~k·~r(∑

G

ei~G·~raG ei

~G·u(~r)

)

or,

ρk =∑

~G

aG

dd~re−i(~k−~G)·r ei

~G·u(~r) (8.15)

so that

ρ∗k =∑

~G′

a∗G

dd~r′ei(~k−~G′)·~r′e−i

~G′·u(~r′)

Hence the structure factor is

〈|ρk|2〉 =∑

~G,~G′

aGa∗G

dd~r

dd~r′e−i(~k−~G)·~r ei(

~k−~G′)·~r′〈ei ~G·u(~r)−i ~G′·u(~rprime)〉 (8.16)

If the system overall is invariant under translation we have that,

〈ei ~G·u(~r)−i ~G′·u(~r′)〉 = f(~r − ~r′) (8.17)

Page 86: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

82 CHAPTER 8. FLUCTUATIONS AND CRYSTALLINE ORDER

so that one of the integrals can be eliminated,

〈|ρk|2〉 =∑

~G,~G′

aGa∗G

dd~r e−i(~k−~G)·~r(2π)dδ(~G− ~G′)f(~r) (8.18)

wheref(~r) = 〈ei ~G·~u(r)−i ~G·u(0)〉.

Evaluating the square, and by assuming as done earlier that the displacement field is a Gaussianvariable, and that different components of ~u are uncorrelated, we find,

f(~r) = e−G2

2 〈(u(~r)−u(0))2〉

= e−G2〈u2〉eG

2〈u(r)u(0)〉(8.19)

and therefore,

〈|ρk|2〉 =∑

~G

|a~G|2e−G2〈u2〉

ddr e−i(~k−~G)·~reG

2〈u(~r)u(0)〉 (8.20)

The correlation function appearing in the exponent 〈u(~r)u(0)〉 can be calculated by inverseFourier transformation of 〈|u(k)|2〉 ∼ 1

k2 :

〈u(~r)u(0)〉 ∝

1r d = 3

− ln r, d = 2

−r, d = 1

(8.21)

In d = 3 it can be shown that the line widths are broadened by the Debye-Waller factor but donot change location. There is no effect from the decaying correlation function. the exponent. with

I(k)

k~G

d = 3

Figure 8.8: Delta Function Singularities

the same delta function peaks. In d = 2, the situation is more interesting:

〈|ρk|2〉 =∑

~G

e−G2〈u2〉|aG|2

ddre−i(~k−~G)·re−c ln r

=∑

~G

e−G2〈u2〉|aG|2

ddre−i(

~k−~G)·~r

rc︸ ︷︷ ︸

this is the Fourier transform of a power law

=∑

~G

e−G2〈u2〉|aG|2

1

|~k − ~G|d−c,

where c is a constant. It is customary to set (d− c) = 2− η, and write

〈|ρk|2〉 =∑

~G

e−G2〈u2〉|aG|2

1

|~k − ~G|2−η. (8.22)

Page 87: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

8.1. CALCULATION OF 〈ρ〉 AND 〈ρ(0)ρ(~R)〉 83

Thermal fluctuations in d = 2 change the delta function singularities at Bragg peak locations thatare characteristic of long ranged crystalline order into power law singularities, still at wavevectorsequal the reciprocal lattice vectors (Fig. 8.9). This broadening of the spectral lines reflects the

I(k)

G

d = 2

k

Figure 8.9: Power Law Singularities

enhanced effect of thermal fluctuations in d = 2.The role of fluctuations is even more dramatic in d = 1. We first recall that

∫ ∞

−∞dxe−b|x|e−ikx =

2b

b2 + k2,

so that

〈|ρk|2〉 ∝∑

~G |aG|2∫

ddr e−i(~k−~G)·~r e−cr

︸ ︷︷ ︸

this is the Fouriertransform of anexponential. But∫ddr ei

~k′·~r e−r/ξ ∝ 1(k′)2+ξ−2

for small k′

or,

〈|ρk|2〉 ∝∑

G

|aG|21

(k −G)2 + ξ−2(8.23)

where ξ−2 is a constant. The resulting scattering intensity is sketched in Fig. 8.10. Thermalfluctuations have removed any traces of any singularity at k = G, which indicates the lack of anylong range order (any periodicity) in a one dimensional crystal.

d = 1

I(k)

G G G

Figure 8.10: No Singularities

There are two other important consequences associated with spontaneously broken symmetrieswhich we will not discuss here, but that you can look up for yourself. The first one is that continuousbroken symmetries of the type that we have encountered are usually associated with so calledsoft modes. They are normal modes that correspond to long wave excitations or fluctuations thatcost vanishingly low energy. For fluctuating surfaces, these modes are called capillary waves; forcrystalline solids, they are known as phonons. The second important concept is that of “rigidity”.For a crystalline solid, this is the simple concept of rigidity: the solid appears rigid because any

Page 88: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

84 CHAPTER 8. FLUCTUATIONS AND CRYSTALLINE ORDER

distortion requires that many atoms move in concert. This is a direct consequence of the brokensymmetry. The same concept applies to other broken symmetries.

Page 89: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 9

Ising Model

This is the first in a sequence of chapters in which we will explore the consequences of a spontaneouslybroken symmetry as it appears at a phase transition. We begin in this chapter by describing theIsing model of ferromagnetism which we will use to illustrate the general concepts.

There are many phenomena in nature in which a large number of constituents interact to givesome global behavior. A magnet, for example, is composed of many interacting magnetic dipolesof molecular origin. A liquid is made up of many interacting molecules. A computer program hasmany interacting commands. The economy, the psychology of road rage, plate tectonics, and manyother phenomena share the fact that all involve global behavior due to local interactions between alarge number of entities.

The Ising model was introduced as a simple model of ferromagnetism in an attempt to describethe collective behavior that emerges at the Curie point of magnets: at this point the material turnsfrom a paramagnet into a permanent magnet (or ferromagnet). It is defined on a regular lattice,each site of which contains a magnetic dipole or spin. In the simplest version, each spin can onlyappear in two microscopic states +1 (for up, in some direction) or −1 (down) (Fig. 9.1). The valueof the spin variable at any lattice site is determined by the spin’s (usually short range) interactionwith the other neighboring spins. A sketch of the model in two dimensions on a square lattice isshown in Fig. 9.2.

arrow Spin S = +1

arrow Spin S = −1

Figure 9.1:

or or . . .

Figure 9.2: Possible configurations for a N = 42 = 16 system.

In order to calculate the partition function of this model, we would express the sum over states

85

Page 90: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

86 CHAPTER 9. ISING MODEL

as∑

ν

=

+1∑

s1=−1

+1∑

s2=−1

. . . ...

+1∑

sN=−1

=

N∏

i=1

+1∑

si=−1

where N is the total number of spins in a lattice of arbitrary dimensionality. The energy of anyconfiguration eEν/kBT must now be prescribed. Since we want to model a ferromagnet, where allthe spins are aligned either +1 or −1, we make the spin of site i to want to oriented parallel to itsneighbors. That is, for the spin on site i we define the following interaction energy,

Ei =∑

neighbors of i

(Si − Sneighbors)2 = S2i − 2

neighbors of i

SiSneigh. + S2neigh. (9.1)

But S2i = 1 always since Si = ±1, so up to a constant (additive and multiplicative)

Ei = −∑

neighbors of i

SiSneighbors of i

On a square lattice, the neighbors of i are the spin surrounding it (see Fig. 9.3 for a two dimensionalcase).

Middle Spin is i, 4 Neighbors are

to the North, South, East, and West

Figure 9.3: Four nearest neighbors on a square lattice.

This choice of interaction energy reflects the assumption that interactions at the microscopiclevel are local in nature. The choice of the 4 spins above, below and to the right and left is special(restricted to “nearest neighbors”), and other choices exist in the literature. For example, onecan include interactions with spins on the diagonals (NW, NE, SE, SW) that are “next nearestneighbors”. It doesn’t actually matter for our purpose here how many are included, so long as theinteraction is local. The total energy of any microscopic configuration is the sum over all spins ofthe interactions just described. We write,

Eν = −J∑

〈ij〉SiSj

where J is a positive interaction constant that sets the energy scale, and 〈ij〉 denotes a sum over alli and j = 1, 2, ...N , provided i and j are nearest neighbors, with no double counting. Note that thesum over spins has

〈ij〉=

N∑

i=1

N∑

j=1︸ ︷︷ ︸

neighbors, no double counting

= N × 4

2

terms. If there were “q” nearest neighbors instead, corresponding to a different interaction pattern,to a different lattice (say hexagonal), or to a different dimension of space

〈ij〉=q

2N

The value of q nearest neighbors is q = 2 for the one dimensional Ising model, q = 4 for a twodimensional square lattice, and q = 6 for a simple cubic lattice in three dimensions. The lowestenergy is is attained when all spins are aligned along the same orientation, in which case E = − qJ2 N .

Page 91: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

87

One can also include in this model the effect of an externally imposed magnetic field H, whichtends to align spins parallel to the field. The energy of a configuration inside a magnetic field is

Eν = −J∑

〈ij〉SiSj −H

i

Si. (9.2)

The corresponding partition function is

Z(N,J

kBT,H

kBT) =

N∏

i=1

+1∑

S1=−1

e( JkBT

)P

〈jk〉 SjSk+( HkBT

)P

i Sj (9.3)

The one dimensional model was solved by Ising for his doctoral thesis (we will solve it below).The two dimensional model was solved by Onsager for H = 0, in one of the premiere works of mathe-matical physics. We will not give that solution. You can read it in Landau and Lifshitz’s “StatisticalPhysics”. They give a readable, physical, but still difficult treatment. The three dimensional Isingmodel has not yet been solved.

The most interesting consequence of the Ising model is that it exhibits a broken symmetry (forH = 0). In the absence of a field, the energy of a state and hence the partition function, have anexact symmetry Si ↔ −Si (the group associated with this symmetry is Z2.) Hence 〈Si〉 ↔ −〈Si〉and therefore

〈Si〉 = 0

In fact, for dimension d = 2 and 3 this is not true. An experimental realization spontaneously breaksthis symmetry exhibiting either positive or negative average magnetization 〈Si〉. The exact resultfor the magnetization in d = 2 is surprisingly simple:

m = 〈Si〉 =

±(

1− cosech2 2JkBT

)1/8

, T < Tc

0, T > Tc

The critical temperature Tc ≈ 2.269J/kB is where (1− cosech2 2JkBT

)1/8 vanishes.

0

1

m = 〈Si〉

Tc T

Figure 9.4: Two–dimensional square lattice Ising model magnetization per spin

A qualitative interpretation of this transition from zero to nonzero average magnetization at Tcis easy to give. Consider the Helmholtz free energy F = E−TS where S is now the entropy. At lowtemperatures, F is minimized by minimizing E. The largest negative value E has is − q2JN , whenall spins are aligned +1 or −1. So at low temperatures (and rigorously at T = 0), 〈Si〉 → +1 or〈Si〉 → −1. At high temperatures, entropy maximization dominates the minimization of F . Clearlyentropy is maximized by having the spins helter-skelter at random: 〈Si〉 = 0, since plus an minusone would appear with the same probability.

The transition temperature between 〈Si〉 = 1 and 〈Si〉 = 0 should be when the thermal energyis comparable to the energy per spin kBT = O( q2J), which is in the right ball park. However,there is no simple explanation for the abrupt increase of the magnetization per spin m = 〈Si〉 atTc. This abruptness is in contrast with the case of a paramagnet that shows a gradual decreasein magnetization with increasing temperature (and in the presence of an external field H). In theferromagnet, the fundamental up-down symmetry is broken at exactly Tc.

Page 92: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

88 CHAPTER 9. ISING MODEL

Paramagnet

m = 〈Si〉

1

T

No BrokenSymmetryH > 0

Figure 9.5:

Paramagnet

Tccoexisting phases)

magnetization (andis there spontaneous

Only on the shaded line

T Tc H = 0

+1−1 m

for single phaseforbidden region

m

H > 0

+1

T

−1

T

H < 0

m

Figure 9.6:

Page 93: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

9.1. (*) LATTICE GAS MODEL 89

The phase diagram for the Ising model (or indeed for any ferromagnet) is sketched in Fig. 9.6.The true breaking of the up-down symmetry only appears for H = 0. Otherwise, there is always afinite, residual magnetization at any temperature.

9.1 (*) Lattice Gas Model

In this model, one still has a lattice with sites labeled i = 1, 2...N , but now there is a concentrationvariable ci = 0 or 1 defined at each site (i.e., the site is either empty or occupied). The most simplemodel of interaction is the lattice gas model defined as

ELGν = −ϑ∑

〈ij〉cicj − µ

i

ci

where ϑ is an interaction energy, and µ is the constant chemical potential.

One can think of this as a model of a binary alloy (where ci is the local concentration of one ofthe two chemical species), or as a liquid vapor mixture (where ci is the local density; e.g. zero forthe vapor). This model can be mapped onto the Ising model by introducing the transformation,

Si = 2ci − 1.

Then

EIsingν = −J∑

〈ij〉(2ci − 1)(2cj − 1)−H

i

(2ci − 1)

= −4J∑

〈ij〉cicj + 4J

〈ij〉ci + constant− 2H

i

ci + constant

= −4J∑

〈ij〉cicj + 4J

q

2

i

ci − 2H∑

i

ci + constant

and

EIsingν = −4J∑

〈ij〉cicj − 2(H − Jq)

i

ci + constant.

Since the additive constant is unimportant, this is the same interaction energy as ELGν , providedthat

ϑ = 4J µ = 2(H − Jq).

The phase transition occurs at a specific value of the chemical potential µc = −2Jq. It is sometimessaid that the liquid-vapor transition does not involve a change of symmetry (associated with thegroup Z2) as opposed to the Ising model. This is not correct. However, the external chemicalpotential must have a carefully tuned value to get that possibility.

9.2 One dimensional Ising Model. Transfer matrix solution

It is easy to exactly solve the one dimensional Ising model. Of course, we already know that itcannot exhibit long range order, and hence have a true phase transition. Hence we expect

〈Si〉 = m = 0

for all T > 0.

Page 94: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

90 CHAPTER 9. ISING MODEL

T

µ

Tc

〈c〉 ≈ 1

〈c〉 ≈ 0

10

T

Tc

〈c〉

TTc

T

µ = 2Jq

2Jq

P

Pc Tc

Density

Real liquid – gas transition

Lattice – gas transition

P = Pc

Figure 9.7:

1

2

3

N

N − 1

Ising Ring Periodoc Boundary Conditions

. . . .

Figure 9.8:

9.2.1 No magnetic field

Consider a one dimensional chain of spin variables Si = ±1, i = 1, . . . , N . We also consider the socalled periodic boundary conditions, such that SN+1 = Sa1. This system is shown in Fig. 9.8.Thepartition function is,

Z =∑

SieβJ

PNi=1 SiSi+1 =

S1=±1

S2=±1

. . .∑

SN=±1

eβJPNi=1 SiSi+1 .

The exponential of the sum can be written as a product of exponentials:

Z =∑

S1=±1

S2=±1

. . .∑

SN=±1

[eβJS1S2

] [eβJS2S3

]. . .[eβJSNS1

](9.4)

The transfer matrix method involves representing each term in the product as a matrix:

Tσiσi+1= eβJSiSi+1 (9.5)

In this representation, the elements of the matrix are labeled by the variable S (instead of just theconventional positive integers):

TSiSi+1=

(T1 1 T1 −1

T−1 1 T−1 −1

)

=

(eβJ e−βJ

e−βJ eβJ

)

With the definition given in Eq. (9.5), the partition function can now be written as,

Z =∑

S1=±1

S2=±1

. . .∑

SN=±1

TS1S2TS2S3

. . . TSNS1

Page 95: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

9.2. ONE DIMENSIONAL ISING MODEL. TRANSFER MATRIX SOLUTION 91

Next consider the usual rules of matrix multiplication (A,B and C are matrices):

A = B · C is explicitly Aik =∑

j

BijCjk (9.6)

Therefore, in our notation,∑

S3=±1

TS2S3TS3S4

,

can be understood as the product of the two transfer matrices in question, exactly as in Eq. (9.6)with the spin variables being the matrix indices. With this in mind, Eq. (9.4) can be written as,

Z =∑

S1=±1

TNS1S1

since all the matrices TSiSi+1are identical, and after having performed the N matrix products

implied by the sums over the N spin variables, and taking into account that SN+1 = S1. All that isleft, is the final sum over S1.

Given the definition of trace of a matrix,

Tr(A) =∑

i

Aii,

then,Z = Tr(TN ).

The trace of TN can be calculated by diagonalization. The matrix T is symmetric and hencecan be written in diagonal form with real eigenvalues. Let D be the diagonal matrix, then

D = B−1TB,

where B is the matrix implementing the basis set change to diagonalize T . Given the cyclic propertyof the trace,

Tr(ABC) = Tr(CAB),

one has that

Tr(DN ) = Tr(B1TB B1TB . . .) = Tr(B−1TNB) = Tr(BB−1TN ) = Tr(TN ).

All we have to do now is to diagonalize T , so that

Z = Tr(DN ) = λN1 + λN2 ,

where λ1 and λ2 are the two eigenvalues of T .In order to find the eigenvalues, we compute

∣∣∣∣

eβJ − λ e−βJ

e−βJ eβJ − λ

∣∣∣∣= 0.

One finds,λ1,2 = eβJ ± e−βJ ,

so that the free energy,

F = −kBT lnZ = −kBT ln[λN1 + λN2

]= −kBT

[

lnλN1 + ln

(

1 +

(λ1

λ2

)N)]

Assume that λ1 > λ2 and that N is very large (thermodynamic limit), then,

F = −kBT lnλN1 = −kBTN ln (2 coshβJ) . (9.7)

This is the exact partition function for the Ising model in one dimension.

Page 96: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

92 CHAPTER 9. ISING MODEL

9.2.2 One dimensional Ising model in a magnetic field

The partition function in this case is,

Z =∑

Sieβ(J

PNi=1 SiSi+1+H

PMi=1 Si) =

Si

i

eβJSiSi+1+βH2 (Si+Si+1)

The product can be expressed again in terms of a transfer matrix, now given by

TSiSi+1=

(eβ(J+H) e−βJ

e−βJ eβ(J−H)

)

Diagonalization of this matrix leads to the following two eigenvalues,

λ1,2 = eβJ cosh(βH)±√

e2βJ sinh2(βH) + e−2βJ ,

with the partition function in this case given by

F = −kBTN lnλ1,

which reduces to Eq. (9.7) in the limit of H = 0.The average magnetization is given by,

〈S〉 =kBT

N

∂ lnZ

∂H=kBT

λ1

∂λ1

∂H=

sinhβh√

sinh2 βH + e−4βJ.

low T

high T

−1

+1

m

H

Figure 9.9: Magnetization of the d = 1 Ising model

The exact solution shows that as the field vanishes H → 0, the order parameter vanishes 〈S〉 = 0,for all temperatures. This model in one dimension does not exhibit a phase transition. Note, however,that something peculiar happens at T = 0. The limits (T → 0, h → 0) and (H → 0, T → 0) aredifferent (i.e., the limit is singular). In particular, for β →∞, 〈S〉 = 1 for any field H.

In fact the system of spins is perfectly ordered at T = 0 and H = 0. The exact free energy is atT = 0

F = −kBTN ln(eβJ)

= −NJ,which corresponds to all spins being parallel to each other (a perfectly ordered state). Therefore itis commonly said that the Ising model in one dimension has a phase transition only at T = 0, butthat the system is disordered at any finite temperature.

9.2.3 Pair correlation function. Exact solution

It is also possible to compute exactly the pair correlation function of the Ising model in one dimension.It is useful to examine its behavior with spin separation, especially in the limit T → 0. We definethe spin pair correlation function as,

g(j) = 〈SiSi+j〉 − 〈Si〉〈Si+j〉,

Page 97: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

9.2. ONE DIMENSIONAL ISING MODEL. TRANSFER MATRIX SOLUTION 93

the analog of the pair distribution function h(r) defined for fluids. The calculation is greatly sim-plified by assuming that the interaction constant J changes from site to site J = Ji, and taking allJi = J at the end of the calculation. In this case, we have (for H = 0)

〈SiSi+j〉 =1

Z

SiSiSi+je

βP

l JlSlSl+1 =1

Z

Si(SiSi+1) (Si+1Si+2) . . . (Si+j−1Si+j) e

βP

l JlSlSl+1

by taking advantage of the fact that since Si = ±1, then S2i = 1.

Now note that each pair in the product can be written as,

(SiSi+1) eβ

P

l JlSlSl+1 =∂

∂(βJi)eβ

P

l JlSlSl+1 ,

and therefore

〈SiSi+j〉 =1

Z

1

βj

(∂jZ

∂Ji∂Ji+1 . . . ∂Ji+j−1

)

Jl=J

.

Since the exact partition function for this case in which J is a function of the lattice site l is

Z =

N∏

l=1

(2 coshβJl) ,

one has,∂Z

∂Ji=∏

l 6=i(2 coshβJl) 2β sinhβJi,

and∂2Z

∂Ji∂Ji+1=

l 6=i,l 6=i+1

(2 coshβJl) (2β sinhβJi) (2β sinhβJi+1)

and repeating the process for the j derivatives

〈Siσi+j〉 =1

l 2 coshβJl

1

βj

l 6=i,...l 6=i+j−1

(2 coshβJl) (2β)j(sinhβJ)

j,

where in the last term of the right hand side we have already taken into account the fact that at theend of the calculation all the Jl = J . Taking this into account in the remaining terms, one has,

〈SiSi+j〉 =1

(2 coshβJ)N

1

βN(2 coshβJ)

N−j(2β)

j(sinhβJ)

j= (tanhβJ)

j.

Finally, one defines a correlation length,

ξ = − 1

ln tanhβJ,

so that the final result for the correlation function is,

〈SiSi+j〉 = e−j/ξ

namely, correlations decay exponentially as a function of the separation between the spins, with acharacteristic decay length ξ. There are no power law correlations as there is no phase transition atany finite temperature.

Consider now the magnetic susceptibility given by

χ = β∑

j

〈SiSi+j〉

and given the exponential decay of the correlation, the susceptibility is finite. Of course, this isin agreement with the observation that there is no phase transition at finite temperature in theone dimensional Ising model. However, note that ξ behaves anomalously as one approaches T = 0,also consistent with long ranged order appearing exactly at T = 0. In particular, as β → ∞,tanhβJ → 1 so that the correlation function diverges logarithmically as T = 0 is approached,reflecting the emergence of long ranged correlations and the eventual ordered state at T = 0.

Page 98: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

94 CHAPTER 9. ISING MODEL

9.3 Mean Field Theory

The one dimensional Ising model can be solved exactly, but its usefulness is limited as it does nowshow the existence of a finite temperature phase transition. Short of an exact solutions in threedimensions, there is an approximate scheme, mean field theory, that yields quite useful results,although its predictions disagree with experiments in several important aspects.

The energy of each spin in the lattice depends on the local field created by its neighbors. Theenergy of the neighbors depends on the spin in question and that of their own neighbors, thusleading to a complex coupled system. However, when the ferromagnet is globally magnetized ordemagnetized, the local field on any spin does not fluctuate very much from spin to spin. The meanfield approximation is based on replacing the local field on any given spin by an average spin asdetermined over the entire system.

We first calculate the partition function of the Ising model by using the mean field approximation.The partition function is given by,

Z =∑

Sieβ(J

P

SiSj+HP

Si)

The difficulty in the evaluation function resides in the coupling term SiSj . The approximationinvolves replacing Sj by the (yet unknown) average 〈S〉,

Z ≈∑

Sieβ(J

P

Si〈S〉+HP

Si) =∑

Sieβ(Jq〈S〉+H)

P

Si

The sum in the exponent can be turned into a product of uncoupled exponentials. The product canbe further interchanged with the sum over states

Z =∑

Si

i

eβ(Jq〈S〉+H)P

Si =∏

i

Si=±1

eβ(Jq〈S〉+H)P

Si

Finally,

Z =[

eβ(Jq〈S〉+H) + e−β(Jq〈S〉+H)]N

. (9.8)

This completes our calculation of the partition function. The solution has not yet been found as Zstill depends on the unknown average 〈S〉. We obtain the latter quantity self consistently from thepartition function just obtained,

〈S〉 =1

N

i

Si =kBT

N

∂ lnZ

∂H. (9.9)

By combining Eqs. (9.9) and (9.8) one finds,

〈S〉 = tanh [β (Jq〈S〉+H)] , (9.10)

which is an implicit equation for the average magnetization 〈S〉.Equation (9.10) does lead to a phase transition point at which the average magnetization 〈S〉

abruptly deviates from zero. Let us expand 〈S〉 around zero in the vicinity of this point by recallingthat tanhx ≃ x− x3/3 + . . . for small x,

〈S〉 ≃ βJq〈S〉 − 1

3(βJq)

3 〈S〉3 + . . .

This equation has a solution of zero magnetization 〈S〉 = 0, but also a pair or solutions of nonzeromagnetization as long as βJq > 1. Therefore, the mean field approximation leads to a transitiontemperature at

Tc =Jq

kB

Page 99: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

9.3. MEAN FIELD THEORY 95

These other two solutions are,

〈S〉2 =3(βJq − 1)

(βJq)3

It is instructive to examine their temperature dependence near Tc. By expanding β = βc + δβ andby noting that δβ = −δT/(kBT 2), we find,

〈S〉 ∼ ±√−δT = (Tc − T )

1/2.

In summary, the mean field approximation not only predicts a critical temperature separating a hightemperature phase of zero magnetization and a low temperature phase with two -degenerate- phasesof nonzero magnetization, but it also predicts the dependence of the magnetization on temperaturenear Tc. Specifically, the magnetization vanishes as Tc is approached from below as a power lawwith an exponent of 1/2. In what follows, we introduce a magnetization exponent β by

m ∝ (1− T

Tc)β

where the critical exponent in the mean field approximation is β = 1/2. The parabolic dependence

Close to Tc, mean

field gives m ∝ (1− T/Tc)1/2

m

TTc

H = 0

Figure 9.10:

of m versus temperature near Tc is shown in Fig. 9.11.

+1

−1

T

mParabola near m = 0, T = Tc, H = 0

“Classical” exponent β = 1/2

same as parabola

Figure 9.11:

The mean field results just derived depend on the dimensionality of the system, but only throughthe number of neighbors q. We find,

kBTc = qJ =

2J d = 1

4J d = 2

6J d = 3

The result for d = 1 is clearly incorrect since Tc = 0. In d = 2, Onsager’s exact solution yieldskBTc ≈ 2.27J , so the mean field estimate is rather poor. Note also that Onsager’s solution yieldsβ = 1/8 in d = 2, quite different from the mean field result β = 1/2. In three dimensions numericalresults yield kBTc ∼ 4.5J and β = 0.313..., closer to the mean field estimates, but still quantitativelyincorrect. Strangely enough, numerical studies conducted for lattices in high dimensions (d ≥ 4) dofind β = 1/2 (although there are logarithmic corrections in d = 4).

Page 100: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

96 CHAPTER 9. ISING MODEL

There are a number of other predictions made by the mean field approximation. For example,for T = Tc, but H 6= 0, we have

m = tanh (m+H

kBTc)

or

H = kBTc(tanh−1m−m)

Expanding around m = 0 we find

H = kBTc(m+1

3m3 + ...−m)

or

H =kBTc

3m3.

This dependence defines another critical exponent

T = Tcm

H

Shape of critical isotherm

is a cubic H ∝ m3

in classical theory

Figure 9.12:

H ∝ mδ at T = Tc

where δ = 3 in the mean field approximation.A second example concerns the dependence of the magnetic susceptibility

χT = (∂m

∂H)T .

The susceptibility is the analog of the compressibility in liquid gas systems

κT = − 1

V(∂V

∂P)T

The magnetic susceptibility relates the change in magnetization that is brought about by a changein magnetic field. A large χT means the material is very “susceptible” to being magnetized. Giventhe mean field solution,

m = tanhkBTcm+H

kBT

we haveH = kBT · tanh−1m− kBTcm

so that

(∂H

∂m)T = kBT

1

1−m2− kBTc.

Therefore

χT =1

(T − Tc)γ

Page 101: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

9.3. MEAN FIELD THEORY 97

χT , susceptibility

diverges near Tc

Tc

χT

Figure 9.13:

close to Tc with γ = 1 in the mean field approximation. The divergence of the susceptibility near Tcis shown in Fig. 9.13.

It is interesting to dwell some more on the origin of the divergence of the susceptibility near thetransition point. From its definition,

χ− ∂〈S〉∂H

=∂

∂H

1

Z

SiS0e

−β(−J P

SiSj−HP

Si).

All spins are equivalent, so we have picked the average of one of them (S0) as the average spin inthe system. By taking the derivative with respect to H, we find,

χ = − 1

Z2

∂Z

∂H〈S〉+ 1

Z

SiS0e

β(JP

SiSj+HP

Si)

(

β∑

i

Si

)

.

Recalling that 〈S〉 = (kBT/N)∂ lnZ/∂H, we find,

χ = − N

kBT〈S〉2 +

β

Z

SiS0(∑

Si)eβ(−J P

SiSj+HP

Si) = − N

kBT〈S〉2 + β

〈S0Si〉.

Rearranging we find,

χ = β∑

i

(〈S0Si〉 − 〈S0〉〈Si〉) . (9.11)

Equation (9.11) is the analog of the sum rule found for liquids

d~r〈∆n(r)∆n(0)〉 = n2kBTκT .

These are but two examples of equations that relate thermodynamic response functions to integralsof the corresponding correlation functions.

We have just argued that the susceptibility diverges near Tc, yet given that Si = ±1, all termsin Eq. (9.11) are smaller than one. The origin of the divergence must be in the number of termsthat contribute to the sum, or equivalently, in very slowly decaying correlations in S. Long rangecorrelations among the spins near Tc must be responsible for the divergence of χ. In three dimensions,for example, the correlation function must decay as

〈S0Si〉 ∼1

|~r0 − ~ri|3

or slower for the susceptibility to diverge. Certainly, exponentially decaying correlations are incon-sistent with a divergent susceptibility.

As a third example we consider the energy and its derivative, the heat capacity,

〈E〉 = 〈−J∑

〈ij〉SiSj −H

i

Si〉

Page 102: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

98 CHAPTER 9. ISING MODEL

In the mean field approximation all the spins are independent, so

〈SiSj〉 = 〈Si〉〈Sj〉.Hence

〈E〉 = −Jq2Nm2 −HNm

The specific heat per spin is defined to be

c =1

N(∂E

∂T)N .

Since near Tc and H = 0,

m2 ∝

(Tc − T ) T < Tc

0 T > Tc

we have

〈E〉 =

−constant(Tc − T ), T < Tc

0, T > Tc

and hence

c =

constant T < Tc

0 T > Tc

If we had included the full temperature dependence away from Tc, the specific heat looks more likea saw tooth, as drawn in Fig. 9.14. More generally, the specific heat diverges near the critical

Tc

C

Figure 9.14: Discontinuity in C in mean field theory.

temperature, and the divergence is used to defined another critical exponent through

c ∝ |T − Tc|−α

This discontinuity in c, a second derivative of the free energy, is one reason continuous phasetransitions were called “second order”. Now it is known that this is an artifact of mean field theory.

To summarize, near the critical point we have,

m ∼ (Tc − T )β , T < Tc, H = 0

H ∼ mδ, T = Tc

XT ∼ |T − Tc|−γ , H = 0

c ∼ |T − Tc|−α, H = 0

We summarize in the following table the predictions from mean field theory, the results of Onsager’sexact solution in d = 2, and approximate numerical results which have been obtained in d = 3.

mean field d = 2 (exact) d = 3 (num) d ≥ 4α 0 (disc.) 0 (log) 0.125... 0 (disc.)β 1/2 1/8 0.313... 1/2γ 1 7/4 1.25... 1δ 3 15 5.0... 3

Page 103: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

9.3. MEAN FIELD THEORY 99

notransition

non trivialexponents

mean fieldtheory works

1 2 3 4 5

0.2

0.3

0.4

0.5

0.1

β

d

Figure 9.15: Variation of critical exponent β with spatial dimension

We will address later why mean field theory works for d ≥ 4. Figure 9.15 sketches for examplethe dependence of the critical exponent β as a function of a continuous dimension variable d. Fornow we will leave the Ising model behind.

Page 104: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

100 CHAPTER 9. ISING MODEL

Page 105: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 10

Continuum theories of the Isingmodel

Nowadays, it is well established that, near the critical point, one can either solve the Ising model inits original discrete form

E = −J∑

〈ij〉SiSj −H

i

Si

with

Z =∑

Sie−E/kBT

or the so called ψ4 field theory

F =

ddxK2

(∇ψ)2 +r

2ψ2 +

u

4ψ4 −H

ddx ψ(~x)

where K, u, and H are constants, and r ∝ T − Tc with the partition function

Z =

Dψe−βF .

The continuum approximation here is in the same spirit as that introduced when we used aGaussian, continuum model for interface fluctuations instead of a discrete version based on interfacesteps. Although it may not appear that way at first sight, the field theory description of theIsing model has proved to be a very useful simplification near the critical point, yielding the samequantitative behavior as the original, discrete model.

It is possible to obtain one from the other in an approximate way as we now show. First, defineψ(~r) to be the spin density over a small volume V ,

1

NV≈ 1

V

V

d~rψ(~r).

With this in mind, we write,

H∑

Si =HN

V

dd~xψ(~x).

Next, note that (Si − Sj)2 = 2 (1− SiSj) since S2i = S2

j = 1. Therefore

−J∑

〈i,j〉SiSj = −J

2

i,j

SiSj =J

2

i,j

[1

2(Si − Sj)2 − 1

]

=J

4

i,j

(Si − Sj)2 −JN2

2.

101

Page 106: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

102 CHAPTER 10. CONTINUUM THEORIES OF THE ISING MODEL

We next approximate,

(Si − Sj)2∆x2

≈(∂ψ(x)

∂x

)2

.

This is clear in one dimension, but is also holds is more than one dimension with the nearest neighborIsing model. We then find,

−J∑

〈i,j〉SiSj ≈

J∆x2N

2V

dd~x (∇ψ)2 − JN2

2.

We neglect the second term in the right hand side as it is only an additive constant. We also define

the constant K = J∆x2N4V . Therefore in the continuum limit, the energy of any spin configuration is

given by,

E =K

2

dd~x (∇ψ)2 −H

dd~xψ(~x).

There is, however, an additional complication in the calculation of the partition function. In theoriginal model, the discrete spins are constrained to be ±1, whereas ψ is a continuum variable. Thestandard procedure involved allowing arbitrary values of ψ in the configurations, but to penalize-via an increased energy- those configurations that locally deviate from ψ = ±1. Schematically,

Si=±1=∏

i

Si=∞∑

Si=−∞δS2

i ,1∝∏

~x

dψ(~x)δ(ψ2(~x)− 1

).

Where we have extended the sum but introduced a Kronecker delta, followed by the continuumapproximation. In order to incorporate the constraint given by the delta function into the energy,we introduce an exponential representation of the delta function,

δ(x) = limǫ→0

1

ǫ√πe−x

2/ǫ2 :

leading to∏

~x

dψ(~x)1

ǫ√πe−

1ǫ2

(ψ2−1)2

.

Now, by expanding the square in the exponent, redefining the constants appropriately, and removingconstant terms, we arrive at our field theory model,

E =

dd~x

[K

2(∇ψ)

2+ f(ψ)

]

, f(ψ) =r

2ψ2 +

u

4ψ4 −Hψ. (10.1)

This is the so called ψ4 field theory. The critical point of this model is at r = 0 so that r > 0, T > Tcand r < 0 for T < Tc. With this choice, the function f(ψ) is shown in Fig. 10.1. For r > 0, there

f

ψ

f

ψ

T > Tc T < Tc

r > 0 r < 0

Figure 10.1: Uniform part of the energy E in the absence of magnetic field.

is a single minimum of the energy that corresponds to an average value of ψ = 0. Below Tc thereare two minima in the energy that correspond to states of finite magnetization. It is customary to

Page 107: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

10.1. LANDAU THEORY 103

assume r ∝ (T − Tc), u > 0 so that ψ ∼ √Tc − T in the absence of fluctuations, as given by meanfield theory. We will elaborate on these issues below when we describe Landau’s theory.

The ultimate goal of the theory is the determination of the critical exponents α, β, γ, δ, η, andν defined earlier, and to understand the observed experimental deviations from mean field theory.For reference, we list here the definition of the various exponents.

F (T ) ∼ |T − Tc|2−α, H = 0 C =∂2F

∂T 2∼ |T − Tc|−α.

ψ ∼ (Tc − T )β , H = 0, T < Tc

χT = (∂ψ

∂H)T ∼ |T − Tc|−γ , H = 0

ψ ∼ H1/δ, T = Tc

〈ψ(~x)ψ(0)〉 ∼ 1

xd−2+η, T = Tc,H = 0 〈ψ(~x)ψ(0)〉 ∼

e−x/ξ, T > Tc

〈ψ〉2 +O(e−x/ξ), T < Tc

andξ ∼ |T − Tc|−ν , H = 0

This is the first time we have mentioned η and ν in this context.

10.1 Landau theory

Landau introduced the ψ4 theory in a different way. He asserted that the continuous phase transitionat the critical point involves going from a disordered phase with high symmetry to an ordered phasewith lower symmetry. To describe the transition and its symmetry reduction he introduced an orderparameter, nonzero in the ordered phase only. For example, the order parameter in a ferromagnetis the magnetization. In general, identifying the order parameter and the symmetry broken at Tccan be quite subtle. For the time being we will restrict ourselves to a scalar order parameter, wherethe symmetry broken is Z2, that is, when ψ ↔ −ψ symmetry is spontaneously broken. A slightlymore complicated example would involve a magnetic dipole or spin confined to a plane (dimensiond = 2). In this case all spin orientations would be equivalent, so that the order parameter is now a

vector in the plane ~ψ = ψxx+ ψy y. The group corresponding to this rotation symmetry is O(2). A

spin in d = 3 would have ~ψ = ψxx + ψy y + ψz z, and if all directions are equivalent the symmetrygroup is O(3).

Landau argued that near the point at which the symmetry is spontaneously broken, the freeenergy must be a function of the order parameter ψ. Hence

F =

ddx f(ψ)

and he further assumed that near the transition (where ψ abruptly deviates from zero), the freeenergy must have an expansion in powers of ψ,

f(ψ) =r

2ψ2 +

u

4ψ4 + ...

In this case, the expansion has been chosen to preserve the symmetry ψ ↔ −ψ so that no odd termsin ψ would appear. More complicated order parameters and symmetries would lead to differentexpansions of f . In the event that an external magnetic field breaks the symmetry, another term isadded to the free energy,

−H∫

ddx ψ

Since at high temperatures, in the disordered phase, ψ = 0, one chooses r > 0 for T > Tc.However, at low temperatures, we have an ordered phase with ψ 6=> 0, and hence r < 0. To make

Page 108: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

104 CHAPTER 10. CONTINUUM THEORIES OF THE ISING MODEL

sure that the free energy cannot be reduced by further increasing ψ he set u > 0. There is no needfor u to have any special dependence in T near Tc, so it is taken to be a constant. It is customaryto expand r in Taylor series,

r = r0(T − Tc) + ...

and keep only the lowest order term, with r0 > 0.Finally, spatial fluctuations in order parameter cost energy and would be suppressed. A way to

incorporate this into F is to assume that it also depends on gradients of ψ. This dependence can inturn be expanded in powers of ~∇ and ψ, so that the lowest order, nontrivial term is

1

2K

dd~r |∇ψ|2

with K > 0. Note that this term raises the free energy of inhomogeneities (~∇ψ 6= 0). In summary,the complete Landau free energy is

F =

ddx f(ψ) +K

2|∇ψ|2 −H

ddx ψ f(ψ) =r

2ψ2 +

u

4ψ4 (10.2)

In high energy physics, this form of the energy is called a ψ4 field theory. The∇2 term correspondsto the kinetic energy of a particle; r is proportional to its mass; and ψ4 describes interactions. vander Waals also introduced the same free energy (the case with K ≡ 0 is normally given in textbooks).He was thinking of a liquid-gas transitions, so that ψ = n − nc, where n is density and nc is thedensity at the critical point.

It is worth emphasizing that the whole machinery of quadratic, quartic, and square gradient termsin the free energy is necessary. For example, note that the critical point is defined in thermodynamicsas (

∂P

∂V

)

Tc

= 0,

(∂2P

∂V 2

)

Tc

= 0.

for a liquid-gas system. In a ferromagnet, one similarly has,(∂H

∂m

)

Tc

= 0,

(∂2H

∂m2

)

Tc

= 0.

One can write a series expansion for the equation of state at T = Tc in a liquid as follows,

p(n, Tc) = pc +

(∂p

∂n

)

Tc

(n− nc) +1

2

(∂2p

∂n2

)

Tc

(n− nc)2 +1

6

(∂3p

∂n3

)

Tc

(n− nc)3 + . . .

and similarly for a ferromagnet

H(m,Tc) = >0

Hc +

(∂H

∂m

)

Tc

(m− *0mc) +

1

2

(∂2H

∂m2

)

Tc

m2 +1

6

(∂3H

∂m3

)

Tc

m3 + . . .

If the dependence of p of H on the order parameter is analytic at the critical point, then for bothsystems one must have

(p− pc) ∝ (n− nc)δ H ∝ mδ

with δ = 3. Experimentally, however, δ ≈ 5.0 in three dimensions, while δ = 15 in d = 2. Thissimple argument indicates that there is quite a subtle problem to solve.

We return to classical Landau theory by considering first homogeneous states of the system (inwhich the gradients of the order parameter vanish). In this case, we need only consider

F

V= f(ψ)−Hψ =

r02

(T − Tc)ψ2 +u

4ψ4 −Hψ. (10.3)

The equilibrium state can be found by direct minimization of the free energy with respect to theorder parameter. For H = 0 we find,

∂f

∂ψ= 0 = r0(T − Tc)ψ + uψ3

Page 109: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

10.1. LANDAU THEORY 105

This cubic equations has three solutions

ψ = 0, ψ = ±√

r0(Tc − T )

u(10.4)

These three solutions are minima in some range of parameters, and maxima in others. This canbe checked by computing the sign of ∂2f/∂ψ2. It is found that ψ = 0 is a minimum of the free

energy for T > Tc, and ψ = ±√

r0(Tc−T )u are minima for T < Tc. Therefore this phenomenological

expansion of F does provide a phase transition at T = Tc between a high temperature phase inwhich the order parameter vanishes, and a low temperature phase of nonzero order parameter. Wealso note that Eq. (10.4) implies β = 1/2, the same value predicted by mean field theory.

In the case H 6= 0, there is only one solution

∂ψ(f(ψ)−Hψ) = 0 =

∂f

∂ψ−H

or

H = r0(T − Tc)ψ + uψ3

Note that this gives

1/chiT =

(∂H

∂ψ

)

T

= r0(T − Tc)

close to the critical point, or

χT ∝ |T − Tc|−1

Hence, γ = 1 in Landau theory, also the same value given by the mean field approximation. AtT = Tc we have H ∝ ψ3, or δ = 3.

Finally, since for H = 0

f =1

2rψ2 +

1

4uψ4

and

ψ2 =

0, T > Tc

− ru , T < Tc

we have,

f =

0, T > Tc

− 14r2

u ∝ (Tc − T )2, T < Tc

Since the specific heat c ∝ ∂2f/∂T 2, we find

c ∝

0, T > Tc

constant, T < Tc

The fact that there is no divergence implies α = 0.

In summary, the values of the critical exponents given by Landau’s theory are the same as thosegiven by mean field theory:

α 0 (discontinuity)β 1/2γ 1δ 3

Page 110: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

106 CHAPTER 10. CONTINUUM THEORIES OF THE ISING MODEL

10.2 Ornstein-Zernicke theory of correlations

We next extend Landau’s theory by allowing spatially dependent fluctuations. We follow the treat-ment by Ornstein and Zernicke. It works well except near the critical point. The essential point isthat in a bulk phase for, say, T >> Tc and H = 0 ψ has an average value of zero 〈ψ〉 = 0 so that weexpect ψ4 << ψ2. Therefore above Tc we approximate the free energy by

F ≈∫

ddx[K

2(∇ψ)2 +

r

2ψ2]

Far below Tc, we expect ψ to be close to its average value

〈ψ〉 = ±√

− ru

where r = r0(T − Tc) < 0 below Tc. Let ∆ψ = ψ − 〈ψ〉. Then

~∇ψ = ~∇(∆ψ − 〈ψ〉) = ~∇(∆ψ)

Also,ψ2 = (∆ψ + 〈ψ〉)2 = (∆ψ)2 + 2〈ψ〉∆ψ + 〈ψ〉2

and

ψ4 = (∆ψ + 〈ψ〉)4

= (∆ψ)4 + 4〈ψ〉(∆ψ)3 + 6〈ψ〉2(∆ψ)2 + 4〈ψ〉3(∆ψ) + 〈ψ〉4.

Grouping terms in powers of ∆ψ, we find

r

2ψ2 +

u

4ψ4 =

r

2〈ψ〉2 +

u

4〈ψ〉4 + (r〈ψ〉+ u〈ψ〉3)∆ψ + (

r

2+

3u

2〈ψ〉2)(∆ψ)2 +O(∆ψ)3

But 〈ψ〉 = ±√

−r/u, so

r

2ψ2 +

u

4ψ4 =

r

2〈ψ〉2 +

u

4〈ψ〉4 − r(∆ψ)2 +O((∆ψ)3).

The first two terms in the right hand side are a constant independent of the order parameter, andwe omit them. Also (∆ψ)3 << (∆ψ)2, hence we approximate

F ≈∫

dd~x

(K

2(∇∆ψ)2 + |r|(∆ψ)2

)

where we have used the fact that r < 0 below Tc.Both approximate free energies (above and below Tc are formally the same as the energy for the

fluctuating interface

σ

2

dd−1~x

(∂h

∂x

)2

Averages and correlation functions can be calculated in exactly the same way as we did for that case.The definitions of Fourier transforms are the same (except that now dd−1x→ ddx and Ld−1 → Ld,for example), and the arguments concerning translational invariance of space, and isotropy of spacestill hold. Let

C(x) = 〈ψ(x)ψ(0)〉be the spin-spin correlation function, then

C(~k) =kBT

Kk2 + r, T > Tc

Page 111: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

10.2. ORNSTEIN-ZERNICKE THEORY OF CORRELATIONS 107

and since

〈∆ψ(x)∆ψ(0)〉 = 〈ψ(x)ψ(0)〉 − |r|u

by using 〈ψ〉2 = |r|/u, we have

C(~k) =kBT

Kk2 + 2|r| −|r|u

(2π)dδ(~k), T < Tc

Inverse Fourier transformation will give the correlation function in real space. Recall that theGreen’s function of the Helmholtz equation is defined as

(∇2 − ξ−2)G(x) = −δ(x)

where ξ > 0 is a constant. The Fourier transform of this equation is

G(k) = 1/(k2 + ξ−2),

exactly the same that we have obtained from the Ornstein-Zernicke theory. The Green’s function ofthe Helmholtz equations is

G(x) =

ξ/2 e−x/ξ, d = 112π K0(x/ξ), d = 21

4πx e−x/ξ, d = 3

K0(y) is the modified Bessel function of order zeroth and satisfies K0(y → 0) ∼ − ln y, and K0(y →o∞) ∼ (π/2y)1/2 e−y. This completes our calculation of the correlation function as a function ofthe dimensionality of the system.

In discussions of critical phenomena, these three relations are consolidated into

G(x) ∝ 1

x(d−1)/2

1

ξ(d−3)/2e−x/ξ.

HenceG(x) ∼ e−x/ξ

unless ξ =∞, in which case

G(x) ∼ 1

xd−2

for large x. In summary, up to a constant of proportionality, we have

C(x) =

e−x/ξ, T > Tc1

xd−2 , T = Tc

〈ψ〉2 +O(e−x/ξ), T < Tc

where the correlation length ξ is

ξ =

√Kr , T > Tc

√K

2|r| , T < Tc

Note that the result for T < Tc

C(x) = 〈ψ(x)ψ(0)〉 = 〈ψ〉2 +O(e−x/ξ)

is a rewriting of〈∆ψ(x)∆ψ(0)〉 = e−x/ξ

As sketched in Fig. 10.2 the correlation length diverges as ξ ∼ 1/√

|r| above and below Tc, so

ξ ∼ |T − Tc|−1/2

Page 112: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

108 CHAPTER 10. CONTINUUM THEORIES OF THE ISING MODEL

T

ξ Correlation length ξ

ξ ∼ |T − Tc|−1/2or

ξ ∼ 1√|r|

Figure 10.2:

T > Tc

T < Tc

T = Tc

C(x)

x

Figure 10.3: Correlation function above at and below Tc

in Ornstein-Zernicke theory. The correlation function becomes a power law at Tc.

C(x) =1

xd−2, T = Tc,H = 0

Since in general we have defined the exponents

ξ ∼ |T − Tc|ν

and

C(x) ∼ 1

xd−2+η

we have that Ornstein-Zernicke theory predicts ν = 1/2 and η = 0.In summary,

Mean Field Ising Landau 0− Z d = 2 Ising d = 3 Ising d ≥ 4 Isingα 0 (disc.) 0 (disc.) 0 (log) 0.125... 0β 1/2 1/2 1/8 0.313... 1/2γ 1 1 7/4 1.25... 1δ 3 3 15 5.0... 3η 0 0 1/4 0.041... 0ν 1/2 1/2 1 0.638... 1/2

Therefore the continuum formulations of the Ising model given so far give exactly the same predic-tions as the mean field approximation, and hence disagree in fundamental ways with both experi-ments and numerical approximate solutions, as well as Onsager’s exact solution in two dimensions.

10.3 Ginzburg criterion

In the Ornstein-Zernicke theory we have explicitly used the fact that fluctuations are small (∆ψ)3 <<(∆ψ)2 so that the free energy considered is quadratic in the order parameter. We show, however,that this assumption is completely inconsistent with the values of the critical exponents given by thetheory, and hence the results obtained can be considered suspect. In order to quantify the magnitudeof fluctuations, we compute

〈∆ψ2〉〈ψ〉2 , (10.5)

Page 113: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

10.3. GINZBURG CRITERION 109

in the vicinity of Tc (r = 0). We also define,

t = |T − TcTc

|

and examine the behavior of the ratio (10.5) as t→ 0.By definition we have 〈ψ〉 ∼ tβ as t→ 0. On the other hand, the magnetic susceptibility

χT =

dd~x〈∆ψ(~x)∆ψ(0)〉 ∼ t−γ ,

by definition of γ. We estimate the order of magnitude of the integral as 〈∆ψ2〉ξd, and therefore byusing the definition of the correlation length exponent ξ ∼ t−ν , we find 〈∆ψ2〉t−νd ∼ t−γ . Therefore,

〈∆ψ2〉〈ψ〉2 ∼ t

−γ+νd−2β .

This relation is known as the Ginzburg criterion. Note that if we substitute the mean field (orLandau) values of the critical exponents, we find

〈∆ψ2〉〈ψ〉2 ∼ t

(d−4)/2.

If d > 4 the approximation of neglecting higher order terms in the expansion of the fluctuationsis justified, as the fluctuations become negligible relative to the average as the critical point isapproached. For d < 4 on the other hand, the variance of the fluctuations diverges as the criticalpoint is approached. The quadratic approximation for the free energy used in this chapter is thereforenot valid.

Page 114: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

110 CHAPTER 10. CONTINUUM THEORIES OF THE ISING MODEL

Page 115: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 11

Scaling Theory

Historically, the field of phase transitions was in a state of turmoil in the 1960’s. Sensible, down toEarth theories like the mean field approach, Landau’s approach, and Ornstein-Zernicke theory werenot working. A lot of work was put into refining these approaches, with limited success.

For example, the Bragg-Williams mean field theory given earlier is quite crude. Refinements canbe made by incorporating more local structure. Pathria’s book on Statistical Mechanics reviewssome of this work. Consider for example the specific heat of the two-dimensional Ising model. SinceOnsager’s solution, this was known exactly. It obeys C ∼ ln |T − Tc|, near Tc, whereas our meanfield calculations predicted a discontinuity at the critical temperature. Further refinements of themean field approach have led to much better predictions of C away from but not close to Tc. At Tc,all mean field theories, no matter how refined, give C to be discontinuous.

C

TcT

Crude mean field Better mean field

C

TcT

C

TcT

better . . .

All mean theories give C ∼ discontinuous

Figure 11.1:

Experiments also show a divergence as given in Fig. 11.2 taken from Stanley’s book, “Intro-duction to Phase Transitions and Critical Phenomena”. Experiments consistently did not (and donot of course) agree with Landau’s prediction for the liquid-gas or binary liquid critical points. Asdiscussed in the previous chapter, Landau’s theory gives

n− nc ∼ (Tc − T )β

with β = 1/2. Experiments in three dimensions give β ≈ 0.3 for many liquids (usually quoted as1/3 historically). Figure 11.3 shows this result. The figure is taken from the most comprehensiveexperimental survey given in Guggenheim, J. Chem. Phys. 13, 253 (1945) and reproduced inStanley’s book.

If one examines typical configurations of a ferromagnet near Tc (where ξ → ∞) the result isstriking. Figure 11.4 shows some quite old simulations of the Ising model in two dimensions, alsotaken from Stanley’s book. Spins up are shown in white in the figure, and spins down in black.As Tc is approached (frames (e) and (f) in the figure) domains of either phase appear to adopt afractal structure. The structure of the configurations is similar for a fluid near its critical point.Imagine we heat it through Tc while fixing the pressure so that it moves on the coexistence line

111

Page 116: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

112 CHAPTER 11. SCALING THEORY

Figure 11.2:

Figure 11.3:

Page 117: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

113

Figure 11.4:

Page 118: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

114 CHAPTER 11. SCALING THEORY

P

T

liquid

vapor

heat this way

Figure 11.5:

of the phase diagram (Fig. 11.5). The appearance of the configurations will be quite different atdifferent temperatures. The system is transparent, except for the interface and meniscus separating

Liquid

Vapor

Looking

Milky

Mess

Gas

Opaque TransparentTransparent

P ≡ Pliquid vapor coexistance

T < Tc T > TcT = Tc

Figure 11.6:

liquid and vapor, for T < Tc and T > Tc (Fig. 11.6). However, as we approach Tc, it becomesturbid or milky looking. This is called “critical opalescence”. The correlation length ξ is so largethat density fluctuations become of the order of the wavelength of light (or larger) and hence scattervisible light. This is not unlike the size of spin fluctuations in Fig. 11.4.

Given the difficulties in reconciling experimental observations near the critical point of manysystems and theoretical treatments based on mean field approaches or Landau’s theory, a significantamount of effort was devoted to proving rigorous results that would follow from the laws of thermo-dynamics alone. Unfortunately, thermodynamic arguments rely on system stability at the criticalpoint (entropy is maximized, free energy is minimized, etc.), and hence only inequalities can be ob-tained. These inequalities follow from the curvature of the thermodynamic potential at the criticalpoint, and the fact that any fluctuation in the system has to obey such inequalities. Such studiesled to rigorous inequalities among the six critical exponents. (One should mention the assumptionthat any divergence is the same for T → T+

c or T → T−c . This is nontrivial, but turns out to be

true.) For example, Rushbrooke proved

α+ 2β + γ ≥ 2.

Griffiths provedα+ β(1 + δ) ≥ 2.

These are rigorous results. Other inequalities were proved with some plausible assumptions. It turnsout that all inequalities are satisfied for all the exponents quoted previously (be them theoreticallyderived or experimentally observed), but as equalities. In fact, all the thermodynamic inequalitiesderived turned out to be obeyed as equalities.

We summarize all the independent relations between exponents

α+ 2β + γ = 2

γ = ν(2− η)γ = β(δ − 1)

Page 119: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

11.1. SCALING WITH ξ 115

and

νd = 2− α

Given the number of exponents involved, and the relationships given, it turns out that for the Isingmodel there are only two independent exponents. The last equation is called the “hyperscaling”relation as it is the only one that explicitly involves the spatial dimension d. The others are simplycalled “scaling” relations.

11.1 Scaling With ξ

A breakthrough in the theory of critical phenomena occurred when Widom introduced phenomeno-logically the idea of scaling, which lead to the derivation of the equality in the thermodynamicinequalities given above. While his ideas made possible also the interpretation of experimentalevidence, his scaling assumption remains an assumption, and it did not predict the values of theexponents, only relationships among them. In essence, he assumed that all thermodynamic functionshave a scale invariant contribution as T → Tc, and that this scale invariant contribution becomesa generalized homogeneous function of its arguments. Consider, for example, the equilibrium cor-relation function C = C(x) which in the case of the Ising model is a function of T and H as well(the two other intensive parameters on which the partition function depends). Instead of usingT as the independent variable, let us introduce ξ instead. Near Tc they are both related throughξ ∼ |T − Tc|−ν . The assumption that C is a generalized homogeneous function of its arguments iswritten as,

C(x, ξ,H) =1

ξd−2+η

∼C(x/ξ, Hξy). (11.1)

Note that physically this equation embodies the idea of scale invariance: correlations do notdepend on x alone, but on the ratio x/ξ instead. Similarly, since blocks of spins are correlatedover a distance ξ, only a little H is needed to significantly magnetize the system: the effect of H ismagnified by ξy.

We can also introduce explicitly the temperature into the scaling relation (11.1). Since ξ ∼|T − Tc|−ν , which we write as ξ ∼ t−ν by introducing t = |T − Tc|/Tc, we have

C(x, T,H) =1

xd−2+η

∼C(xtν , H/t∆) (11.2)

with

∆ = yν

being called the “gap exponent”.

11.1.1 Derivation of exponent equalities

We now show how the scaling relation (11.2) allows one to derive equalities among critical exponents.First we start with the magnetic susceptibility sum rule

χT ∝∫

ddxC(x, T, H)

=

ddx1

xd−2+η

∼C(xtν , H/t∆)

= (tν)−2+η

dd(xtν)

∼C(xtν , H/t∆)

(xtν)d−2+η

so that

χT (T,H) = t(−2+η)ν∼χ(H/t∆)

Page 120: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

116 CHAPTER 11. SCALING THEORY

where we have defined∫ ∞

−∞dd(xtν)

1

(xtν)d−2+η

∼C(xtν , H/t∆) =

∼X(

H

t∆)

But by definition of the exponent γ,

χ(T,H = 0) = t−γ ,

hence,γ = (2− η)ν.

This exponent equality is called Fisher’s law.Another equality can be derived by using the definition of magnetic susceptibility,

∂〈ψ(T,H)〉∂H

= χT (T,H)

relation that can be integrated to

〈ψ(T,H)〉 =

∫ H

0

dH ′χ(T,H ′)

=

∫ H

0

dH ′t−γ∼χ(H ′

t∆)

= t∆−γ∫ H

0

d(H ′

t∆)∼χ(H ′

t∆)

= t∆−γ ∼ψ(H/t∆)

But by definition of the exponent β we have

〈ψ(T,H = 0)〉 = tβ

Therefore,β = ∆− γ

and the gap exponent∆ = β + γ

or equivalently, y = (β + γ)/ν. So we have

〈ψ(T,H)〉 = tβ∼ψ(

H

tβ+γ) (11.3)

We mention next a useful transformation of the scaling relations that allows one to change thescaling variables, or scaling fields as they are sometimes called. We illustrate it with a manipulation

of Eq. (11.3). We clearly do not know the functional form of∼ψ, all we know is that it is not a

function of the two arguments separately, but rather a function of the combination Htβ+γ . Therefore

if we multiply or divide by any function of H/tβ+γ into∼ψ, we obtain another unknown function,

which still a function of the same argument H/tβ+γ . In particular, a crafty choice can cancel the tβ

factor as follows:

〈ψ(T,H)〉 = tβ(H

tβ+γ)

ββ+γ (

H

tβ+γ)

−ββ+γ

∼ψ(

H

tβ+γ)

We combine the last two terms on the right hand side into

∼∼ψ, and write,

〈ψ(T,H)〉 = Hβ/β+γ

∼∼ψ(

H

tβ+γ)

Page 121: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

11.1. SCALING WITH ξ 117

with H−β/β+γ∼ψ(H) =

∼∼ψ(H).

This new form of the scaling relation allows us to determine yet another exponent equality. Recallthat by definition of δ

〈ψ(Tc,H)〉 ∼ H1/δ

gives the shape of the critical isotherm. Therefore δ = 1 + γ/β, or

γ = β(δ − 1).

This is called Widom’s scaling law.Yet one more relation can be derived as follows. Since 〈ψ〉 = ∂F/∂H,

F =

dH 〈ψ〉

=

dH tβ∼ψ(

H

tβ+γ)

= t2β+γ

d

(H

tβ+γ

) ∼ψ

(H

tβ+γ

)

.

Hence

F (T,H) = t2β+γ∼F

(H

tβ+γ

)

But by definition of the exponent α,F (T, 0) = t2−α

therefore 2− α = 2β + γ, orα+ 2β + γ = 2

which is Rushbrooke’s law.Finally we analyze the behavior of successive derivatives of the free energy. This is important in

the analysis we carried our earlier regarding a Taylor series expansion of thermodynamic variables,and how they failed to predict basic properties near the critical point. This calculation clarifies thisissue. The form of the free energy is

F (T,H) = t2−α∼F (

H

tβ+γ),

therefore

limH→0

∂nF

∂Hn= F (n) = t2−αt−n(β+γ)

Therefore we obtain the result that successive derivatives of F are separated by a constant “gap”exponent (β + γ), with the corresponding exponent being

yn = 2− α− n(β + γ)

In short, the free energy is not analytic at the critical point, and hence cannot be expanded inTaylor series of its arguments.

We finally discuss two physically intuitive arguments to derive the hyperscaling relation, whichare nevertheless not rigorous. The free energy is extensive and has units of energy, so

F ∼ kBTV

ξd

as one can think of V/ξd as the number of independent parts of the system. Since ξ ∼ |T − Tc|−ν ,then F = |T − Tc|νd. But the singular behavior in F defines α via

F = |T − Tc|2−α

Page 122: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

118 CHAPTER 11. SCALING THEORY

0

1

2

−1

−2

free energy

magnetization

Susceptability

yn

2 31

Figure 11.7: Gap Exponent

Henceνd = 2− α

The hyperscaling relation works for all systems, except that mean field solutions must have d = 4.Another argument, not as compelling, but interesting, is as follows. One can imagine that near Tcfluctuations in the order parameter are of the same order as 〈ψ〉 itself. If so,

〈(∆ψ)2〉〈ψ〉2 = O(1)

This assumption is the weak part of the argument as we do not know the relative size of fluctuationsto the average. As argued earlier,

〈ψ〉2 ∼ t2β

Also as given in the previous chapter,

ddx〈∆ψ(x)∆ψ(0)〉 = XT

The integral is only nonzero for x ≤ ξ. In that range, 〈∆ψ(x)∆ψ(0)〉 ∼ O〈(∆ψ)2〉 so that

χT =

ddx〈∆ψ(x)∆ψ(0)〉 ∼ ξd〈(∆ψ)2〉 = χT

Since χT ∼ t−γ and ξ ∼ t−ν then 〈(∆ψ)2〉 = tνd−γ The relative fluctuation therefore obeys

〈(∆ψ)2〉〈ψ〉2 ∼ tνd−γ−2β

which we assumed is an order one quantity. Therefore, νd = γ + 2β. To put it into the same formas before, use Rushbrooke’s scaling law, α+ 2β + γ = 2. This gives νd = 2− α.

Page 123: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 12

Renormalization Group

The renormalization group is a theory that builds on scaling ideas, and provides a procedure tocalculate critical exponents. It exploits the observation that phase transitions can be describedby a few long wavelength properties. These include the symmetry of the order parameter and thedimensionality of space (for systems without long range forces or quenched impurities). Short lengthscale properties are systematically eliminated by the renormalization group transformation.

The basic ideas behind the renormalization group were given by Kadanoff in 1966 in his “blockspin” treatment of the Ising model. A more general method, also free of inconsistencies, was givenby Wilson (who won the Nobel prize for his work). Furthermore, Wilson and Fisher devised a veryuseful implementation of the renormalization group in the so called ǫ(= 4 − d) expansion, where asmall parameter ǫ is eventually extrapolated to unity to predict properties in d = 3.

At a critical point, Kandanoff reasoned, spins in the Ising model act together up to distances of theorder of the correlation length. He argued that one could computer the partition function recursively,by progressively eliminating (or coarse graining) scales smaller than ξ. From the recursion relationdescribing scale changes, one can in fact solve the Ising model. This is only an asymptotic solutionvalid for long length scales, but this is sufficient to derive the correct thermodynamic behavior.

12.1 Decimation in one dimension

In order to illustrate these ideas with a specific example, we will consider a special coarse grainingtransformation in the one dimensional Ising model that can be carried out exactly. Of course, thisis an academic exercise as there is no phase transition in one dimension, but the renormalizationprocedure can be followed in detail. We first introduce a coupling constant K = J/kBT and writethe partition function as,

Z =∑

Sie

P

iKSiSi+1 .

We first rewrite eKSiSi+1 = coshK(1 + uSiSi+1) with u = tanhK. This relationship is easy toverify by explicitly substituting the values corresponding to all four combinations of spin values. Werewrite the partition function as

Z =∑

Si

i

coshK(1 + uSiSi+1).

We now introduce a specific renormalization transformation called “decimation”: in the sumleading to the partition function, all spins on odd sites i = 2n+ 1, n = 0, 1, . . . are singled out, andthe sum carried out over them. This leads a sum that will depend only on the even numbered spins:

Z =∑

S2n

S2n+1

i

coshK(1 + uSiSi+1) =∑

S2n

i=2n

S2n+1eK(S2nS2n+1+S2n+1S2n+2).

119

Page 124: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

120 CHAPTER 12. RENORMALIZATION GROUP

The sum over, say S3 can be done because both S2 and S4 are held fixed. That is, for any spinconfiguration of given S2 and S4 we can sum over all possible states of S3. The same can be donefor all the odd spins when holding the even spins constant. Explicitly,

S2n+1eK(S2nS2n+1+S2n+1S2n+2) = cosh2K

S2n+1(1 + uS2nS2n+1) (1 + uS2n+1S2n+2) =

= 2 cosh2K(1 + u2S2nS2n+2

).

Substituting back into the partition function, one has now,

Z =∑

S2n

S2n+1

2n

2 cosh2K(1 + u2S2nS2n+2

).

This is the partition function of the decimated system (odd spins summed over). Given the partialsummation carried over, Z no longer includes any dependence on fluctuations at length scale ofthe order of the lattice spacing: the sums over them have already been computed. The remarkableobservation is that this partition function is the same as the partition function of the original systemexcept for a multiplicative constant coshK → 2 cosh2K and replacing u by u2. In fact, in order toindicate this transformation of u after one iteration of our transformation, one writes,

u(1) = u2.

This is called a recursion relation as it relates how the interaction constant of the decimated spins isrelated to the original lattice constants. The remaining even spins interact with a coupling constantu(1) which already contains the effect of all the possible configurations of the intervening spins thathave been eliminated. You can think of this as an effective coupling constant that includes the effectsof the odd spins.

In the original variables, the recursion relation is tanhK(1) = (tanhK)2. Since the inverse oftanh is known, we can write,

K(1) =1

2ln cosh(2K).

This transformation can be iterated over and over again until carrying out the entire sum leadingto the partition function. At each level of iteration, the effective length scale of the fluctuations inhalved as only the remaining, effective, spins interact with an effective coupling constant K(i). Wewrite this iteration as

K(i+1) = R(K(i)) R(x) =1

2ln cosh(2x). (12.1)

Note that under iteration,limn→∞K

(n) = 0,

unless K(0) = ∞. Under iteration, the partition function is transformed into an Ising model ofeffective spins of ever decreasing coupling constant K(n). Since K = J/kBT , this can be interpretedas that under iteration the remaining effective spins appear to be at higher and higher temperature.In short, the partition function is that of a disordered, infinite temperature Ising model. This is ofcourse the only stable phase for the Ising model. The other solution of this recursion is the particularcase in which the original system of spins has K(0) =∞, that is, the system is at T = 0. This is theperfectly ordered solution, which we know is only stable at T = 0.

With this simple transformation, we have already determined that the partition function of theIsing model only has two possibilities: one corresponding to the infinite temperature Ising modelfor any arbitrary starting Ising model (of fixed temperature), and the zero temperature Ising model.For later reference, the solutions of the recursion relation are called fixed points. A fixed point isone in which the system remains invariant under the transformation:

K∗ = R(K∗).

In the case given, the high temperature fixed point is said to be stable trajectories with any startingK flow into it by the action of the transformation R, whereas the low temperature fixed point itsaid to be unstable. Only a trajectory under iteration that starts at the fixed point remains on it.Any small perturbation will cause the trajectory to flow toward the stable fixed point.

Page 125: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

12.2. BLOCK SCALING RENORMALIZATION GROUP 121

12.2 Block scaling renormalization group

Decimation can only be done exactly in one dimension. We turn here to a different transformation:block scaling. We write the partition function as

Z(N,K, h) =∑

SieK

P

〈ij〉 SiSj+hP

i Si

where K = J/kBT and h = H/kBT . Now, let us reorganize all these sum as follows:

1. Divide the lattice into blocks of size bb (we’ll let b = 2 in the cartoon below, Fig. 12.3).

2. Each block will have an internal sum done on all the spins it contains leading to an effectiveblock spin S(1) = ±1. There are many different block transformations used, but for thepurposes of the discussion we can think of a majority rule transformation defined as

S(1)i =

+1,∑

iǫbd Si > 0

−1,∑

iǫbd Si < 0

±1randomly

,∑

iǫbd Si = 0

Clearly, if N is the number of spins in the original lattice, only N/bd spins will remain after blocking.It is easy to see that this is a good idea. Any feature correlated on a finite length scale l

before blocking becomes l/b after blocking once. Blocking m times means this feature is now on thesmall scale l/bm. In particular, consider the correlation length ξ. The renormalization group (RG)recursion relation for ξ is

ξ(m+1) = ξ(m)/b

If the RG transformation has a fixed point, then

ξ∗ = ξ∗/b

The equation defining the fixed point has only two possible solutions

ξ∗ =

0, Trivial fixed point : T = 0,∞∞, Nontrivial fixed point : T = Tc

Note that the critical fixed point (the fixed point at Tc) is unstable, while the T = 0 and T = ∞

ξ

Tc

Arrows show flow

under RG

Figure 12.1:

fixed points are stable. If one does not start at a state in which the correlation length is exactlyinfinite, the flow defined by the blocking transformation is such that it will lead to the fixed pointof ξ = 0. This is shown in Fig. 12.1.

As was the case with the decimation transformation, the coarse-graining eliminates informationon small scales in a controlled way. Consider Fig. 12.2, and the blocking transformation shown in

Fig. 12.3. The states in the new renormalized system have S(m+1)i = ±1, on sites i = 1, 2, ..N (m+1)

Of course, N (m+1) = N (m)/bd. In the figure, N (m+1) = 16, bd = 22, and N (m) = 64.

Page 126: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

122 CHAPTER 12. RENORMALIZATION GROUP

T > Tc blocking Renormalized System

Less Fluctuations

System with

a few fluctuations

(lower T )

Figure 12.2:

Original System

RG level m

blocking with b = 2 Renormalized ConfigurationRG level (m+ 1)

Figure 12.3:

It is simple to define and label all the states in the sum for Z as we renormalize, but how do thespins interact after renormalization? In other words, how do we determine the coupling constantsin the renormalized lattice as a function of the original coupling constant. It is clear that, if theoriginal system only has short ranged interactions, the renormalized system will as well.

At this point, historically, Kadanoff reasoned that if E(m)(S(m)i ) was that of an Ising model,

then E(m+1)(S(m+1)i ) was as well. The coupling constants K(m) and h(m) would depend on the

level of transformation (m), but not the functional form of the energy E. This is not correct as wewill see shortly, but the idea and approach are dead on target. If this assumption were correct, thepartition function after (m) iterations of the transformation would be simply,

Z(m)(N (m),K(m), h(m)) =∑

S(m)i

eK(m) P

〈ij〉 S(m)i S

(m)j +h(m) P

i S(m)i

ButZ(m) = e−N

(m)f(m)(h(m),t(m))

where t(m) is the reduced temperature obtained from K(m), and f (m) is the free energy per unitvolume. But since we are only doing partial sums to compute the partition function, it must be thecase that Z(m) = Z(m+1). Therefore,

N (m)f (m)(h(m), t(m)) = N (m+1)f (m+1)(h(m+1), t(m+1))

or,f (m)(h(m), t(m)) = b−df (m+1)(h(m+1), t(m+1)) (12.2)

This is the recursion relation for the free energy under the assumption that the original lattice andall the renormalized lattices have an interaction of the Ising type.

Since h and t change under RG and the transformation only depends on the parameter b, weassume

t(m+1) = bAt(m) h(m+1) = bBt(m) (12.3)

Page 127: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

12.2. BLOCK SCALING RENORMALIZATION GROUP 123

Hence,

f (m)(h, t) = b−df (m+1)(hbB , tbA)

Now Kadanoff assumed (incorrectly) that f (m) and f (m+1) are both simply the Ising model, andhence the same functionally. Hence

f(h, t) = b−df(hbB , tbA) (12.4)

Hence, the free energy is a generalized homogeneous function of its arguments. This is the sameresult assumed by Widom. We do not have a derivation for the exponents A and B, and hence thepossibility remains open for the existence of nontrivial exponents.

The recursion relation quickly leads to a scaling form for the free energy. Since b is arbitrary,choose tbA = 1. Then

f(h, t) = td/Af(ht−B/A, 1) = td/A∼f (ht−B/A)

which is of the same form as the scaling relation previously obtained for F .

It is now known that, in general, configuration blocking of the Ising model does not lead toIsing-type interactions among the block spins. Consequently, the arguments above do not apply, asat each iteration the functional form of the energy as a function of the block spins is different. Inorder to see why, let us consider the interactions among spins before and after application of theRG transformation. For simplicity, let H = 0. The original system has interactions spanning only

vs.

E(m)

kBT= −K(m)

P

〈ij〉 S(m)i S

(m)j vs. E(m+1)

kBT= −K(m+1)

P

〈ij〉 S(m+1)i S

(m+1)j

+(langer range interactions)

Figure 12.4:

its nearest neighbors as shown schematically in Fig. 12.4. The blocked spin treats four spins asinteracting together, with identical strength, with other blocked spins. This interaction is shownin Fig. 12.5. Note that the direct part of the nearest-neighbor interaction to the right in the

Nearest neighbor interaction shown by solid line

shown by wavy lineNext – nearest neighbor interaction (along diagonal)

Figure 12.5:

renormalized system has 4 spins interacting equivalently with 4 other spins. There are 16 differentconfigurations of the original system which are replaced by one global, effective, interaction of theblock spin. Those sixteen cases have all K(m+1) contributions treated equally in block spin near

Page 128: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

124 CHAPTER 12. RENORMALIZATION GROUP

neighbor interaction despite the fact that,

2 interactions separated by 1 lattice constant

6 interactions separated by 2 lattice constants

8 interactions separated by 3 lattice constants

2 interactions separated by 4 lattice constants

Let’s compare this to next nearest neighbor interactions along the diagonals. There are again 16

Figure 12.6: Nearest neighbor interaction vs. next nearest neighbor interaction

configurations of the original system that are treated equally in block spin representation:

1 interactions separated by 2 lattice constants

4 interactions separated by 3 lattice constants

6 interactions separated by 4 lattice constants

4 interactions separated by 5 lattice constants

1 interactions separated by 6 lattice constants

Let us define the block spin next nearest neighbor coupling constant K(m+1)nn . It includes all the

cases above with a single effective interaction. It is clear from looking at the interactions in the orig-

inal system that underlie the block spin interactions that K(m+1) (nearest neighbor) and K(m+1)nn

(next nearest neighbor) are in fact quite similar, although it looks like K(m+1)nn < K(m+1). How-

ever, Kadanoff’s assumption implies that K(m+1)nn = 0 (the renormalized spins have an Ising type

interaction). This assumption turns out to be incorrect.It is nowadays possible to numerically track what happens to the Ising model interactions under

the block spin renormalization group. It turns out that longer range interactions between effectivespins are generated by the block transformation, much as we have described above, as well as threeand higher spin interactions (terms like K3

∑SiSjSk or higher). Although the interactions remain

short ranged among the block spins, it turns out they become quite complicated.The appearance of complex interaction terms under renormalization and their corresponding

coupling constants can be most usefully illustrated as a flow in a space of coupling constants. ForH = 0 and the original Ising model, the phase diagram is shown in Fig. 12.7. For a generalized Ising

0 Kc K

low T

infinite T

Recall K = J/kBT

Figure 12.7: Usual Ising model H = 0

model with only next nearest neighbor interactions (J = 0) one gets the same phase diagram forH = 0 (Fig. 12.8). Imagine now what happens to the phase diagram for a generalized Ising modelwith both K and Knn namely,

E

kBT= −K

〈ij〉SiSj −Knn

〈〈ij〉〉SiSj

for H = 0. The phase diagram is sketched in Fig. 12.9. These two phase diagrams illustrate the

Page 129: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

12.2. BLOCK SCALING RENORMALIZATION GROUP 125

0

low T

infinite T

(Knn)c Knn Knn = Jnn/kBT

Figure 12.8: Ising model, no nearest neighbor interaction

Knn

(Knn)c

0 Kc

disordered

Low Tordered

2nd order transitions

K

infinite T

Figure 12.9:

complexity that can be expected under renormalization. If we start with the Ising model, the phasediagram is given in Fig. 12.7. Recall the existence of one unstable fixed point at Tc, and two trivialand stable fixed points. Under one iteration of the renormalization transformation, the energy of theblock spins will contain next nearest neighbor interactions, but also next near neighbor, and possiblyhigher. In that case, the phase diagram of the renormalized system could be that of Fig. 12.9 (ifonly next nearest neighbor terms were generated), with a more complicated fixed point structure.

It is customary then to introduce a many dimensional space of coupling constants, and to examinethe flow under renormalization in this space. Figure 12.10 illustrates possible flows on a two dimen-sional projection (for simplicity). Three fixed points are schematically indicated: (K∗,K∗

nn) = (0, 0)(the infinite temperature, disordered fixed point), (K∗,K∗

nn) = (∞,∞) (the zero temperature fixedpoint), and (K∗,K∗

nn), a fixed point which is stable along the direction tangent to the line, butunstable in the normal direction. This is the fixed point that corresponds to the critical point ofthe renormalized model, and hence of the system. It is from the fixed point structure, and from theflows in coupling constant space that one can compute scaling functions and associated exponents.

Knn

K

Critical

of RG transformation

fixed point

Figure 12.10:

One should point out that the effective interactions developed by the RG depend on the specificRG transformation. For example, blocking with b = 3 would give a different sequence of interactions,and a different fixed point result than b = 2. But, it turns out, that since we are computing thesame partition function by doing the partial sums in different ways, the exponents obtained mustbe all the same.

The way to formalize our discussion so far was developed by Wilson. The RG transformationtakes us from one set of coupling constants ~K(m) to another ~K(m+1) via a length scale changeof b entering a recursion relation: ~K(m+1) = R( ~K(m)). We have introduced a vector notation to

accommodate the many axes used in coupling constant space, ~K = (K,Knn,K3, . . .). By iteration,

one may reach a fixed point ~K∗ = R( ~K∗). Let us now imagine that after a sufficient number of

iterations, we are close enough to the fixed point ~K∗ that we can linearize the transformation R

Page 130: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

126 CHAPTER 12. RENORMALIZATION GROUP

around it. In principle this is clear. In practice, because of all the coupling constants generated onour way to the fixed point, it is quite complex. Usually it is done by numerical methods, or by theǫ = 4− d expansion described later.

Linearization of the recursion relation near the fixed point can be obtained from (we use Greek

subindices to denote components of ~K)

K(m+1)α −K∗

α = Rα( ~K(m))−Rα( ~K∗)

so that by linearization,

δK(m+1)α =

β

MαβδK(m)β

withMαβ = (∂Rα/∂Kβ)

and δKα = Kα −K∗α

12.2.1 Monte Carlo Renormalization Group

Before we continue our analysis of the transformation, it is useful to consider a purely numericalmethod to obtain the matrix Mαβ . This method was developed by Swendsen (Phys. Rev. Lett. 14,859 (1979)). First, write the energy formally as

E =∑

α

KαSα

where Kα are the components of ~K, and hence S1 =∑

〈i,j〉 SiSj , and higher order interactions. Byusing the chain rule,

∂〈S(m+1)γ 〉

∂K(m)β

=∑

α

∂K(m+1)α

∂K(m)β

∂〈S(m+1)γ 〉

∂K(m+1)α

.

This relation involves the matrix elements that we need to compute. The other derivatives are infact correlation functions:

∂〈S(m+1)γ 〉

∂K(m+1)α

=∂

∂K(m+1)α

[1

Z

S(m+1)γ e

P

αKαSα

]

.

Explicitly taking the derivatives leads to,

∂〈S(m+1)γ 〉

∂K(m+1)α

= − 1

Z

∂Z

∂K(m+1)α

〈S(m+1)γ 〉+ 1

Z

§(m+1)γ S(m+1)

α eP

αKαSα =

= −〈S(m+1)γ 〉〈S(m+1)

α 〉+ 〈S(m+1)γ S(m+1)

α 〉.The right hand side is an equilibrium correlation function that can be calculated by simulation (thedetails of the simulation method are described in the chapter on Monte Carlo methods). In essence,the Ising model is simulated, the the block transformation is carried out numerically so that thecorrelation functions for the block configurations can also be computed.

Analogously, one can also find that,

∂〈S(m+1)γ 〉

∂K(m)β

= −〈S(m+1)γ 〉〈S(m)

β 〉+ 〈S(m+1)γ S

(m)β 〉.

Therefore, the simulation plus a judicious truncation in the number of components of ~K to beconsidered (neglecting very long range interactions that will remain small) leads to a numericalestimate of the matrix elements Mαβ that are required.

Page 131: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

12.2. BLOCK SCALING RENORMALIZATION GROUP 127

12.2.2 Scaling form, and critical exponents

Assume now that we have obtained a fixed point of a given RG transformation, and computedthe matrix Mαβ . We now diagonalize M , and call λα the eigenvalues. Assume for simplicity of

exposition that δ ~K above are directly the eigenvectors (that is, that the original matrix is alreadydiagonal. Otherwise, the discussion that follows applies to the actual eigenvectors). Then,

Mαβ =

λ1 0λ2

...0 ...

so that

δK(m+1)α = λαδK

(m)α

The functional dependence of λα is easy to find. It is clear that acting with the RG on scale b,followed by acting again by b′, is the same as acting once by b · b′. Hence:

M(b)M(b′) = M(bb′)

This is precisely one of the requirements for the transformation to be a “group” in the mathematicalsense. Or,

λ(b)λ(b′) = λ(bb′)

The solution to this equations is λ = by since (bb′)y = byb′ y. Therefore the remormalization grouptransformation leads to

δK(m+1)α = byαδK(m)

α (12.5)

near the fixed point.If yα > 0, δKα is called relevant, i.e., its magnitude increases under renormaliztion. If yα = 0,

δKα is called marginal as it neither increases nor decreases under renormalization. If yα < 0, thenδKα is called irrelevant, as it becomes smaller under renormalization. Under infinite interactions,this coupling constant would disappear from the energy for the block spins.

The critical part of the argument is that we require that the transformation preserves the partitionfunction. This partition function is after all what we wish to compute by the partial summationinvolved in the renormalization transformation. This requirement and the recursion relation (12.5)imply for the free energy

f (m)(δ ~K(m)) = b−df (m+1)(δ ~K(m+1)),

or at the fixed point,

f∗(δK1, δK2, ..) = b−df∗(by1δK1, by2δK2, ...)

which proves that the free energy is a generalized homogeneous function of its arguments, andprovides explicit values of the scaling exponents from the eigenvalues of the matrix M .

In practice, the problem is to find the fixed point. A complete solution was provided with theso called ǫ = 4 − d expansion. This is discussed later in this chapter. From our discussion aboveabout the Ginzburg criterion, mean field theory works for d > 4 (it is marginal with logarithmiccorrections in d = 4). Therefore it is expected there has to be a very easy to find and easy to analyzefixed point in d ≥ 4. The assumption of this theory is that the fixed point analysis in d = 4 can beanalytically continued to d < 4 in a controlled ǫ = 4− d expansion.

We finally list some results that extend our calculation to the general case in which the matrixM is not diagonal, which is normally the case. We assume that M is symmetric to make thingseasier. (If not, we have to define right and left eigenvalues. This is done in Goldenfeld’s book.) Wedefine an orthonormal complete set of eigenvectors by

β

Mαβe(γ)β = λ(γ)e(γ)α

Page 132: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

128 CHAPTER 12. RENORMALIZATION GROUP

where λ(γ) are eigenvalues and e are the eigenvectors. We expand δK via

δKα =∑

γ

δKγe(γ)α

or,

δKγ =∑

α

δKαe(γ)α

where δK are the “scaling fields”, a linear combination of the original interactions. Now we act on

the linearized RG transform with∑

α e(γ)α (...). This gives

α

e(γ)α (δK(m+1)α ) =

α

e(γ)α (∑

β

MαβδK(m)α )

It is a little confusing, but the superscripts on the e’s have nothing to do with the superscripts onthe δK’s. We get

δK(m+1)γ = λ(α)

β

e(γ)β δK

(m)β

so,δK(m+1)

γ = λ(α)δK(m)γ

This is what we had (for δK’s rather than δK’s) a few pages back. Everything follows the samefrom there.

12.3 φ4 model: dimensional analysis

We start from

F [ψ] =

dd~xK2

(∇ψ)2 +r

2ψ2 +

u

4ψ4

near Tc, where K and u are constants and r ∝ T −Tc We introduce the following change of variablesF ′ = βF , ψ′ =

√βKψ, r′ = r/K, u′ = u/βK2. After some algebra, removing the primes, and

renaming r′ = r0 and u′ = u0 we have,

F =

dd~x

[1

2|∇ψ|2 +

r02ψ2 +

u0

4ψ4

]

.

Let us examine the dimensions of the various terms (denoted by [.]. By definition, F is dimensionless[F ] = 1. Therefore

[∫

dd~x|∇ψ|2]

= LdL−2 [ψ]2

= 1,

or [ψ] = L1−d/2. Furthermore, [r0] = L−2, and [u0] = Ld−4. These are called the engineeringdimensions.

Critical phenomena involve different scalings with length as we have seen, and therefore theactual dimensions of variables near the critical point do not agree with the engineering dimensions.This is one of the manifestations of the complexity of critical phenomena. For example,

[ψ2] =1

Ld−2

but we know that Tc

〈ψ(0ψ(x)〉 =1

xd−2+η

where η 6= 0 as the engineering dimension would lead one to believe. Also, since r ∝ T −Tc, near Tc

[T − Tc] =1

L2

Page 133: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

12.4. MOMENTUM SHELL RG. GAUSSIAN FIXED POINT 129

but [ξ] = L, soξ = (T − Tc)−1/2

near Tc. Butξ = (T − Tc)−ν

with ν 6= 1/2. The results of our scaling theory give as scaling dimensions

[ψ]scaling =1

L(d−2+η)/2=

1

Ld2−1

1

Lη/2

and

[r]scaling =1

L1/ν=

1

L2

1

L1ν−2

One can interpret these anomalous dimensions as following from a dependence on another length inthe problem: the lattice constant a ≈ 5A, and rewrite

〈ψ2〉 =1

xd−2+ηaη

so that the dimensions of ψ are those given by dimensional analysis (engineering dimensions). Theexponent η/2 is called ψ’s anomalous dimension, while 1/η − 2 is the anomalous dimension of r.

It is useful at this point to rescale variables again by using L = r−1/20 as the unit of length. We

define ψ′ = ψ/L1−d/2, x′ = x/L and u′ = u0/Ld−4. Simple algebraic substitutions lead to (removingprimes, except in u),

F =

dd~x

[1

2|∇ψ|2 +

1

2ψ2 +

u′

4ψ4

]

,

Therefore the partition function of the φ4 model will be a function of the single parameter u′.Since it is the coefficient of the quartic term in the energy, the first approximation is to treat itas a perturbation. This leads to the Ornstein-Zernicke theory, and hence to mean field exponents.However, since r0 ∝ (T − Tc) near the critical temperature, one has u′ ∼ t(d−4)/2, This simpleanalysis shows that for d > 4, u′ → 0 as Tc is approached, and hence the quartic term in the energycan be truly neglected. Therefore the Ornstein-Zernicke theory can be expected to hold. On theother hand, for d < 4, the nonlinear term diverges and this simplification is incorrect. This is in facta restatement of the Ginzburg criterion.

This analysis if the starting point of the renormalization group treatment of the φ4 model. Weexpect that a “Gaussian fixed point” will exist given by r∗ = 0, u∗ = 0, and h∗ = 0, which will becorrect for d > 4. The success of the ǫ expansion is to show that there exists a new fixed point ford < 4 so that the Gaussian fixed point becomes unstable in this range. This new fixed point doesprovide with the exponents observed experimentally.

12.4 Momentum shell RG. Gaussian fixed point

Before proceeding to a more complete calculation, we illustrate the technique used (momentum shellRG) for the case of the Gaussian fixed point. Consider,

F =

dd~x

[1

2|∇ψ|2 +

r02ψ2

]

=

∫ Λ

0

dd~k

(2π)d1

2

(r0 + k2

)|ψ~k|2.

We now introduce the following transformation: Replace the integral∫ Λ

0into a long wavelength

component∫ Λ/l

0and a short wavelength component

∫ Λ

Λ/l. With the respective integrals we define a

long wavelength field and a perturbation,

ψ(~x) = ψ(x) + δψ(x) =

∫ Λ/l

0

dd~k

(2π)dψ~ke

i~k·~x +

∫ Λ

Λ/l

dd~k

(2π)dψ~ke

i~k·~x.

Page 134: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

130 CHAPTER 12. RENORMALIZATION GROUP

In the example of block spins, the field ψ would correspond to block spins, whereas δψ would describefluctuations within blocks. Since at the Gaussian level all modes decouple, we can write the partitionfunction as

Z =

∫∏

0<k<Λ/l

dψ~k

Λ/l<k<Λ

dδψ~keR Λ/l0

dd~k

(2π)d12 (r0+k

2)|ψ~k|2

eR ΛΛ/l

dd~k

(2π)d12 (r0+k

2)|δψ~k|2

.

The integral over δψ is Gaussian and can be done. This is the “renormalization” step of thetransformation that removes a subset of the degrees of freedom.

The second step is rescaling the variables so that the renormalized system is of the same size asthe original system. We introduce,

kl = lk ψl(kl) =1

zψk.

Calling Zδψ the partial partition function arising from the integration over δψ, we find,

Z = Zδψ

∫∏

k

dψl(k)e

R Λ0l−d

ddkl(2π)d

r0+k2ll2

«

|ψl(k)|2z2.

If we now require that this partition function reduce to the same form as the original partitionfunction, we require l−dz2/l2 = 1, or a factor z = l1+d/2. Then

Z = Zδψ

∫∏

k

dψl(k)eR Λ0

ddkl(2π)d

(l−dr0z2+k2l )|ψl(k)|2z2 ,

which is the original partition function if we define

rl = l−dz2r0 rl = l2r0.

The partial summation and rescaling leaves the partition function invariant if we allow a scaledependent coefficient r. In effect, this is the recursion relation for this transformation.

We immediately find two Gaussian fixed points: r∗ = 0 (the critical points) and r∗∞. Since theequation is already linearized around the fixed point r∗ = 0, we find that the critical exponent isyr = 2.

Therefore we would write for the free energy,

f∗(δr) = l−df∗(l2δr).

By definition or r, δr ∼ t, the reduced temperature away from the critical point, and since l is alength we can write,

f∗(δr) = ξ−df∗(ξ2t).

From this scaling form we conclude that

ξ2 ∼ t−1 or ξ ∼ t−1/2,

the known mean field result.

12.5 (*) ǫ expansion

It is worthwhile to consider how the coefficient of the quartic term (with its engineering dimensionremoved) varies as scale L is varied. Consider the ”beta function”:

−Ldu′

dL= −(4− d)L4−du

= −(4− d)u′

Page 135: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

12.5. (*) ǫ EXPANSION 131

d > 4

d < 4

d = 4

−Ldu′

dL

u′

initial value of u′ is

desceased as L increases

initial value of u′ is

increased as L increases

Figure 12.11:

The direction of the arrows is from

du′

d lnL= −(d− 4)u′ , d > 4

du′

d lnL= (4− d)u′ , d < 4

Note that u′ = 0 is a stable fixed point for d > 4 and an unstable fixed point for d < 4. This sort ofpicture implicitly assumes u′ is small. The renormalization group will give us a more elaborate flowsimilar to

−Ldu′

dL= −(4− d)u′ +O(u′2)

so that the flow near the fixed pt. determines the dimension of a quantity. Clearly

−Ldu′

dL u′ = 0 fixed pt. d > 4

d < 4

u′

new Wilson fixed pt.

u′ > 0

Figure 12.12:

−Ldu′

dL= (Dim. of u)u′

and similarly

−Ldr′

dL= (Dim. of r) r′

So the anomalous dimension can be determined by the “flow” near the fixed point.

The analysis above is OK for d > 4, and only good for u′ close to zero for d < 4. To get someidea of behavior below 4 dimensions, we stay close to d = 4 and expand in ǫ = 4 − d. (See notebelow)

So consider

F =

ddxK2

(∇ψ)2 +r

2ψ2 +

u

4ψ4

we will find

Page 136: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

132 CHAPTER 12. RENORMALIZATION GROUP

ǫ = 1 (d = 3) Exact (d = 3)α = ǫ/6 0.17 0.125β = 1

2 − ǫ6 0.33 0.313

γ = 1 + ǫ/6 4 5.0δ = 3 + ǫ 1.17 1.25ν = 1

2 + ǫ/12 0.58 0.638η = 0 +O(ǫ2) 0 0.041

(Note: Irrelevant CouplingsConsider

ddx[1

2(∇ψ)2 +

r

2ψ2 +

u

4ψ4 +

v

6ψ6 +

w

8ψ8]

as before,

[ψ] = 1/Ld2−1

[ν] = 1/L2

[u] = 1/L4−d

But similarly,

[v] = 1/L2(3−d)

[w] = 1/L3(2 23−d)

so,

F =

ddx′[1

2(∇′ψ′)2 + (

r

2L2)ψ′2 + (

u

4L4−d)ψ′4 +

vL2(3−d)

6ψ6 + wL3(2 2

3−d)ψ8]

Hence, v, w look quite unimportant (at least for the Gaussian fixed point) near d = 4. This is calledpower counting. For the remainder of these notes we will simply assume v, w are irrelevant always.)

We will do a scale transformation in Fourier space,

ψ(x) =

|k|<Λ

ddk

(2π)dei~k·~xψk

where (before we start) Λ = π/a, since we shall need the lattice constant to get anomalous dimen-sions. We will continually apply the RG to integrate out small length scale behavior, which, inFourier space, means integrating out large wavenumber behavior. Schematically shown in fig. 12.13.Let

Λ

Λ

−Λ

−Λ

ky

kx

Λ/b

Λ/b

ky

kx . . . . 0

ky

kx

RGRG

domain of nonzero hatpsi k Scale transformationby factor of b

to k → 0

Figure 12.13:

Page 137: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

12.5. (*) ǫ EXPANSION 133

ψ(x) = ψ(x) + δψ(x)

all degrees of freedom degrees of freedom degrees of freedom

after several for “slow” modes on for “fast” modes on

transformations |k| < Λ/b Λ/b < |k| < Λ

|k| < Λ to be integrated out

Also, for convenience,Λ

b= Λ− δΛ

So that the scale factor

b =Λ

Λ− δΛ ≥ 1

Then we explicitly have

ψ(x) =

|k|<Λ−δΛ

ddk

(2π)dei~k·~xψk

and

δψ(x) =

Λ−δΛ<|k|<Λ

ddk

(2π)dei~k·~xψk

Schematically, refer to fig. 12.14. This is called the “momentum–shell” RG. We do this in momentum

Λ

ky

kx

ψ(x)

Λ

ky

kx

ψ(x)

Λ

ky

kx

δψ(x)

Λ− δΛΛ− δΛ

Figure 12.14:

space rather than real space because it is convenient for factors of

ddx(∇ψ)2︸ ︷︷ ︸

coupling different ψ(x)’s

=

∫ddk

(2π)dk2|ψk|2

︸ ︷︷ ︸

Fourier modes uncoupled

which are trivial in momentum space.The size of the momentum shell is

Λ−δΛ<|k|<Λ

ddk

(2π)d= aΛd−1δΛ

where a is a number that we will later set to unity for convenience, although a(d = 3) = 1/2π2, a(d =2) = 1/2π. Let us now integrate out the modes δψ

e−F [ψ]︸ ︷︷ ︸

scales |k| < Λ − δΛ

≡∑

States δψ︸ ︷︷ ︸

shell scales

e−F [ψ+δψ]︸ ︷︷ ︸

scales |k| < Λ

Essentially we are — very slowly — solving the partition function by bit–by–bit integrating outdegrees of freedom for large k.

Page 138: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

134 CHAPTER 12. RENORMALIZATION GROUP

We will only do this once, and thus obtain recursion relations for how to change scales.

F [ψ + δψ] =

ddx12(∇(ψ + δψ))2 +

r

2[ψ + δψ]2 +

u

4[ψ + δψ]4

We will expand this out, ignoring odd terms in ψ (which must give zero contribution since theychange the symmetry of F ), constant terms (which only shift the zero of the free energy), and termsof order higher than quartic like ψ6 and ψ8 (which are irrelevant near d = 4 by power counting aswe did above).

In that case

1

2[∇(ψ + δψ)]2 =

1

2(∇ψ)2 +

1

2(∇δψ)2 + odd term

r

2[ψ + δψ]2 =

r

2ψ2 +

r

2(δψ)2 + odd term

u

4[ψ + δψ]4 =

u

4ψ4[1 +

δψ

ψ]4

=u

4ψ4[1 + 4

δψ

ψ+ 6

(δψ)2

ψ2+ 4

(δψ)3

ψ3+

(δψ)4

ψ4]

=u

4ψ4 +

3u

2ψ2(δψ)2 +O(δψ)4 + odd term

We will neglect (δψ)4, since these modes have a small contribution to the k = 0 behavior. This canbe self–consistently checked. Hence

F [ψ + δψ] = Fψ︸ ︷︷ ︸

|k|<Λ−δΛ

+

ddx[1

2(∇δψ)2 +

1

2(r + 3uψ2)(δψ)2]

but δψ =∫

Λ−δΛ<|~k|<Λddk

(2π)dei~k·~xψk, so

(∇δψ)2 = Λ2(δψ)2

for a thin shell, and

F [ψ + δψ] = F [ψ] +

ddx1

2(Λ2 + r + 3uψ2)(δψ)2

and we have,

e−F [ψ] = e−F [ψ]∑

States δψ

e−R

ddxΛ2+r+3uψ2

2 (δψ)2

where∑

States δψ

=∏

all δψ states

dδψ(x)

The integrals are trivially Gaussian in δψ, so

e−F [ψ] = e−F [ψ]∏

x

total no. of δψ states

Λ2 + r + 3uψ2

we know the total number of δψ states because we know the total number of modes in Fourier spacein the shell are ∫

Λ−δΛ<|k|<Λ

ddk

(2π)d= aΛd−1δΛ

so ∏

x

total no.

(...)→ e(R

dx)(aΛd−1δΛ) ln (...)

Page 139: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

12.5. (*) ǫ EXPANSION 135

ande−F [ψ] = e−F [ψ]e−

R

ddx aΛd−1δΛ 12 ln (Λ2+r+3uψ2)+const.

Now consider1

2ln [Λ2 + r + 3uψ2] =

1

2ln (1 +

r

Λ2+

3u

Λ2ψ2) + const.

we expand using ln (1 + ǫ) = ǫ− ǫ2

2 + ǫ3

3

=1

2 rΛ2

+3u

Λ2ψ2 − 1

2(r

Λ2+

3u

Λ2ψ2)2 + ...

=1

2 7

const.r

Λ2+

3u

Λ2ψ2 −

* const.1

2(r

Λ2)2 − 3ur

Λ2ψ2 − 9u2

2Λ4ψ4 + ...

=1

2(3u

Λ2− 3ur

Λ4)ψ2 +

1

4(−9u2

Λ4)ψ4

So that

F [ψ] = F [ψ] +

ddx12(3u

Λ2− 3ur

Λ4)(aΛd−1δΛ)ψ2 +

1

4(−9u2

Λ4)(aΛd−1δΛ)ψ4

so,

F [ψ] =

ddx12(∇ψ)2 +

r

2ψ2 +

u

4ψ4

where,

r = r + (3u

Λ2− 3ur

Λ4)(aΛd−1δΛ)

u = u+ (−9u2

Λ4)(aΛd−1δΛ)

Butr = r(Λ− δΛ) etc,

sor(Λ− δΛ)− r(Λ)

δΛ= (

3u

Λ2− 3ur

Λ4)aΛd−1

andu(Λ− δΛ)− u(Λ)

δΛ= (−9u2

Λ4)aΛd−1

or,

−1

a

∂r

∂Λ= (

3u

Λ2− 3ur

Λ4)Λd−1

−1

a

∂u

∂Λ= (−9u2

Λ4)Λd−1

Set a = 1 (or let u→ u/a), and set Λ = 1/L, to obtain

−L ∂r∂L

= −3uL2−d + 3urL4−d

−L∂u∂L

= 9u2L4−d

We need to include the trivial change in length due to the overall rescaling of space, with engineeringdimensions [r] = 1/L2, [u] = 1/L4−d, by letting r′ = rL2, u′ = uL4−d. Note

dr

dL=d(r′/L2)

dL= (

dr′

dL− 2

r′

L)

1

L2

Page 140: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

136 CHAPTER 12. RENORMALIZATION GROUP

so

−L drdL

= (−Ldr′

dL+ 2r′)

1

L2

and

−LdudL

= (−Ldu′

dL+ (4− d)u′) 1

L4−d

But

−3uL2−d = −3u′L2−d

L4−d = −3u′1

L2

and

−3urL4−d = −3u′r′L4−d

L4−dL2= −3u′r′

1

L2

and

9u2L4−d = 9u′2L4−d

(L4−d)2= 9u′2

1

L4−d

Hence we obtain

−Ldr′

dL= −2r′ − 3u′ + 3u′r′

and

−Ldu′

dL= −(4− d)u′ + 9u′2

Which are the recursion relations, to first order. Evidently there is still a trivial Gaussian fixed

−Ldu′

dL

d < 4

u′

d > 4d = 4

u∗

Wilson–Fisher fixed pt.

Stable for d > 4

Gaussianfixed pt.

Figure 12.15:

point r∗ = u+ = 0, but there is a nontrivial “Wilson–Fisher” fixed point which is stable for d < 4 !From the u equation (using ǫ = 4− d)

u∗ =ǫ

9

and in the r equation (using u∗ ∼ r∗ ∼ ǫ with ǫ << 1), 2r∗ = −3u∗, so

r∗ = − ǫ6

Linearizing around this fixed point gives the dimensions of r and u as that fixed point is approached.Let δr = r′ − r∗, δu = u′ − u∗. In the u equation,

−LdδudL

= −ǫu∗ − ǫδu+ 9(u∗)2(1 +2δu

u∗)

= [18u∗ − ǫ]δu

−LdδudL

= ǫδu (Wilson–Fisher fixed pt.)

Page 141: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

12.5. (*) ǫ EXPANSION 137

So the dimension of u is ǫ.In the r equation,

−Ldδ rdL

= −2r∗ − 3u∗ − 2δr + 3u∗δr

but δr >> δu as L→∞, so,

−LdδrdL

= (−2 + 3u∗)δr

or,

−LdδrdL

= −2(1− ǫ

6)δr Wilson–Fisher fixed pt. (stable d < 4)

So the dimension of r is −2(1− ǫ/6).Similarly, near the Gaussian fixed pt. where u∗ = r∗ = 0, we have (as we did before)

−LdδudL

= −ǫδu

−LdδrdL

= −2δr

Giving the dimensions of r and u there.To get the physical exponents, recall

[r] = 1/L1/ν

Hence

ν =1

2, Gaussian fixed pt.

ν =1

2(1 +

ǫ

6), Wilson–Fisher fixed pt.

To get another exponent we use

[ψ] =1

Ld2−1+η/2

or more precisely

[ψ] =1

Ld2−1+η/2

f(rL1/ν , uL−yu)

where

yu =

−ǫ, Gaussian

ǫ, Wilson–Fisher

is the dimension of u. Choosing the scale

rL1/ν ≡ 1

gives[ψ] = r

ν2 (d−2+η)f(1, urνyu)

for the Wilson–Fisher fixed point this is straight forward. Since νyu > 0, the term urνyu is irrelevantas r → 0. Hence

[ψ] = rν2 (d−2+η) : const.

f(1, 0)

≡ rβ

Now it turns out η = 0 +O(ǫ2) (which can be calculated later via scaling relations), so,

β =1

2

1

2(1 +

ǫ

6)(2− ǫ)

=1

2(1 +

ǫ

6)(1− ǫ

2)

Page 142: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

138 CHAPTER 12. RENORMALIZATION GROUP

or

β =1

2(1− ǫ

3) Wilson–Fisher fixed pt.

The Gaussian fixed point is more tricky (although academic since it is for d > 4). We again obtain

[ψ] = rν2 (d−2+η)f(1, urνyu)

but now ν = 12 , yu = −ǫ = d− 4, η = 0, so

[ψ] = rd−24 f(1, ur

d−42 )

This looks like β = d−24 ! Not the well–known β = 1/2. The problem is that u is irrelevant, but

dangerous, for T < Tc (where ψ is nonzero). Without this quartic term, the free energy is ill defined.For the equation above, f(1, u → 0) is singular. Its form is easy to see. By dimensions ψ ∼

r/ubelow Tc, so

f(1, u→ 0) =1

u1/2

and

[ψ]u→0 = rd−24

1

(urd−42 )1/2

=rd−24 − d−4

4

u1/2

= (r

u)1/2

So, β = 1/2 for the Gaussian fixed point. Hence d ≥ 4, the Gaussian fixed point is stable andα = 0, β = 1/2, δ = 3, γ = 1, η = 0, ν = 1

2 (The remainder are obtained carefully with the scalingrelations).

For d < 4, the Wilson–Fisher fixed point is stable and since α + 2β + γ = 2, γ = ν(2 − η),γ = β(δ − 1), νd = 2− α we have:

ǫ = 1 Ising d = 3α = ǫ

6 0.17 0.125...β = 1

2 (1− ǫ3 ) 0.33 0.313...

γ = 1 + ǫ6 1.17 1.25...

δ = 3(1 + ǫ3 ) 4 5.0...

ν = 12 (1 + ǫ

12 ) 0.58 0.638...η = 0 +O(ǫ2) 0 0.041...

Page 143: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

Chapter 13

Master Equation and the MonteCarlo method

The Monte Carlo method is a class of random or stochastic methods that are used to simulatea large variety of physical phenomena. It has wide applicability in Statistical Mechanics, as itaffords an almost unique route to approximately compute statistical averages in complex nonlinearsystems. We focus below on its most common implementation in Statistical Mechanics, and leaveout applications to other fields.

We have defined a canonical average as

〈A〉 =1

Z

ν

Aνe−βEν . (13.1)

We mentioned earlier that it is not necessary in order to compute macroscopic averages to solve forthe time dependent trajectory of the system in its phase space, and then compute temporal averagesof the quantities of interest. Rather, we compute averages over known probability measures, suchas that given in Eq. (13.1).

We can easily devise a scheme to approximately compute the average in Eq. (13.1) by numericalmeans. The scheme could be as follows: given a specific physical system and the energy as a functionof its microscopic variables, choose a configuration ν completely at random, and evaluate Aν andpν ∼ e−βEν . Repeat for a large number of times until a good estimate of 〈A〉 has been obtained.

As you may imagine, the problem with this approach is that the sample space is huge (∼ e1023

) fora macroscopic system. Furthermore, since E is extensive (E ∼ N ∼ 1023), the probability of eachconfiguration pν extremely small. Therefore the procedure just outlined would require adding anextremely large number of extremely small quantities.

Both difficulties could be alleviated if we had a way of sampling only the most likely states,those that contribute the most to the average. The Monte Carlo method provides a way to preciselydo that: sample states not with uniform probability as outlined above, but rather with probabilityprecisely equal to pν . If a sequence of N states ν′ is created, which are distributed according tothe canonical distribution function, then

〈A〉 =1

N∑

ν′

Aν′

The Monte Carlo method which we describe next is a particular way of generating configurationsthat are distributed according to a predefined probability distribution function. Imagine that thesystem of interest is initially in a state S (we use S instead of ν in the following). This state might,for example, be all the spins +1 if the system in question is the Ising model. We imagine next thatwe generate a process in configuration space that prescribes transitions from one state to another.We define a transition probability of going from S → S′, in a unit time as

W (S, S′) ≡(Read As)

W (S ← S′) (13.2)

139

Page 144: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

140 CHAPTER 13. MASTER EQUATION AND THE MONTE CARLO METHOD

This transition is schematically shown in Fig. 13.1. Of course, one needs to keep in mind transitions

S′ S

W (S′, S)

W (S, S′)

Figure 13.1:

out of S into any other state S′ with probability

W (S′, S) ≡(Read As)

W (S′ ← S). (13.3)

The master equation is the equation that governs the evolution of the probability distributionP (S) under the transitions just discussed. It is customary to assume that the W are transition ratesper unit time, and hence that P (S) changes as a function of time. However, time can be a fictitiousvariable simply denoting a sequence of transitions in phase space. The equation reads,

∂P (S, t)

∂t=∑

S′

[W (S, S′)P (S′)−W (S′, S)P (S)] . (13.4)

The transition rates in the master equation are assumed to depend only on the current state,but not on the prior trajectory. This defines a so called Markov process. Note also that the masterequation is not invariant under time reversal.

We now examine restrictions on the master equation to generate states distributed according tothe canonical probability distribution. First, we want p(S) to be a time independent distribution,and so we require that

∂Peq(S)

∂t= 0. (13.5)

A sufficient condition for the right hand side of Eq. (13.4) to be zero is,

W (S, S′)P0(S′) = W (S′, S)P0(S) (13.6)

where P0 is the stationary distribution function. This equation is known as the detailed balancecondition. We now require that P0(S) = (1/Z)e−βE(S). Therefore,

W (S, S′)W (S′, S)

= e−(E(S)−E(S′))/kBT (13.7)

This is a surprisingly simple result for the transition rates W : it is sufficient (but not necessary)that they satisfy Eq. (13.7) for the process to asymptotically yield a stationary distribution of statesP (S) which equals the canonical distribution. In fact, if we are only concerned with equilibriumproperties, any set of W ’s which satisfy detailed balance will produce a sequence of states distributedaccording to the canonical distribution. All one needs to do is pick a convenient W .

The two most common choices of W ’s used in numerical work are the Metropolis rule

W (S, S′) =

e−∆E/kBT , ∆E > 0

1, ∆E ≤ 0(13.8)

where ∆E = E(S)− E(S′), and the Glauber rule

W (S, S′) =1

2(1− tanh

∆E

2kBT) (13.9)

Page 145: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

141

Let us quickly verify that W for the Metropolis rule satisfies the detailed balance condition. IfE(S) > E(S′)

W (S, S′)W (S′, S)

=e−(E(S)−E(S′))/kBT

1= e−∆E/kBT

Now if E(S) < E(S′)W (S, S′)W (S′, S)

=1

e−(E(S)−E(S′))/kBT= e−∆E/kBT

where ∆E = E(S)−E(S′) as above. So the detailed balance condition is satisfied by this transitionrule. You can check yourself that the Glauber rule also satisfies detailed balance. The form of thetransition rate as a function of the energy change is shown in Fig. 13.2 and 13.3.

1 1 1

∆E ∆E ∆E

Low T Moderate T High T

Metropolis

Figure 13.2:

1 1

∆E ∆E ∆E

Low T Moderate T High T

1

1/2 1/2

Glauber

Figure 13.3:

In practice, one introduces transitions from state S′ → S that take place at the rateW (S, S′). One usually wants the states S′ and S to be “close” in configuration space,so that W is not too small (close in energy for the Metropolis and Glauber rules). For example, forIsing model, a state S is close to S′ if the two states differ by no more than one spin flip, asdrawn in Fig. 13.4.

? or

Outcome withprobabilityW (S, S′)

Source State S′ Target State S

Figure 13.4:

Page 146: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

142 CHAPTER 13. MASTER EQUATION AND THE MONTE CARLO METHOD

Let us consider the explicit steps for the 2 dimensional Ising model in zero field. The energy ofany configuration is

Eν = −J∑

〈ij〉Si Sj (13.10)

where Si = ±1 are the spins on sites i = 1, 2, ... N spins, and 〈ij〉 means only nearest neighbors aresummed over. The positive constant J is the coupling interaction. If the transitions to be exploredinvolve only one spin, there are only five cases to consider as shown in Fig. (13.5).

?

prob = e−8J/kBTCase 1

?

?

?Case 4

?

Case 2

Case 3

Case 5

prob = e−4J/kBT

prob = 1

prob = 1

prob = 1

Figure 13.5:

The resulting transition probabilities are as follows (assuming we use the Metropolis rule):

Page 147: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

143

For Case 1:

ES′ = −4J , ES = +4J

∆E = 8J

W (S, S′) = Prob. of flip = e−8J/kBT

For Case 2:

ES′ = −2J , ES = +2J

∆E = 4J

W (S, S′) = Prob. of flip = e−4J/kBT

For Case 3:

ES′ = 0 , ES = 0

∆E = 0

W (S, S′) = Prob. of flip = 1

For Case 4:

ES′ = +2J , ES = −2J

∆E = −4J

W (S, S′) = Prob. of flip = 1

For Case 5:

ES′ = +4J , ES = −4J

∆E = −8J

W (S, S′) = Prob. of flip = 1

What to do next is simple if W = 1, the new state S is accepted with probability one. Ifone the other hand W = e−4J/kBT , say for example W = 0.1, the spin flips 10% of the time onaverage. In order to implement this the standard procedure is to compare W = 0.1 to a randomnumber uniformly distributed between 0 and 1. If the random number is less that or equal to W ,then you flip the spin. With this rule, the spin has exactly a probability of 0.1 of being flipped,exactly as required. Of course, this is repeated over an over again until a sufficiently large number ofconfigurations have been generated. Because detailed balance is satisfied, the configurations givenby this algorithm will be distributed according to the canonical distribution as desired.

Here is an outline of a computer code. If you set the temperature of the system to be high andinitially set Si = 1 for all spins, you will see that the magnetization quickly relaxes to zero (Fig. 13.7).If the system temperature is below Tc, the magnetization quickly relaxes to its equilibrium value (seeFig. 13.8). Doing this for many temperatures, for a large enough lattice, gives the phase diagramshown in Fig. 13.9. One can also compute the running average of the energy of the system, tofind the curve shown in Fig. 13.10. By taking the numerical derivative, one can estimate the heatcapacity per spin

C =1

J

∂〈E〉∂T

Try it. Your results will be very close to Onsager’s results even with fairly small systemsN = 10×10,or 100×100 over a thousand or so time steps, and you’ll do much better than the mean field theory.

As an exercise, think about the code that you would write to simulate the discrete model ofinterface fluctuations:

Z =∑

ν

e−Eν/kBT

Page 148: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

144 CHAPTER 13. MASTER EQUATION AND THE MONTE CARLO METHOD

Initialize System

- How many spins, How long to average

- How hot, (give T = (...)J/kB)?

- How much field H?

- Give all N spins some initial value (such as Si = 1).

use periodic boundary conditions

Do time = 0 to maximum time

Do i = 1 to N

- Pick a spin at random

- Check if it flips

- Update the system

Do this N times

(each spin is flipped on average)

Calculate magnetization per spin

〈M〉 = 1N 〈∑

i Si〉

Calculate energy per spin

〈EJ 〉 = −1N 〈∑

〈ij〉 SiSj〉

averaging over time (after throwing away some early time).

time t = 0

Grinding

any 1mcs

time step

as N updates

Output

Figure 13.6:

M

1high T > Tc

t

t in Monte Carlo step 1, 1 unit of time

corresponds to attempting N spin flips

Figure 13.7:

Page 149: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

145

1

M transient

time average to get 〈M〉

t

low T < Tc

Figure 13.8:

M ∼ |T − Tc|−beta, β = 1/8

Tc ≈ 2.269J/kB

T

1

〈M〉

Figure 13.9:

Tc

T

−2

〈E/J〉

Figure 13.10:

ln |T − Tc|

Tc ≈ 2.269 J/kB

C

Figure 13.11:

Page 150: PHYS 559 - Advanced Statistical Mechanics - …coish/courses/phys559/phys559...ii Preface These are lecture notes for PHYS 559, Advanced Statistical Mechanics, which I’ve taught

146 CHAPTER 13. MASTER EQUATION AND THE MONTE CARLO METHOD

where

Eν = J

N∑

i=1

|hi − hi+1|

where hi is the interface displacement at site i.

13.1 Other ensembles

The Monte Carlo method outlines above can generalized to other ensembles beyond the canonicaldistribution considered above. For example, if we have an ensemble as fixed (N, p, T ), in which boththe volume and the energy of the system can fluctuate, the corresponding probability distributionis,

pν =1

Ze−β(Eν+pVν).

The condition of detailed balance for this case would be,

W (ν′, ν)W (ν, ν′)

= e−β(Eν′−Eν)−βp(Vν′−Vν).

The sequence of configurations to be generated by the algorithm would change not only the configu-ration of the system but also its overall volume. This if, of course, not quite physical in many cases,but nevertheless it will produce a sequence of states distributed according to the correct equilibriumprobability distribution.