Thermodynamics and Statistical Physics
B. Zheng 1
Zhejiang Institute of Modern Physics, Zhejiang University,Hangzhou 310027, P.R. China
1e-mail: [email protected]; tel: 0086-571-87952753; Fax: 0086-571-87952754
1
Chapter 1
Thermodynamics
2
Chapter 2
Classical statistical mechanics
From thermodynamics one does not understand every-thing. E.g., what is pressure, and especially what aretemperature and entropy? Kinetic theory of moleculesoffers certain understanding, but not sufficiently gen-eral. Thermodynamics does not look fundamental andsystematic, and all the laws seem isolated each other.
2.1 Postulates
Statistical physics
• equilibrium
• non-equilibrium
Statistical mechanics here is mainly concerned withthe equilibrium state.
• Calculate macroscopic parameters from microstruc-tures and interactions
3
• Derive thermodynamicse.g., define and calculateTemperatureInternal energyEntropyProve that the laws in thermodynamics
• Statistical mechanics goes beyond thermodynamics
The system: a classical system isolated in the sensethat the energy is conserved or nearly conserved, is com-posed of a large number N of elements, typically
N ' 1023 molecules
Thermodynamic limit
N →∞, V →∞but
V
N→ v (a constant)
Here V is measured in a microscopic unit,e.g.,
V ' 1023 molecular volumes
The thermodynamic properties of the system may betackled from three levels.
• Fundamental level
Solve the microscopic equations of motion, such as New-ton, Heisenberg Eqs..
Difficulties∗ too many degrees of freedom
4
∗ microscopic initial conditions and boundary conditions∗ irregular disturbance from environments
Here the time t is microscopic.
• Quasi-fundamental level
Do not trace the motion of each molecule, and con-sider only the probability distribution ρ(pi, qi, t) with
ρ(pi, qi, t)∏
i
d3pi d3qi
being the number of molecules at time t, and lyingwithin a volume
∏i d
3pi d3qi of the coordinate qi and mo-
mentum pi.Equations of motion:
∗ Liouville’s eq.
∗ Boltzmann eq.
These eqs. can be solved only in some simple cases,such as dilute gases.
The time t is mesoscopic.
• Statistical mechanics
Do not solve any eqs. of motion, but assume a form ofρ(pi, qi, t) in the equilibrium state, i.e.ρ(pi, qi,∞) ≡ ρ(p, q)∗ it can be tested by experiments∗ it can be derived from eqs. of motion in some special cases.
More strictly, this is the so-called ensemble theory.The time t is macroscopic.
Γ space: the phase space spanned by (p, q), each pointin Γ space represents a microscopic state of the system.
5
An ensemble: A collection of systems, identical incomposition and macroscopic conditions, but existing indifferent microscopic states, without interactions eachother.
A system can be represented by a point in Γ space,then
ρ(p, q)dp dq ≡ ρ(pi, qi)∏
i
dpi dqi
is the number of systems at the volume element dpdq.Physically, it is reasonable to do this just because amacroscopic state may correspond to many microscopicstates.
The ensemble is introduced to replace the dynamicevolution of the microscopic states.
Postulate of equal a priori probabilityWhen an isolated system is in thermodynamic equi-
librium, its state is equally like to be any state satisfyingthe macroscopic conditions, i.e.
ρ(p, q) =
const. if E < H(p, q) < E + ∆0 otherwise
here ∆ ¿ E H is the Hamitonianand the ensemble described by this distribution is theso-called microcanomical ensemble.
• why is ∆ introduced?Theoretically, it is convenientPractically, isolation is not strict
• why is ρ(p, q) a const for possible states?
6
One may understand from ergodicity, which may beachieved by∗ the measure of non-ergodic states is negligibly small∗ disturbance∗ boundary conditions∗ interactions· · · · · ·If there is ergodicity, it is natural that the practical
path of a system should be a simple loop in Γ space.The ensemble average of a measurable property f(p, q)
is defined as
< f >≡∫
dp dqf(p, q)ρ(p, q)∫dp dqρ(p, q)
In the thermodynamic limit, it is usually assumed
< f 2 > − < f >2
< f >2 ¿ 1 (∗)
We may define the most probable value of f(p, q) as thevalue with the maximum probability P (f0). Here weshould note that P (f) is different from ρ(p, q), as shownin Fig. 2.1.
In the thermodynamic limit, the ensemble averageand the most probable value should be nearly the same.
Otherwise, statistical mechanics should be questioned.
Question: in what case Eq. (*) is not valid?Answer: strongly correlated systems.
7
P
ff0
Figure 2.1:
8
E
E
p
q
Figure 2.2:
9
2.2 Microcanomical ensemble
H is naturally defined as the internal energy. So, it isimportant how to define the entropy and temperature.
Let us denote the volume in Γ space of the micro-scopic ensemble
Γ(E) ≡∫
E<H(p,q)<E+∆dp dq
Γ(E) is understood to be dependent of N, V and also ∆.For example, it is shown in Fig.2.2 for H(p, q) = p2+q2.The entropy is defined by
S(E, V ) ≡ k log Γ(E)
where k is a universal constant eventually shown to beBoltzmann’s constant.
To justify the definition of S, we should show(a) S is extensive(b) S satisfies the properties of the entropy as re-
quired by the second law of thermodynamicsProof: (a)Let the system be divided into two subsystems:1: N1, V1, H1
2: N2, V2, H2
with
N = N1 + N2
V = V1 + V2
andH(p, q) = H1(p1, q1) + H2(p2, q2)
10
Here it is assumed that the interaction between twosubsystems is negligible. (p1, q1) are the coordinates ofparticles in the system 1, and (p2, q2) are the coordinatesof particles in the system 2.
E.g., if the intermolecular potential is finite-range,and the surface-to-volume ratio of each subsystem isnegligibly small.
If we define
S1(E1, V1) = k log Γ1(E1, ∆)
S2(E2, V2) = k log Γ2(E2, ∆)
andS(E, V ) = k log Γ(E, 2∆)
we should show in the thermodynamic limit
S(E, V ) = S1(E1, V1) + S2(E2, V2)
If E = E1 + E2 is a decomposition of the energy, thevolume in Γ space of the whole system is
∫
E1+E2<H1+H2<E1+E2+2∆dp1 dp2 dq1 dq2
=
∫
E1<H1<E1+∆dp1 dq1
∫
E2<H2<E2+∆dp2 dq2
= Γ1(E1)Γ2(E2)
This means that the extensive property of S has beenproven if the two subsystems are added up without in-teractions.
The key point is that the possible decomposition ofthe energy E = E1 + E2 is not unique when the systemis divided into two subsystems.
11
Let us divide E into E/∆ intervals with Ei = i∆, i =1, · · ·E/∆, then
Γ(E) =
E/∆∑i=1
Γ1(Ei)Γ2(E − Ei)
We will show or have to show that only one term inthe sum is dominating.
Reading materials:Let the largest term be Γ1(E1)Γ2(E2) with
E1 + E2 = E
then
Γ1(E1)Γ2(E2) ≤ Γ(E) ≤ E
∆Γ1(E1)Γ(E2)
or
S1(E1, V1)+S2(E2, V2) ≤ S(E, V ) ≤ S1(E1, V1)+S2(E2, V2)+k logE
∆
In the thermodynamic limit, we expect
log Γ1 ∝ N1
log Γ2 ∝ N2 (∗)E ∝ N1 + N2
thenS(E, V ) = S1(E1) + S2(E2) + O(log(N))
In summary, the extensive property of S is based onthat a decomposition E = E1 + E2 of the energy is domi-nating when the system is divided into two subsystems,
12
with fixed N1, N2 and V1, V2. Such a dominating decom-position is expected from Eqs. (∗).
In thermodynamic equilibrium, the system is homo-geneous, therefore
N ∝ V
In other words, E ∝ N1 + N2 = N is simply the ex-tensive property of E. However, log Γ1 ∝ N1 ∝ V1 in (∗)indicates already that S1 is extensive.
Therefore, derivation of E = E1 + E2, Sounds not veryconvincing. It tells only that if S1 and S2 are extensivethen S is also extensive.
End reading materials
Understanding:Probably, alternatively, we may think E1 and E2 are
the averaged energy of the two subsystems. In thethermodynamic limit, the fluctuation δ of E1 aroundE1 should be much smaller than E1. Let us take δ < ∆,then only Γ1(E1)Γ2(E2) is dominating
Γ(E) =
E/∆∑i=1
Γ1(Ei)Γ2(E − Ei)
Therefore
S(E, V ) = S1(E1, V1) + S2(E2, V2)
with E = E1 + E2.This also explains why we can add two systems up in
equilibrium
Question: why is it only ”an understanding”?
13
Answer: Γ1(Ei)Γ2(E−Ei) is just P (E1 = Ei) – the prob-ability E1 takes the value Ei. When we assume δ < ∆, itindicates already that all other terms are negligible.
Reading materials:Why
Γ(E) =
E/∆∑i=1
Γ1(Ei)Γ2(E − Ei) ?
e.g. N = 2
H =1
2m
2∑i=1
p21 +
1
2ω2
2∑i=1
q2i
H1 =1
2mp2
1 +1
2ω2q2
1
H2 =1
2mp2
2 +1
2ω2q2
2
Γ(E) =
∫
E<H<E+2∆dp1 dq1 dp2 dq2
Γ1(E1) =
∫
E1<H1<E1+∆dp1 dq1
Γ2(E − E1) =
∫
E−E1<H2<E−E2+∆dp2 dq2
If E1 is fixed,
Γ(E) = Γ1(E1)Γ2(E − E1)
If not
Γ(E) =
E/∆∑i=1
Γ1(Ei)Γ2(E − Ei)
14
since E1 takes value from 0 to E, corresponding to dif-ferent states.
End reading materials
Actually, this implies that E = E1 + E2 maximizes thefunction Γ1(E1)Γ2(E2) under the restriction δE = δE1 +δE2 = 0, i.e.
δ(Γ1(E1)Γ2(E2)) = 0
with δE1 + δE2 = 0
This leads to
∂
∂E1log Γ1(E1)
∣∣∣∣E1=E1
=∂
∂E2log Γ2(E2)
∣∣∣∣E2=E2
or∂S1(E1)
∂E1
∣∣∣∣E=E1
=∂S2(E2)
∂E2
∣∣∣∣E2=E2
We define the temperature of any system by
∂S(E, V )
∂E≡ 1
T
Then∂S1
∂E1=
∂S
∂E2
is simply the zeroth law
T1 = T2
T defined in this way is an intensive parameter, and∂S/∂E = 1/T is also one of the Maxwell relations inthermodynamics.
If S is correctly defined, T should be also correct.
15
2.3 Thermodynamics
Let us define
∑(E) =
∫
H(p,q)<E
dp dq
ω(E) =∂
∑(E)
∂EThen
Γ(E) = ω(E)∆
Γ(E) =∑
(E + ∆)−∑
(E)
It can be proved that up to an additive const of theorder O(log N), the following definitions are equivalent
S = k log Γ(E)
S = k log ω(E)
S = k log∑
(E)
why?— question to think about
Keeping in mind that the energy does not fluctuate somuch, it is obvious that
∑(E) =
∑(E1)
∑(E2).
(b) With the definition
S(E, V ) = k log∑
(E)
it is easy to show that S never decreases, i.e., the secondlaw for a thermally isolated system in thermodynamics.
Proof: For our system considered, parameters areN,E, V . By definition of an isolated system, N and E
16
can not change, V can not decrease. Therefore, thesecond law here is simply stated as that S is a non-decreasing function of V . This is obvious, for
∑(E) is a
non-decreasing function of V by its definition.
S is really entropy.Assuming that the system is changed slowly by cou-
pling the system to external environments. Then it is aquasi-static process
dS(E, V ) =
(∂S
∂E
)
V
dE +
(∂S
∂V
)
E
dV
=1
TdE +
(∂S
∂V
)
E
dV
Define the pressure of the system to be
P ≡ T
(∂S
∂V
)
E
then
dS =1
T(dE + PdV )
ordE = TdS − PdV
This looks like the first law.Question: Is it reasonable to define
P ≡ T
(∂S
∂V
)
E
Hint:
• P is intensive
17
• Exercise : prove
P = −(
∂E
∂V
)
S
If we accept the definition of P , then S should be en-tropy by the first law.
In other words, the first law should be also an as-sumption in statistical mechanics.
2.4 Equipartition Theorem
Reading materials.
H = H(p, q) = H(pj, qj), j = 1, · · · 3Ne.g.
H =1
2m
∑p2
i
or
H =1
2m
∑p2
i +1
2ω2
∑q2i
Let xi be either pi or qi
⟨xi
∂H
∂xj
⟩=
1
Γ(E)
∫
E<H<E+∆dp dq xi
∂H
∂xj
=∆
Γ(E)
1
∆
(∫
H<E+∆dp dq −
∫
H<E
dp dq
)xi
∂H
∂xj
=∆
Γ(E)
∂
∂E
∫
H<E
dp dq xi∂H
∂xj
18
∫
H<E
dp dq xi∂H
∂xj=
∫
H<E
dp dq xi∂
∂xj(H − E)
=
∫
H<E
dp dq∂
∂xj[xi(H − E)]− δij
∫
H<E
dp dq (H − E)
the first term =
∫
H=E
dp dq xi(H − E) = 0
∴⟨
xi∂H
∂xj
⟩=
δij
ω(E)
∂
∂E
∫
H<E
dp dp(E −H)
=δij
ω(E)
(∫
H<E
dp dq +1
∆
∫
E<H<E+∆dp dq(E −H)
)
(the second term is negligible)
=δij
ω(E)
∑(E) = δij
1∂
∂E log∑
(E)= δij
k∂S∂E
= δijkT
If i = j ⟨xi
∂H
∂xi
⟩= kT
IfH =
∑i
AiP2i +
∑i
BiQ2i
Pi, Qi are canonical conjugate variablesThen ∑
i
(Pi
∂H
∂Pi+ Qi
∂H
∂Qi
)= 2H
If f of the const Ai and Bi are non-zero,
< H >=1
2fkT
19
exercise Prove the equipartition theorem for
H =1
2mp2 +
1
2ω2q2
by explicitly calculating the ensemble average.
2.5 Classical ideal gas
H =1
2m
N∑i=1
p2i
∑(E) =
∫
H<E
d3p1 · · · d3pN d3q1 · · · d3qN
= V N
∫
H<E
d3p1 · · · d3pN
Let
R =√
2mE∑(E) = V NΩ3N(R)
Ωn(R) is the volume of an n-sphere of radius R
Ωn(R) = CnRn
Cn =2πn/2
Γ(n/2 + 1)
Γ(Z) is the gamma function
log Cn −−−−→n→∞
n
2log π − n
2log
n
2+
n
2
20
hence
∑(E) = C3N
[V (2mE)3/2
]N
S(E, V ) = k log∑
(E)
' Nk log
[V
(4πmE
3N
)3/2]
+3
2Nk
U(S, V ) ≡ E =3N
4πmV 2/3 exp
(2S
3Nk− 1
)
T =
(∂U
∂S
)
V
=2U
3Nk, U =
3
2NkT
CV =3
2Nk
P = −(
∂U
∂V
)
S
=NkT
V
2.6 Maxwell-Boltzmann distribution
N identical molecules, volume V ,
H(p, q) = E
The system may be described by a microcanonical en-semble. For quasi-independent systems:
H =∑
i
Hi(pi, qi)
where Hi(pi, qi) represents the Hamiltonian of each molecule.There could be a number of atoms in a molecule, but
21
the interactions between two molecules are negligible.E.g., the simplest case is
H =1
2m
∑i
p2i
µ space : the phase space spanned by the single-molecule coordinates (p, q).
A microscopic state of a single molecule can be rep-resented by a point in the µ space. A microscopic stateof the system is described by a set of the points.
Since the energy of a molecule is bounded by E, thepoints are confined to a finite region of µ space. Wedivide the region into K elements of volume ω = d3p d3q,and denote the number of the molecules in a elementby nl, then
K∑
l=1
nl = N
K∑
l=1
εlnl = E.
Here the molecules are assumed to be quasi-independent,εl is the energy level of a single particle.
It is important to note that a microscopic state ofthe system may be described by a set of nl, but a setof nl corresponds to not only one microscopic state,e.g., interchange of two molecules leads to new states.That is, a given set of nl corresponds to a volume inΓ space, which is called the volume occupied by nl.
To describe a macroscopic state, we need to aver-age over the microcanonical ensemble, i.e., all possiblemicroscopic states. In other words, we should obtain< nl >, then calculate all macroscopic quantities.
22
However, it is difficult to perform this average.
We assume that the equilibrium state is described bythe most probable distribution nl, which occupies thelargest volume in Γ space.
Why?
• This is some what similar to the case when we cal-culate Γ(E) ' Γ1(E1)Γ2(E2) by dividing the systeminto two subsystems.
• For a fixed l, if the relative fluctuation
(< n2l > − < nl >2)/< nl >2
is still not small enough, we increase the total num-ber N to reduce it. Finally, < nl > should be equalto the most probable nl.
The procedure:(a) Calculate the volume of nl(b) Maximize it to obtain the most probable nl.
The volume of nl
Ω (nl) ∝ N !∏Kl=1 nl!
K∏
k=1
gnk
k
where gk is introduced for convenience and will finallybe put to 1.
Understanding:There are N ! ways of distributing N distinguishable
molecules to N positions. However, N positions form K
groups with the distribution nl. Inside a group, thereare nl! ways of distributing nl molecules.
23
For a large nl,
log nl! = nl(log nl − 1) (i.e., nl! ' nnl
l )
log Ω (nl) = N(log N − 1)−K∑
l=1
nl(log nl − 1)
+K∑
l=1
nl log gl + const
Now we vary nl under the condition of∑K
l=1 nl =N and
∑Kl=1 εlnl = E, to find the most probable nl.
We introduce the Lagrange multipliers α and β, andcalculate
δ [log Ω (nl)]− δ
(α
K∑
l=1
nl + β
K∑
l=1
εlnl
)= 0
Now we consider all nl are independent each other.
K∑
l=1
[−(log nl) + log gl − α− βεl] δnl = 0
∴ log nl = −α− βεl
nl = e−α−βεl
Finally, α and β can be determined by the conservationof the total particles and the total energy.
To prove that nl maximizes Ω (nl), we can simplycalculate the second variation
−K∑
l=1
1
nl(δnl)
2 < 0
24
Note that* nl is only the function of εl, does not generally de-
pends on (p, q). This is important in the equilibriumstate.
* The molecules tends to gather in the lower energystates.
2.7 Boltzmann statistical theory
N =∑
l
nl =∑
l
e−α−βεl
E =∑
l
εlnl =∑
l
εle−α−βεl
Let us assumeεl = ε(p, q, y)
y = yk represent macroscopic external parameters.Define the partition function of a single particle
Z(β, y) =∑
l
e−βεl =
∫
ε≤E
dpdqe−βε(p,q,y)
then
N = e−αZ(β, y) or α = logZ(β, y)
N
E = −N∂ log Z(β, y)
∂β
Suppose the system is changed very slowly.
25
dE =∑
l
nldεl +∑
l
εldnl
The first term represents the interaction with the ex-ternal environment
∑
l
nldεl =∑
l,k
nl∂εl
∂ykdyk
=∑
k
(∑
l
nl∂εl
∂yk
)dyk
Question: why not differentiate over p and q?Answer: p and q are integrating variables.
Here
Yk ≡∑
l
nl∂εl
∂yk= −N
β
∂ log Z(β, y)
∂yk
are the forces acting on the system from the environ-ment.
Proof: since
e−α =N
Z(β, y)
therefore
−N
β
∂ log Z(β, y)
∂yk
= −N
β
1
Z(β, y)
∂Z(β, y)
∂yk
= −e−α
β
∂
∂yk
∑
l
e−βεl
=∑
l
e−α−βεl∂εl
∂yk
26
For example, yk = −V , then
P = +Y =N
β
∂ log Z(β, V )
∂V
If
dεl = 0
dQ =∑
l
εldnl
Therefore, in general
dQ =∑
l
εldnl = dE −∑
k
Ykdyk
= −Nd
(∂ log Z(β, y)
∂β
)+
N
β
∑
k
∂ log Z(β, y)
∂ykdyk
Since
d log Z(β, y) =∂ log Z(β, y)
∂βdβ +
∑
k
∂ log Z(β, y)
∂ykdyk
hence
dQ =N
βd
(log Z(β, y)− β
∂ log Z(β, y)
∂β
)
Define
β =1
kT
dS = Nkd
(log Z(β, y)− β
∂ log Z(β, y)
∂β
)
i.e., S = Nk
(log Z(β, y)− β
∂ log Z(β, y)
∂β
)+ const
27
Then, dQ = TdS, i.e., assuming the first law, we mayprove the second law.
Exercise: Perform the calculations for an ideal gaswith
H =1
2m
∑i
p2i
Hint: Since E ∝ N , we neglect the bound of E incalculating
Z(β, y) =∑
l
e−βεl
since εl ¿ E.Question: What is the relation between the Boltz-
mann theory and the microcanonical theory?Answer:
log Ω (nl) = N log N −∑
l
nl log nl
= N log N +∑
l
(α + βεl)e−α−βεl
= N log N + α∑
l
e−α−βεl + β∑
l
εle−α−βεl
= N log N + N logZ(β, y)
N−Nβ
∂ log Z(β, y)
∂β
∴ S = Nk
(log Z(β, y)− β
∂ log Z(β, y)
∂β
)
Fix y,
dS = Nk∂ log Z(β, y)
∂βdβ −Nk
∂ log Z(β, y)
∂βdβ + kβdE
∴ 1
T=
(∂S
∂E
)
y
= kβ
28
2.8 Summary
Ergodicity→ the probability distribution of a microcanonical
ensembleA special case: the most probable distribution of
quasi-independent particles.→ T, S, P and thermodynamicsThe drawback is that the calculation of observables
is clumsy because of the restriction of the energy.
Exercise :
Assuming
P ≡ T
(∂S
∂V
)
E
,
(∂S
∂E
)
V
=1
T,
prove
P = −(
∂E
∂V
)
S
.
Prove the equipartition theorem for
H =1
2mp2 +
1
2ω2q2
by explicitly calculating the ensemble average.Perform the calculations of the Boltzmann statistical
theory for an ideal gas with
H =1
2m
∑i
p2i
Exercises 6.1 and 6.3 in the textbook.
29