the fundamentals of density functional theory (revised and

224
The Fundamentals of Density Functional Theory (revised and extended version) H. Eschrig Institute for Solid State and Materials Research Dresden and University of Technology Dresden

Upload: trinhkien

Post on 01-Jan-2017

227 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: The Fundamentals of Density Functional Theory (revised and

The Fundamentals ofDensity Functional Theory

(revised and extended version)

H. Eschrig

Institute for Solid State and Materials Research Dresdenand

University of Technology Dresden

Page 2: The Fundamentals of Density Functional Theory (revised and
Page 3: The Fundamentals of Density Functional Theory (revised and

Preface

Density functional methods form the basis of a diversified and very activearea of present days computational atomic, molecular, solid state and evennuclear physics. A large number of computational physicists use these meth-ods merely as a recipe, not reflecting too much upon their logical basis. Onealso observes, despite of their tremendous success, a certain reservation intheir acceptance on the part of the more theoretically oriented researchersin the above mentioned fields. On the other hand, in the seventies (Thomas-Fermi theory) and in the eighties (Hohenberg-Kohn theory), density func-tional concepts became subjects of mathematical physics.

In 1994 a number of activities took place to celebrate the thirtieth an-niversary of Hohenberg-Kohn-Sham theory. I took this an occasion to givelectures on density functional theory to senior students and postgraduates inthe winter term of 1994, particularly focusing on the logical basis of the the-ory. Preparing these lectures, the impression grew that, although there is awealth of monographs and reviews in the literature devoted to density func-tional theory, the focus is nearly always placed upon extending the practicalapplications of the theory and on the development of improved approxima-tions. The logical foundation of the theory is found somewhat scattered inthe existing literature, and is not always satisfactorily presented. This situ-ation led to the idea to prepare a printed version of the lecture notes, whichresulted in the present text.

It is my intention to provide a thorough introduction to the theoreticalbasis of density functional methods in a form which is both rigorous and yetconcise. It is aimed not only for those who are already or are going to beactive in the field, but also for those who just want to get a deeper insightinto the meaning of the results of practical calculations, and last but notleast to provide the interested mathematician with the physicist’s view onthe logical roots of the theory. High value is put on the self-containment ofthe text, so that it should be accessible to anybody who has been throughthe standard course in quantum mechanics.

Page 4: The Fundamentals of Density Functional Theory (revised and

6 Preface

The shorter Part II of these notes deals with the relativistic theory. Thispart of the theory is less rigorous logically, due to the unsolved basic prob-lems in quantum field theory, but it is nevertheless physically very importantbecause heavier atoms and many problems of magnetism need a relativistictreatment.

Finally, as is always the case, many things had to be omitted. I decidedin particular to omit everything which seemed to me to not yet have its finalform or at least be in some stabilized shape. This does not exclude thatimportant points are missing, just because I was unaware of them.

The author’s views on the subject have been sharpened by discussionswith many colleagues, but particularly during the course of scientific coop-eration with M. Richter and P. M. Oppeneer on the applied side of the story.

Dresden, July 1996 Helmut Eschrig

In the present updated and extended version, errors and misprints havebeen corrected in the original text. Throughout, an effort has been madeadditionally to improve overall clarity. This includes in particular majorchanges in Chapter 4. Here, also the non-collinear spin case was added asSection 4.8, a very important issue presently. Chapter 6 and Section 9.2,however, have been completely rewritten, since in the opinion of the authorthe material gains much in systematics and clarity, if one against traditionstarts with the dependence on the particle number and afterwards treats thedensity-potential interrelation. Although not yet completely developed, thepresently most common versions of LDA+U have been included as Section7.5. The last section of the first edition on the other hand has been omittedas it is not relevant any more: with the drastic increase of computer datastorage capacities, radial basis functions are stored numerically now and thedirect projection onto the electron sector of the Dirac-Fock space is not aproblem any more.

Dresden, August 2003 Helmut Eschrig

Page 5: The Fundamentals of Density Functional Theory (revised and

Contents

Introduction 9

Part I: NON-RELATIVISTIC THEORY 12

1 Many-Body Systems 13

1.1 The Schrodinger Representation, N Fixed . . . . . . . . . . . . . . 14

1.2 The Momentum Representation, N Fixed . . . . . . . . . . . . . . 17

1.3 The Heisenberg Representation, N Fixed . . . . . . . . . . . . . . 24

1.4 Hartree-Fock Theory . . . . . . . . . . . . . . . . . . . . . . . . . 25

1.5 The Occupation Number Representation, N Varying . . . . . . . . 29

1.6 Field Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2 Density Matrices and Density Operators 35

2.1 Single-Particle Density Matrices . . . . . . . . . . . . . . . . . . . 35

2.2 Two-Particle Density Matrices . . . . . . . . . . . . . . . . . . . . 38

2.3 Density Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2.4 Expectation Values and Density Matrices . . . . . . . . . . . . . . 43

2.5 The Exchange and Correlation Hole . . . . . . . . . . . . . . . . . 47

2.6 The Adiabatic Principle . . . . . . . . . . . . . . . . . . . . . . . . 52

2.7 Coulomb Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3 Thomas-Fermi Theory 61

3.1 The Thomas-Fermi Functional and Thomas-Fermi Equation . . . . 62

3.2 The Thomas-Fermi Atom . . . . . . . . . . . . . . . . . . . . . . . 67

3.3 The Thomas-Fermi Screening Length . . . . . . . . . . . . . . . . 69

3.4 Scaling Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3.5 Correction Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4 Hohenberg-Kohn Theory 76

4.1 The Basic Theorem by Hohenberg and Kohn . . . . . . . . . . . . 77

4.2 The Kohn-Sham Equation . . . . . . . . . . . . . . . . . . . . . . 81

4.3 The Link to the Hartree-Fock-Slater Approximation . . . . . . . . 86

4.4 Constrained Search Density Functionals . . . . . . . . . . . . . . . 87

4.5 Ensemble State Density Functionals . . . . . . . . . . . . . . . . . 89

4.6 Dependence on Particle Number N . . . . . . . . . . . . . . . . . 95

4.7 Spin Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.8 Non-Collinear Spin Configurations . . . . . . . . . . . . . . . . . . 105

Page 6: The Fundamentals of Density Functional Theory (revised and

8 Contents

5 Legendre Transformation 109

5.1 Elementary Introduction . . . . . . . . . . . . . . . . . . . . . . . 1105.2 Prelude on Topology . . . . . . . . . . . . . . . . . . . . . . . . . 1165.3 Prelude on Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . 1215.4 Banach Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1245.5 Dual Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1295.6 Conjugate Functionals . . . . . . . . . . . . . . . . . . . . . . . . . 1315.7 The Functional Derivative . . . . . . . . . . . . . . . . . . . . . . 1325.8 Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . 135

6 Density Functional Theory by Lieb 138

6.1 The Ground State Energy . . . . . . . . . . . . . . . . . . . . . . 1396.2 The Hohenberg-Kohn Variational Principle . . . . . . . . . . . . . 1426.3 The Functionals F , G, and H . . . . . . . . . . . . . . . . . . . . 1446.4 The Kohn-Sham Equation . . . . . . . . . . . . . . . . . . . . . . 151

7 Approximative Variants 155

7.1 The Homogeneous Electron Liquid . . . . . . . . . . . . . . . . . . 1567.2 The Local Density Approximation . . . . . . . . . . . . . . . . . . 1617.3 Generations of Kohn-Sham Type Equations . . . . . . . . . . . . . 1657.4 The Self-Interaction Correction . . . . . . . . . . . . . . . . . . . . 1687.5 The LDA+U Approach . . . . . . . . . . . . . . . . . . . . . . . . 171

Part II: RELATIVISTIC THEORY 178

8 A Brief Introduction to Quantum Electrodynamics 179

8.1 Classical Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . 1808.2 Lorentz Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . 1818.3 Lagrange Formalism . . . . . . . . . . . . . . . . . . . . . . . . . . 1838.4 Relativistic Kinematics . . . . . . . . . . . . . . . . . . . . . . . . 1868.5 Relativistic Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . 1878.6 The Principles of Relativistic Quantum Theory . . . . . . . . . . . 1888.7 The Dirac Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

9 Current Density Functional Theory 196

9.1 QED Ground State in a Static External Field . . . . . . . . . . . 1979.2 Current Density Functionals and Kohn-Sham-Dirac Equation . . . 2029.3 The Gordon Decomposition and Spin Density . . . . . . . . . . . 2079.4 Approximative Variants . . . . . . . . . . . . . . . . . . . . . . . . 210

Bibliography 214

Index 223

Page 7: The Fundamentals of Density Functional Theory (revised and

Introduction

Density functional theory provides a powerful tool for computations of thequantum state of atoms, molecules and solids, and of ab-initio moleculardynamics. It was conceived in its initial naıve and approximative versionby Thomas and Fermi immediately after the foundation of quantum me-chanics, in 1927. In the middle of the sixties, Hohenberg, Kohn and Shamon the one hand established a logically rigorous density functional theoryof the quantum ground state on the basis of quantum mechanics, and onthe other hand, guided by this construction, introduced an approximativeexplicit theory called the local-density approximation, which for computa-tions of the quantum ground state of many-particle systems proved to besuperior to both Thomas-Fermi and Hartree-Fock theories. From that timeon, density functional theory has grown vastly in popularity, and a floodof computational work in molecular and solid state physics has been theresult. Motivated by its success, there has been always a tendency to widenthe fields of application of density functional theory, and in these develop-ments, some points which were left somewhat obscure in the basic theory,were brought into focus from time to time. This led in the early eighties toa deepening of the logical basis, essentially by Levy and Lieb, and finallyLieb gave the basic theory a form of final mathematical rigor. Since thattreatment, however, is based on the tools of modern convex functional anal-ysis, its implications only gradually became known to the many people whoapply density functional theory. A thorough treatment of the dependenceon particle number on the basis of Lieb’s theory is given for the first timein the present text.

While quite a number of high quality and up-to-date surveys and mono-graphs on the variants and applications of density functional theory exist,the aim of these lecture notes is to provide a careful introduction into thesafe basis of this type of theory for both beginners and those working inthe field who want to deepen their understanding of its logical basis. Tothis end, references are not only given to further introductory texts, surveysand some key original papers regarding the physics, but also to literaturetreating the underlying mathematics. Only Chapter 5 of this text providesa more systematic and self-explanatory introduction to a piece of modern

Page 8: The Fundamentals of Density Functional Theory (revised and

10 Introduction

mathematics appropriate in this context, namely convex (non-linear) func-tional analysis, and in particular duality theory. This chapter was includedfor those physicists who either are for one reason or another not willing todive into comprehensive treatments of modern mathematics or who wantsome guidance into those parts of mathematics necessary for a deeper un-derstanding of the logics of density functional theory. Even the material ofthis chapter is presented on an introductory level: proofs are only sketchedif they are essentially enlightening as regards the logical interrelations.

The material has been divided into two parts. Part I contains the non-relativistic theory, and, because this part has to a larger extent achievedsome final logical state, it contains a discussion of the central ideas andconstructions of density functional theory, including some relevant mathe-matical aspects. The shorter Part II is devoted to the relativistic extensions,which have not yet reached the same level of rigor, mainly because realisticquantum field theory is in a much less explicit state than quantum mechan-ics.

Chapters 1, 2 and at least part of 3 are elementary and provide a physicalintroduction for students having been through the standard first course inquantum mechanics. Those who are already to some extent familiar withmany-body quantum theory may immediately start with Chapter 4 and usethe first chapters only for reference (for instance to explain the notationused). Care has been taken to present the material of Part II in such a waythat it should be readable also for somebody who is not very familiar withquantum field theory, yet without going smoothly around the reefs. Some re-marks scattered throughout the text address the more physics-minded math-ematician.

Notation is systematic throughout the text. Bold math (r) denotes ausually three-dimensional vector, z∗ is the complex conjugate to the com-plex number z, A is the operator corresponding to the quantum observableA (for brevity the same symbol is used to denote an operator in differentrepresentation spaces), A† is the Hermitian conjugate to A, det ||Aik|| meansthe determinant of the matrix ||Aik||, while Aik or 〈i|A|k〉 is the matrix el-ement. In cross-references, (2.1) refers to Formula(e) 1 in Chapter 2. (Insome cases the formula number refers to a couple of lines of mathematical ex-pressions not interrupted by text lines; the formula number mostly appearson or below the last of those lines.) Citations are given as the author(s) andyear in square brackets. In case the reader is confused by some notion hehas no definite idea of, it can most probably be found in the index at theend of the book with a reference to places in the text, where the notion is

Page 9: The Fundamentals of Density Functional Theory (revised and

Introduction 11

explained or put into context.Density functional theory can be built up in several versions: (i) as

a theory with particle densities (summed over spin variables) and spin-independent external potentials only, irrespective whether the quantum stateis spin-polarized or not; (ii) as a theory with spin-up and spin-down den-sities and external potentials which possibly act differently on spin-up andspin-down particles for collinear polarization situations with one global spinquantization direction (generalization to more than two eigenvalues of thez-component of the spin is straightforward); and finally, (iii) as a generaltheory with (spatially diagonal) spin-density matrices and general doublyindexed spin-dependent potentials. All three versions are treated in thepresent text, the two former cases are considered in parallel throughout byconsequently using a combined variable x = (r, s) of spatial position r andz-component of spin s. Most expressions given refer immediately to thespin-independent cases, if x is read as synonymous with r. The reason forpresenting the material in this way is that practically all formulae therebyrefer simultaneously to both cases, whereas the most general case of spin-density matrices does not lead to essentially new aspects in many questionsaddressed in large parts of the text.

These are lecture notes, hence neither an attempt in any respect is madeto give a complete reference list of existing literature, nor are questionsof priority addressed. The cited literature was selected exclusively (andsubjectively of course) to facilitate the access to the subject, and to giveadvice for delving deeper into related subjects. Nevertheless, key originalpapers which have introduced a new approach or a new theoretical tool arecited, since the author is convinced that in particular students will benefitfrom reading the key original papers.

Page 10: The Fundamentals of Density Functional Theory (revised and

Part I:

NON-RELATIVISTIC THEORY

Page 11: The Fundamentals of Density Functional Theory (revised and

1 Many-Body Systems

The dynamics of a quantum system is governed by the Hamiltonian H . If |Ψ〉is a quantum state of the system in an abstract Hilbert space representation,its time-evolution is given by

−~

i

∂t|Ψ〉 = H |Ψ〉. (1.1)

Stationary states with definite energies, particularly the ground state of thesystem, are obtained as solutions of the eigenvalue equation

H |Ψ〉 = |Ψ〉E, 〈Ψ|Ψ〉 = 1. (1.2)

Alternatively they are obtained as stationary solutions of the variationalproblem

〈Ψ| H |Ψ〉〈Ψ|Ψ〉 ⇒ stationary. (1.3)

The variation (with respect to |Ψ〉) of the numerator on the left-hand sidewith the denominator kept fixed equal to unity leads immediately to (1.2),the energy, E, thereby appearing as a Lagrange multiplier corresponding tothe latter constraint.

We are interested in systems ofN identical particles (electrons, say) mov-ing in a given external field and interacting with each other with pair forces.The Hamiltonian for this case consists of the kinetic energy operator T , thepotential operator U of the interaction of the particles with the externalfield, and the two-particle interaction operator W :

H = T + U + W . (1.4)

The case W = 0, of particles which do not interact with each other, i.e.

H0 = T + U , (1.5)

is often considered as a reference system.In the following we subsequently introduce the most important actual

representations of the Hilbert space for the quantum states. (For de-tails see standard textbooks of quantum theory as, e.g., [Sakurai, 1985,Dawydow, 1987, Dirac, 1958, von Neumann, 1955].)

Page 12: The Fundamentals of Density Functional Theory (revised and

14 1. Many-Body Systems

1.1 The Schrodinger Representation, N Fixed

This spatial representation formally uses the eigenstates of the coordinates r

(and possibly of the spin projection s with respect to some given quantizationaxis z) of the particles

r |r〉 = |r〉 r, σ |s〉 = |s〉 s (1.6)

as basis vectors in the Hilbert space of one-particle quantum states. Here,σ is the z-component of the spin operator. The subscript z will generally beomitted in order not to overload the notation. A combined variable

xdef= (r, s),

dxdef=

s

d3r (1.7)

will be used throughout for both position and spin of a particle. We willgenerally have in mind N -electron systems, large parts of what follows apply,however, to a general case of identical particles, having a spin (or some otherinternal degree of freedom) or not. In expressions applying for both spinlessparticles and particles with spin, x and r may be considered synonyms inthe spinless case.

The N -particle quantum state is now represented by a (spinor-)wave-function

Ψ(x1 . . . xN ) = 〈 x1 . . . xN |Ψ〉 = 〈 r1s1 . . . rNsN |Ψ〉. (1.8)

The spin variable si runs over a finite number of values only (2S + 1 valuesfor spin-S particles). For one spin-half particle, e.g., the spinor part of thewavefunction (for fixed r),

χ(s) = 〈s|χ〉 =

(

χ+

χ−

)

, (1.9)

consists of two complex numbers χ+ and χ−, forming the components of aspinor. This latter statement means that a certain linear transformation ofthose two components is linked to every spatial rotation of the r-space (see,e.g., [Landau and Lifshitz, 1977], or any textbook on quantum mechanics).The operator σ ≡ σz is now represented by a 2 × 2 matrix. For later usewe also give the operators for the x- and y-components of the spin in thisrepresentation:

σz =1

2

(

1 00 −1

)

, σx =1

2

(

0 11 0

)

, σy =1

2

(

0 −ii 0

)

. (1.10)

Page 13: The Fundamentals of Density Functional Theory (revised and

1.1 The Schrodinger Representation, N Fixed 15

The eigenstates of σz,

χ+(s) = 〈s|χ+〉 =

(

10

)

, χ−(s) =

(

01

)

, (1.11)

form a complete set for the s-dependence at a given space-point r:

〈χ+|χ〉 =∑

s

〈χ+|s〉〈s|χ〉 = (1 0)

(

χ+

χ−

)

= χ+, 〈χ−|χ〉 = χ−. (1.12)

The full wavefunction of a spin-half particle is given by

φ(x) = φ(rs) =

(

φ+(r)φ−(r)

)

. (1.13)

It is called a spin-orbital.For fermions (half-integer spin), only wavefunctions which are antisym-

metric with respect to particle exchange are admissible:

Ψ(x1 . . . xi . . . xk . . . xN ) = −Ψ(x1 . . . xk . . . xi . . . xN ). (1.14)

In the non-interacting case (H0), Slater determinants

ΦL(x1 . . . xN ) =1√N !

det ‖φli(xk)‖ (1.15)

of single-particle wavefunctions φli(xk) (spin-orbitals) are appropriate. The

subscript L denotes an orbital configuration Ldef= (l1 . . . lN). The determi-

nant (1.15) can be non-zero only if the orbitals φli are linear independent,it is normalized if the orbitals are orthonormal. Furthermore, if the φli maybe written as φli = φ′li + φ′′li , where 〈φ′li|φ′lk〉 ∼ δik, and the φ′′li are lineardependent on the φ′lk , k 6= i, then the value of the determinant depends onthe orthogonal to each other parts φ′li only. These statements following fromsimple determinant rules comprise Pauli’s exclusion principle for fermions.For a given fixed complete set of spin-orbitals φk(x), i.e. for a set with theproperty

k

φk(x)φ∗k(x′) = δ(x− x′), (1.16)

the Slater determinants for all possible orbital configurations span the anti-symmetric sector of the N -particle Hilbert space. In particular, the generalstate (1.14) may be expanded according to

Ψ(x1 . . . xN ) =∑

L

CLΦL(x1 . . . xN ) (1.17)

(‘configuration interaction’).

Page 14: The Fundamentals of Density Functional Theory (revised and

16 1. Many-Body Systems

Bosonic (integer, in particular zero spin) wavefunctions must be sym-metric with respect to particle exchange:

Ψ(x1 . . . xi . . . xk . . . xN ) = Ψ(x1 . . . xk . . . xi . . . xN). (1.18)

The corresponding symmetric sector of the N -particle Hilbert space isformed by the product states

ΦL(x1 . . . xN ) = N∑

P

i

φlPi(xi), (1.19)

where N is a normalization factor, and P means a permutation of the sub-scripts 12 . . .N into P1P2 . . .PN . The subscripts li, i = 1 . . .N, need notbe different from each other in this case. Particularly all φli might be equalto each other (Bose condensation).

For both fermionic and bosonic systems, the probability density of agiven configuration (x1 . . . xN ),

p(x1 . . . xN ) = Ψ∗(x1 . . . xN )Ψ(x1 . . . xN ), (1.20)

is independent of particle exchange.The Hamiltonian acting on Schrodinger wavefunctions is now explicitly

given as

H = − ~2

2m

N∑

i=1

∇2i +

N∑

i=1

v(xi) +λ

2

N∑

i6=j

w(|ri − rj|), (1.21)

where v(xi) is the potential of the external field acting on the particle withposition and spin xi. For later considerations we allow for a spin depen-dent field which may be visualized as an external magnetic field B(r) inz-direction acting according to a potential term −2sB(r) on the spin onlywhile its effect on the orbital motion is neglected. Note that this is even notthe most general spin dependence of the external field, because we put vdepending on the same z-component of the spin operator (1.10) everywherein r-space. (The most general spin dependent external field would consistof four spatial functions, indexed by two spinor indices (ss′) of a general(spatially local) spin-half operator vss′(r); this is used in density functionaltreatments of ground states with non-collinear spin structure, spiral spinstructures for example. The restriction to a fixed z-direction of the exter-nal spin-coupled field allows for a treatment completely in parallel to the

Page 15: The Fundamentals of Density Functional Theory (revised and

1.2 The Momentum Representation, N Fixed 17

spin-independent case and is therefore used throughout in the basic text. Itapplies for many cases of magnetic structure.) A coupling constant λ of thepair interaction was introduced being equal to e2 in case of the Coulombinteraction w(|ri − rj|) = |ri − rj |−1 of particles with charge ±e.

Natural units will be used throughout by putting

~ = m = λ = 1. (1.22)

This means for electrons, that energies are given in units of Hartree andlengths in units of the Bohr radius aBohr,

1Hartree = 2Rydberg = 27.212eV, 1aBohr = 0.52918 · 10−10m. (1.23)

The Hamiltonian (1.21) reduces in these units to

H = −1

2

N∑

i=1

∇2i +

N∑

i=1

v(xi) +1

2

N∑

i6=j

w(|ri − rj|). (1.24)

The formal connection with (1.1–1.4) is expressed as H Ψ(x1 . . . xN) =〈x1 . . . xN |H|Ψ〉, where for brevity the same symbol H was used for theSchrodinger operator (1.24) on the left-hand side and for the abstract Hilbertspace operator on the right-hand side.

1.2 The Momentum Representation, N Fixed

The momentum or plane wave representation formally uses the eigenstatesof the particle momentum operator

p |k〉 = |k〉 ~k (1.25)

instead of the position vector eigenstates from (1.6) as basis vectors in theHilbert space of one-particle quantum states.

In Schrodinger representation, the particle momentum operator is givenby

p =~

i∇, (1.26)

and the momentum eigenstate (1.25) is represented by a wavefunction

φk(r) = 〈r|k〉 =1√Veik·r. (1.27)

Page 16: The Fundamentals of Density Functional Theory (revised and

18 1. Many-Body Systems

V is the total volume or the normalization volume.In order to avoid formal mathematical problems with little physical rel-

evance, the infinite position space R3 of the particles is to be replaced by alarge torus T 3 of volume V = L3 defined by

x+ L ≡ x, y + L ≡ y, z + L ≡ z, (1.28)

where (x, y, z) are the components of the position vector (periodic or Bornvon Karman boundary conditions). The meaning of (1.28) is that any func-tion of x, y, z must fulfill the periodicity conditions f(x + L) = f(x), andso on. As is immediately seen from (1.27), this restricts the spectrum ofeigenvalues k of (1.25) to the values

k =2π

L(nx, ny, nz) (1.29)

with integers nx, ny, nz. The k-values (1.29) form a simple cubic mesh inthe wavenumber space (k-space) with a k-space density of states (numberof k-vectors (1.29) within a unit volume of k-space)

D(k) =V

(2π)3, i.e.

k

→ V

(2π)3

d3k. (1.30)

Again we introduce a combined variable

qdef= (k, s),

q

def=

s

k

=V

(2π)3

s

d3k (1.31)

for both momentum and spin of a particle.In analogy to (1.8), the N -particle quantum state is now represented by

a (spinor-)wavefunction in momentum space

Ψ(q1 . . . qN ) = 〈 q1 . . . qN |Ψ〉 = 〈k1s1 . . .kNsN |Ψ〉 (1.32)

expressing the probability amplitude of a particle momentum (and possi-bly spin) configuration (q1 . . . qN ) in complete analogy to (1.20). Everythingthat was said in the preceding section between (1.6) and (1.20) transfers ac-cordingly to the present representation. Particularly, in the case of fermions,Ψ of (1.32) is totally antisymmetric with respect to permutations of the qi,and it is totally symmetric in the case of bosons. Its spin dependence is incomplete analogy to that of the Schrodinger wavefunction (1.8).

Page 17: The Fundamentals of Density Functional Theory (revised and

1.2 The Momentum Representation, N Fixed 19

Equation (1.2) reads in momentum representation

(q′1...q′

N)

〈 q1 . . . qN |H| q′1 . . . q′N 〉〈 q′1 . . . q′N |Ψ〉 = 〈 q1 . . . qN |Ψ〉E, (1.33)

where summation runs over the physically distinguished states only. Thismay be achieved by introducing some linear order in the discrete set kswith k’s from (1.29) and only considering ascending sequences (q1 . . . qN )and (q′1 . . . q

′N ), having in mind the (anti-)symmetry of (1.32) with respect

to particle exchange.With this rule, the Hamiltonian (1.24) for a fermionic system is repre-

sented by1

〈 q1 . . . qN |H| q′1 . . . q′N 〉 =1

2

i

k2i

j

δqjq′j+

+∑

i

δsıs′ivsı

kı−k′i(−1)P

j(6=i)

δqq′j +

+1

2

i6=j

δkı+k,k′i+k′

j

V

[

δsıs′iδss′j

w|kı−k′i|−

−δsıs′jδss′i

w|kı−k′j |

]

(−1)P∏

k(6=i,j)

δqkq′k , (1.34)

where i = P ı, and P is a permutation of the subscripts which puts ı inthe position i (and puts in the position j in the last sum) and leaves theorder of the remaining subscripts unchanged. There is always at most onepermutation P (up to an irrelevant interchange of ı and ) for which theproduct of Kronecker δ’s can be non-zero. For the diagonal matrix element,that is q′i = qi for all i, P is the identity and can be omitted.

The Fourier transforms of v(x) and w(r) are given by

vsk

=1

V

d3r v(r, s)e−ik·r (1.35)

1The reader not quite familiar with the present considerations may for instance think offive momenta ka, kb, kc, kd, ke combined with spin up and followed by the same momentacombined with spin down in an ascending sequence of the assumed linear order of states,with ka+kd = kb+ke. As an example with N = 3 he may take the case q′1 = (ka+), q′2 =(kc+), q′3 = (kd−) and q1 = (kc+), q2 = (ke+), q3 = (kb−) and then play with othercombinations of spins.

Page 18: The Fundamentals of Density Functional Theory (revised and

20 1. Many-Body Systems

and

w|k| =1

V

d3r w(r)e−ik·r, (1.36)

where the latter integral may readily be further simplified with sphericalcoordinates.

The first sum in (1.34) is over the kinetic energies of the particles intheir momentum eigenstates (1.25, 1.27). Since the momentum operator ofsingle particles commutes with the kinetic energy operator T of the system,this part is diagonal in the momentum representation, which is formallyexpressed by the j-product over the Kronecker symbols δqjq′j . The nextsum contains the individual interaction events of particles with the externalpotential v(x). The amplitude of this interaction process is given by theFourier transform of the potential, and is assumed to be diagonal in the z-component of the particle spin according to the text after (1.21). Since theinteraction of each particle with the external field is furthermore assumedto be independent of the other particles (U is assumed to be a sum overindividual items v(xl) in (1.21)), all remaining particle states j 6= i are keptunchanged in an interaction event in which one particle makes a transitionfrom the state q′i to the state qı. This is again expressed by the j-product.Finally, the classical image of an elementary pair-interaction event is thatparticles in states k′is

′i and k′js

′j transfer a momentum kı − k′i and keep

their spin states unchanged, because the interaction potential w is spin-independent. Hence, kı + k = k′i + k′j and sı = s′i, s = s′j . Quantum-mechanically, one cannot decide which of the particles, formerly in statesq′i, q

′j , is afterwards in the state qı and which one is in the state q. This

leads to the second term in (1.34), the exchange term with qı and q reversed.For a bosonic system, the exchange term would appear with a positive sign.The elementary processes corresponding to the terms of (1.34) are depictedin Fig.1.

(As a formal exercise the reader may cast the stationary Schrodingerequation (1.2) into the Schrodinger representation and the Hamiltonian(1.24) into a form analogous to (1.33, 1.34), as an integral operator witha δ-like kernel.)

Both Hamiltonians (1.24) and (1.34) may be considered as deduced fromexperiment. However, they are also formally equivalent. This is obtainedby a derivation encountered frequently, and although it requires some effortit should be performed at least once in life. We consider for example theterm 〈q1 . . . qN |wmn|q′1 . . . q′N 〉, where wmn means the interaction term of the

Page 19: The Fundamentals of Density Functional Theory (revised and

1.2 The Momentum Representation, N Fixed 21

-ki, si

a)

- -u

@@k′i, sı kı, sı

vsı

kı−k′i

b)

- -

- -u

u

k′i, sı

k′j , s

kı, sı

k, s

w|kı−k′i|

c)

-

-

HHHHHHj*

u

u

k′i, s

k′j , sı

kı, sı

k, s

w|kı−k′j |

d)

Figure 1: Elementary events corresponding to the terms contained in (1.34):a) Propagation of a particle with (conserved) momentum ki and spin si, and havingkinetic energy k2

i /2.b) Scattering event of a particle having initially momentum k′

i and (conserved) spinsı, on the potential v(x).c) Direct interaction event of a pair of particles having initially momenta k′

i andk′

j , and transferring a momentum kı − k′i from one particle to the other, the spins

remaining unchanged.d) Exchange interaction event of a pair of particles.

Hamiltonian which in Schrodinger representation is equal to w(|rm−rn|). Tocompute it we need the Schrodinger representation of the states |q1 . . . qN 〉,which is composed of plane-wave spin-orbitals

φq(x) = φks(rs) =1√Veik·rχs(s) =

1√Veik·rδss = 〈rs|ks〉. (1.37)

(We have denoted the spin variable of the x-state by s in order to distinguishit from the spin variable s of the q-state. χs(s) is one of the two spinors(1.11).) According to (1.15),

〈x1 . . . xN |q1 . . . qN〉 =1√N !

det ‖φqk(xl)‖ (1.38)

Page 20: The Fundamentals of Density Functional Theory (revised and

22 1. Many-Body Systems

and

〈q1 . . . qN |wmn|q′1 . . . q′N〉 =

=1

N !

dx1 . . . dxN (det ‖φqk(xl)‖)∗ w(|rm − rn|) det ‖φq′k(xl)‖ =

=1

N !

dx1 . . . dxN ∗

∗∑

P

(−1)P∏

l

φ∗qPl(xl)w(|rm − rn|)

P ′

(−1)P′∏

l

φq′P′l

(xl).

(1.39)

The two determinants have been expanded into sums of products. P means apermutation of the subscripts 12 . . .N into P1P2 . . .PN with the indicatedsign factor being ±1 for even and odd orders of permutations, respectively.Due to the orthonormality of the plane-wave spin-orbitals (1.37) the integralover dxl for l 6= m, n is equal to unity if and only if qPl = q′P ′l. Withour convention on the sequences (q1 . . . qN ) and (q′1 . . . q

′N ) introduced after

(1.33), this is only possible, if after removing the items with l = m,n, thesequences (Pl) and (P ′l) become identical, and furthermore q′P ′l = qPl, l 6=m,n. Hence, if the sets qi and q′i differ in more than two q’s, the matrixelement is zero. Otherwise, for a given P, only two items of the P ′-sumare to be retained: with P ′l = PPl for l 6= m,n and P as defined in the

text below (1.34), and with P ′m = PPm def= i, P ′n = PPn def

= j andP ′m = PPn = j, P ′n = PPm = i, respectively. (There are two possiblechoices to fix P, both giving the same result.) With fixed i and j, (N − 2)!permutations P of the remaining integration variables yield identical results.If we finally denote Pl by k and P ′l by k = P k, we are left with

1

N(N − 1)

i6=j

dxmdxn φ∗qı(xm)φ∗q(xn)w(|rm − rn|)∗

∗(φq′i(xm)φq′j(xn)− φq′j(xm)φq′i(xn)) (−1)P∏

k(6=i,j)

δqkq′k . (1.40)

This result is independent of m and n, because the denotation of integrationvariables is irrelevant. This fact is a direct consequence of the total symme-try of Ψ∗(x1 . . . xN )Ψ′(x1 . . . xN) with respect to permutations of subscripts,whereupon each term of the sums of (1.24) yields the same expectation value;in our case, the last sum of (1.24) yields N(N − 1) times the result (1.40).The rest is easy: introducing instead of rm, rn coordinates of the center of

Page 21: The Fundamentals of Density Functional Theory (revised and

1.2 The Momentum Representation, N Fixed 23

gravity and distance of the pair, the result contained in (1.34, 1.36) is im-mediately obtained. The single-particle part of (1.34) is obtained along thesame lines, but this time both permutations P ′ and PP must be completelyidentical for a non-zero contribution.

An important reference system is formed by interaction-free fermions(w = 0) in a constant external potential v(x) = 0: the homogeneousinteraction-free fermion gas with the Hamiltonian Hf :

〈 q1 . . . qN |Hf | q′1 . . . q′N 〉 =1

2

i

k2i

j

δqjq′j . (1.41)

Its ground state is a Slater determinant of plane waves so that the sum∑N

i=1 k2i is minimum. This is obviously the case if all ki lie inside a sphere

of radius kf determined by

N =

ki≤kf∑

qi

1 = 2V

(2π)3

k≤kf

d3k, (1.42)

where the factor 2 in front of the last expression comes from summation overthe two spin values for each k. Hence

N

V= n =

k3f

3π2(1.43)

with n denoting the constant particle density in position space of this groundstate, related to the Fermi radius kf . The Fermi sphere of radius kf separatesin k-space the occupied orbitals 〈r|k〉 from the unoccupied ones. The groundstate energy is

E = 2∑

k≤kf

k2

2=

3

10k2fN (1.44)

implying an average energy per particle

ε =E

N=

3

10k2f . (1.45)

Energies are given in natural units (1.23) in both cases.

Page 22: The Fundamentals of Density Functional Theory (revised and

24 1. Many-Body Systems

1.3 The Heisenberg Representation, N Fixed

The above matrix notation (1.33, 1.34) of the momentum representationmay by considered as a special case of a more general scheme.

Let |L〉 be any given complete orthonormal set of N -particle stateslabeled by some multi-index L. Any state may then be expanded accordingto

|Ψ〉 =∑

L

|L〉CL =∑

L

|L〉〈L|Ψ〉, (1.46)

and the stationary Schrodinger equation (1.2) takes on the form of a matrixproblem:

L′

[HLL′ − EδLL′]CL′ = 0, HLL′ = 〈L|H|L′〉. (1.47)

with an infinite matrix ‖H‖ and the eigenstate represented by a columnvector C. This is Heisenberg’s matrix mechanics.

To be a bit more specific let φl(x) be a complete orthonormal setof single-particle (spin-)orbitals, and let ΦL, L = (l1 . . . lN ) run over theN -particle Slater determinants (1.15) of all possible orbital configurations(again using some linear order of the l-labels). In analogy to (1.34–1.36) onenow gets

HLL′ =∑

i

−1

2∇2 + v(x)

l′i

(−1)P∏

j(6=i)

δll′j +

+1

2

i6=j

[

〈lıl|w|l′il′j〉 − 〈lıl|w|l′jl′i〉]

(−1)P∏

k(6=i,j)

δlkl′k (1.48)

with i = P ı, and P is defined in the same way as in (1.34), particularlyagain P = identity for L′ = L. The orbital matrix elements are

〈l|h|m〉 def=

l

−1

2∇2 + v(x)

m

=

=∑

s

d3r φ∗l (r, s)

[

−1

2∇2 + v(r, s)

]

φm(r, s) (1.49)

and

〈lm|w|pq〉 =∑

ss′

d3rd3r′ φ∗l (r, s)φ∗m(r′, s′)w(|r − r′|)φq(r′, s′)φp(r, s). (1.50)

Clearly, 〈lm|w|pq〉 = 〈ml|w|qp〉.

Page 23: The Fundamentals of Density Functional Theory (revised and

1.4 Hartree-Fock Theory 25

1.4 Hartree-Fock Theory

For an interacting N -fermion system, a single Slater determinant (1.15) canof course not be a solution of the stationary Schrodinger equation (1.2).However, referring to (1.3), one can ask for the best Slater determinantapproximating the true N -particle ground state as that one which minimizesthe expectation value of the Hamiltonian H among Slater determinants. Thecorresponding minimum value will estimate the true ground state energyfrom above.

This would, however, in general be a too restrictive search. The pointis that in most cases of interest the Hamiltonian H does not depend onthe spins of the particles: v(r, s) = v(r). Consequently, the true groundstate has a definite total spin S2 = 〈∑x,y,z

α (∑

i σiα)2〉, whereas a Slater

determinant of spin-orbitals in general does not have a definite total spin;rather such a spin eigenstate can be build as a linear combination of Slaterdeterminants with the same spatial orbitals but different single-particle spinstates occupied. Depending on whether the total spin of the ground stateis zero or non-zero, the approach is called the closed-shell and open-shellHartree-Fock method, respectively.

We restrict our considerations to the simpler case of closed shells and willsee in a minute that in this special case a determinant of spin-orbitals woulddo. In this case, the number N of spin-half particles must be even becauseotherwise the total spin would again be half-integer and could not be zero.A spin-zero state of two spin-half particles is obtained as the antisymmetriccombination of a spin-up and a spin-down state:

〈s1s2|S = 0〉 =1√2

(

χ+(s1)χ−(s2)− χ−(s1)χ

+(s2))

. (1.51)

This is easily seen by successively operating with σ1α + σ2α, α = x, y, z (see(1.10, 1.11)) on it, giving a zero result in all cases. Hence, a simple productof N/2 spin pairs in states (1.51) provides a normalized N -particle S = 0spin state, which is antisymmetric with respect to particle exchange withinthe pair and symmetric with respect to exchange of pairs. (It cannot ingeneral be symmetric or antisymmetric with respect to exchange betweendifferent pairs.)

The two particles in the spin state (1.51) may occupy the same spatialorbital φ(r), maintaining the antisymmetric character of the pair wavefunc-tion

Φ(x1x2) = φ(r1)φ(r2)〈s1s2|S = 0〉. (1.52)

Page 24: The Fundamentals of Density Functional Theory (revised and

26 1. Many-Body Systems

If, for even N , we consider a Slater determinant of spin orbitals, where eachspatial orbital is occupied twice with spin up and down, and expand thedeterminant into a sum over permutations of products, then a permutationwithin a doubly occupied pair of states does not change the spatial part ofthose terms. For the spin part, those permutations just combine to a productof N/2 spin-zero states (1.51). The total Slater determinant is thus a linearcombination of products of spin-zero states, hence it is itself a spin-zero statein this special case. Moreover, as a Slater determinant it has the correctantisymmetry with respect to all particle exchange operations. Therefore,such a single Slater determinant can provide a spin-zero approximation to aclosed-shell ground state.

Now, take (for even N)N/2 spatial orbitals φi and build a Slater determi-nant with spin-orbitals φi+(xk) in the first N/2 rows and with spin-orbitalsφi−(xk) in the lower N/2 rows. Using the Laplace expansion, this Slaterdeterminant may be written as

ΦHF(x1 . . . xN) =(N/2)!√N !

k|k′

(−1)k|k′

(N/2)!det ‖φi+(xk)‖ det ‖φi−(xk′)‖,

(1.53)

where k|k′ means a selection of N/2 numbers k among the numbers1,. . . ,N , the remaining unselected numbers being denoted by k′. Thereare N !/(N/2)!2 different selections to be summed up with an appropriatelychosen sign for each item. The items of the sum are normalized, and theyare orthogonal to each other with respect to their spin dependence, becausethey differ in the selection of the variables of spin-up particles. Hence thereare no crossing matrix elements for any spin independent operator, and itsexpectation value may be calculated just with one of the terms in the sumof (1.53), all terms giving the same result.

With the help of (1.48–1.50) the expectation value of the Hamiltonian(with s-independent external potential v) in the state (1.53) is easily ob-tained to be

EHF = 〈ΦHF|H|ΦHF〉 = 2

N/2∑

i=1

〈i|h|i〉+ 2

N/2∑

i,j=1

〈ij|w|ij〉 −N/2∑

i,j=1

〈ij|w|ji〉.

(1.54)

The three terms are called in turn one-particle energy, Hartree energy,and exchange energy. Summation over both spin directions for each orbital

Page 25: The Fundamentals of Density Functional Theory (revised and

1.4 Hartree-Fock Theory 27

φi results in factors 2 for the one-particle term, 4 for the Hartree term, butonly 2 for the exchange term because the contained matrix element is onlynonzero if both interacting particles have the same spin direction. (Recallthat the interaction part of the Hamiltonian comes with a prefactor 1/2.)Note that both the Hartree and exchange terms for i = j contain (seeminglyerroneously) the self-interaction of a particle in the orbital φi with itself.Actually those terms of the Hartree and exchange parts mutually cancel in(1.54) thus not posing any problem. (There is of course one term with i = jbut opposite spin directions remaining correctly in the Hartree part.)

In order to find the minimum of this type of expression one must vary theorbitals keeping them orthonormal. However, as we already know, the de-terminant remains unchanged upon an orthogonalization of the orbitals, andhence it suffices to keep the orbitals normalized while varying them. Addingthe normalization integral for φk, multiplied with a Lagrange multiplier 2εk,to (1.54) and then varying φ∗k leads to the minimum condition

h φk(r) + vH(r)φk(r) + (vX φk)(r) = φk(r)εk (1.55)

with the Hartree potential

vH(r) = 2

N/2∑

j=1

d3r′ φ∗j(r′)w(|r − r′|)φj(r′) (1.56)

and the exchange potential operator

(vX φk)(r) = −N/2∑

j=1

d3r′ φ∗j(r′)w(|r − r′|)φk(r′)φj(r). (1.57)

(φ∗k is varied independently of φk which is equivalent to independently vary-ing the real and imaginary parts of φk; the variation then is carried out byusing the simple rule δ/δφ∗k(x)

dx′ φ∗k(x′)F (x′) = F (x) for any expression

F (x) independent of φ∗k(x).)The Hartree-Fock equations (1.55) have the form of effective single-

particle Schrodinger equations

F φk = φk εk, (1.58)

where the Fock operator F = −(1/2)∇2 + veff consists of the kinetic energyoperator and an effective potential operator

veff = v + vH + vX (1.59)

Page 26: The Fundamentals of Density Functional Theory (revised and

28 1. Many-Body Systems

called the mean field or molecular field operator.For a given set of N/2 occupied orbitals φi the Fock operator F as an

integral operator is the same for all orbitals. Hence, from (1.58), the Hartree-Fock orbitals may be obtained orthogonal to each other. From (1.55) it thenfollows that

N/2∑

i=1

εi =

N/2∑

i=1

〈i|h|i〉+ 2

N/2∑

i,j=1

〈ij|w|ij〉 −N/2∑

i,j=1

〈ij|w|ji〉. (1.60)

Comparison with (1.54) yields

EHF =

N/2∑

i=1

(εi + 〈i|h|i〉) = 2

N/2∑

i=1

εi − 〈W 〉 (1.61)

for the total Hartree-Fock energy. The sum over all occupied εi (includingthe spin sum) double-counts the interaction energy.

Coming back to the expression (1.54), one can ask for its change, if oneremoves one particle in the Hartree-Fock orbital φk (of one spin direction)while keeping all orbitals φj unrelaxed. This change is easily obtained to be

−〈k|h|k〉 − 2∑

j 〈kj|w|kj〉 +∑

j 〈kj|w|jk〉, which is just −εk as seen from(1.55–1.57). For a given set of occupied φi, (1.58) yields also unoccupiedorbitals as solutions. The change of (1.54), if one additionally occupies oneof those latter orbitals φk, is analogously found to be +εk. This result, whichmay be written as

(

∂EHF

∂nk

)

φj

= εk (1.62)

with nk denoting the occupation number of the Hartree-Fock orbital φk andthe subscript φj indicating the constancy of the orbitals, goes under thename Koopmans’ theorem [Koopmans, 1934]. It guarantees in most casesthat the minimum of EHF is obtained if one occupies the orbitals with thelowest εi, because removing a particle from φi and occupying instead a stateφj yields a change of EHF equal to εj − εi plus the orbital relaxation energy,which is usually smaller than εj − εi in closed shell situations. There mayotherwise, however, be situations where levels εi cross each other when theorbitals are allowed to relax after a re-occupation. If this happens for thehighest occupied and lowest unoccupied level (called HOMO and LUMO,where MO stands for molecular orbital), then a convergence problem mayappear in the solution process of the closed shell Hartree-Fock equations.

Page 27: The Fundamentals of Density Functional Theory (revised and

1.5 The Occupation Number Representation, N Varying 29

Note that the similarity of the Hartree-Fock equations (1.55) with asingle-electron Schrodinger equation is rather formal and not very far-reaching: the Fock operator is not a linear operator in contrast to any Hamil-tonian. It depends on the N/2 Hartree-Fock orbitals φi, lowest in energy,which appear as solutions to the Hartree-Fock equations. Thus, those equa-tions are highly nonlinear and must be solved iteratively by starting withsome guessed effective potential, solving (linearly) for the φi, recalculatingthe effective potential, and iterating until self-consistency is reached. Themolecular field is therefore also called the self-consistent field.

More details can for instance be found in [Slater, 1960, Chapters 12ff].

1.5 The Occupation Number Representation,N Varying

Up to here we considered representations of quantum mechanics with theparticle number N of the system fixed. If this number is macroscopicallylarge, it cannot be fixed at a single definite number in experiment. Zero massbosons as e.g. photons may be emitted or absorbed in systems of any scale.(In a relativistic description any particle may be created or annihilated,possibly together with its antiparticle, in a vacuum region just by applyingenergy.) From a mere technical point of view, quantum statistics of identicalparticles is much simpler to formulate with the grand canonical ensemblewith varying particle number, than with the canonical one. Hence there aremany good reasons to consider quantum dynamics with changes in particlenumber.

In order to do so, we start with building the Hilbert space of quantumstates of this wider frame. The considered up to now Hilbert space of allN -particle states having the appropriate symmetry with respect to particleexchange will be denoted by HN . In Section 1.3 an orthonormal basis |L〉of (anti-)symmetrized products of single-particle states out of a given fixedcomplete and orthonormalized set φi of such single-particle states wasintroduced. The set φi with some fixed linear order (φ1, φ2, . . .) of theorbitals will play a central role in the present section. The states |L〉 willalternatively be denoted by

|n1 . . . ni . . .〉,∑

i

ni = N, (1.63)

where ni denotes the occupation number of the i-th single-particle orbitalin the given state. For fermions, ni = 0, 1, for bosons ni = 0, 1, 2, . . .. Two

Page 28: The Fundamentals of Density Functional Theory (revised and

30 1. Many-Body Systems

states (1.63) not coinciding in all occupation numbers ni are orthogonal.HN is the complete linear space spanned by the basis vectors (1.63), i.e.the states of HN are either linear combinations

∑ |L〉CL of states (1.63)or Cauchy sequences of such linear combinations. (A Cauchy sequence is asequence |Ψn〉 with limm,n→∞ 〈Ψm − Ψn|Ψm − Ψn〉 = 0. The inclusion ofsuch sequences into HN means realizing the completeness property of theHilbert space, being extremely important in all considerations of limits; cf.Section 5.2. This completeness is not to be confused with the completenessof a basis set φi.) The extended Hilbert space F (Fock space) of all stateswith the particle number N not fixed is now defined as the completed directsum of all HN . It is spanned by all state vectors (1.63) for all N withthe above given definition of orthogonality retained, and is completed bycorresponding Cauchy sequences. Its vectors are given by all series

∑ |L〉CLwith |L〉 running over all states (1.63) for all N , for which the series

C2L

converges to a finite number. (A mathematical rigorous treatment can forinstance be found in [Cook, 1953, Berezin, 1965].)

Note that now F contains not only quantum states which are linear com-binations with varying ni so that ni does not have a definite value in thequantum state (occupation number fluctuations), but also linear combina-tions with varying N so that now quantum fluctuations of the total particlenumber are allowed too. For bosonic fields (as e.g. laser light) those quantumfluctuations can become important experimentally even for macroscopic N .

In order to introduce the possibility of a dynamical change of N , opera-tors must be introduced providing such a change. For bosons those operatorsare introduced as

bi| . . . ni . . .〉 = | . . . ni − 1 . . .〉√ni, (1.64)

b†i | . . . ni . . .〉 = | . . . ni + 1 . . .〉√ni + 1. (1.65)

These operators annihilate and create, respectively, a particle in the orbitalφi and multiply by a factor chosen for the sake of convenience. Particularly,in (1.64) it prevents producing states with negative occupation numbers.(Recall that the ni are integers; application of bi to a state with ni = 0gives zero instead of a state with ni = −1.) Considering all possible matrixelements with the basis states (1.63) of F , one easily proves that b and b†

are Hermitian conjugate to each other. In the same way the key relations

ni| . . . ni . . .〉 def= b†i bi| . . . ni . . .〉 = | . . . ni . . .〉ni, (1.66)

Page 29: The Fundamentals of Density Functional Theory (revised and

1.5 The Occupation Number Representation, N Varying 31

and

[bi, b†j] = δij, [bi, bj ] = 0 = [b†i , b

†j ] (1.67)

are proven, where the brackets in standard manner denote the commutator

[bi, b†j]

def= [bi, b

†j ]− = bib

†j − b†j bi. The occupation number operator ni is

Hermitian and can be used to define the particle number operator

N =∑

i

ni (1.68)

having arbitrarily large but always finite expectation values in the basisstates (1.63) of the Fock space F . The Fock space itself is the complete hull(in the above described sense) of the linear space spanned by all possiblestates obtained from the vacuum state

|〉 def= |0 . . . 0 . . .〉, bi|〉 = 0 for all i (1.69)

by applying polynomials of the b†i to it. This situation is expressed by sayingthat the vacuum state is a cyclic vector ofF with respect to the algebra of thebi and b†i . Obviously, any operator in F , that is any operation transformingvectors of F linearly into new ones, can be expressed as a power series ofoperators b†i and bi. This all together means that the Fock space provides

an irreducible representation space for the algebra of operators b†i and bi,defined by (1.67).

For fermions, the definition of creation and annihilation operators musthave regard for the antisymmetry of the quantum states and for Pauli’sexclusion principle following from this antisymmetry. They are defined by

ci| . . . ni . . .〉 = | . . . ni − 1 . . .〉ni (−1)P

j<i nj , (1.70)

c†i | . . . ni . . .〉 = | . . . ni + 1 . . .〉 (1− ni) (−1)P

j<i nj . (1.71)

Again by considering the matrix elements with all possible occupation num-ber eigenstates (1.63), it is easily seen that these operators have all theneeded properties, do particularly not create non-fermionic states (that is,states with occupation numbers ni different from 0 or 1 do not appear: ap-plication of ci to a state with ni = 0 gives zero, and application of c†i to astate with ni = 1 gives zero as well). The ci and c†i are mutually Hermitianconjugate, obey the key relations

ni| . . . ni . . .〉 def= c†i ci| . . . ni . . .〉 = | . . . ni . . .〉ni (1.72)

Page 30: The Fundamentals of Density Functional Theory (revised and

32 1. Many-Body Systems

and

[ci, c†j]+ = δij, [ci, cj]+ = 0 = [c†i , c

†j]+ (1.73)

with the anticommutator [ci, c†j]+ = cic

†j+c

†j ci defined in standard way. Their

role in the fermionic Fock space F is completely analogous to the bosoniccase. (The c†- and c-operators of the fermionic case form a normed completealgebra provided with a norm-conserving adjugation †, called a c∗-algebra inmathematics. Such a (normed) c∗-algebra can be formed out of the bosonicoperators b† and b, which themselves are not bounded in F and hence haveno norm, by complex exponentiation.)

As an example, the Hamiltonian (1.24) is expressed in terms of creationand annihilation operators and orbital matrix elements (1.49, 1.50) as

H =∑

ij

c†i〈i|h|j〉cj +1

2

ijkl

c†i c†j〈ij|w|kl〉clck (1.74)

Observe the order of operators being important in expressions of that type.This form is easily verified by considering the matrix element 〈L|H|L′〉 with|L〉 and |L′〉 represented in notation (1.63), and comparing the result with(1.48).

In order to write down some useful relations holding accordingly in boththe bosonic and fermionic cases, we use operator notations ai and a†i denotingeither a bosonic or a fermionic operator. One easily obtains

[ni, ai] = −ai, [ni, a†i ] = a†i (1.75)

with the commutator in both the bosonic and fermionic cases.Sometimes it is useful (or simply hard to be avoided) to use a non-

orthogonal basis φi of single-particle orbitals. The whole apparatus maybe generalized to this case by merely generalizing the first relations (1.67)and (1.73) to

[ai, a†j ]± = 〈φi|φj〉, (1.76)

which generalization of course comprises the previous relations of the orthog-onal cases. Even with a non-orthogonal basis φi the form of the originalrelations (1.67) and (1.73) may be retained, if one defines the operators aiwith respect to the φi and replaces the operators a†i by modified creationoperators a+

i with respect to a contragredient basis χi, 〈φi|χj〉 = δij. Ofcourse, this way the a+

i are no longer Hermitian conjugate to the ai.

Page 31: The Fundamentals of Density Functional Theory (revised and

1.6 Field Quantization 33

1.6 Field Quantization

Finally, a spatial representation may be introduced by defining field opera-tors

ψ(x) =∑

i

φi(x)ai, ψ†(x) =∑

i

φ∗i (x)a†i (1.77)

providing a spatial particle density operator

n(x) = ψ†(x)ψ(x) (1.78)

and obeying the relations

[ψ(x), ψ†(x′)]± = δ(x− x′),[ψ(x), ψ(x′)]± = 0 = [ψ†(x), ψ†(x′)]±. (1.79)

These relations are readily obtained from those of the creation and annihi-lation operators, and by taking into account the completeness relation

i

φi(x)φ∗i (x′) = δ(x− x′) (1.80)

of the basis orbitals.In terms of field operators, the Hamiltonian (1.24) reads

H =

dx ψ†(x)

[

−1

2∇2 + v(x)

]

ψ(x) +

+1

2

dxdx′ ψ†(x)ψ†(x′)w(|r − r′|) ψ(x′)ψ(x), (1.81)

which is easily obtained by combining (1.74) and (1.77).Field-quantized interaction terms contain higher-order than quadratic

expressions in the field operators and hence yield operator forms of equa-tions of motion (in Heisenberg picture) which are nonlinear. Note, however,that, contrary to the Fock operator of (1.55), the Hamiltonians (1.74, 1.81)are linear operators in the Fock space of states |Ψ〉 as demanded by the su-perposition principle of quantum theory. In this respect, the Fock operatorrather compares to those operator equations of motion than to a Hamilto-nian. (See also the comment at the end of Section 1.4.)

We do not dive here into the subtle mathematical problems connectedwith the thermodynamic limit N → ∞, N/V = const. (see, e.g.[Sewell, 1986] and references given therein). In the simplest possible case

Page 32: The Fundamentals of Density Functional Theory (revised and

34 1. Many-Body Systems

of noninteracting with each other fermions, where the ground state may bea single determinant, this ground state may be redefined as a new vacuumstate |0〉 on which the former creation operators c†i of occupied states i now

act as annihilation operators hi = c†i (yielding as previously the result zero),and the former annihilation operators of those states act now as hole statecreation operators h†i = ci. For unoccupied states the previous operators areretained. This redefinition, if done in the Hamiltonian also, puts the groundstate energy to zero and introduces instead of the former fermions two typesof particle excitations: former particles and holes. Both types of excitationhave now positive excitation energies equal to the absolute value of the en-ergy distance of the former states from the Fermi level. This sign reversionfor hole state energies in the single-particle part of the Hamiltonian (1.74)comes about by bringing the operator product hih

†i appearing after the re-

definition back to the normal order h†i hi and observing the anticommutationrule. By this elementary renormalization, for zero temperature the Fockspace is saved in the thermodynamic limit. Recall that by definition of theFock space, every state vector of the Fock space can be norm-approximatedarbitrarily close by a state vector which holds a finite number of particles.For finite temperatures or for interacting particles much more complex con-structs are necessary for the thermodynamic limit. These remarks concludeour short survey over the representations of quantum mechanics.

Page 33: The Fundamentals of Density Functional Theory (revised and

2 Density Matricesand Density Operators

As was repeatedly seen in the last chapter, for computing most of the usualmatrix elements only a few of the many coordinates of an N -particle wave-function are used, where due to the symmetry properties it is irrelevantwhich ones. Density matrices are a tool to extract the relevant informationout of such monstrous constructs as are N -particle wavefunctions. (For re-views, see [Erdahl and Smith, 1987, Coleman, 1963, McWeeny, 1960]; in thetext below we try to keep the notation as canonical as possible.)

In this chapter, we consider exclusively reduced density matrices relatedto an N -particle wavefunction. In a wider sense of quantum states, N -particle wavefunction states are called pure states as distinct from ensemblestates described by N-particle density matrices. The latter are consideredin Sections 4.5 and 4.6.

The first four sections of the present chapter deal directly with reduceddensity matrices to the extent to which it is needed in the context of ourprogram. The last three sections of this chapter also contain material indis-pensable in this context, and they are placed in the present chapter becausethey are in one or the other way connected to the notion of density matricesor densities, and logically they must precede the following chapters.

2.1 Single-Particle Density Matrices

The spin-dependent single-particle density matrix of a state |Ψ〉 is definedas (recall that ΨΨ∗ is always symmetric in its arguments)

γ1(x; x′) = N

dx2 . . . dxN Ψ(xx2 . . . xN )Ψ∗(x′x2 . . . xN ). (2.1)

Its spin-independent version is

γ1(r; r′) =∑

s

γ1(rs; r′s). (2.2)

The pre-factor N , the particle number, was chosen in order that the diagonalof the latter density matrix gives the spatial density of particles, n(r). By

Page 34: The Fundamentals of Density Functional Theory (revised and

36 2. Density Matrices and Density Operators

definition, n(r) is the probability density of measuring one of the particlecoordinates r1 . . . rN at point r. This is just N times the probability densityto measure the coordinate r1 at r, hence

n(r) = γ1(r; r), tr γ1 =

dx γ1(x; x) =

d3r γ1(r; r) = N. (2.3)

Treating the arguments of γ1 as (continuous) matrix indices, the trace tr ofthe matrix is to be understood as the integral over its diagonal.

The spatial diagonal of the spin-dependent single-particle density matrixis the spin-density matrix

nss′(r) = γ1(rs; rs′) (2.4)

containing the information on the direction and spatial density of spin po-larization (cf. Section 2.4).

With a fixed basis of single-particle orbitals φi, the Heisenberg repre-sentation of these density matrices is

〈i|γ1|j〉 =

dxdx′ φ∗i (x)γ1(x; x′)φj(x

′) (2.5)

for spin-orbitals φi(x) and

〈i|γ1|j〉 =

d3rd3r′ φ∗i (r)γ1(r; r′)φj(r′) (2.6)

for spatial orbitals φi(r). Note that the index sets for both cases are dif-ferent: In (2.5) the subscript i distinguishes spin-orbitals, which may havethe same spatial parts but different spin parts of the orbitals, in (2.6) theequally denoted subscript only differs between spatial orbitals. The traceover the spin variables has already been performed in (2.6) according to(2.2). Expressions like 〈i|γ1|j〉 are to be understood always in the contextof the orbital sets considered. Due to this difference the first expressionis a spin-dependent density matrix whereas the second is a spatial densitymatrix. Their diagonals yield the occupation numbers of spin-orbitals andspatial orbitals, respectively.

If Ψ of (2.1) is an (anti-)symmetrized product of those orbitals, thesedensity matrices (2.5) and (2.6) are diagonal with integer diagonal elements,restricted to 1 or 0 in the case of fermions. In general the eigenvalues of thesingle-particle density matrix of fermions are real numbers (because, as is

Page 35: The Fundamentals of Density Functional Theory (revised and

2.1 Single-Particle Density Matrices 37

easily seen, the density matrix is Hermitian) between 0 and 1. Furthermore,for any (anti-)symmetrized product Ψ of single-particle orbitals φi(x),

γ21 =

dx′′ γ1(x; x′′)γ1(x

′′; x′) =∑

k

γ1|k〉〈k|γ1 = γ1, (2.7)

and this property is decisive for a state to be an (anti-)symmetrized productof single-particle orbitals. These latter statements are easily obtained withthe help of the definition (2.1), yielding

γ1(x; x′) =

N∑

i=1

φi(x)φ∗i (x′) (2.8)

in the considered case.As an example, the density matrix in momentum representation

〈k|γ1|k′〉 = δkk′ 2Θ(kf − |k|) (2.9)

corresponding to the ground state of the Hamiltonian Hf of (1.41)—thehomogeneous interaction-free fermion gas—is considered. It is diagonal be-cause the particles occupy k-eigenstates, one per spin for k-vectors insidethe Fermi sphere. Fourier back-transformation yields it in spatial represen-tation:

γ1(r; r′) =1

V

kk′

eik·r〈k|γ1|k′〉e−ik′·r

=

=2

8π3

k<kf

d3k eik·(r−r′) =

=k3f

π2

sin(kf |r − r′|)− (kf |r − r′|) cos(kf |r − r′|)(kf |r − r′|)3

. (2.10)

In agreement with (1.43) one finds for r′ → r

γ1(r; r) =k3f

3π2, (2.11)

the previous connection between the Fermi radius and the particle density.In the language of the algebra of representations, the first line of (2.10)

may be understood as γ1(r; r′) = 〈r|γ1|r′〉 =∑

kk′〈r|k〉〈k|γ1|k′〉〈k′|r′〉 onthe basis of completeness of the k-states (1.27).

Page 36: The Fundamentals of Density Functional Theory (revised and

38 2. Density Matrices and Density Operators

2.2 Two-Particle Density Matrices

The spin-dependent two-particle density matrix is defined as (some authorsomit the factor 2! in the denominator; our definitions pursue the idea thatγN = ΨNΨ∗N for any closed N -particle system)

γ2(x1x2; x′1x′2) =

=N(N − 1)

2!

dx3 . . . dxN Ψ(x1x2x3 . . . xN )Ψ∗(x′1x′2x3 . . . xN ).

(2.12)

It is related to the single-particle density matrix by the integral

γ1(x; x′) =

2

N − 1

dx2 γ2(xx2; x′x2). (2.13)

The spin-independent relations are obtained accordingly by taking the traceover spin variables. Analogously to the single-particle case, the diagonal ofthe two-particle density matrix gives the pair density n2(x1, x2), i.e. theprobability density to find one particle at x1 and another at x2. This is theprobability density to measure one of the particle coordinates at x1 and asecond one at x2, being just N(N−1) times the probability density that theoriginal coordinates x1 and x2 of the wavefunction are measured at thosepoints:

n2(x1, x2) = 2 γ2(x1x2; x1x2). (2.14)

Matrices with respect to single-particle orbital sets like (2.5, 2.6) may bedefined in an analogous manner.

Consider as an example the two-particle density matrix of the determi-nant state (1.15) of non-interacting fermions with the first N spin-orbitals

Page 37: The Fundamentals of Density Functional Theory (revised and

2.2 Two-Particle Density Matrices 39

occupied (li = i). It is explicitly given by

γ2(x1x2; x′1x′2) =

=N(N − 1)

2

dx3 . . . dxN ∗

∗ 1

N !

P

(−1)P

[

i

φPi(xi)

]

P ′

(−1)P′

[

i

φ∗P ′i(x′i)

]

=

=N(N − 1)

2N !

N∑

ij=1

[

φi(x1)φj(x2)φ∗i (x′1)φ∗j (x′2)−

−φi(x1)φj(x2)φ∗j(x′1)φ∗i (x′2)

]

(N − 2)! =

=1

2

N∑

ij=1

[

φi(x1)φj(x2)φ∗i (x′1)φ∗j (x′2)− φi(x1)φj(x2)φ

∗j(x′1)φ∗i (x′2)

]

.

(2.15)

This follows from just a somewhat simplified variant of the algebra of (1.39,1.40). In the first expression the two determinants of spin-orbitals are ex-panded. P means a permutation of the subscripts 12 . . .N into P1P2 . . .PNwith the indicated sign factor being ±1 for even and odd orders of permu-tations, respectively. For i > 2, x′i = xi. Due to the orthonormality ofthe spin-orbitals the integral is equal to unity if and only if Pi = P ′i fori = 3 . . . N ((N−2)! cases). In each of those cases either P1 = P ′1,P2 = P ′2or P1 = P ′2,P2 = P ′1, P and P ′ having the same order in the firstpossibility and differing by one order in the second. Summation over allP1 = i,P2 = j yields the final result. (Although, of course, P1 6= P2,retaining the i = j terms in the double sum is harmless, because the corre-sponding square bracket expressions are zero. See the corresponding discus-sion after (1.54).)

Suppose now that N/2 spatial orbitals are occupied for both spin direc-tions. From the last expression it immediately follows that

2γ2(r1r2; r′1r′2) =

=

N/2∑

ij=1

[

4φi(r1)φj(r2)φ∗i (r′1)φ∗j(r′2)− 2φi(r1)φj(r2)φ

∗j(r′1)φ∗i (r′2)

]

.

(2.16)

According to the rules, summation was taken over s′1 = s1 and s′2 = s2 givinga factor 4 in the first term, but only a factor 2 in the second because this is

Page 38: The Fundamentals of Density Functional Theory (revised and

40 2. Density Matrices and Density Operators

nonzero only if the spins of φi(x) and φj(x) were equal. Considering

2

N/2∑

i=1

φi(r)φ∗i (r) = n(r) (2.17)

and a spin-summed variant of (2.14) yields

n2(r1, r2) = n(r1)n(r2)−1

2|γ1(r1; r2)|2. (2.18)

As is seen, even for non-interacting fermions the pair density does not reduceto the product of single-particle densities as it would be for non-correlatedparticles. The correlation expressed by the last term of (2.18) is calledexchange.2

Two different pair correlation functions are introduced in the generalcase:

g(r1, r2) =n2(r1, r2)

n(r1)n(r2), h(r1, r2) = n2(r1, r2)− n(r1)n(r2). (2.19)

For large spatial distances |r1 − r2|, g usually tends to unity and h to zero.For non-interacting fermions, h is just given by the last term of (2.18).

For the homogeneous non-interacting fermion gas it is given by half thesquare of (2.10), and

g(r) = 1− 9

2

[

sin kfr − kfr cos kfr

(kfr)3

]2

(2.20)

with r = |r1 − r2|, g(0) = 1/2. An exchange hole around a given fermionis dug out of the distribution of all the other fermions, half of the averagedensity in depth and oscillating with the wavelength π/kf at large distances(cf. Fig.2). Its depth has to do with only the particles of equal spin directiontaking part in the exchange.

2In the probability-theoretical sense of particle distributions, exchange in the pairdensity is of course a correlation. It is a particular type of correlation which has its originsolely in quantum kinematics (symmetry of the many-particle wavefunction), thereforeit appears even in non-interacting systems. Of course, it does not change the energy ofa system of non-interacting particles, as this energy depends only on the single-particledensity matrix (cf. Section 2.5). In the context of many-particle physics, however, theword ‘correlation’ is used in a narrower meaning and is reserved for particle correlationdue to interaction and beyond exchange. For interacting systems, both exchange andcorrelation contribute to the energy.

Page 39: The Fundamentals of Density Functional Theory (revised and

2.3 Density Operators 41

0

0.5

1

0 0.5 1 1.5 2 2.5 3 3.5 4

g(r)

kfr

0.995

1.005

5 10 15

g(r)

kfr

Figure 2: Pair correlation function g(r) of the homogeneous non-interacting electrongas. In this case, h(r) = n2 [g(r) − 1]. If the volume per electron is represented by aball of radius rs, i.e. 4πr3

s/3 = n−1, then, from (1.43), kfrs = (9π/4)1/3 = 1.919.

2.3 Density Operators

The particle density may be represented as the expectation value of a par-ticle density operator, which, for an N -particle system and with the spin

Page 40: The Fundamentals of Density Functional Theory (revised and

42 2. Density Matrices and Density Operators

dependence retained, is formally defined as

n(x) =

N∑

i=1

δ(r − ri) δsσi. (2.21)

Here, ri is the position operator and σi is the spin operator of the i-thparticle. In the Schrodinger representation, ri reduces simply to the positionvector ri. As

d3r δ(r − ri) = 1 and∑

s δsσi= 1, one has

dx n(x) = N. (2.22)

To be precise, 1 and N , respectively, have to be understood as the real num-ber multiplied with the identity operator in the corresponding representationspace. The spin-dependent number density in the many-body quantum stateΨ is

n(x) = 〈Ψ|n(x)|Ψ〉 = 〈n(x)〉. (2.23)

The equivalence of this expression with the definition of n(x) = nss(r) asthe diagonal of (2.4) is easily seen from (2.21) and the (anti-)symmetry ofthe wavefunction Ψ with respect to an interchange of particle coordinatesand spin variables. Both definitions of n(x) yield immediately

dxn(x) = N, (2.24)

and hence∫

dx (n(x)− n(x)) = 0, (2.25)

i.e., not only the quantum average 〈n(x) − n(x)〉 of density fluctuations iszero but also the spatial average.

Summing n(x) over the spin variable s, one obtains a spin-independentparticle density operator n(r) with the relations

n(r) =N

i=1

δ(r − ri),

d3r n(r) = N, (2.26)

n(r) = 〈Ψ|n(r)|Ψ〉 = 〈n(r)〉, (2.27)

Page 41: The Fundamentals of Density Functional Theory (revised and

2.4 Expectation Values and Density Matrices 43

d3r n(r) = N,

d3r (n(r)− n(r)) = 0 (2.28)

in analogy to (2.21-2.25).The same operators (2.21, 2.26) in a Fock-space representation may be

expressed as

n(x) = ψ†(x)ψ(x), n(r) =∑

s

ψ†(x)ψ(x) (2.29)

through field operators (cf. (1.78)).

2.4 Expectation Values and Density Matrices

We start these considerations with the Schrodinger representation. Givensome single-particle operator

T1 =

N∑

i=1

t1(ri), (2.30)

its expectation value in the many-body state Ψ is

〈T1〉 =

dx1 . . . dxN Ψ∗(x1 . . . xN)

N∑

i=1

t1(ri) Ψ(x1 . . . xN ) =

= N

dx

[

t1(r)

dx2 . . . dxN ∗

∗Ψ(xx2 . . . xN )Ψ∗(x′x2 . . . xN )]

x′=x=

=

d3r[

t1(r) γ1(r; r′)]

r′=r= tr

(

t1γ1

)

. (2.31)

In the second line, the symmetry of ΨΨ∗ with respect to an interchange ofparticle variables xi and x1 was used. After summing over s = s′ from whicht1 is assumed independent, the spin-independent single-particle density ma-trix (2.2) appears. To ensure that t1 only operates on the variable r comingfrom Ψ, r′ is to be put equal to r at the very end. Hence, the knowledgeof this single-particle density matrix suffices to calculate 〈T1〉. The kineticenergy, for instance, is obtained as

〈T 〉 = −1

2

d3r[

∇2 γ1(r; r′)]

r′=r. (2.32)

Page 42: The Fundamentals of Density Functional Theory (revised and

44 2. Density Matrices and Density Operators

Similarly, expectation values of a spin operator are expressed by means ofthe spin dependent single-particle density matrix. For instance, from (1.10),a vector spin operator σ =

α eασα may be defined with Cartesian co-ordinate unit vectors eα. (It is this construct which connects the SU(2),the transformation group of the two-dimensional unitary space of spinors,with the SO(3), the transformation group of the orthogonal space of three-dimensional Euclidean vectors.) We define the operator of the vector spindensity in Schrodinger representation as

S(r) =N

i=1

σi δ(r − ri). (2.33)

Twice its expectation value, which is equal to the spin magnetization densityin units of µBohr—the Bohr magneton; the factor two is the gyromagneticfactor—is obtained from the spin-density matrix (2.4):

m(r)def= 2〈S(r)〉 = 2

ss′

σs′s nss′(r) = 2 tr s(σn(r)). (2.34)

With the help of (1.10), the components of m are expressed through theelements of the Hermitian spin-density matrix

nss′(r) =

(

n++(r) n+−(r)n−+(r) n−−(r)

)

(2.35)

as

mx(r) = 2Re n−+(r), my(r) = 2Im n−+(r),

mz(r) = n++(r)− n−−(r). (2.36)

(Formally, the spin-density matrix is a symmetric second rank spinor; everysymmetric second rank spinor can be related to a vector as every symmetriceven rank spinor can be related to a symmetric irreducible tensor.) It iseasy to verify that m2 = ( trnss′)

2−4 detnss′. The trace (being equal to theparticle density n) and the determinant of the spin-density matrix are thetwo invariants with respect to spatial rotations. Likewise, two independentinvariants are n and |m|. The degree of spin polarization ζ and its directionem are defined as

ζdef=|m|n, 0 ≤ ζ ≤ 1, em

def=

m

|m| . (2.37)

Page 43: The Fundamentals of Density Functional Theory (revised and

2.4 Expectation Values and Density Matrices 45

Particle density n(r) and degree of spin polarization ζ(r) form another pairof invariants of the spin-density matrix. Together with the magnetizationdirection em(r), which contains two further independent real functions ofr, they comprise the four independent real functions of r contained in theHermitian spin-density matrix (2.35). Sometimes it makes sense to fix onlyan undirected magnetization axis, and to let ζ vary between -1 and 1.

A simpler situation appears, if the spin quantization axis (magnetizationaxis) is fixed in the whole space (collinear spin situation). In this case, givensome single-particle operator

B1 =

N∑

i=1

B1(xi), (2.38)

which is merely an ordinary function of the particle variables xi, its expec-tation value is readily obtained from the number density n(x) as

〈B1〉 =

dxB1(x)n(x). (2.39)

An important example is

B1(x) = v(r)− 2sB(r) (2.40)

with an external potential v and an external magnetic field B in z-directioncoupled to the spin only (z being the quantization axis of the spin).

A prototype of a two-particle operator is a pair-potential operator, whichin the case of a spin-independent interaction, the Coulomb interaction say,has the form

A2 =1

2

i6=j

A2(ri, rj). (2.41)

(In the Coulomb case A2(ri, rj) = 1/|ri−rj|.) In analogy to (2.31) one finds

〈A2〉 =

d3rd3r′A2(r, r′)γ2(r, r

′; r, r′) =

=1

2

d3rd3r′A2(r, r′)n2(r, r

′) (2.42)

which expresses the expectation value of a two-particle operator by meansof the two-particle density matrix γ2 (more specifically its diagonal part—in

Page 44: The Fundamentals of Density Functional Theory (revised and

46 2. Density Matrices and Density Operators

correspondence with the expectation value (2.39) being expressed throughthe density, i.e. the diagonal part of the single-particle density matrix). Thefactor 1/2 does not appear in the first expression because it was alreadyabsorbed into the definition (2.12) of the two-particle density matrix.

An alternative way of deriving (2.39) uses the particle density operator(2.21):

〈B1〉 =

dxB1(x)N

i=1

δ(r − ri)δssi

=

=

dxB1(x)

N∑

i=1

δ(r − ri)δssi

=

=

dxB1(x)n(x). (2.43)

The first equality just rewrites the operator (2.38) with the help of an integralover a sum of δ-functions. Taking out of the brackets terms not depending onthe particle variables ri, si leaves inside the brackets just the particle densityoperator (in the Schrodinger representation used for (2.38)), the expectationvalue of which is the particle density.

The expectation value (2.42) may also be expressed via the particle den-sity operator (2.21). This needs only an additional little trick consisting inadding to and subtracting from the double sum of (2.41) ‘self-interaction’terms:

〈A2〉 =

1

2

d3rd3r′A2(r, r′) ∗

∗[

ij

δ(r − ri)δ(r′ − rj)−

i

δ(r − ri)δ(r′ − ri)

]⟩

=

=1

2

d3rd3r′A2(r, r′) ∗

∗⟨

ij

δ(r − ri)δ(r′ − rj)−

i

δ(r − ri)δ(r′ − r)

=

=1

2

d3rd3r′A2(r, r′) [ 〈n(r)n(r′) 〉 − n(r)δ(r − r′) ] . (2.44)

(Those usually infinite ‘self-interaction terms’ of a point particle ‘on place’are not to be confused with the finite self-interaction terms of a particle in a

Page 45: The Fundamentals of Density Functional Theory (revised and

2.5 The Exchange and Correlation Hole 47

spatial orbital appearing in (1.54) of the Hartree-Fock theory.) In the secondsum of the second line, the argument of the second δ-function was changedin accordance with the argument of the first one. After this change, thesecond δ-function can be taken out of the brackets. Note that for a singularinteraction potential as the Coulomb one the separate r-integral over thelast item containing the δ-function would be infinite. This shows that thesplitting of the expression for n2(r, r

′) in square brackets into two itemsis formal: the expectation value of the first item must contain the sameδ-function contribution canceling the second since the Coulomb interactionenergy of an N -electron quantum system is finite. The formal splittingallows, however, an explicit expression of 〈A2〉 in terms of the particle densityoperator only. (Of course it cannot be expressed in terms of the particledensity n(r) only.) Integration of the expression in square brackets over r

and r′ yields with the help of (2.26)

d3rd3r′ n2(r, r′) =

d3rd3r′ [〈n(r)n(r′)〉 − n(r)δ(r − r′)] =

=

d3r [〈n(r)N〉 − n(r)] =

=

d3r n(r)(N − 1) = N(N − 1) (2.45)

in accordance with (2.14) and (2.12).

Since the density operators (2.21) and (2.26) were defined representationindependent, the result (2.44) likewise holds in every representation.

2.5 The Exchange and Correlation Hole

The results of the last section show that knowledge of γ1 (comprising theknowledge of n as its diagonal) and of n2 suffices to calculate the totalenergy of the system as the expectation value of the Hamiltonian (1.24).

Page 46: The Fundamentals of Density Functional Theory (revised and

48 2. Density Matrices and Density Operators

Considering additionally (2.19) yields

E = 〈H〉 = −1

2

d3r[

∇2γ1(r; r′)]

r′=r+

+

dx v(x)n(x) +

+1

2

d3rd3r′ n(r′)w(|r′ − r|)n(r) +

+1

2

d3rd3r′w(|r′ − r|)h(r′, r) =

= Ekin + Epot + EH +WXC. (2.46)

In a natural way the energy is decomposed into the kinetic energy Ekin,the interaction energy Epot with the external potential v(x), the so-calledHartree energy EH being the classical interaction energy of a density n(r)with itself, and an exchange and correlation energy WXC, which may bevisualized as appearing from the interaction energy ωXC(r) of a particle atr with an exchange and correlation hole (as used in Quantum Chemistry,cf. however (2.68))

hXC(r′, r) = h(r′, r)/n(r) (2.47)

surrounding the particle:

ωXC(r) =1

2

d3r′w(|r′ − r|)hXC(r′, r). (2.48)

The exchange and correlation energy

WXC =

d3r n(r)ωXC(r) (2.49)

is obtained by summing ωXC(r) over all particles, i.e. integrating n(r)ωXC(r)over the r-space. The factor 1/2 in ωXC appears because each particle isconsidered twice: once as the particle interacting with the exchange andcorrelation hole and once as taking part in the composition of this hole.

The number of particles missing in the exchange and correlation hole(2.47) can easily be obtained from (2.14) and (2.13) implying

d3r′ n2(r′, r) = (N − 1)n(r), (2.50)

Page 47: The Fundamentals of Density Functional Theory (revised and

2.5 The Exchange and Correlation Hole 49

which immediately leads to the general sum rule (cf. (2.19))

d3r′ hXC(r′, r) =

d3r′[

n2(r′, r)

n(r)− n(r′)

]

= −1, (2.51)

that is, exactly one particle is missed in any exchange and correlation hole,independent of the system and of the interaction, and independent of theposition r of the particle which sees the hole.

In the case of an uncorrelated N -fermion wavefunction (1.15) only anexchange hole appears. From (2.16–2.18) it is obtained as

hX(r′, r) = −∑N/2

ij=1 φi(r′)φj(r)φ∗j(r

′)φ∗i (r)∑N/2

i=1 φi(r)φ∗i (r). (2.52)

For the homogeneous non-interacting fermion gas one finds from (2.19,2.20) and (1.43)

hX(r) = −3k3f

2π2

[

sin kfr − kfr cos kfr

(kfr)3

]2

, hX(0) = − k3f

6π2= −n

2(2.53)

for the exchange hole (cf. the end of Section 2.2). Of course, in this holealso just one particle is missed, because this result (2.51) was obtained oncompletely general grounds. The additional amount of density, by particlerepulsion expelled from the region of small r-values as compared to the pureexchange hole of Fig.2, is piled up at larger r-values, where in case of particlerepulsion g(r) may exceed unity. Compared to Fig.2, the pair correlationfunction of an electron liquid with Coulomb interaction at a typical valencedensity starts out at r = 0 at a value well below 1/2 and approaches unityat kfr well before 4, then slightly overshooting. The relevance of this issuewill be discussed in Section 7.3.

The physical idea behind the notion of an exchange and correlation holeis that due to Pauli’s exclusion principle electrons of parallel spin cannotcome arbitrarily close to each other in space, and that the repulsive inter-action leads to a further reduction of the pair density for small distancesof the particles in the pair. In fact (2.53) results due to the first of thosetwo mechanisms. (The exchange and correlation hole hXC of an interactinghomogeneous electron liquid at metallic densities is close to −n for r = 0.)Formally however, the exchange and correlation hole is defined by (2.47) onthe basis of (2.46–2.49), and those expressions have no refer to the actualtotal particle number N , or to the actual particle number contributing to

Page 48: The Fundamentals of Density Functional Theory (revised and

50 2. Density Matrices and Density Operators

the particle density in a particular spatial region. This gives rise to situa-tions, where the above physical idea becomes irrelevant, and the exchangeand correlation hole takes on a different, more formal meaning. (A detaileddiscussion may be found in [Perdew and Zunger, 1981]). The extreme case isa single particle bound in an external potential well. In this case, of course,

EH +WXC = 0, (2.54)

since there is no second particle to interact with. Nevertheless there is anonzero particle density n(r) = φ∗(r)φ(r) and hence, formally, a Hartree en-ergy which must exactly be compensated by the WXC-term. This is achievedby putting

hXC(r′, r) = −n(r′) (2.55)

in this case. (Note that hXC is by definition not symmetric in its arguments.It describes a hole distribution over vectors r′ around a given position r.)The above result can be brought in accordance with the relations of Section2.2 by putting

γ2def= 0 for N < 2 (2.56)

and hence h(r′, r) = −n(r′)n(r) in this case. The sum rule (2.51) is ob-viously obeyed also by this pathological case N = 1. The ‘exchange andcorrelation hole’ of that single particle is independent of the particle’s posi-tion r, and is just equal to minus its own density.

Consider next an H2 molecule in its singlet ground state. This groundstate is reasonably well described by a Slater determinant in which bothelectrons occupy the lowest molecular orbital φ(r). From (2.52) we find inthis case

hXC(r′, r) = −φ(r′)φ∗(r′) = −1

2n(r′), (2.57)

again in agreement with the sum rule. This result is easily understood. Noexchange repulsion takes place, because the two electrons have opposite spin.Correlation due to interaction was neglected by the determinant ansatz. As−hXC equals the density of one of the electrons, n+hXC is the density of thesecond one. Thus, with the help of this hXC, WXC just subtracts the self-interaction of one electron in its orbital from the Hartree energy, containingby definition those self-interaction contributions. Again, in the considered

Page 49: The Fundamentals of Density Functional Theory (revised and

2.5 The Exchange and Correlation Hole 51

approximation of the ground state, the ‘exchange and correlation hole’ isindependent of the position r of the particle it surrounds. It provides againonly a correction term and not really a correlation term.

Let now the H2 molecule slowly dissociate. As is well known, for largerdistances between the two protons the Slater determinant of molecular or-bitals loses its value as an approximation for the electronic ground state,rather the latter is now well approximated by a Heitler-London ansatz

Ψ(r′r) =1√2(φA(r′)φB(r) + φB(r′)φA(r)), (2.58)

where φA and φB are now atomic orbitals around the centers A and B ofthe two protons. The two-particle density matrix is just the product of thistwo-particle wavefunction with its complex conjugate. The exchange andcorrelation hole can be written down in a straightforward way from thiswavefunction although it is rather involved. For large distances, however,for which the two orbitals do no longer overlap, φA(r)φB(r) = 0, it simplifiesto

hXC(r′, r) =

= −φA(r′)φ∗A(r′)φA(r)φ∗A(r) + φB(r′)φ∗B(r′)φB(r)φ∗B(r)

φA(r)φ∗A(r) + φB(r)φ∗B(r). (2.59)

If now both position vectors r′ and r are close to A, then only the firstterms of the numerator and denominator are nonzero, and we obtain a resultaround A similar to (2.55), likewise for B. If one of the positions is close toA and the other close to B, the expression (2.59) vanishes. Given a positionvector r close to one atomic site, the exchange and correlation hole extendsonly over that site. The Heitler-London ansatz contains only configurationswith one electron at each atomic site. Hence there is only locally on eachsite a self-interaction correction needed to correct for the Hartree energy.

As we have seen, for few-particle systems a large part if not all of theexchange and correlation hole has formally to be introduced in order toself-interaction correct the Hartree term. For large systems of essentiallydelocalized particles the self-interaction correction of the Hartree term tendsto zero, and the exchange and correlation hole takes on its physically intuitivemeaning discussed above. However, even in large systems, if the particlesremain essentially in localized orbitals as in a Heitler-London situation, self-interaction corrections keep up at a non-zero level.

Page 50: The Fundamentals of Density Functional Theory (revised and

52 2. Density Matrices and Density Operators

2.6 The Adiabatic Principle

This section introduces an important tool for the formal development ofmany-body theory yielding a basis for many proofs and representations ofissues and being called integration over the coupling constant. To this end,in the present section we will write explicitly down the coupling constant λof (1.21) which is otherwise put to unity in our context. In the Hamiltonian(1.4) we replace accordingly W by λW and denote the Hamiltonian itself byHλ. This notation is in accordance with (1.5).

Consider now the ground state solution of (1.2) at some given value ofλ, denoted by |Ψλ〉 and corresponding to the ground state energy

Eλ = 〈Ψλ| Hλ |Ψλ〉. (2.60)

Differentiating the second equation (1.2) yields

∂Ψλ

∂λ

Ψλ

+

Ψλ

∂Ψλ

∂λ

= 0, (2.61)

hence⟨

∂Ψλ

∂λ

Ψλ

+

Ψλ

∂Ψλ

∂λ

=

=

∂Ψλ

∂λ

Ψλ

Eλ + Eλ

Ψλ

∂Ψλ

∂λ

= 0. (2.62)

By merely applying a phase factor, the matrix elements of these relationscan be made real. Then, due to its normalization, a parametric change ofthe ground state will be orthogonal to that ground state.

We now assume that the derivative ∂Ψλ/∂λ indeed exists. This cangenerally be assumed to hold true if Ψλ is non-degenerate. In case of levelcrossing on varying λ, at least a special choice among the degenerate states isneeded as in perturbation theory, implying that |Ψλ〉 can generally no longerbe the ground state for all values of λ considered. Under the assumption for∂Ψλ/∂λ to exist, from (2.60, 2.62),

dEλdλ

=

Ψλ

dHλ

Ψλ

= 〈Ψλ| W |Ψλ〉 def= 〈W 〉λ. (2.63)

This very elementarily derivable relation is of enormous importance in quan-tum physics wherefore it carries the name of two, sometimes even three

Page 51: The Fundamentals of Density Functional Theory (revised and

2.6 The Adiabatic Principle 53

famous people having been among the first who put focus on it: It iscalled the (Pauli-)Hellmann-Feynman theorem3 [Pauli, 1933, Section A11],[Hellmann, 1937, Feynman, 1939].

The next important assumption is that there is a unique differentiablepath of ground states |Ψλ〉 in the state space for the whole λ-interval be-tween zero and unity. This is called the adiabaticity assumption because inimportant applications one has in mind a time-dependent λ and considersthe case dλ/dt→ 0.

In our context, if we call E0 the ground state energy of the interaction-free reference system (1.5) and denote by Eint the total energy change dueto particle interaction, we find

E = E0 + Eint, Eint =

∫ 1

0

dλ 〈W 〉λ. (2.64)

Note that Eint is different from EH +WXC: in contrast to the latter Eint con-tains kinetic energy changes due to correlation and also eventually changesin the interaction energy with an external field due to a particle-interactiondependence of the ground state density.

Besides (2.46), there is another decomposition of the total ground stateenergy of the system due to Kohn and Sham [Kohn and Sham, 1965], namely

E = T + Epot + EH + EXC, (2.65)

where T is the kinetic energy of an interaction-free reference system in suchan external potential v0(x) that it has the same ground state density n(x)as the considered interacting system in the external potential v(x). Thecontributions Epot and EH are defined as previously with the actual groundstate density n(x). The last term EXC is defined as the difference betweenthe left-hand side and the sum of the preceding terms, and is called theKohn-Sham exchange and correlation energy. It differs from WXC by thedifference between T and Ekin, the exchange and correlation contribution tothe kinetic energy:

T + EXC = Ekin +WXC (2.66)

At this point one further assumption is needed: There should be such apath vλ(x), 0 ≤ λ ≤ 1, in the functional space of potentials, that the

3In fact, the relation was considered even earlier, probably for the first time by P.Guttinger, Z. Phys. 73, 169 (1931).

Page 52: The Fundamentals of Density Functional Theory (revised and

54 2. Density Matrices and Density Operators

ground state density nλ(x) ≡ n(x) is kept constant for 0 ≤ λ ≤ 1. Then,dHλ/dλ = dVλ/dλ+ W , and hence, from the first relation (2.63),

dEλdλ

=

dxn(x)∂vλ(x)

∂λ+ EH +

1

2

d3rd3r′ n(r)w(|r − r′|)hXC,λ(r′, r).

The essential point in obtaining the second term of this relation, EH, wasthat nλ(x) is kept constant as a function of λ. Integration over λ yields

E − E0 =

∫ 1

0

dλdEλdλ

=

=

dxn(x)(v(x)− v0(x)) + EH +

+1

2

d3rd3r′ n(r)w(|r − r′|)∫ 1

0

dλ hXC,λ(r′, r),

E0 = T +

dxn(x)v0(x).

Comparison with (2.65) results in

EXC =1

2

d3rd3r′ n(r)w(|r − r′|)hKS(r′, r) (2.67)

with the Kohn-Sham exchange and correlation hole

hKS(r′, r) =

∫ 1

0

dλ hXC,λ(r′, r) (2.68)

[Perdew and Zunger, 1981, Gunnarsson and Lundqvist, 1976].Coming back to theorem (2.63), it, of course, holds for any parame-

ter λ the Hamiltonian depends on. A great variety of applications con-siders a change in time of the external field v(x). For instance, think ofa sufficiently slow motion of nuclei, with electrons quickly moving in theCoulomb field of those nuclei. The adiabatic principle says that a systemin the ground state |Ψλ〉 will remain all the time in its ground state, ifdλ/dt→ 0, the adiabaticity assumption mentioned above is valid, and thatground state is separated by a gap from the excitation spectrum of thesystem [Gell-Mann and Low, 1951]. Among many other applications thisadiabatic principle forms a basis for a quantum mechanical treatment ofadiabatic (in this sense) forces for the ionic motion in molecules and solids.

Page 53: The Fundamentals of Density Functional Theory (revised and

2.7 Coulomb Systems 55

2.7 Coulomb Systems

In this section we further specialize the Hamiltonian (1.24) to

HCoul = HCoul(ri;Zµ,Rµ) = T + VCoul =

= −1

2

N∑

i=1

∇2i −

N∑

i=1

M∑

µ=1

Zµ|ri −Rµ|

+1

2

N∑

i6=j

1

|ri − rj|+

+1

2

M∑

µ6=ν

ZµZν|Rµ −Rν |

. (2.69)

The external potential v is thought to be created by the Coulomb field ofM nuclei fixed at positions Rµ and having electric charges Zµ. The Nelectrons mutually interact with Coulomb forces too, and, for fixed Rµ, thelast term is just a constant energy—the Coulomb interaction energy of thenuclei. Natural units are used as in (1.24). The nuclear positions Rµ and thenuclear charges Zµ are treated as parameters upon which the Hamiltonian(2.69), the total ground state energy

E = E(Zµ,Rµ) = Ekin + ECoul (2.70)

and the N -electron state

Ψ(xi;Zµ,Rµ) = 〈x1 . . . xN |Ψ(Zµ,Rµ)〉 (2.71)

depend. (Ekin = 〈T 〉 is as previously defined, and ECoul = 〈VCoul〉 is the totalCoulomb energy; the notation in (2.71) is as in (1.8).)

We now consider various re-scalings of those parameters. Replace firstthe charges Zµ by λZµ, 0 ≤ λ ≤ 1. For λ = 0 the N electrons move in aconstant (zero) external potential and repel each other by Coulomb forces.They move to infinite mutual distances with the infimum of the total energybeing E(λ = 0) = 0. Simply by integrating the first equation of (2.63) overλ from 0 to 1 and taking into account that this time W does not depend onλ, one immediately gets [Wilson, 1962]

E = E(λ = 1) =

= −M

µ=1

∫ 1

0

d3rn(r;λ)

|r −Rµ|+

1

2

M∑

µ6=ν

ZµZν|Rµ −Rν |

(2.72)

for the total energy of the Coulomb system (2.69). Here, n(r;λ) is thedensity of the N interacting electrons in the Coulomb field of scaled nuclearcharges.

Page 54: The Fundamentals of Density Functional Theory (revised and

56 2. Density Matrices and Density Operators

In order to derive virial theorems we next scale all coordinate vectorsaccording to

HCoul, λdef= HCoul(λri;Zµ, λRµ) =

= λ−2 T (ri) + λ−1 VCoul(ri;Zµ,Rµ) (2.73)

and

Ψλ = Ψ(λrisi;Zµ, λRµ). (2.74)

While E of (2.70) is obtained as the lowest eigenvalue of the Schrodingerequation HCoulΨ = ΨE, now the equation

HCoul, λΨλ = ΨλEλ (2.75)

is to be considered. It differs from the original Schrodinger equation for Eonly by the notation of the dynamical variable being now λri, and by thereplacement Rµ → λRµ of parameters. While the former change of notationhas no effect on E, the latter replacement has. Hence,

Eλ = E(Zµ, λRµ),dEλdλ

λ=1

=∑

µ

Rµ ·∂E

∂Rµ. (2.76)

Alternatively, this derivative may be obtained by applying the first equation(2.63) to (2.73). Comparing both results yields

µ

Rµ ·∂E

∂Rµ= −2Ekin − ECoul. (2.77)

If the positions Rµ of the nuclei are chosen so that the adiabatic forces onthem Fµ = ∂E/∂Rµ are zero (adiabatic equilibrium positions), then

ECoul = −2Ekin = 2E (2.78)

follows.Instead of (2.69), the total Hamiltonian for the motion of both nuclei

and electrons

Htot = Tnucl + HCoul, Tnucl = −M

µ=1

∇2µ

2Mµ

(2.79)

Page 55: The Fundamentals of Density Functional Theory (revised and

2.7 Coulomb Systems 57

may be considered, where now the nuclear positions Rµ are dynamical vari-ables too, rather than parameters. If the above scaling is performed in theSchrodinger equation

HtotΨtot = ΨtotEtot, (2.80)

then this scaling merely results in a change of notation of the dynamicalvariables, hence Etot remains independent of λ. Application of (2.63) nowyields

ECOUL = −2EKIN = 2Etot,

ECOULdef= 〈Ψtot|VCoul|Ψtot〉, EKIN

def= 〈Ψtot|Tnucl + T |Ψtot〉. (2.81)

The difference between ECOUL and ECoul is caused by correlation betweenthe nuclei and by a change of the correlation between nuclei and electronsdue to nuclear motion, which is not contained in the adiabatic theory. Thiscorrelation energy is again negative and roughly equal to –2 times the kineticenergy of the nuclei, as can be seen from comparing (2.78) with (2.81): Thekinetic energy of the nuclei is roughly the difference EKIN−Ekin. The heavierthe masses Mµ of the nuclei, the smaller their kinetic energy compared tothat of the electrons, and hence the better the adiabatic approximation.(Non-adiabatic energy corrections are systematically of the order of M

−1/2µ ;

cf. e.g. [Born and Huang, 1968], Section 14.)A comprehensive review of the various virial theorems may be found in

[Marc and McMillan, 1985].Another general problem of Coulomb systems is connected with the

Hamiltonian (2.69), which is not suited for the thermodynamic limitN, M → ∞, N/V, M/V = const. The problem arises because in this limiteach of the three double-sums of (2.69) diverges already, if one fixes one par-ticle index and sums over the second. As a consequence we obtain infinite(with varying sign) contributions to the total energy per particle, and theenergy per volume E/V = 〈HCoul〉/V is not defined.

This problem can be circumvented in the following manner: Considerthe torus T 3 defined by

r ≡ r + LR1, r ≡ r + LR2, r ≡ r + LR3 (2.82)

with (R1, R2, R3) linear independent. Replace the Coulomb potential by aYukawa potential in the torus

vCoul(r) =1

r−→ vαL(r) =

l

vα(|r + Rl|), (2.83)

Page 56: The Fundamentals of Density Functional Theory (revised and

58 2. Density Matrices and Density Operators

vα(r) =e−αr

r, Rl = L(l1R1 + l2R2 + l3R3), li integer, (2.84)

which also means that the Poisson equation is to be replaced according to

∆vCoul = −4πδ(r) −→ (∆− α2)vα = −4πδ(r). (2.85)

Instead of the Hamiltonian (2.69) consider now the Hamiltonian on the torus

HαL = HαL(ri;Zµ,Rµ) =

= −1

2

N∑

i=1

∇2i −

N∑

i=1

M∑

µ=1

ZµvαL(|ri −Rµ|) +1

2

N∑

i6=j

vαL(|ri − rj |) +

+1

2

M∑

µ6=ν

ZµZνvαL(|Rµ −Rν |), (2.86)

where N and M now mean the number of electrons and the number ofnuclei, respectively, in the torus T 3. In this Hamiltonian, with α non-zero,for ri 6= Rµ and ri 6= rj all sums remain finite as they were in (2.69) fora finite system, i.e. with N and M finite in the whole space. Denote theground state energy of this torus by EαL(Zµ,Rµ), which for α non-zero isfinite, too.

For a finite system, a molecule or a radical say, these considerationswould be irrelevant. Nevertheless, it is easily seen that the energy (2.70) is

E(Zµ,Rµ) = limα↓0

limL→∞

EαL(Zµ,Rµ). (2.87)

In the L-limes, due to the exponential decay of the interaction with dis-tance, only one item of the l-sum in (2.83) survives, and the α-limes re-places afterwards vα back by vCoul. The essential point in this case is thatlimL→∞EαL(Zµ,Rµ) is a finite continuous function of α ≥ 0. Picking asufficiently large L and a sufficiently small α, but α ≫ 1/L, one can ap-proximate E(Zµ,Rµ) by EαL(Zµ,Rµ) arbitrarily close. For a finite L on theother hand, the limes

limα↓0

EαL(Zµ,Rµ)

does not exist unless the torus is perfectly neutral with zero dipolar moment.The individual l-sums of (2.83) diverge in that limes. Only for higher mul-tipole distributions the Coulomb potential decays fast enough with distanceto ensure convergence of all lattice sums, provided the summation over theneutral assembly is carried out before the limes is taken. In a ferroelectric

Page 57: The Fundamentals of Density Functional Theory (revised and

2.7 Coulomb Systems 59

crystal, this further needs a domain configuration with zero total moment (oralternatively the separation of an internal macroscopic mean electric field).

Hence, for an extended system, a different limes has to be considered.Here, for L = 1 one picks a large enough torus T 3

1 and ensures that the torusis perfectly neutral, i.e., N =

µ Zµ. Then, for integer L > 1, the positions

Rµ are assumed to form L3 identical copies of the original torus, attachedto each other. Finally,

(E/V )(Zµ,Rµ) = limL→∞

N/L3=const.

1

L3|T 31 |

limα↓0

EαL(Zµ,Rµ) (2.88)

is the ground state energy per volume, if the right side exists and is finite.In (2.88), |T 3

1 | is the volume of the torus T 31 the limiting process was started

with (e.g. unit cell in a crystal or a cutout of a disordered solid large enoughto be representative). By this prescription, the last term of (2.86) as wellas the external potential

µ ZµvαL(|r − Rµ|) on a given electron are in-dependent of L. In the L-limes, merely the accessible electron momentaaccording to (a slightly generalized) relation (1.29) become infinitely dense:The electrons are allowed to delocalize with the electrostatics kept perfectlybalanced. Picking a sufficiently large L, one can approximate (E/V )(Zµ,Rµ)by limα↓0 EαL(Zµ,Rµ)/L

3|T 31 | arbitrarily close.

Again, one can consider (E/V ) as the ground state energy per vol-ume for fixed Rµ as just described, or alternatively, by adding Tnucl, asthe ground state energy of the dynamical system of electrons and nuclei.That for an infinite system the latter is finite was only in the mid sixtiesrecognized as an important solvable problem, and (E/V ) was first provento be bounded from below in the end of sixties [Dyson and Lenard, 1967,Lenard and Dyson, 1968], whereby the proof turned out to be astonishinglycomplicated. A shorter proof with sharpening the estimate by more thanten orders of magnitude was given in [Lieb, 1976] with the help of some re-sults of Thomas-Fermi theory. Those considerations had essentially to fightwith the short-range part of the Coulomb potential (proof of absence of aCoulomb collapse), which problem was not considered above because wefixed the nuclear positions. In the latter work by Lieb, however, also theexistence of the thermodynamic limit of a neutral system was proven, whichis complicated due to the long-range part of the Coulomb potential.

All those proofs did not explicitly treat the Coulomb potential as thelimit of an exponentially decaying potential, but sort of a two step limit like(2.88) is effectively used in any numerical calculation of Coulomb energies of

Page 58: The Fundamentals of Density Functional Theory (revised and

60 2. Density Matrices and Density Operators

extended systems: The long-range part of the Coulomb potential is Fouriertransformed, and the q = 0 Fourier component is treated as if it were fi-nite. (Note from (2.85), that vCoul(q) ∼ 1/q2 while vα(q) ∼ 1/(q2 + α2).)A standard numerical approach to Coulomb energies of extended systemsis described in [Fuchs, 1935, Ewald, 1921]. The total energy is then self-consistently calculated with using a discrete k-mesh, and the convergencewith increasing k-point density is considered. In this approach, the aboveconsiderations are reflected in a delicacy of the limes q → 0, α → 0, leav-ing e.g. a potential constant fundamentally undetermined. Only the localelectro-neutrality ensures that the total energy per volume does not dependon such a constant.

Page 59: The Fundamentals of Density Functional Theory (revised and

3 Thomas-Fermi Theory

After having started in Chapter 1 with the N -particle wavefunction, in thepreceding chapter it was shown that the single-particle and two-particledensity matrices suffice to calculate most ground state properties of a many-body system described by the Hamiltonian (1.4). If one were able to classifythe set of all admissible two-particle density matrices—the correspondingsingle-particle density matrices follow from them via (2.13)—then the groundstate energy, E, of the many-body system could be obtained as the absoluteminimum of (2.46) over this set in a variational way, without reference tomany-body wavefunctions as needed in (1.3).

Single-particle density matrices may be classified as all possible self-adjoint trace-class operators (in the Hilbert space of the respective rep-resentation) with real non-negative eigenvalues between zero and unity inthe fermion case, and arbitrarily large in the boson case. Unfortunately,for two-particle density matrices the situation is much more complicated[Coleman, 1963, Ando, 1963].

The modern progress of density functional theory uses a philosophy whichin a manner of speaking starts from the other end. Exploiting the relationbetween particle densities n(r) and many-body wavefunctions Ψ(x1 . . . xN )of ground states, one tries to find a functional expression of the ground stateenergy E through the ground state density n(r) instead of the two-particledensity matrix, and then to base a variational principle for the density onthat functional relation.

Thomas-Fermi theory [Thomas, 1927, Fermi, 1927] is the earliest andmost naıve version of such theories, which is, however, still of consid-erable conceptual importance, and it is up to now the only explicitdensity functional theory being asymptotically exact in a certain sense[Lieb, 1981]. A recent survey from the user’s standpoint may be found in[Parr and Yang, 1989].

Page 60: The Fundamentals of Density Functional Theory (revised and

62 3. Thomas-Fermi Theory

3.1 The Thomas-Fermi Functionaland Thomas-Fermi Equation

Thomas and Fermi independently considered the first three terms of theenergy expression (2.46). At that time they were not aware of the exchangeenergy and neglected the correlation term. Hence the only contribution notreadily expressed through the particle density n(r) was the kinetic energy.There is, however, one model situation where even the kinetic energy isreadily expressed in terms of the particle density. This is the homogeneousinteraction-free fermion gas (with spin 1/2), for which (1.45) and (1.43) yield

εkin(n) = CF n2/3, CF =

3

10(3π2)2/3 = 2.8712 (3.1)

for the average kinetic energy per particle as a function of the (constantin space) particle density n. The kinetic energy per unit volume in thissituation is nεkin, and if in a real case the particle density varies sufficientlyslowly in space, then

Ekin ≈∫

d3r n(r)εkin(r) = CF

d3r n5/3(r) (3.2)

may serve as a workable approximation for the kinetic energy functionalof the particle density.4 Hence, the Thomas-Fermi functional for the totalenergy is

ETF[n(r); v(r)] = CF

d3r n5/3(r) +

d3r v(r)n(r) +

+1

2

d3rd3r′ n(r′)w(|r′ − r|)n(r). (3.3)

It is an explicitly given functional of both the density n(r) and the externalpotential v(r) (and of course also depends on the form of the pair interactionw(r)). For the Coulomb system of Section 2.7 it is

ETF[n(r);Zµ,Rµ] =

= CF

d3r n5/3(r)−M

µ=1

d3rZµn(r)

|r −Rµ|+

+1

2

d3rd3r′n(r′)n(r)

|r′ − r| +1

2

M∑

µ6=ν

ZµZν|Rµ −Rν |

, (3.4)

4Ekin ≥ CLT

d3r n5/3(r) with CLT = (3/10)(3π/4)2/3 is a rigorous result by[Lieb and Thirring, 1975].

Page 61: The Fundamentals of Density Functional Theory (revised and

3.1 The Thomas-Fermi Functional and Thomas-Fermi Equation 63

where, as in (2.69), the last constant (i.e. density independent) term of theCoulomb interaction energy of the nuclei is added in order to be preparedfor the possibility of a thermodynamic limit (cf. Section 2.7).

We are going to use (3.3) and (3.4), respectively, as variational expres-sions, in which the density n(r) will be varied in order to find the groundstate. To determine the range of admissible variations, the next step is tocharacterize the domain of definition of the above two functionals of n(r)[Lieb, 1981, mathematical details of the present chapter will follow this ba-sic paper]; cf. also [Lieb and Simon, 1977]. As most considerations in ourcontext will concern some complete (in the sense of the discussion in Sec-tion 1.5) functional space, integration should usually be understood in theLebesgue sense [Reed and Simon, 1973], see also Chapter 5 of the presenttext. In this context, given some real number p, 1 ≤ p ≤ ∞, f ∈ Lp meansthat

‖f‖p def=

[∫

dx|f(x)|p]1/p

, (1 ≤ p <∞) (3.5)

is finite, where integration is over the respective Lebesgue measurable space,e.g. the R3 in the present context. Functions coinciding almost everywhere(a.e.), i.e. everywhere except possibly on a set of points of Lebesgue measurezero, are considered equal (their difference is not causing any difference ofintegrals in which they appear), and f ∈ L∞ means

‖f‖∞ def= ess sup|f(x)| (3.6)

being finite, where the essential supremum ess sup is the smallest real num-ber r so that |f(x)| ≤ r a.e. (A reader not really interested in the moremathematical depths may on an intuitive heuristic level take the integralsof the present section in his usual understanding and ignore the appendagea.e.)

A more systematic treatment of the Lp-spaces will be given in Chapter5. Here we list only a few important issues without proof: If f ∈ Lp andf ∈ Lq with p < q, then f ∈ Lt for all p ≤ t ≤ q. If f ∈ Lp and g ∈ Lp′

with 1/p+ 1/p′ = 1, then fg ∈ L1, i.e.,∫

dx |fg| <∞, and hence a fortiori|∫

dx fg| < ∞. More generally, if f ∈ Lp ∩ Lq and g = g1 + g2, g1 ∈Lp′, g2 ∈ Lq′, where again 1/p+ 1/p′ = 1 = 1/q + 1/q′, then fg ∈ L1.

The kinetic energy integral of (3.3) or (3.4) is finite, if n(r) ∈ L5/3.On the other hand,

d3r n(r) = N demands n(r) ∈ L1. Let now n(r) ∈L5/3 ∩L1, v = v1 + v2, v1(r) ∈ L5/2, v2(r) ∈ L∞, w = w1 + w2, w1(|r|) ∈

Page 62: The Fundamentals of Density Functional Theory (revised and

64 3. Thomas-Fermi Theory

L5/2, w2(|r|) ∈ L∞. Then∫

d3r v1(r)n(r) is finite because v1 ∈ L5/2 andn ∈ L5/3, 2/5+3/5=1, and

d3r v2(r)n(r) is finite because v2 ∈ L∞ andn ∈ L1. Hence

d3r v(r)n(r) is finite. Furthermore,∫

d3r′ n(r′)w(|r′ − r|)is finite for all r for the same reason, and approaches zero for r → ∞, ifw(r) approaches zero for r → ∞. Hence the integral as a function of r isagain in L∞, and ETF[n(r); v(r)] of (3.3) exists and is finite. Furthermore,in the decomposition

1

|r| =e−|r|

|r| +1− e−|r||r| (3.7)

the first part is in L5/2 for the 3-dimensional r-space (∫

d3r r−5/2e−5r/2 ∼∫

dr r2r−5/2e−5r/2 =∫

dr r−1/2e−5r/2 < ∞), and the second part is boundedand hence in L∞. As we see, the above assumption on n(r) guarantees theexistence and finiteness of (3.4), too.

The following type of argument is repeatedly used in the context of den-sity functional theory: Given the particle number N , let admissible densitiesn(r) be all non-negative L5/3 ∩L1-functions with

d3r n(r) = N . Assumethat every admissible density is a ground state density for some externalpotential v(r) with the corresponding ground state denoted by Ψv. If v′ issome other potential, then obviously E[v] = 〈Ψv|Hv|Ψv〉 ≤ 〈Ψv′|Hv|Ψv′〉,from which one can expect E[v] ≈ ETF[nv; v]

def= infnETF[n; v] (because ev-

ery n is assumed to come from some Ψv′). Purists may instead take thefollowing relation for a definition:

ETFN [v] = inf

n

ETF[n; v]

n ∈ L5/3 ∩L1, n(r) ≥ 0,

d3r n(r) = N

.

(3.8)

Here, ETFN [v] is the N -particle Thomas-Fermi ground state energy related to

the external potential v. If v comes from nuclear charges as in (3.4), thenthe left-hand side of the last relation depends on the Zµ and on the nuclearconfiguration Rµ. On the right-hand side, a|b is the usual notation ofthe set of elements a for which the relations b hold.

In practically all cases of interest, v(r) is negative and approaches zeroat infinity:

v(r) ≤ 0, lim|r|→∞

v(r) = 0. (3.9)

If one adds one further particle in this case, then either the potential can bindit, and hence ETF

N+1[v] < ETFN [v], or it cannot bind it, the additional particle

Page 63: The Fundamentals of Density Functional Theory (revised and

3.1 The Thomas-Fermi Functional and Thomas-Fermi Equation 65

disappears at infinity in an E → 0 state, and hence ETFN+1[v] = ETF

N [v].In this latter case there is no minimal n(r) for N + 1 particles (whereforethe infimum of (3.8) is in general not a minimum), but in this case thereis always a maximum number Nmax of particles which the potential v canbind, and the corresponding particle density n(r) minimizes ETF[n; v] forall N equal to or larger than that maximum number of bound particles(the excess particles disappearing at infinity in E → 0 states). Hence, thefollowing minimum relation

ETFN [v] = min

n

ETF[n; v]

n ∈ L5/3 ∩L1, n(r) ≥ 0,

d3r n(r) ≤ N

(3.10)

is always true under the condition (3.9), where the sign of equality in thelast constraint of (3.8) had to be changed into ≤. The minimum is taken onfor some n with

d3r n = N , if N is smaller than Nmax, and it is taken onby some n with

d3r n = Nmax in all other cases.It is also easy to see that the functional ETF[n; v] is strictly convex in n

for every fixed v and for every non-negative pair potential w(r), i.e.,

ETF[ cn1 + (1− c)n2; v ] < cETF[n1; v] + (1− c)ETF[n2; v], 0 < c < 1.

(3.11)

This implies that the particle density minimizing the Thomas-Fermi energyfunctional is unique. This is, of course, a result of approximation becausein real quantum physics the ground state may be degenerate, in generalimplying a degeneracy of several varying ground state densities, too.

Adding the integral condition on the density, multiplied by a Lagrangemultiplier µ, to the energy functional (3.3) and varying it with respect tothe density yields the Thomas-Fermi equation

5

3CF n

2/3(r) = max µ− v(r)− vH(r), 0 def= [µ− v(r)− vH(r) ]+ ,

(3.12)

where, as previously, the Hartree potential

vH(r) =

d3r′ n(r′)w(|r′ − r|) (3.13)

was introduced. If, at some point r, n(r) > 0, then the minimum conditiondemands δETF/δn − µ = 0 there. If, however, n(r) = 0 for the minimal n,

Page 64: The Fundamentals of Density Functional Theory (revised and

66 3. Thomas-Fermi Theory

ETF

a)

n

♦ETF

b)

n

Figure 3: Minimum of ETF(n) in the domain 0 ≤ n ≤ ∞, a) inner point, b) leftdomain boundary.

then the slope may be positive, δETF/δn − µ ≥ 0 at that point r, becausethe density n(r) is not allowed to become negative (case of minimum at theboundary of domain of definition; cf. Fig.3, where for the sake of simplicityµ = 0 is assumed).

In the form (3.12), the Thomas-Fermi equation is an integral equation be-cause its solution n(r) appears under the integral of vH. The solving densityn(r) is obtained non-zero everywhere, where v(r) + vH(r) < µ. Otherwisen is zero. The chemical potential (Fermi level) µ is to be determined suchthat

d3r n = N . The Thomas-Fermi equation may be transformed intoa differential equation for the Hartree potential with the help of Poisson’sequation

−∇2vH(r) = 4πn(r). (3.14)

With the numerical value of CF from (3.1) and by inserting n(r) from (3.12)into (3.14) one finds

−∇2vH(r) =8√

2

3π[µ− v(r)− vH(r) ]3/2+ . (3.15)

Appropriate boundary conditions have to be added of course, which take ona simple form in problems of high symmetry.

Consider the enormous gain of the Thomas-Fermi approximation: In-stead of the solution of the N -particle Schrodinger equation, which is aneigenvalue problem in 3N coordinates (where N can be arbitrarily large, de-pending on the case of application), approximants of the ground state energyand the ground state density are now obtained from the solution of a single3-dimensional integral equation, or alternatively from the direct solution ofa 3-dimensional variational problem.

Page 65: The Fundamentals of Density Functional Theory (revised and

3.2 The Thomas-Fermi Atom 67

3.2 The Thomas-Fermi Atom

Since an excellent textbook treatment of the Thomas-Fermi atom is givenin [Landau and Lifshitz, 1977, §70], we keep the discussion brief here.

For a neutral atom,

N = Z, v = −Zr, ∇2v = 4πZδ(r). (3.16)

The Thomas-Fermi equation is solved by a spherical n(r) in this case, andsince the minimizing density of the Thomas-Fermi functional is unique, theThomas-Fermi atom is always spherical. Suppose now that n(r) is zero out-side of some radius r0. Then, for a neutral atom, by ordinary electrostatics,

veffdef= v + vH = 0 for r > r0, (3.17)

too, because of the neutrality. Hence, from (3.12), µ = 0 in this case.(The argument remains valid for r0 → ∞.) For a positively charged ion,veff < 0 for r > r0, and hence µ = veff(r0) < 0. (The right hand side of(3.12) is equal to µ − veff for r ≤ r0, where n(r) ≥ 0, and from n(r0) = 0follows µ = veff.) For a negatively charged ion, it is easily seen that a finiter0 is impossible, because now veff > 0 would follow from electrostatics forr > r0, where n(r) = 0, and hence (3.12) could not be fulfilled. Furthermore,µ 6= 0 is also impossible, since both the effective potential and the densityapproach zero for r →∞.

Since we found µ = 0 for N ≥ Z (neutral atom or negatively chargedion), and considering (3.16), the Thomas-Fermi equation for this case takeson the form

−∇2veff =8√

2

3π(−veff)3/2 − 4πZδ(r). (3.18)

The δ-function leads to the boundary condition rveff → −Z for r → 0.Furthermore, we already discussed veff → 0 for r →∞. These two conditionsdetermine the solution of (3.18) for veff uniquely. Since, moreover, n(r) isuniquely obtained from veff via Poisson’s equation, a unique relation N =N(Z) is obtained under the presupposition N ≥ Z. For r 6= 0, (3.18) reads

−1

r

d2

dr2rveff =

8√

2

3π(−veff)

3/2, (3.19)

and for large r, its solution is veff ∼ r−4. On the other hand, if N(r)denotes the electron number inside the radius r, then Gauss’ theorem of

Page 66: The Fundamentals of Density Functional Theory (revised and

68 3. Thomas-Fermi Theory

electrostatics yields Z−N(r) = r2 dveff/dr. Hence, for the solution of (3.19),Z − N = Z − N(∞) = 0. Summarizing, under the assumption N ≥ Z wefound N = Z. There is a unique solution of the Thomas-Fermi theory forevery neutral atom, but non for negatively charged ions. For N < Z, a finiter0 is always obtained. The positively charged Thomas-Fermi ion has a finiteradius.

With the ansatz

veff(r) = −Zrχ(αr), α = (4π)2/3C−1

F Z1/3 = 1.1295Z1/3, (3.20)

the Thomas-Fermi equation for the neutral atom is cast into the universalequation

d2χ(x)

dx2=

1

x1/2[χ(x)]3/2, χ(0) = 1, χ(∞) = 0, (3.21)

which may be solved numerically (Fig.4).The electron density n(r) is obtained as

n(r) =32

9π3

[

χ(αr)

αr

]3/2

Z2 (3.22)

with the asymptotics

∼ r−3/2 0←r←− n(r)r→∞−→ ∼ r−6. (3.23)

(For a positively charged ion, N < Z, the density n(r) has a boundedsupport, i.e., n(r) = 0 outside of some radius r0 as discussed above.)

The Thomas-Fermi energy of the atom (in notation of (3.8) and the textthereafter) is

ETFZ [Z] = −0.7687Z7/3. (3.24)

Compared to accurate quantum mechanical results, this is by 54% to lowfor hydrogen (for which the exact value is –0.5=–1Rydberg), by 35% to lowfor helium, and by about 15% for heavy elements (Z ≈ 100).

The monotonous function χ(x) of Fig.4 yields a monotonous charge den-sity with no indication of any shell structure of the Thomas-Fermi atom.The charge density diverges erroneously for r → 0 and has a wrong asymp-totic behavior ∼ r−6, where an atom in quantum mechanics must have an

Page 67: The Fundamentals of Density Functional Theory (revised and

3.3 The Thomas-Fermi Screening Length 69

0

0.2

0.4

0.6

0.8

1

0 2 4 6 8 10 12

χ(x)

x

Figure 4: Universal Thomas-Fermi function χ(x) of neutral atoms.

exponential falloff [Hoffmann-Ostenhof et al., 1980]. (Check that the pre-factor of the asymptotics n(r) ∼ r−6 does not depend on Z; the ‘size of theThomas-Fermi atom’ is independent of Z.)

Everything discussed in this section transfers accordingly to molecules,except the universality of a function χ, but including the asymptotics close tonuclei and at infinity, and including the non-existence of negatively chargedentities [Lieb, 1981].

3.3 The Thomas-Fermi Screening Length

Consider a homogeneous electron liquid of density n(r) = n(0) = const. ina positive neutralizing background density n+ = n(0) so that the effective

potential v(0)eff (r) = v(0)(r) + v

(0)H (r) = const., too. The external potential v

produced by the background charge density, and the Hartree potential vH

separately are, of course, not constant in space, since they must fulfill therespective Poisson equations so that their Laplacian derivative is constant,and from (3.14, 3.15)

4πn(0) = −∇2v(0)H (r) =

8√

2

3π[µ− v(0)

eff ]3/2 =8√

2

3πε3/2f , (3.25)

Page 68: The Fundamentals of Density Functional Theory (revised and

70 3. Thomas-Fermi Theory

where the Fermi energy εf is the distance from the constant effective poten-tial to the chemical potential µ. It is, according to these relations, connectedwith the density by

εf =1

2(3π2n(0))2/3, (3.26)

exactly as in (1.43), from which the Thomas-Fermi kinetic energy functionalwas derived.

Put now a small additional test charge δZ in the coordinate origin, so thatan additional external potential δv = −δZ/r appears, for which ∇2δv = 0and hence ∇2δvH = ∇2δveff at r 6= 0. Thus, now (3.15) yields

−∇2(v(0)H + δveff) =

8√

2

3π(εf − δveff)

3/2 ≈ 8√

2

3πε3/2f

(

1− 3

2

δveff

εf

)

.

(3.27)

The last expression has been linearized with respect to the perturbation.Subtraction of (3.25) leads to

∇2δveff =4√

2εf

πδveff

def=

1

λ2TF

δveff. (3.28)

As is readily obtained from this relation, the screened potential perturbationhas a Yukawa form

δveff(r) ∼1

re−r/λTF (3.29)

with the Thomas-Fermi screening length λTF. In truth, (3.29) gives theasymptotics for large r, because linearization in (3.27) was justified for suf-ficiently large r only. With the electron density parameter rs defined by

3r3s

def= n−1 (3.30)

one finally obtains

λ2TF =

π

4kf=

( π

12

)2/3

rs (3.31)

for the Thomas-Fermi screening length. For ordinary metals, rs = 2 . . . 6and hence λTF = 0.9 . . . 1.5 in natural units.

Page 69: The Fundamentals of Density Functional Theory (revised and

3.4 Scaling Rules 71

3.4 Scaling Rules

In this section we analyze the dependence of the functional (3.4) on theparameters Zµ, Rµ. Using ideas of physical similarity, we scale lengths anddensities according to Rµ → Rµ = γRµ, n(r) → n(r) = n(γ−1r), so thatdensity profiles are stretched on the same scale as atom position vectors Rµ.We assume that this length scaling can be compensated by according scalingsof density amplitudes, charges and energies with appropriate factors. Thus,assuming

ETF[n(r);Zµ,Rµ] = ηETF[αn(γ−1r); βZµ, γRµ], (3.32)

from the four terms of (3.4) the conditions

η = γ−3α−5/3 = γ−2β−1α−1 = γ−5α−2 = γβ−2 (3.33)

follow, hence

ETF[n(r);Zµ,Rµ] = γ7ETF[γ−6n(γ−1r); γ−3Zµ, γRµ]. (3.34)

With this result, from (3.8, 3.12, 3.13) and (3.17) the following scaling rela-tions are obtained:

n(r;Zµ,Rµ) = γ6n(γr; γ−3Zµ, γRµ), (3.35)

N(Zµ,Rµ) = γ3N(γ−3Zµ, γRµ), (3.36)

veff(r;Zµ,Rµ) = γ4veff(γr; γ−3Zµ, γRµ), (3.37)

µ(Zµ,Rµ) = γ4µ(γ−3Zµ, γRµ), (3.38)

ETFN (Zµ,Rµ) = γ7ETF

γ−3N(γ−3Zµ, γRµ). (3.39)

The density minimizing the left side of (3.34) is n(r;Zµ,Rµ), that oneminimizing the right side is n(r; γ−3Zµ, γRµ). Eq. (3.34) also says thatthis latter density is expressed by the former one as n(r; γ−3Zµ, γRµ) =γ−6n(γ−1r;Zµ,Rµ). This way, (3.35) is obtained. With its use, the remain-ing relations follow. The result (3.24) is just a special case of the last scalingrelation.

There is one key issue of the modern treatment of Thomas-Fermi theory,connected with this type of scaling. If EN (Zµ,Rµ) is the exact quantummechanical ground state energy corresponding to an assembly (3.4), i.e., the

Page 70: The Fundamentals of Density Functional Theory (revised and

72 3. Thomas-Fermi Theory

infimum of the spectrum of the corresponding N -electron Hamiltonian withCoulomb interactions and the nuclei at fixed positions, then

limγ→0

γ7Eγ−3N(γ−3Zµ, γRµ) = ETFN (Zµ,Rµ). (3.40)

This is an N → ∞ limit with all zµ = Zµ/N fixed and interatomic dis-tances uniformly decreasing [Lieb and Simon, 1977, Lieb, 1981]. There is acorresponding relation (in a certain weak topological sense) for the densi-ties [ibid.]. Thomas-Fermi theory is asymptotically exact for large nuclearcharges and electron numbers. Unfortunately, the convergence of (3.40) isvery slow on a scale of the real world (cf. the numbers given after (3.24)).

Another exact relation is obtained by multiplying the Thomas-Fermiequation (3.12) by n(r) and integrating over r-space. If we denote the threeterms of (3.3) for the minimizing density in turn by K, U and W , then theresult may be expressed as

5

3K = µN − U −W. (3.41)

Equally simply the virial theorem is proved in Thomas-Fermi theory for anatom: Let n(r) be the solution of the Thomas-Fermi equation for an atom,and let nλ(r) = λ3n(λr) (so that N is unchanged when varying λ). ThenETF [nλ; v] has its minimum at λ = 1,

0 =∂ETF [nλ(r); v(r)]

∂λ

λ=1

= 2K − U −W. (3.42)

The n5/3-dependence ensures the correct scaling behavior of the kinetic en-ergy term K thus establishing the virial theorem.

One could try to derive the virial theorem for molecules, too. However,as another basic defect, there is no stable molecule in Thomas-Fermi theory.The Thomas-Fermi energy (3.10) of a molecule is monotonically decreas-ing for increasing interatomic distances. This is Teller’s famous no-bindingtheorem [Teller, 1962], the mathematically rigorous proof of which again isgiven in [Lieb, 1981].

3.5 Correction Terms

The comprehensive description (with mathematical rigor) of the state of artwith respect to the content of this section may again be found in [Lieb, 1981].

Page 71: The Fundamentals of Density Functional Theory (revised and

3.5 Correction Terms 73

For quantum chemistry applications see also [Parr and Yang, 1989]. Wecontent ourselves with brief comments only.

As was mentioned at the beginning of Section 3.1, Thomas and Fermi didnot consider the exchange and correlation energy. According to (2.46) andfor a homogeneous electron liquid of density n, this energy may be writtenas

WXC =1

2

d3rd3r′w(|r′ − r|)h(r′, r) =

d3r′ 2π

∫ ∞

0

drr2w(r)h(r).

(3.43)

For the Coulomb interaction w(r) = 1/r and replacing the pair correla-tion function h(r) with the density times the exchange hole (2.53), the lastintegral gives

2πn

∫ ∞

0

drr hX(r) =

= −9πn2

∫ ∞

0

drr

[

sin kfr − kfr cos kfr

(kfr)3

]2

=

= −9πn2

k2f

∫ ∞

0

dxx

[

sin x− x cos x

x3

]2

=

= −9π

4

n2

k2f

= −CXn4/3, CX =

3

4

(

3

π

)1/3

= 0.7386. (3.44)

For the final expression, the Fermi momentum kf = (3π2n)1/3 was replacedby the density expression according to (1.43). In the spirit of the Thomas-Fermi approximation, Dirac [Dirac, 1930] suggested a correction term

−CX

d3r n4/3(r) ≈WX (3.45)

to be added to the Thomas-Fermi functional.This Thomas-Fermi-Dirac theory is based on the functional

ETFD[n(r); v(r)] = ETF[n(r); v(r)]− CX

d3r n4/3(r) (3.46)

and leads to the Thomas-Fermi-Dirac equation

5

3CF n

2/3(r)− 4

3CX n

1/3(r) = [µ− v(r)− vH(r) ]+. (3.47)

Page 72: The Fundamentals of Density Functional Theory (revised and

74 3. Thomas-Fermi Theory

The functional space for n is the same as previously, the theory, however,is considerably complicated by the negatively curved additional functionalterm, which spoils the convexity of ETFD[n]. This difficulty is overcome byintroducing a complexified functional which leads for a definite class of ex-ternal potentials, including those of assemblies of point charges, to the sameenergy minimum ETFD

N and minimizing density n as the original functionalETFD[n; v] (in cases where the minimizing density exists). Moreover, evenfor zero external potential v the Dirac exchange term leads to unphysicalnegative total energy values for densities leaking away in space with smallcorrugations. There is no minimizing density in this case; the energy ap-proaches −εN asymptotically with a certain positive constant ε. Hence, ifNmax is the maximum number of particles v can bind (in the considered the-ory), then for N > Nmax, the Thomas-Fermi-Dirac theory makes no physicalsense.

Unfortunately, the Thomas-Fermi-Dirac theory does not remedy any ofthe defects of ordinary Thomas-Fermi theory. The total energy of atoms isobtained even lower (since the Dirac term is always negative), there are nonegatively charged ions or radicals either, and Teller’s no-binding theoremremains in force although neutral atoms or molecules now have a finite ra-dius. All these defects derive from the poor kinetic energy functional (3.2)rather than from the Dirac term. The Dirac term was, however, very ef-fective in a different context (see Section 4.3 below), and in this respect itshould be mentioned, that the value of CX given in (3.44) is not sacrosanctsince a Hartree-Fock state, from which the exchange hole (2.53) and hencethe Dirac term was derived, always overestimates the exchange energy dueto neglect of correlations.

The basic deficiency of Thomas-Fermi theory is the poor approximationof the kinetic energy. In order to improve upon it (originally in the prob-lem of nucleon motion in a nucleus), von Weizsacker [von Weizsacker, 1935]considered modified plane waves (1 + a · r) exp(ik · r) in order to have aninhomogeneous situation and found a gradient correction term

KW[n(r)] =1

2

d3r [∇n1/2(r)]2 (3.48)

to the kinetic energy functional. Later on Kirshnits [Kirshnits, 1957] showedthat the correct term in a systematic expansion of the kinetic energy func-tional is (1/9)KW. One may, however, from different view-points come upwith different coefficients λ, (1/9) ≤ λ ≤ 1, in front of KW. Without div-ing into the details of this theory, we mention that considerable qualitative

Page 73: The Fundamentals of Density Functional Theory (revised and

3.5 Correction Terms 75

improvement is achieved in various respects: The electron density at atomicnuclei is now finite, and at infinity it decays exponentially. Negative ionsare formed and molecules may bind. It is, however, up to now not known,to which extent the Thomas-Fermi-λWeizsacker theory provides accuratecorrection terms to the asymptotics (3.40).

The gradient expansion of the kinetic energy functional can be followedup to higher orders. We do not consider this here, since it seems, unfor-tunately, to be a dead road, and the sixth order already diverges. (See[Dreizler and Gross, 1990, Chapter 5.] for a comprehensive and up-to-datesurvey.)

Page 74: The Fundamentals of Density Functional Theory (revised and

4 Hohenberg-Kohn Theory

The history of the Thomas-Fermi and Hohenberg-Kohn theories presents aninstructive example of the way knowledge is gathered in many-body physics(or in physics in general). Thomas and Fermi introduced their ideas on anaıve and pragmatic level with the aim to test how the structure of heavieratoms and hopefully chemistry may be obtained from the newly developedapparatus of quantum physics, using the numerical means available at thattime. After the promising first results, only about thirty years later theabsence of molecular binding in numerical results (connected with the rapidpost-war growth in numerical capabilities) became a pressing problem, andonly 35 years after the foundation of Thomas-Fermi theory it was analyticallyshown by Teller that there is no chemical binding in that theory (withoutthe Weizsacker term). Partly because of its destructive character, Teller’stheorem was doubted on grounds of rigor, and certainly this situation playeda role in keeping alive the interest of mathematicians in the theory. Besidesconfirmation of Teller’s theorem, a great number of constructive and veryimportant results have appeared up to now from those activities.

Meanwhile, quantum chemistry had developed on the basis of Hartree-Fock theory. Again a naive approximation had to be introduced due tonumerical limitations despite the rapid development of computer techniques.Slater’s Xαmethod was born, this time, however, with an astonishing successsometimes even jealously noticed by many-body theorists.

The Hohenberg-Kohn theory is formal in nature since there is only littlemore than no hope to aquire rigorous knowledge of the basic ingredient,the Hohenberg-Kohn functional F [n]. Nevertheless, with a clever trick con-tributed by Kohn and Sham, it simultaneously put the Xα method on amuch broader theoretical basis than the Hartree-Fock-Slater approximationhad been (and hence at least partially explained its success) and openedmany ways to generalize both the Thomas-Fermi and the Xα approaches. Inthat situation, the marriage of this approach with sophisticated many-bodytheory and the development of the mathematical basis became extremelyfruitful. To stress this again, the gap between the formally rigorous part ofthe theory and its pragmatic approximate versions will probably never beclosed. (At present there is a tendency to increase the gap by the rapidly

Page 75: The Fundamentals of Density Functional Theory (revised and

4.1 The Basic Theorem by Hohenberg and Kohn 77

growing fields of application of approximate versions of the theory.) Never-theless, the role of the formal part as a guide can hardly be overestimated.

In the whole of this chapter the theory is developed on a heuristic levelwith focus on the underlying physics. The mathematically strict develop-ment comes in Chapter 6.

Monographs and surveys on density functional theory in the under-standing of Hohenberg and Kohn appear rather regularly. Here we citea more or less representative selection: [Dreizler and da Providencia, 1985],[Parr and Yang, 1989], [Trickey, 1990], [Dreizler and Gross, 1990], and morerecently [Gross and Dreizler, 1995].

4.1 The Basic Theorem by Hohenberg and Kohn

In this chapter we fix the particle number N and consider Hamiltonians asgiven in (1.24):5

H [v] = −1

2

N∑

i=1

∇2i +

N∑

i=1

v(xi) +1

2

N∑

i6=j

w(|ri− rj |) = T + U + W , (4.1)

where the external potential v is explicitly indicated as a functional variable.The functional dependence of H on v is of course affine-linear:

H[αv1 + βv2] = αH [v1] + βH[v2] for α + β = 1. (4.2)

As a reference system,

H0[v] = −1

2

N∑

i=1

∇2i +

N∑

i=1

v(xi) (4.3)

will be needed. In the following, a superscript 0 will systematically refer tothe interaction-free case w ≡ 0. All Hamiltonians considered will be assumedbounded below, and, for a given interaction w not explicitly indicated, theground state energy

E[v]def= inf〈Ψ|H[v]|Ψ〉 |Ψ ∈ WN, (4.4)

WNdef= Ψ | Ψ(x1 . . . xN ) (anti)symmetric,

〈Ψ|Ψ〉 = 1, 〈∇iΨ|∇iΨ〉 <∞ for i = 1 . . . N, (4.5)

5Only in the last section the more general potential case vss′(r) due to a generalexternal magnetic field which couples to the spin is considered.

Page 76: The Fundamentals of Density Functional Theory (revised and

78 4. Hohenberg-Kohn theory

is defined even if there is no ground state Ψ0[v] minimizing (4.4) (for in-stance when v cannot bind N particles as discussed for the Thomas-Fermicase before (3.10)). If there is such a ground state (not necessarily unique;degeneracy is permitted), it is obviously gauge invariant with respect topotential constants, i.e., it is the same for all potentials v+const., and

E[v + const.] = E[v] +N · const. (4.6)

Therefore, potentials essentially differing only by such a gauge constant arenot considered different in the following, and v1 6= v2 means v1 − v2 is nota.e. a constant.

For the rest of this section we restrict the consideration to spin indepen-dent potentials. Nevertheless we keep in all what follows the notation v(x)and n(x) with x meaning r in the spin independent case. The implicationsof spin dependence are discussed in Sections 4.7 and 4.8. We now considerpotentials like those discussed in the text around (3.7) and denote this sit-uation as v ∈ ⊕Lp for some p’s (p = ∞ allowed), and which do have anN -particle ground state. We define this class of potentials:

VN def= v | v ∈ ⊕Lp for some p’s, H [v] has a ground state . (4.7)

(Of course, VN depends on w: V0N for the interaction-free case is generally

different from VN . E.g. v = −Z/|r| ∈ V0N for every N < ∞: There

is an infinite number of interaction-free single-particle states bound by theCoulomb potential of a nucleus of charge Z, their energies clustering towardsthe continuum edge. However, with w(r) = 1/r, v = −Z/|r| ∈ VN for N ≤Nmax(Z) < ∞ only, where Nmax(Z) is the maximum number of repellingeach other particles, which the potential v = −Z/|r| can bind.)

Now, for v ∈ VN , we have

H[v]Ψ0[v] = Ψ0[v]E[v] (4.8)

with at least one ground state Ψ0[v]. In case of degeneracy, Ψ0[v] denotes inthe following any one of the degenerate ground states. Consider v1 6= v2 ∈VN and suppose Ψ0[v1] = Ψ0[v2] ≡ Ψ0. Subtracting the two Schrodingerequations from each other yields

i

(v1(xi)− v2(xi))Ψ0 = Ψ0(E[v1]− E[v2]) = Ψ0 · const. (4.9)

Since the left side is different from a constant on a domain of non-zeromeasure, Ψ0 must be zero there. It is a conjecture without doubt although

Page 77: The Fundamentals of Density Functional Theory (revised and

4.1 The Basic Theorem by Hohenberg and Kohn 79

not easily proved mathematically, that this cannot be for v ∈ ⊕Lp. (v ∈ ⊕Lp

excludes hard potential barriers behind which Ψ0 ≡ 0 would be possible.)Hence, Ψ0[v1] 6= Ψ0[v2]. Since Ψ0[vi] is non-zero where the potentials aredifferent, Ψ0[v2] does not satisfy the Schrodinger equation for v1, and hencea strict inequality

E[v1] < 〈Ψ0[v2]|H[v1]|Ψ0[v2]〉 = E[v2] +

dxn[v2](v1 − v2) (4.10)

holds, where n[v] is the particle density corresponding to the ground stateΨ0[v]. The same strict inequality holds with subscripts 1 and 2 reversed.

Suppose that still n[v1] = n[v2] a.e. Then, adding the inequality (4.10)and that obtained by interchanging subscripts 1 and 2 yields

E[v1] + E[v2] < E[v2] + E[v1], (4.11)

which is a contradiction. Hence,

n[v1] 6= n[v2] for v1 6= v2, (4.12)

or, in other words, for every given n(x) (taken as function on the wholex-space) there is at most one potential function v(x)mod(const.) for whichn(x) is the ground state density. This is the basic theorem by Hohenbergand Kohn:

v(x)mod(const.) ∈ VN is a unique function of the ground statedensity n(x).

(v(x)mod(const.) is a family of potentials, VN is the set of such families, andit is one of these families which is uniquely defined by n(x).) For this reason,VN may be called the set of n-representable potentials, although this nameis not much in use in the literature.

The use of the theorem is to transfer every functional dependence on vinto a functional dependence on n by substituting v[n]. To this end we haveto define the class of densities

AN def= n(x) |n comes from an N -particle ground state, (4.13)

being called the class of pure-state v-representable densities in the literature.Now, define the density functional by Hohenberg and Kohn as

FHK[n]def= E[v[n]]−

dx v[n]n, n ∈ AN . (4.14)

Page 78: The Fundamentals of Density Functional Theory (revised and

80 4. Hohenberg-Kohn theory

It is clear from (4.6) that the undefined potential constant drops out on ther.h.s. of (4.14); FHK[n] is uniquely and well defined on AN . Now considerFHK[n]+

dxnv as a functional of the two independent variables n and v (v inthe integral not necessarily being v[n]). Let n ∈ AN . This implies that thereis some potential vn = v[n] for which n is a ground state density. Pick v ∈ VNindependently. Then, FHK[n] +

dxnv = FHK[n] +∫

dxnvn +∫

dxn(v −vn) = E[vn] +

dxn(v − vn) = 〈Ψ0[vn]|H[vn]|Ψ0[vn]〉 +∫

dxn(v − vn) =

〈Ψ0[vn]|H [v]|Ψ0[vn]〉 ≥ E[v]. This is just the same argument as used before(3.8), now based on rigor. Here, the issue is the Hohenberg-Kohn variationalprinciple

E[v] = minn∈AN

FHK[n] +

dxnv

(4.15)

for v ∈ VN .Recall that in these considerations neither Ψ0[v] nor n[v] were supposed

unique. The ground state may be degenerate, and so may be the groundstate density. The basic theorem states the mapping v = v[n] to be single-valued, and that is enough for the whole theory. This point was firstmade by Lieb [Lieb, 1983] (cf. also [Kohn, 1985].) In the original paper[Hohenberg and Kohn, 1964] the analysis was confined to the class V ′N ofpotentials having a non-degenerate ground state and to densities n ∈ A′Ncoming from a non-degenerate ground state. In that case, the mapping be-tween n and v is one-to-one, and, consequently, there is even a one-to-onemapping between n and Ψ0. This makes the alternative definition

FHK[n] = 〈Ψ0[n]|T + W |Ψ0[n]〉, n ∈ A′N (4.16)

of the Hohenberg-Kohn density functional possible, which of course coin-cides with (4.14) on the narrower class A′N ⊂ AN . As degeneracy of theground state is quite common in physics and the theory makes little profitfrom the mappings being one-to-one, (4.14) should be taken as the basicdefinition of the Hohenberg-Kohn density functional, with the restriction tonon-degenerate ground states released. Note, however, the discussion at thevery end of this chapter.

The really serious problems remaining are connected with the fact thatneither the classes VN and AN nor the functional FHK[n] are known explic-itly. From an information theoretical point of view, one even could say thatnothing was gained by the variational principle (4.15), because a guess ofFHK[n] seems equally hopeless as a direct guess of E[v], both functionals

Page 79: The Fundamentals of Density Functional Theory (revised and

4.2 The Kohn-Sham Equation 81

being no doubt extremely involved. On the other hand, the very simpleguess

FHK[n] ≈ CF

d3r n5/3(r) +1

2

d3rd3r′ n(r′)w(|r′ − r|)n(r) (4.17)

in connection with the variational principle (4.15) yields precisely theThomas-Fermi theory, which was shown in Chapter 3 to produce a num-ber of encouraging estimates. Also the corrected versions of Section 3.5 ofThomas-Fermi theory all fit in the frame of Hohenberg-Kohn theory.

We are faced with two principally different types of theory in this field:The Hohenberg-Kohn variational principle and its subsequently consideredversions are rigorously based on quantum theory but are not given explicitly.Thomas-Fermi theory, its variants, and the related local-density approxima-tion considered below are explicitly given variational principles, but only afew rigorous statements can be made about their connection to exact quan-tum theory, the strongest being (3.40). It is, however, a general experiencein physics that a guess in a variational expression can be much more fruitfulthan a direct guess of a quantity to be estimated. The recent developmentsin density functional theory may serve as a brilliant example for this expe-rience.

4.2 The Kohn-Sham Equation

The real break-through in modern density functional theory, which put itat once on one level with and tightly linked it to the Hartree-Fock-Slaterapproximation of many-fermion theory—but meanwhile led far beyond—,came from Kohn and Sham [Kohn and Sham, 1965] with the suggestion ofthe Kohn-Sham equation. Recall that the weakest part of Thomas-Fermitheory was the treatment of the kinetic energy functional, and we are nowgoing to explain the Kohn-Sham trick in handling this part.

The considerations of last section can be carried through for any reason-able w (so that the considered Hamiltonians are bounded below). Particu-larly for w ≡ 0 the Hohenberg-Kohn functional

T [n]def= E0[v0[n]]−

dx v0[n]n, n ∈ A0N (4.18)

is just the kinetic energy of the ground state of the interaction-freeN -particlesystem as a functional of the ground state density. Even this functional isnot explicitly known, but its existence (i.e. the property of the mapping

Page 80: The Fundamentals of Density Functional Theory (revised and

82 4. Hohenberg-Kohn theory

n 7→ T to be single-valued) is again guaranteed by the Hohenberg-Kohntheorem. Among the ground states of an interaction-free fermion systemin an external field v there are always determinantal states (1.15). In caseof degeneracy, linear combinations of degenerate determinantal states mayalso serve as ground states. A subtlety here is, that a density derived from alinear combination of degenerate determinantal ground states may not nec-essarily be derivable from a single determinantal ground state [Lieb, 1983].Therefore, slightly deviating from the general scheme, we define

A0N

def= n(x) |n comes from a determinantal N -particle ground state.

(4.19)

as the domain of T [n].The density of a determinantal state is

n(x) =N

i=1

φi(x)φ∗i (x)

for Ψ00(x1 . . . xN) =

1√N !

det ‖φi(xk)‖, 〈φi|φj〉 = δij . (4.20)

Its kinetic energy is

E0kin = 〈Ψ0

0|T |Ψ00〉 = −1

2

N∑

i=1

〈φi|∇2|φi〉. (4.21)

Hence,

T [n =∑

φiφ∗i ] = −1

2

N∑

i=1

〈φi|∇2|φi〉 (4.22)

under the orthonormality conditions of (4.20) and under the condition thatthe orbitals φi, i = 1, . . . , N form a determinantal ground state. In otherwords,

T [n] = minφ∗i , φi

−1

2

N∑

i=1

〈φi|∇2|φi〉∣

〈φi|φj〉 = δij ,

N∑

i=1

φiφ∗i = n

. (4.23)

For n fixed, the minimum of E0 is taken on at minimal E0kin. Actually this

expression for T is already a continuation of the definition (4.18) beyond A0N ,

Page 81: The Fundamentals of Density Functional Theory (revised and

4.2 The Kohn-Sham Equation 83

since it also gives a value for T when no potential v0[n] exists for that densityn. We come back to this type of generalization in a more general context inthe next but one section. The relation (4.23) may now be inserted into theHohenberg-Kohn variational principle (4.15), which in the considered casereads

E0[v] = minn

T [n] +

dxnv

. (4.24)

The density n in the last integral may also be replaced with the conditions of(4.23),

dxnv =∑〈φi|v|φi〉, 〈φi|φj〉 = δij, and then, instead of a two-step

minimization, first of (4.23) and then of (4.24), the minimum may likewisebe sought in one step:

E0[v] = minφ∗i , φi

N∑

i=1

(

−1

2〈φi|∇2|φi〉+ 〈φi|v|φi〉

)

〈φi|φj〉 = δij

. (4.25)

Now, after introducing Lagrange multipliers εi for the side conditions oforbital normalization (orthogonality will automatically be provided), eachterm of the i-sum may be varied separately which yields just the one-particleSchrodinger equation

(

−∇2

2+ v(x)

)

φi(x) = φi(x) εi (4.26)

for the N orbitals lowest in energy of non-interacting electrons.In the interacting case w 6≡ 0, we decompose the Hohenberg-Kohn func-

tional according to

FHK[n] = T [n] + EH[n] + EXC[n], n ∈ AN , (4.27)

with the Hartree energy EH as introduced in (2.46) (its variation with respectto the density n giving the Hartree potential (3.13)). This decompositiondefines EXC as a functional of n on the considered domain, since all theremaining terms of (4.27) were already defined as functionals of n on thatdomain. (The Hartree energy (2.46) was defined for any density n, and thedomain of definition of T [n] by (4.23) is a whole functional space containingall of AN as will be discussed in Section 4.4. Since, however, FHK[n] was onlydefined on AN , EXC is so far likewise only defined on that set of densities.)Compare the corresponding decomposition (2.65).

Page 82: The Fundamentals of Density Functional Theory (revised and

84 4. Hohenberg-Kohn theory

Now, to proceed one replaces T [n] with the r.h.s. of (4.23) and n in EH

and in EXC with (4.20). Then one can again combine the variations of (4.23)and of (4.15) into a one-step variation

E[v] = minφ∗i , φi

N∑

i=1

(

−1

2〈φi|∇2|φi〉+ 〈φi|v|φi〉

)

+ EH[n] + EXC[n]

〈φi|φj〉 = δij

. (4.28)

This time, however, some functional analysis is necessary. The variation ofEXC[n] with respect to a variation of φ∗i may be expressed via (4.20), thelatter implying δn(x)/δφ∗i (x

′) = δ(x− x′)φi(x′):δEXC

δφ∗i (x)=

dx′δEXC

δn(x′)

δn(x′)

δφ∗i (x)=δEXC

δn(x)φi(x). (4.29)

(In the existing literature the whole approach is often very sketchy repre-sented by formally treating the kinetic energy this way:

−∇2

2φi(x) =

δE0kin

δφ∗i (x)=

dx′δT

δn(x′)

δn(x′)

δφ∗i (x)=

δT

δn(x)φi(x). (4.30)

This chain of equations does not mean that δT/δn(x) = −∇2/2, since itdoes not hold for every function φi(x) and for every variation δφ∗i (x) ofa complete set. It just holds for those orbitals φi(x) which make up thedeterminantal ground state with density n(x) and under an integral withδφ∗i (x) orthogonal to all of them. Such a sketchy presentation of courserequires a certain knowledge of convex analysis on the side of the reader.The simple but essential point is that given n, T [n] is the minimum (4.23)of (4.22) rather then (4.22) itself for a general set of φi.)

The derivations given here still are a bit formal and on an heuristic level.On this level on may orient by partial derivatives and use the followingformal correspondences:

ni, i = 1 . . .M : n(x), x ∈ R3,

∂ni∂nj

= δij :δn(x)

δn(y)= δ(x− y),

∂ni

j

ajnj = ai :δ

δn(x)

dx′ f(x′)n(x′) = f(x),

∂φkf(ni(φk)) =

i

∂f

∂ni

∂ni∂φk

δφ(x)F [n[φ]] =

dx′δF

δn(x′)

δn(x′)

δφ(x).

(4.31)

Page 83: The Fundamentals of Density Functional Theory (revised and

4.2 The Kohn-Sham Equation 85

(The third line is just a special case of Euler’s lemma for the first variation:(δ/δn(x))

dx′f(x′)n(x′) = 0 ⇒ f(x) ≡ 0. Generally, the linear increment(total differential) δF [δn] =

dx (δF/δn(x)) δn(x) is to be considered asa linear functional of δn, which may be viewed as represented by some(generalized) integral kernel δF/δn(x) being often a generalized function ase.g. a δ-function. A correct definition of functional derivatives will be givenin Section 5.7.)

The variation of (4.28) is now straightforward. Again the side conditionsare treated with Lagrange multipliers. As the resulting equation will turnout to be Hermitian, again one needs only to consider the normalization oforbitals φi with multipliers εi while the orthogonality will automatically beprovided by the Hermiticity of the resulting Kohn-Sham equation:

(

−∇2

2+ veff(x)

)

φi(x) = φi(x) εi, veffdef= v + vH + vXC, (4.32)

where the Kohn-Sham exchange and correlation potential is defined as

vXC(x)def=

δEXC

δn(x). (4.33)

Multiplying (4.26) by φ∗i (x), integrating, summing over i and considering(4.20, 4.21) yields immediately

E0[v] =N

i=1

εi, ε1 ≤ ε2 ≤ . . . , (4.34)

where for the minimum of (4.15) in the considered case clearly the orbitalswith the lowest N eigenvalues εi have to be taken. (In case of degeneracyεN = εN+1 = . . . this choice is not unique, see also Section 4.5.)

Note however that, by definition, T [n] in (4.27) is further on the kineticenergy of an interaction-free ground state with density n (and hence at bestcorresponding to an external potential v0[n], different from v). The Kohn-Sham EXC contains the change in kinetic energy due to interaction and canbe expressed through a coupling constant integral over the w-term as givenin (2.67, 2.68).

A consideration analogous to that which led to (4.34) now yields

E[v] ≤N

i=1

εi − EH[n]−∫

dxnvXC + EXC[n], n(x) =

N∑

i=1

|φi(x)|2.

(4.35)

Page 84: The Fundamentals of Density Functional Theory (revised and

86 4. Hohenberg-Kohn theory

The ≤ sign was needed here, because the right side can only be the minimumof (4.15) over AN ∩A0

N : EXC[n] was only defined on AN , and the solution of(4.32) yields via (4.20) a density out ofA0

N . There is another principal differ-ence between the equations (4.26) and (4.32): (4.26) is a linear Schrodingerequation, and the eigenenergies εi and eigenfunctions φi depend only onthe external potential v. (4.32) on the other hand is a nonlinear problem,because the effective potential veff depends on the Kohn-Sham orbitals φi.Hence it must be solved iteratively (self-consistently) like, e.g., the Hartree-Fock problem. As one consequence, the same level-crossing problems as inHartree-Fock theory may appear (cf. the text after (1.62)). Note that sucha level crossing situation implies that the non-interacting reference groundstate is degenerate; one more reason not to rely on non-degeneracy of theground states. To cope with such situations we will need another general-ization of the theory which will be considered in Section 4.5. Yet, in manyapplications (with approximate functional EXC, of course), a self-consistentsolution of (4.32) is found with the Kohn-Sham orbitals for the lowest Nlevels εi occupied. In that case, automatically n ∈ A0

N , since an externalpotential v0[n] equal to the self-consistent veff would, for an interaction-free fermion system, give just that ground state density. Hence, every nobtained via (4.20) from a solution of the Kohn-Sham equations is in A0

N .Since AN ⊆ A0

N has not been proved, possibly not every density n ∈ AN canbe obtained via (4.20) from a solution of the Kohn-Sham equations. Solvinga variational problem by means of Euler’s equation needs a careful investiga-tion of the existence of the functional derivatives which cannot be providedat this stage since we do not even know the topology of the domains ANand A0

N .

4.3 The Link to

the Hartree-Fock-Slater Approximation

The direct solution of the Hartree-Fock equations (1.55) for large systems(N > 103) is limited by the involved structure of the exchange potentialoperator (1.57). To overcome this limitation, albeit with approximate results(being aware that Hartree-Fock theory itself is an approximation, althoughgiving definite (upper) bounds for the total energy), [Slater, 1951] proposedto estimate the exchange potential term in the spirit of Thomas and Fermi.While Slater used a Fermi surface average of the exchange potential for thehomogeneous system, [Gaspar, 1954] proposed to approximate the exchangeenergy term of (1.54) by the Dirac expression (3.45), with the rules (4.31)

Page 85: The Fundamentals of Density Functional Theory (revised and

4.4 Constrained Search Density Functionals 87

yielding

vX(r) = −4

3CX n

1/3(r). (4.36)

The potential expression originally used by Slater had a factor 2 instead of4/3 in front.

Since both approaches have their justifications in particular contexts,and, as was soon emphasized by Slater, correlation modifies the effect ofexchange, generally reducing it, a parameterization

vXα(r) = −2αCX n1/3(r) (4.37)

became very popular in the sixties and seventies and was used with greatsuccess in atomic, molecular and solid state calculations of the electronicstate. A most popular variant of this Xα-approach was to determine α sothat the virial equation 〈H〉 = −〈T 〉 was fulfilled (cf. (2.78)).

The Hartree-Fock-Slater equation is obtained by replacing the exchangeterm of (1.55) with the local potential term using (4.37). Comparing itwith the Kohn-Sham equation (4.32), one observes immediately that theformer fits into the latter frame, if one understands (4.37) as an approxima-tion to the Kohn-Sham exchange and correlation potential or, equivalently,if one understands 3α/2 times the Dirac expression (3.45) as an approxi-mation to EXC[n]. With simple approximate expressions of that type forEXC[n] (termed local density approximations), Hohenberg-Kohn-Sham the-ory is computationally as equally simple as the Hartree-Fock-Slater approx-imation.

Note, however, the big conceptual difference between the two approaches:Hartree-Fock theory searches for the best determinantal approximation tothe ground state, whereas Hohenberg-Kohn theory searches for the bestapproximation to the density of the interacting ground state the latter ofwhich cannot be a determinant. In particular, there is no virial theoremconnecting E[v[n]] with T [n], since T contains only part of the kinetic energy,part being contained in EXC. (It is not difficult to show that 〈T 〉−T [n] ≥ 0[Levy, 1982, cf. also (4.45)].) This explains why the α-value in (4.37) whichgives the lowest total energy for an actual case, was always found to bedifferent from the value that satisfied the virial equation.

4.4 Constrained Search Density Functionals

Hohenberg-Kohn theory as described up to here, although alluring, leavesone uncomfortable with v-representability problems: How relevant is AN ∩

Page 86: The Fundamentals of Density Functional Theory (revised and

88 4. Hohenberg-Kohn theory

A0N? Both sets of search are unknown, but it is known that they are not

convex. This last statement might pose the most severe problem (cf. nextchapter).

A generalization which circumvents the mapping of ground state den-sities n on potentials v was independently considered by Levy and Lieb[Levy, 1982, Lieb, 1983]. Define instead of the Hohenberg-Kohn functionalthe Levy-Lieb functional

FLL[n]def= inf 〈Ψ|T + W |Ψ〉 |Ψ 7→ n, Ψ ∈ WN . (4.38)

Here, the infimum search is over all N -particle wavefunctions (not onlyground states) yielding a given density n(x). Since E[v] = inf〈Ψ|H|Ψ〉 |Ψ ∈WN = inf〈Ψ|T + W |Ψ〉+

dx vn[Ψ] |Ψ ∈ WN, FLL can trivially replaceFHK in (4.15).

Now the question arises, what characterizes a density n coming froman N -particle wavefunction. Fortunately, this question has got a simplefinal answer [Gilbert, 1975, Harriman, 1980, Lieb, 1983]: Any non-negativedensity integrating to N and such that

dx |∇n1/2(r)|2 < ∞ (i.e. ∇n1/2 ∈L2, eventually for each spin component) comes from a Ψ ∈ WN , in thefermion case even from a determinantal one. Lieb proved the statement inboth directions, i.e., every N -particle wavefunction has a density with theasserted properties. Hence, defining the set of N -representable densities

JN def= n |n(x) ≥ 0, ∇n1/2 ∈ L2,

dxn = N , (4.39)

we have

E[v] = infn∈JN

FLL[n] +

dxnv

. (4.40)

instead of (4.15).The immediate gain is that JN is explicitly known. It is even convex,

since it is directly seen that cn1 +(1− c)n2 ∈ JN , 0 ≤ c ≤ 1, if n1, n2 ∈ JN .Moreover, the statement of Gilbert, Harriman and Lieb implies that

AN ⊂ JN , A0N ⊂ JN . (4.41)

Equality of the sets is excluded here because it is known due to Lieb thatAN and A0

N are not convex. A further formally important statement byLieb is that in (4.38) there is always a minimizing Ψ ∈ WN , i.e.,

FLL[n] = min 〈Ψ|T + W |Ψ〉 |Ψ 7→ n, Ψ ∈ WN , n ∈ JN . (4.42)

Page 87: The Fundamentals of Density Functional Theory (revised and

4.5 Ensemble State Density Functionals 89

The minimizing Ψ need not be unique, and among the minimizing Ψ’s thereneed not be a ground state (for instance, if n is not a ground state density).From (4.40) (end hence from (4.38)) it follows immediately, however, that,if n is a ground state density, then a ground state minimizes (4.42):

FLL[n] = FHK[n] for n ∈ AN . (4.43)

Thus FLL is just a continuation of FHK from AN onto a convex and explicitlygiven domain. Unfortunately, FLL[n] itself is not convex on JN [Lieb, 1983].

Again, as in the original Hohenberg-Kohn version, everything said so farholds true for any reasonable interaction w, particularly also for w ≡ 0. So,

TLL[n] = min 〈Ψ|T |Ψ〉 |Ψ 7→ n, Ψ ∈ WN , n ∈ JN . (4.44)

is another continuation of T [n] from (4.18) onto JN . (As we now knowthat every n ∈ JN comes from a determinantal state, the domain of defi-nition of (4.23) is JN , too.) On this domain, although for every n there isa determinantal Ψ it comes from, nevertheless, since n does not uniquelydetermine Ψ, the minimizing Ψ for (4.44) need not be a determinant. How-ever, if n ∈ A0

N then there is a minimizing determinant Ψ; and in that case(4.44) coincides with both (4.18) and (4.23). Of course, for a general den-sity, TLL[n] ≤ T [n], since the search of (4.23) is more restricted than thatof (4.44). Now, the same decomposition (4.27) (further with T , not withTLL) can be made for FLL yielding the Kohn-Sham equations. This time,EXC is already defined on JN because now all the other entries in (4.27) aredefined on that domain. Hence, with FLL we are now more comfortable withthe Hohenberg-Kohn variational principle, but not yet with the Kohn-Shamequations which further on yield densities n ∈ A0

N only.There is one simple but maybe important side product of (4.44). Given a

ground state density n for H (possible with interaction), the kinetic energy inthat ground state is 〈T 〉 = 〈Ψ0|T |Ψ0〉, where the ground state wavefunctionΨ0 minimizes 〈H〉 (and gives n). For that n, TLL[n] = 〈Ψ|T |Ψ〉, where Ψminimizes 〈T 〉 among all wavefunctions giving n. Consequently,

〈Ψ0|T |Ψ0〉 ≥ TLL[n[Ψ0]], (4.45)

where n[Ψ0] denotes the density of the ground state Ψ0.

4.5 Ensemble State Density Functionals

Instead of considering quantities coming from N -particle pure states Ψthe theory may be generalized to mixed states given by N -particle den-sity matrices. Such a generalization was first introduced in order to treat

Page 88: The Fundamentals of Density Functional Theory (revised and

90 4. Hohenberg-Kohn theory

non-zero temperatures with density functional theory [Mermin, 1965]. Forzero temperature, a step in this direction was made by [Janak, 1978] withthe introduction of fractional orbital occupation numbers, and later onlinked to the general mixed state theory [Perdew and Zunger, 1981]. Withthe possibility of fractional orbital occupation numbers this generaliza-tion also increases the scope of applicability of the Kohn-Sham equations[Englisch and Englisch, 1984a, b].

An admissible N -particle density matrix, i.e. an ensemble state, has theform

γN(x1 . . . xN ; x′1 . . . x′N ) =

∞∑

K=1

ΨK(x1 . . . xN)gN ;KΨ∗K(x′1 . . . x′N),

0 ≤ gN ;K ,

∞∑

K=1

gN ;K = 1 (4.46)

(cf. e.g. [Parr and Yang, 1989, Section 2.2]). The map of density matricesonto densities n is always linear (an advantage against wavefunctions), hencethe density coming from the above γN is (in accordance with (2.1–2.3))

n(x) =

∞∑

K=1

gN ;KnK(x), (4.47)

where nK(x) comes from the pure state ΨK . Notice that nK ∈ JN , JN isconvex, and (4.47) is obviously an affine combination, hence n ∈ JN too.

Admitting ensemble states increases the variational freedom and eventu-ally increases the set of ground states, incorporating mixtures of degeneratepure ground states; for the ground state energy it does not imply any change.The ensemble-state expectation value of the N -particle Hamiltonian is

〈H〉 = tr HγN =∑

K

gN ;K〈H〉K , (4.48)

where 〈H〉K is the pure-state expectation value for the state ΨK . The groundstate energy is further on defined as

E[v] = infγN

〈H[v]〉. (4.49)

The expectation value (4.48) depends linearly on the mixing coefficientsgN ;K , hence, as any linear function, if it has an extremum, it takes on its

Page 89: The Fundamentals of Density Functional Theory (revised and

4.5 Ensemble State Density Functionals 91

extremum on the boundary of the domain of definition of those gN ;K , givenby the last two relations (4.46). If one depicts the gN,K as the K-componentsof a real Euclidean vector, this domain forms a simplex with corners gN ;K =δKK , K = 1, 2, . . .. (A simplex is a polyhedron with minimum number(dimension+1) of corners: a point, stretch, triangle, tetrahedron,. . . .) Theboundary of a simplex consists of simplices of lower dimension; on each ofthose the infimum of the linear function is again taken on on its boundary.Hence it is taken on on a corner, i.e. on a pure state. If there are severaldegenerate pure ground states, the infimum is taken on by any state of thesimplex spanned by those pure ground states, i.e. having them as its corners.(It is not difficult to realize that, if the function has no minimum but a finiteinfimum, then there must be a sequence of corners, that is of pure states,whose function values converge to that infimum.)

Note the principal difference compared to the expansion of a pure stateinto a basis, where the expectation values are bilinear in the expansion co-efficients, and none of the above considerations apply. The expansion ofa pure state into a basis compares to a coherent superposition of waves,while a mixed state compares to an incoherent superposition; in (4.48) nointerference terms between pure states appear.

There are physical reasons to include ensemble states into considerationeven for the ground state, that is, for zero temperature. Consider for instancea boron atom. Its ground state can be characterized by the configuration(1s)2(2s)2(2p). We have in mind the many-particle ground state and usethe orbital configuration for its approximate characterization only. It hasthe orbital angular momentum quantum number (which is a good quantumnumber for the many-particle state of the interacting electrons) equal toL = 1. Taking spin into account, the ground state has a total angularmomentum (again a good quantum number) of either J = 1/2 or J = 3/2.If, as in our present theory, spin-orbit coupling is neglected, these statesare degenerate: the ground state is a sextet with quantum numbers L2 =L(L+ 1) = 2, Lz = 0,±1, S2 = 3/4, Sz = ±1/2. (In reality with spin-orbitsplitting the ground state has J = 1/2 and is a duplet.) In the particular caseunder consideration with only one electron in an unfilled shell, the spin doesnot influence the symmetry of the orbital part of the wave function, hence wecan forget about it and consider the orbitally threefold degenerate state withL = 1. Its spatial electron density consists of a spherical part (of the atomiccore) and a part of L = 1 symmetry, which can for instance be chosen ∼ x2

or ∼ y2 or ∼ z2 comprising three orthogonal to each other ground states.Any linear combination of these three states with real coefficients α, β, γ,

Page 90: The Fundamentals of Density Functional Theory (revised and

92 4. Hohenberg-Kohn theory

α2 + β2 + γ2 = 1 is again a ground state with the symmetry axis of theaspherical charge density pointing in the direction of the vector (α, β, γ).Likewise, with complex coefficients states can be build which are eigenstatesof the angular momentum projection L(α,β,γ) in any given direction (α, β, γ).However, in real physics, even with spin-orbit coupling neglected, the atomis nearly always in a paramagnetic state with L2 = 2 and 〈L2

(α,β,γ)〉 6= 0 but

〈L(α,β,γ)〉 = 0 for all projections onto all directions (α, β, γ). This state is

γ = |x〉13〈x|+ |y〉1

3〈y|+ |z〉1

3〈z| =

= |Lz = 1〉13〈Lz = 1|+ |Lz = 0〉1

3〈Lz = 0|+ |Lz = −1〉1

3〈Lz = −1|

(4.50)

where the anisotropic part of the wavefunction |x〉 behaves like x/r, thatof |Lz = 1〉 like (x + iy)/r and so on. It is a simple exercise to prove thatboth lines of (4.50) are equal. This state has a spherical charge density. It isthe paramagnetic ensemble state and is degenerate with the three polarizedpure ground states. Note that it is an important characteristic of the para-magnetic state that it fluctuates isotropically. For instance the pure state|z〉 = |Lz = 0〉 yields 〈L(α,β,γ)〉 = 0 for all projections onto all directions(α, β, γ), but Lz does not fluctuate: 〈L2

z〉 = 0. Hence it cannot be influencedby a magnetic field in z-direction.

Given a complete orthonormal set of orbitals ψi, 〈ψi|ψj〉 = δij , the deter-minantal states (1.15) form a complete orthonormal set of N -particle states

ΦL(x1 . . . xN ) =1√N !

det ‖ψli(xk)‖, 〈ΦL′ |ΦL〉 = δL′L, (4.51)

being labeled by an N -tupel of integers L = (l1 ≤ . . . ≤ lN) or likewise by asequence of orbital occupation numbers nLi , i = 1 . . .∞ with nLi = 1 if li ∈ Land nLi = 0 otherwise. Expanding ΨK =

L CLKΦL transforms (4.46) into

γN =∑

LL′

ΦLγN ;LL′Φ∗L′ ,

γN ;LL′ =∑

K

CLKgN ;KC

L′∗K = γ∗N ;L′L, tr γN =

L

γN ;LL = 1. (4.52)

From this expression, using (2.1) and (2.3), and recalling that integrationover one particle variable xi of a product of two determinantal states can be

Page 91: The Fundamentals of Density Functional Theory (revised and

4.5 Ensemble State Density Functionals 93

non-zero only, if both determinants contain at least one common orbital, itis not difficult to find

n(x) =∑

ij

ψi(x)nN ;ijψ∗j (x), nN ;ij =

i(LL′)j

nLi γN ;LL′nL′

j , (4.53)

where the last sum runs over pairs (LL′) for which nLk = nL′

k except pos-sibly for k = i, j. (Recall that the nLk take on only two values: 0 or 1.)Clearly nN ;ij = n∗N ;ji. Hence, this matrix may be diagonalized, ‖nN ;ij‖ →diag(nN ;1, nN ;2, . . .), by a unitary transformation to new orbitals φi(x):

n(x) =∑

i

nN ;i|φi(x)|2, 0 ≤ nN ;i ≤ 1,∑

i

nN ;i = N. (4.54)

The unitarity of the transformation from the ψi to the φi transfers the or-thonormality property from the set of the former to that of the latter. Theconstraints on the nN ;i follow most easily from the constraints on the gN ;K

and from∑

L |CLK |2 = 1, if one thinks of using the diagonalizing orbitals φi

to build the ΦL. The same analysis yields the single-particle density matrix

γ1(x; x′) =

i

φi(x)nN ;iφ∗i (x′) (4.55)

with the same constraints (4.54) on the nN ;i. The orbitals φi of this generalsingle-particle density matrix expression were named natural orbitals byLowdin. (Note that, contrary to the case (2.8) of a pure determinantalstate, here the i-sum may run over an infinite number of non-zero items;this may already happen for a correlated pure state. Moreover, for (4.55)generally γ2

1 6= γ1.)One now defines a density functional for N -particle density matrix states

FDM[n]def= inf tr (T + W )γN | γN 7→ n , (4.56)

where the infimum search is over all ensemble states (N -particle densitymatrices) giving n (again, the infimum is in fact a minimum). The corre-sponding interaction-free functional is by means of the general single-particledensity matrix (4.55) (cf. (2.32))

TDM[n]def= inf tr T γN | γN 7→ n =

= min

i

nN ;i

dxφ∗i

(

−∇2

2

)

φi

i

nN ;i|φi|2 = n, 0 ≤ nN ;i ≤ 1

.

(4.57)

Page 92: The Fundamentals of Density Functional Theory (revised and

94 4. Hohenberg-Kohn theory

The minimum search is now over all sets of orthonormal orbitals φi and overall sets of real numbers nN ;i, together fulfilling the constraints indicated inthe right part between the curled brackets. The same reasoning as previouslyleads to

E[v] = infn∈JN

FDM[n] +

dxnv

. (4.58)

Although obviously by definition FDM[n] = FLL[n] for pure-ground statedensities n, in general only FDM[n] ≤ FLL[n] holds, which relation is equallyobvious by definition. Since the relation between γN and n is linear, FDM[n]is convex, hence FDM[n] ≤ convex hull of FLL[n] (cf. Section 6.3 below). Thetwo continuations FLL and FDM of FHK from AN onto JN are different, nev-ertheless for both the Hohenberg-Kohn variational principle (the stationarysolutions n of which may be chosen ∈ AN) holds. (As distinct from thesituation with pure states, an affine-linear combination of two densities n1

and n2 coming from two degenerate ground states is always again a den-sity coming from an ensemble ground state, obtained by mixing of the twooriginal states. Cf. the discussion after (4.49).)

Of course, now likewise TDM[n] ≤ TLL[n] ≤ T [n] holds. Using a decom-position of FDM analogous to (4.27), this time, however, with TDM insteadof T , leads to

E[v] = inf

i

nN,i

dxφ∗i

(

−∇2

2

)

φi + EH[n] + EXC[n] +

dxnv

i

nN,i|φi|2 = n, 0 ≤ nN,i ≤ 1,∑

i

nN,i = N, (φi|φj) = δij

.

(4.59)

Varying n through φ∗i , the usual Kohn-Sham equation (4.32) is obtainedfrom (4.59). Now putting the resulting Kohn-Sham orbitals into the varia-tional expression · · · of (4.59) for E[v] and varying the orbital occupationnumbers nN ;i one has

∂· · · ∂nN ;i

=

dxφ∗i

(

−∇2

2

)

φi +

+

dx′δ

δn(x′)

EH[n] + EXC[n] +

dxnv

∂n(x′)

∂nN ;i

=

=

dxφ∗i

(

−∇2

2

)

φi +

dxφ∗i veffφi = εi. (4.60)

Page 93: The Fundamentals of Density Functional Theory (revised and

4.6 Dependence on Particle Number N 95

This result is called Janak’s theorem. Of course, only occupation numbervariations

i δnN ;i∂· · · /∂nN ;i with∑

i δnN ;i = 0 are allowed due to theconstraint

i nN ;i = N in (4.59). Pick a pair ij and pick δnN ;i = −δnN ;j =δn. Then from (4.60), if nN ;i = 0 and nN ;j > 0, then εi ≥ εj (∂· · · /∂δn ≥0, because δn < 0 is not allowed in that case (cf. Fig.3 on page 66). IfnN ;i < 1 and nN ;j = 1, then the minimum condition demands again εi ≥ εj,because δn < 0 (i.e. δnN ;j > 0) is again not allowed. If 0 < nN ;i, nN ;j < 1,then δn = 0 is an inner point of the domain of variation for δn, and εi = εjmust hold true.

This result [Englisch and Englisch, 1984a,b] generalizes the occupationrule of Section 4.2: If the Kohn-Sham energies are again ordered accordingto

ε1 ≤ ε2 ≤ . . . , (4.61)

then

nN ;i = 1 for εi < εN , 0 ≤ nN ;i ≤ 1 for εi = εN , nN ;i = 0 for εi > εN .

(4.62)

This aufbau principle releases problems with level crossing if self-consistencyis balanced with partial occupation of crossing levels keeping them degen-erate. If in this way a self-consistent veff can be found, then there is againa v0[n] equal to that veff for which the obtained density is a non-interactingground state density (but no longer necessarily a determinantal one), andhence the φi, nN ;i are minimizing (4.57) and (4.58).

There are more attempts to build up a density-matrix functional theorybased on γ1 [Zumbach and Maschke, 1985].

4.6 Dependence on Particle Number N

In a mixed state, the particle number N need not be fixed. It need even notbe fixed in a pure state if consideration is based on the Fock space of Section1.5. All determinants (4.51) for all integer numbers N = 1, 2, . . . togetherwith the vacuum state form a complete orthonormal basis of the Fock spaceF ,

|ΦL〉 = |nL1 . . . nLi . . .〉 (4.63)

in notation of (1.63). The density is now given by (2.23, 2.29) for a purestate. For a general mixed state

γ =∑

K

|ΨK〉gK〈ΨK |, 0 ≤ gK ,∑

K

gK = 1, (4.64)

Page 94: The Fundamentals of Density Functional Theory (revised and

96 4. Hohenberg-Kohn theory

where the |ΨK〉 are general Fock space vectors, the single-particle densitymatrix, for instance, is defined as

γ1(x; x′) = tr ψ(x)γψ†(x′) =

K

gK〈ΨK |ψ†(x′)ψ(x)|ΨK〉. (4.65)

All relations (4.51) through (4.55), with the subscript N omitted and theL-sums running over all basis states (4.63) of the Fock space, remain validin the present more general case on the basis of the same reasoning as there.The only change is that

i ni = N may now be any non-negative realnumber, the expectation value of the total particle number operator (N =〈N〉 = tr N γ).

The ground state energy (cf. (4.4)) is now defined as a functional of vand a function of real non-negative N :

E[v,N ]def= inf tr H[v]γ | tr N γ = N . (4.66)

This definition modifies the definition of (4.4) in two steps. First, in (4.4)the search was over pure particle number eigenstates (eigenstates of N),whereas the pure states |ΨK〉 of (4.64) are general Fock space states, whichmay be linear combinations |ΨK〉 =

N ′ |ΨN ′〉cN ′,∑

N ′ |cN ′|2 = 1 of par-

ticle number eigenstates N |ΨN ′〉 = |ΨN ′〉N ′ (cf. Section 1.5). Since H [v]and N commute, one has 〈ΨK |H[v]|ΨK〉 =

N ′ |cN ′ |2〈ΨN ′|H[v]|ΨN ′〉. Thisstep already finds in (4.66) the minimum over affine-linear combinationsof pure-state minima of (4.4) for various N ′: If we explicitly denote theN -dependence of (4.4) by EN [v], then we have in (4.66) a search over all∑

N ′ |cN ′ |2EN ′ [v],∑

N ′ |cN ′|2N ′ = N ,∑

N ′ |cN ′|2 = 1. The second step con-sists in considering mixtures (4.64) of pure states |ΨK〉, which, however, doesnot further lower the infimum of (4.66) because we know already from thelast section that this infimum over mixed states is the same as over purestates. Hence,

E[v,N ] = mincN′

N ′

|cN ′|2EN ′ [v]

N ′

|cN ′ |2N ′ = N,∑

N ′

|cN ′|2 = 1

,

(4.67)

and this means E[v,N ] is the convex hull over all EN ′[v] with integer N ′; inparticular E[v,N ] ≤ EN [v] for integer N , and equality holds, if EN [v] (withinteger N and fixed v) is convex (cf. Fig.5).

Page 95: The Fundamentals of Density Functional Theory (revised and

4.6 Dependence on Particle Number N 97

v

ε1

ε2

ε3

& %

E

Nε1

ε2

ε3

JJJJJ@@@bbbPP

uu

u ua)

E

NHHHHHHHHHHHHHHH

u uu u

u

b)

Figure 5: Ground state energies EN [v] (• for integer N) and E[v, N ] (thick line forreal N) as a function of N at a fixed v: a) for non-interacting particles in the externalpotential v, b) for particles moving in constant zero external potential and binding inpairs: the energy lowers every other step by the negative binding energy of the nextformed pair.

Given v(x) for an interaction-free system with single-particle levels ε1 ≤ε2 ≤ . . ., the ground state energy for integer N is apparently

E0N [v] =

N∑

i=1

εi, (4.68)

which can directly be deduced by using (4.55) in (4.66). Varying φ∗i yieldsthe single-particle Schrodinger equation, and varying ni yields (4.62) andhence (4.68). Due to the monotone order of the εi, which has to be used in(4.68), trivially

2E0N [v] ≤ E0

N−1[v] + E0N+1[v]. (4.69)

Page 96: The Fundamentals of Density Functional Theory (revised and

98 4. Hohenberg-Kohn theory

Hence, the variation of the ni for non-integer N leads, according to (4.62),just to linear interpolation between the values (4.68) for neighboring integerN (Fig.5a).

Unfortunately, nothing is known up to now about conditions under whichEN [v] is convex in N for an interacting system, although empirically it isconvex for electrons in atoms and molecules (which is also confirmed numer-ically). If EN [v] is convex in N , then the theory of last section is simplyextended by considering in (4.54–4.62) occupation numbers ni adding up tonon-integer N , and densities n(x) integrating to non-integer N .

Nevertheless, as seen from (4.67), also in the general case of interactingparticles, E[v,N ] for fixed v may be a (piecewise linear) convex functionof N even if EN [v] is not convex. (Cf. Fig.5b. In Chapter 6 the fact willbe exploited that E[v] is a concave function of v under completely generalassumptions.) If E[v,N ] were a strictly convex function of N for fixed v,then a chemical potential µ(N) would exist as a strictly monotone functionof N , and

E[v,N ] = infγ tr (H [v]− Nµ(N))γ . (4.70)

If E[v,N ] exists and is convex in N (cf. Section 6.1) then it is piecewiselinear between integer N , µ(N) = ∂E[v,N ]/∂N is monotone and piecewiseconstant (and so is N(µ), the map inverse to µ(N)), and it jumps at integerN . In the interaction-free case those jumps are from values εN−1 to valuesεN . (N(µ) jumps by one at µ = εi the jumps stacking on each other in casesof orbital degeneracy.) Note that the thermodynamic chemical potential isdefined for the dependence of E/V on N/V in the thermodynamic limitV → ∞ for the volume V (where some appropriate assumption on v is tobe made, e.g. being assumed periodic), and most of the jumps dissolve inthat limit. Also, in the thermodynamic limit, if EN [v] were not convex,an instability towards a phase-separated state with two different particledensities would appear, which in case of a short-range interaction w wouldagain have a thermodynamic energy corresponding to E[v,N ]. For Coulombinteraction, the situation is more involved due to its long-range character,and the notion of a chemical potential applies unmodified to situations withlocal charge neutrality only. Phase separation in this context appears dueto restrictions of quantum (coherent) mixtures of macroscopic states. Toincorporate this case, γ in (4.70) comprises phase-separated thermodynamicmixtures.

Now, fix v and consider E(N) for an atom or an ion or a molecule orradical. Of course, in the interacting case, the Kohn-Sham orbital energies

Page 97: The Fundamentals of Density Functional Theory (revised and

4.6 Dependence on Particle Number N 99

ZZZ

ZZZ

ZZZ

ZZZ

ZZZ

ZZZ

ZZZ

ZZZ

ZZZ

ZZZ

ZZZ

AAA

@@@@@@@@@@@@HHHHHHHHHHHHXX

E(N)

N − 1 N N + 1

I(N)

A(N)

Figure 6: Ionization potential I and electron affinity A for an N -electron species.The slope of the E(N) curve is µ(N). The negative of the slope of the dashed line isMulliken’s electronegativity, apparently yielding a good value for −µ(N).

εi are now functions of N or of the occupation numbers ni of (4.62), respec-tively. Then A = E(N)−E(N +1) is the electron affinity, and from Janak’stheorem (4.60) the in principle rigorous relation

−A =

∫ 1

0

dnLUMO εLUMO(nLUMO) (4.71)

follows, where LUMO is the lowest unoccupied Kohn-Sham level (Low-est Unoccupied Molecular Orbital) of the N -particle state, computed self-consistently for varying occupation number of that level with all the re-maining occupation numbers fixed (level crossing assumed to be absent).Analogously one finds

−I =

∫ 1

0

dnHOMO εHOMO(nHOMO) (4.72)

Page 98: The Fundamentals of Density Functional Theory (revised and

100 4. Hohenberg-Kohn theory

for the ionization potential I = E(N − 1) − E(N), where HOMO is nowthe highest occupied Kohn-Sham level of the N -particle state. Mulliken’selectronegativity is

χM =1

2(I + A), (4.73)

since for two species S and T the energy for an electron transition from Sto T is ∆E = IS − AT whereas for an electron transition from T to S it is∆E = IT −AS. Both are equal for IS +AS = IT +AT . The whole situationfor I, A and χM is sketched on Fig.6.

As a reasonable approximation to (4.71, 4.72), Slater’s concept of thetransition state can be used:

−A ≈ εLUMO(nLUMO = 1/2), −I ≈ εHOMO(nHOMO = 1/2). (4.74)

A detailed consideration of possible physical meanings of the Kohn-Shamorbital energies εi may be found in [Perdew, 1985].

Strictly speaking, (4.74) should even not only be an approximation, be-cause ∂E[v,N ]/∂N should quite generally be constant between two neigh-boring integer N , and by Janak’s theorem it should be given by the Kohn-Sham orbital energy εi of the partially occupied orbital, provided EN [v] isconvex in the integer variable N . In approximate versions of the theory inuse this piecewise linearity of E[v,N ] is, however, never provided.

4.7 Spin Polarization

With a few modifications, this whole text allows for a spin-dependence of theexternal potential v(x) = v(r, s) = (v(r,+), v(r,−)) (the latter notation forthe most important spin-half case) with an interaction term

dxnv =

d3r (n(r,+)v(r,+) + n(r,−)v(r,−)). (4.75)

If v is spin-independent, v(r,+) = v(r,−) = v(r), then this interactionreduces to

dxnv =

d3r n(r)v(r). (4.76)

As declared at the very beginning after (1.7), our notation is always meant tocomprise both cases. (Diamagnetic interaction, however, i.e. orbital motion

Page 99: The Fundamentals of Density Functional Theory (revised and

4.7 Spin Polarization 101

coupling to an external magnetic field is not covered by this frame.) Weused this notation to avoid parallelism in the presentation of the material,because the theories of both cases go formally largely in parallel.

One main difference is that there is no spin dependent analogue to thebasic theorem by Hohenberg and Kohn. The spin dependence of an externalpotential, the external magnetic field, is not a unique function of the groundstate spin density any more: for instance for a finite system with discretelevels of total energy, Zeeman energies in a homogeneous external magneticfield shift the levels without (in our non-relativistic approximation (4.1)with spin-orbit interaction and orbital magnetism neglected) changing wavefunctions until levels with different total spin in field direction cross. Theymust cross because they have different slopes of the Zeeman contributions.At that field the ground state reconstructs. Hence, n(x) is constant in certainintervals of homogeneous magnetic field. Fig.7 shows the case of berylliumatom. The discrete stationary states are eigenstates of the z-component ofthe total spin, Σz , and the interaction energy with a homogeneous B-fieldin z-direction is −2µBohrΣzB. On the left of Fig.7 the signature of the four-electron state is indicated (main orbital character of the correlated state).For small B, the ground state is 1s22s2 with Σz = 0 and the first excitedstate is 1s22s2p with Σz = 1. The next Σz = 1 state 1s2s22p is much higherin energy and hence cannot become the ground state even in large B-fields.The lowest Σz = 2 state is 1s2s2p2, and it becomes the ground state forlarger B. The whole picture is symmetric with respect to B = 0, only theright half is sketched in Fig.7. On B-intervals of nonzero lengths the groundstate and hence also its spin density does not change with changing fieldstrength.

With inhomogeneous fields (next section) the situation is more involvedand not fully explored up to now. In view of this difficulty, von Barth andHedin [von Barth and Hedin, 1972] modified the proof of the basic theoremby Hohenberg and Kohn and proved directly a one-to-one map of groundstate spin densities on ground state wave functions for non-degenerateground states in order to construct the analogue of (4.16). However, nowwe have at hand the functionals FLL and FDM (and Lieb’s F introduced inSection 6.2) which all are uniquely defined without recourse to the basictheorem by Hohenberg and Kohn. Hence, everything of this chapter exceptthe passage from (4.10) to (4.14) of Section 4.1 holds true for the spin de-pendent case. Implications of the basic theorem by Hohenberg and Kohn forthe problem of existence of functional derivatives for F [n] will be discussedin Chapter 6.

Page 100: The Fundamentals of Density Functional Theory (revised and

102 4. Hohenberg-Kohn theory

∆E

B

uu

uu

E0@

@@

@@

@@

@@

@@@ E1 − 2µBohrB

@@

@@

@@

@@

@@

@@

AAAAAAAAAAAAAAAAAAAAAAA E3 − 4µBohrB

B0

B1

1s 2s 2p

↑ ↑↓ ↓↑ ↑ ↑↓

↑ ↑ ↑↓↑ ↑ ↑↑

@@@@@@@@@AAAA

@@@@@@@@@AAAA

Figure 7: Energy change with field B for the discrete levels of Be atom as explained inthe text. The lines with positive slopes correspond to states with all spins reversed. Thethick line marks the ground state energy.

Even if the external potential is not spin-dependent, which is always thecase if no magnetic field is applied and the hyperfine interaction effects ofnuclear spins can be neglected, yet it is rather the rule than the exceptionthat the ground state is spin-polarized. This comes about since an odd num-ber of spin half particles unavoidably has a half-integer total spin, and sincethe role of spin in the Pauli principle allows for a spin-dependent exchangeinteraction even when the Hamiltonian does not contain the spin variable.The Hohenberg-Kohn theory, on the other hand, has both variants, one forn(r, s) and v(r, s), and one for n(r) = n(r,+) + n(r,−) and v(r). In thelatter case, even if the ground state appears to be spin-polarized, v is aunique function of the spatial density n(r) alone. (Spontaneous spin polar-

Page 101: The Fundamentals of Density Functional Theory (revised and

4.7 Spin Polarization 103

ization always implies degeneracy of the ground state, since the directionof the total momentum in space is arbitrary, it is the classical example ofspontaneous symmetry breaking. As we have seen, however, this does notspoil the basic theorem.)

These considerations apply even to a non-interacting system. Let w ≡ 0and v = −7/r for seven electrons. Then the ground state is degenerate;it may have total spin 1/2 or 3/2, and various radial density distributions,because of the 8 degenerate 2s and 2p spin-orbitals only 5 are occupied. Thetotal spin of the real nitrogen atom is according to Hund’s rules 3/2 and isonly orientational degenerate. The O(4) symmetry of the interaction-freeproblem is reduced to SO(3) in the Hamiltonian when Coulomb repulsionof the electrons is introduced. This lifts the degeneracy of 2s with 2p states(releases conservation of the Pauli-Runge-Lenz vector) and thus fixes theradial density distribution. As we have three electrons in 2p states, spin-independence would still suggest a degeneracy of ground states with totalspin 1/2 and 3/2. In reality, however, the latter state has a lower energy.This time, the symmetry is broken kinematically: by requirement of thePauli principle both wavefunctions differ in the number of orbital node sur-faces in the nine-dimensional orbital configuration space, although the or-bital density n(r) may be spherical in both cases, if each of the three spatial2p orbitals is occupied with one electron in an L = 0 state (L being thetotal orbital momentum). In the spin-3/2 state, however, the wavefunc-tion for the relevant spinor component must be zero for r1 = r2, r1 = r3,and r2 = r3 due to antisymmetry, whereas in the spin-1/2 state it hasless nodes. These nodes of the many-particle wavefunction, which do notcause additional nodes in its single-particle projections, do not increase thekinetic energy; they decrease, however, the Coulomb repulsion energy. Fi-nally, a particularly chosen pure ground state will have a particular spindirection: the symmetry is further broken, this time spontaneously (with-out a symmetry breaking term in the Hamiltonian or in the constraints onwavefunctions).

At a first glance, one could think that both the spin-3/2 state andthe spin-1/2 state of the nitrogen atom considered above will have thesame density n(r) (spherical symmetric). But of course these densitiesdiffer in their radial dependence, although in a tiny difference (cf. Fig.8).They have different total energies and hence different exponential fall-offs. Hartree-Fock theory yields all these details, though approximately, viathe spin-dependence of the exchange potential operator. Spin-independentHohenberg-Kohn-Sham theory provides only a local spin-independent po-

Page 102: The Fundamentals of Density Functional Theory (revised and

104 4. Hohenberg-Kohn theory

0 1 2 3 4 5

r2n(r)

r2n(r, s)

r/aB

Figure 8: Radial electron densities and spin densities for the nitrogen atom. Theupper full line gives the total electron density for the spin 3/2 ground state, and thelower two lines give the spin up and down densities. For comparison, the dotted linevery close to the upper full line gives the total density of the lowest spin 1/2 state.(Calculated by L. Steinbeck, unpublished.)

tential vXC(r) = δEXC[n]/δn(r). The question arises, how does this theoryknow about the total spin of the ground state of the nitrogen atom being1/2 or 3/2? The simple answer is, it does not. Being a theory for n(r) andthe ground state energy, if EXC[n] would be exactly known, it would supply

Page 103: The Fundamentals of Density Functional Theory (revised and

4.8 Non-Collinear Spin Configurations 105

the correct ground state energy and the correct ground state density (withthe correct exponential falloff) as the sum over spin-independent Kohn-Shamorbital densities, but it would not know anything about the spin. It shouldhave become clear from this little excursion that in this variant EXC[n(r)]must be an extremely tricky functional.

It should immediately be clear that the spin dependent variant of thetheory provides a much more promising situation, because now EXC[n(x)] istold about the spin of the state through n(x) comprising the spatial spin den-sity m(r) = n(r,+)− n(r,−), and it need no further trace tiny differencesof spatial density fall-offs. Recall that, since for a (possibly spin-polarized)ground state n(r) coming from a spin-independent v(r) the latter is uniquelydetermined by the former, and m(r) is determined by that v, e.g. via theN -particle Schrodinger equation, this m(r) is also determined by n(r) (al-though in cases of degeneracy not uniquely). These considerations aim atmaking clear what has not yet been duly stressed in the literature: EXC[n(r)]and EXC[n(x)] play completely different roles in their respective versions oftheory. Both cope with all ground states, spin-polarized or not (since de-generacy of the ground state need not be excluded), of all spin-independentpotentials, one knowing only the spatial particle density n(r) (and hence itmust be by far the more clever one) and one knowing additionally the spatialspin density m(r). The latter, less demanded by ground states to potentialsv(r), copes additionally with ground states to potentials v(x). The exactfunctionals would of course do equally well for potentials v(r), but clearly aguess for the less demanding case is more promising.

We mention in passing that even in the exact versions of both theoriesthe Kohn-Sham orbital energies would come out spin-independent in the onecase and spin-dependent in the other, which proves again that the immediatephysical meaning of those entities is rather limited.

4.8 Non-Collinear Spin Configurations

We now consider the more general case of an arbitrary possibly not unidi-rectional and r-dependent in magnitude magnetic field B(r) coupled to theelectron spin only via the Zeeman term. The relativistic coupling to theorbital motion as well as dipole-dipole interaction between the spins are stillneglected. This case is covered by a 2× 2 spin matrix

vss′(r) = v(r)δss′ − 2µBohrB(r) · σss′ (4.77)

Page 104: The Fundamentals of Density Functional Theory (revised and

106 4. Hohenberg-Kohn theory

for the external potential, where σss′ is the vector of the three components(1.10) of the spin operator (1/2 times the Pauli matrices). The potentialenergy due to this external potential is

ss′

d3r nss′(r)vs′s(r) =

d3r(

n(r)v(r)−m(r) · B(r))

(4.78)

with the spin magnetization density m from (2.34). This case was first con-sidered by [von Barth and Hedin, 1972]. Here we follow the recent analysisby [Eschrig and Pickett, 2001].

Following the route of Section 4.1, let again v1 6= v2 be two differentpotentials having the same ground state Ψ0. Instead of (4.9), subtraction ofthe two Schrodinger equations from each other now yields

i

s′i

∆vsis′i(ri)Ψ0(r1s1, . . . , ris

′i, . . . , rNsN ) =

= Ψ0(r1s1, . . . , rNsN)∆E (4.79)

with ∆v = v1 − v2, ∆E = E[v1] − E[v2]. In this notation, Ψ0 may beunderstood as a set of 2N functions of r1, . . . , rN for all combinations of spinindices si = +,−. (Due to the symmetry of Ψ0, not all of those functionsare independent and not all of them need be non-zero; however, at least oneof them must be non-zero.) Contrary to (4.9) where the equations for thosefunctions were decoupled, they are now coupled by the l.h.s. of (4.79), andthe reasoning after (4.9) does not apply any more.

In order to decouple the equations, perform an r-dependent unitary spinrotation Qss′(r) which diagonalizes ∆v:

[

Q(r)∆v(r)Q†(r)]

ss′= ∆vs(r)δss′. (4.80)

The ground state Ψ0 is transformed into the new local spin variables accord-ing to

[

i

Qsis′i(ri)

]

Ψ0(r1s′1, . . . , rNs

′N) = Ψ0(r1s1, . . . , rNsN), (4.81)

and (4.79) now reads

i

∆vsi(ri)Ψ0(r1s1, . . . , rNsN) = Ψ0(r1s1, . . . , rNsN)∆E. (4.82)

Page 105: The Fundamentals of Density Functional Theory (revised and

4.8 Non-Collinear Spin Configurations 107

For one component of Ψ0 with s1 = s2 = · · · = sN+= +, sN++1 =

sN++2, · · · = sN = − (this order of the si is without loss of generality due tothe antisymmetry of Ψ0) (4.82) reads

N+∑

i=1

∆v+(ri) +

N∑

i=N++1

∆v−(ri)

Ψ0(r1+, . . . , rN−) =

= Ψ0(r1+, . . . , rN−)∆E. (4.83)

Now, the reasoning which followed (4.9) applies. The dependence of bothsides of (4.83) on r1 requires that ∆v+(r) is a constant a.e., and likewise byvirtue of the dependence on rN , ∆v−(r) must be (possibly another) constanta.e. The special cases N+ = 0 or N+ = N need no separate treatment sincethen one of the functions ∆v±(r) is irrelevant. Hence,

∆v+ = C+, ∆v− = C−. (4.84)

Two separate cases must be considered.Case A: impure spin states. Suppose that there are at least two com-

ponents of Ψ0 nonzero with different numbers N+ and N− = N − N+.Then, N+C+ +(N −N+)C− = ∆E must hold for two different numbers N+,hence C+ = C− = C which implies ∆v+ = ∆v− and from inverting (4.80),∆vss′ = Cδss′. This means ∆B = 0. Repeating the steps from (4.10) to(4.12) recovers the Hohenberg-Kohn result in the more general case:

n1ss′ = n2ss′ −→(

v1(r)− v2(r) ≡ CB1(r)−B2(r) ≡ 0

)

. (4.85)

The ground state spin density determines the scalar potential up to a constantand the applied magnetic field uniquely. This also implies a non-zero spinsusceptibility.

Case B: pure spin states. This case is more involved. Suppose that allnon-zero components of Ψ0 belong to the same numbers N+ and N−. Sucha Ψ0 may be considered as a pure spin state, an eigenstate of the operatorΣz =

i σz sis′iwith the eigenvalue Σz = N+ − N/2 = (N+ − N−)/2. Then

C+ and C− need not be equal:

∆v =

(

C+ 00 C−

)

= C1− 2µBohrBσz, (4.86)

where C = (C+ +C−)/2 and −2µBohrB = (C+ −C−)/2. Back transformingaccording to the inverse of (4.80) gives

∆vss′(r) = Cδss′ − 2µBohrB[

Q†(r)σzQ(r)]

ss′. (4.87)

Page 106: The Fundamentals of Density Functional Theory (revised and

108 4. Hohenberg-Kohn theory

Now the implication is

Ψ01 = Ψ02 −→(

v1(r)− v2(r) = CB1(r)−B2(r) = Be(r)

)

,

e(r) = 2 tr [σQ†(r)σzQ(r)], (4.88)

where the property 2 tr [σασβ ] = δαβ of the σ-matrices (1.10) was used.A ground state Ψ0 which may be transformed into an eigenstate of the z-component of total spin by an r-dependent spin rotation determines the ex-ternal scalar potential up to a constant and the applied magnetic field up toa possibly non-unidirectional but constant in magnitude field contribution.

A simple example of this highly non-trivial generalization of the basicHohenberg-Kohn theorem is the Be atom as discussed in the last section.In this case Q(r) ≡ 1, and the applied magnetic field is determined up to aunidirectional constant field.

The question arises what are the general conditions for such a groundstate which may be transformed into a pure spin state Ψ0. A sufficientcondition is the existence of an operator

S =N

i=1

tit′i

Q∗tisi(ri)σz,tit′iQt′is

′i(ri) (4.89)

that commutes with the Hamiltonian H = T + W + U . Consider the par-ticularly important case with B1 ≡ 0. This yields the behavior of a spon-taneously (without applied B-field) spin-polarized system when a magneticfield B2(r) is switched on. Since the interaction W and in the consideredcase also U1 are spin independent, S will commute with H1 if and only if itcommutes with T which means that the ith item of S must commute with theith item of T . This necessitates that Q be r-independent, so that Ψ0 itself isan eigenstate of Σz and hence is a collinear spin state. The second conditionin (4.88) reduces to B2 = Be where e points in the spin direction; a turningon of a uniform magnetic field leaves the ground state invariant. Restated:in the subspace of collinear spontaneous magnetizations, the ground statedetermines an applied magnetic field only up to some codirectional uniformfield. The static q = 0 spin susceptibility for a field in magnetization di-rection is zero. Apart from the considered sufficient condition for this casethere may additionally at most be certain accidental cases. The consideredsituation is the most general systematic possibility which can appear witha spontaneous spin-polarized ground state. Implications on half-metalicityare discussed in [Eschrig and Pickett, 2001].

Page 107: The Fundamentals of Density Functional Theory (revised and

5 Legendre Transformation

The Legendre transformation as a change of variables of convex functions is afamiliar concept in physics. It was introduced into density functional theoryby Lieb as a then completely new aspect of the transfer from v-dependencesto n-dependences [Lieb, 1983]. Legendre transforms, or conjugate function-als in the modern terminology, are very powerful and very general tools ofconvex analysis [Zeidler, 1986, volume III, chapter 51] or [Young, 1969, §45].This chapter introduces a few important mathematical concepts needed forhandling conjugate functionals and thus provides a basis to understand thenext chapter.

The first section presents the basic idea of a Legendre transform on themost simple intuitive level, but general enough to show some features im-portant in our context although usually not considered in classical physicalapplications. We are firmly convinced that working through the content ofthis first section is the minimum necessary for an understanding of what aLegendre transform really is. The subsequent sections introduce some no-tions of Banach space and of duality theory in convex functional analysis,which are necessary for a deeper understanding of Lieb’s density functionaltheory. In order to establish the connection of this theory with the Kohn-Sham equations, the mathematical notion of the functional derivative is con-sidered. The last section of this chapter generalizes the notion of a Lagrangemultiplier to (a slightly specialized version of) Fenchel’s duality, which mightbecome useful in further developments and applications of density functionaltheory.

Three different approaches to studying this chapter are possible: thosewho are not really interested in the mathematical concepts and want onlyto grasp the formal structure of Lieb’s theory may carefully read the firstsection and cursorily scan through the rest of the chapter before passingto the next, the more physical one. A more careful reading of the presentchapter provides a more rigorous introduction to the next one. It finally maybe taken as a guide to those parts of convex functional analysis, which arerelevant in connection with the mathematics of density functional theory,and then e.g. the content and the bibliography of [Zeidler, 1986] may beused for further studies.

Page 108: The Fundamentals of Density Functional Theory (revised and

110 5. Legendre Transformation

x

lγ, f∗

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

@@

XXXXXXXXXXXXXXXXXXXXXXXXXXXXX

. . .

...

@@@@XXXXXXXXXXXXXXXX

XXXXXXXXXXXXXXXX

u

Figure 9: Supremum of affine-linear functions as explained in thetext. The family is supposed to contain functions with unlimitedpositive slopes, all passing through the heavy dot. The supremumf∗(x) (thick line) is equal to plus infinity for x-values right of theheavy dot.

For what follows, the reader is assumed to have an elementary under-standing of normed linear spaces (as for example the notion of a Hilbertspace).

5.1 Elementary Introduction

Consider a family of affine-linear functions on the real line R

lγ(x) = cγx− dγ , γ ∈ Γ 6= ∅, (5.1)

as shown on Fig.9. Γ is some index set, supposed not to be empty. For everyx ∈ R, define f ∗(x) as the supremum over the family lγ(x). On Fig.9, f ∗(x)is shown as a thick line. Apparently, f ∗(x) may take on the value +∞ at

Page 109: The Fundamentals of Density Functional Theory (revised and

5.1 Elementary Introduction 111

least on part of R, if cγ is not bounded on Γ. It cannot take on the value−∞, however, if Γ is not empty. Hence, f ∗ is finite from below:

f ∗(x) ∈ R ∪ +∞. (5.2)

Moreover, f ∗ is convex:

f ∗(cx1 + (1− c)x2) ≤ cf ∗(x1) + (1− c)f ∗(x2)

for every c ∈ [0, 1] and every pair x1, x2 ∈ R, (5.3)

i.e., the graph of the function f ∗ is never above a chord between two pointsof that graph. This follows immediately from the affine-linearity of thelγ’s and from the simple property of a supremum of a sum not to exceedthe sum of the respective independent suprema: sup lγ(cx1 + (1 − c)x2) =supclγ(x1) + (1− c)lγ(x2) ≤ c sup lγ(x1) + (1− c) sup lγ(x2). Fig.9 showsalso, that f ∗ need not be continuous: it may jump to +∞. Analyzing theway this jump appears in Fig.9, one easily finds out that f ∗ takes on thelower value there. The function f ∗ is lower semicontinuous:

f ∗(lim xn) ≤ lim inf f ∗(xn) for every converging sequence (xn). (5.4)

If lim xn = x, then limn′ lγ(xn′) = lγ(x) for every subsequence (xn′) ofthe sequence (xn) because lγ is continuous. Now, supγlimn′ lγ(xn′) ≤limn′(supγ lγ(xn′)), if the limits on both sides exist, is again a general prop-erty of the supremum, the proof of which is a simple exercise. (The limes onthe right side exists only for properly chosen subsequences; this is the reasonfor considering subsequences here. Consider as an example the marked dotof Fig.9. For any sequence (xn) converging to the x-value of that dot, subse-quences (f ∗(xn′)) of the sequence (f ∗(xn)) converge only, if all xn′ are left ofor equal to limxn, the limes of (f ∗(xn′)) in this case being the ordinate valueof the dot, or if all xn′ are right of lim xn, the limes in this case being +∞.)By definition, lim inf(supγ lγ(xn)) = c implies that there is a subsequence(xn′) of (xn) for which lim(supγ lγ(xn′)) = c. Taking this subsequence, allthat together yields f ∗(lim xn) = supγlγ(lim xn) = supγlimn′ lγ(xn′) ≤limn′(supγ lγ(xn′)) = lim inf f ∗(xn). (To be precise, what we have used isthe definition of sequential lower semicontinuity, whereas lower semicontinu-ity is defined somewhat differently in general. We will, however, only meetsituations where both notions coincide.)

As is seen from Fig.9, not all functions lγ(x) are effective in definingf ∗(x). The supremum f ∗(x) does not change, if the left lower parallel linesare omitted. Hence, if we define f(c) = inf dγ | cγ = c as the infimum of all

Page 110: The Fundamentals of Density Functional Theory (revised and

112 5. Legendre Transformation

negative ordinate sections dγ over functions lγ of the family with given slopecγ = c and put the infimum of the empty set equal to +∞ (i.e., f(c) = +∞,if the family lγ does not contain a function with slope c), then, for everyslope c, lc(x) = cx − f(c) is the upper linear function in the family (5.1),and

f ∗(x) = supccx− f(c), f : R→ R ∪ +∞, f 6≡ +∞. (5.5)

The last assumption on f not to be identically equal to +∞ for all c cor-responds to the earlier presupposition of Γ not to be empty. Note however,that the function f(c), for which the above definition of f ∗(x) makes sense, isotherwise completely arbitrary and may be as wild as anybody can imagine,nevertheless

f ∗(x) ∈ R ∪ +∞, f ∗ 6≡ +∞,

f ∗ is convex and lower semicontinuous. (5.6)

f ∗(x) as defined by (5.5) is called the Legendre transform or the conjugatefunction to f(c). Directly from the definition of f ∗ the generalized Younginequality

f ∗(x) + f(c) ≥ xc (5.7)

follows. The classical Young inequality

xc ≤ |c|p/p+ |x|q/q when 1/p+ 1/q = 1 (5.8)

is obtained from f(c) = |c|p/p, for which f ∗(x) = |x|q/q is easily verified bydetermining the supremum (5.5) (which is a maximum in this case) from thezero of the first derivative with respect to c.

Given x, equality holds in (5.7) for those c for which the affine-linearfunctions lc(x) = cx− f(c) pass through (x, f ∗(x)), with

f ∗(x′) ≥ f ∗(x) + c(x′ − x) for all x′ ∈ R (5.9)

being valid (cf. Fig.9). Such a c is called a subgradient of f ∗ at x. If it isunique, it is obviously equal to the derivative c = df ∗/dx. But it need notbe unique. At the x-value of the heavy dot of Fig.9 there are infinitely manysubgradients to f ∗(x) of that figure: the slopes of all lines passing throughthat point and being otherwise below the graph of f ∗(x′). Geometrically, a

Page 111: The Fundamentals of Density Functional Theory (revised and

5.1 Elementary Introduction 113

straight line touching the graph of f ∗(x′) at x and being otherwise belowthat graph is a tangent of support for f ∗ at x. A subgradient is the slope of atangent of support. The subdifferential ∂f ∗(x) is defined to be the set of allsubgradients of f ∗ at x. The subdifferential is the empty set, ∂f ∗(x) = ∅, ifno subgradient exists (e.g. if f ∗(x) = +∞ or if f ∗(x) would not be convex).With those definitions we have

f ∗(x) + f(c) = xc ⇐⇒ c ∈ ∂f ∗(x) (5.10)

for all x, c ∈ R.

A simple relation between f and f ∗ follows directly from the definition(5.5):

f1(c) ≤ f2(c) for all c =⇒ f ∗1 (x) ≥ f ∗2 (x) for all x. (5.11)

It may become very useful for finding bounds.

Now we may consider the conjugate function f ∗∗(c)def= (f ∗)∗(c) to f ∗(x):

f ∗∗(c) = supxxc− f ∗(x) ≤ sup

xf ∗(x) + f(c)− f ∗(x) = f(c). (5.12)

In the second relation the inequality (5.7) was used. Due to the completegenerality of (5.6) with respect to f , it is clear that f ∗∗ has all properties(5.6); it is a lower semicontinuous convex minorant to f .

We will show that f ∗∗ is the maximal lower semicontinuous convex mino-rant of f . To this goal we need a completely natural but extremely importantstatement on convex functions:

For every point below the graph of a lower semicontinuous convexfunction there is a straight line separating the former from thelatter, i.e. for which the graph of the convex function is entirelyabove the line, and the point is below.

This is just the simplest variant of the famous Hahn-Banach theorem, which,loosely speaking, half of twentieth century mathematics is based upon. Toprepare for the general functional case, we give a formal proof: Consider theconvex function f(c), and pick c0 and a0 < f(c0). Pick a so that a0 < a <

Page 112: The Fundamentals of Density Functional Theory (revised and

114 5. Legendre Transformation

f(c0). For every α, β > 0,

βa+ αa = (α + β)a <

< (α + β)f(c0) =

= (α + β) f

(

β

α + β(c0 − α) +

α

α + β(c0 + β)

)

≤ (α + β)

(

β

α + βf(c0 − α) +

α

α + βf(c0 + β)

)

=

= βf(c0 − α) + αf(c0 + β),

where except for elementary algebra only the definition of convexity wasused. Division by αβ > 0 and rearrangement of terms yields

1

α[−f(c0 − α) + a] <

1

β[f(c0 + β)− a] .

Since α and β are completely independent,

xldef= sup

α

[

1

α(−f(c0 − α) + a)

]

≤ infβ

[

1

β(f(c0 + β)− a)

]

def= xr.

From the lower semicontinuity of f it follows that lim supα↓0[−f(c0−α)+a] ≤[−f(c0) + a] < 0. Hence xl < +∞; and −∞ < xr is obtained in the samemanner. Geometrically the above relation between xl and xr simply meansthat from the point (c0, a) lying below the graph of the convex functionf(c) the slope of the tangent to the left part of this graph cannot exceed theslope of the tangent to the right. Now, there is x with xl ≤ x ≤ xr and a1

with a0 < a1 < a so that f(c0 − α) > a1 − xα, f(c0 + β) > a1 + xβ for allα, β > 0. The graph of the affine-linear function l(c) = a1 + x(c− c0) givesthe wanted straight line (f(c) > l(c) for all c). Note that x, the slope of thestraight line, is always finite. It was necessary to take the point (c0, a) belowf(c0) in order to get through the proof with f(c) taking possibly on thevalue +∞: The strict inequality was needed to exclude that both bounds xland xr might be simultaneously +∞ or −∞.

Coming to our main goal—the property of f ∗∗, suppose that f be lowersemicontinuous and convex (and −∞ < f(c) ≤ +∞). Pick c0, and picka0 < f(c0). For a0 < a < f(c0), there is l(c) = a+x0(c−c0) with l(c) < f(c)everywhere. Hence, for all c, cx0 − f(c) < c0x0 − a. The supremum overc of the left side is f ∗(x0). Thus, f ∗(x0) ≤ c0x0 − a < c0x0 − a0, or a0 <x0c0 − f ∗(x0) ≤ f ∗∗(c0). Since a0 may be chosen arbitrarily close to f(c0),

Page 113: The Fundamentals of Density Functional Theory (revised and

5.1 Elementary Introduction 115

f

c1u

f ∗

x

a)

ff ∗∗

c1–1uurrrr

rrrrrrrrrrr

rrrrrrrrrrrrrrr

r r r r r r r r r r r

f ∗

x

@@

@@

@@

b)

Figure 10: Examples of mutual Legendre transforms. a) f(c) = 0 for c = 1 andf(c) = +∞ otherwise; f∗(x) = x. b) f(c) = 0 for c = −1, 1 and f(c) = +∞ otherwise(dashed); f∗(x) = |x|; f∗∗(c) = 0 for c ∈ [−1, 1] and f∗∗(c) = +∞ otherwise (dotted).

this proves f(c0) ≤ f ∗∗(c0). But we know already f ∗∗(c0) ≤ f(c0), hence, asc0 was arbitrary, f ∗∗(c) = f(c).

Considering again a general function f(c) (not necessarily convex), letf0(c) ≤ f(c) everywhere and f0 lower semicontinuous and convex. Accordingto (5.11), f ∗0 (x) ≥ f ∗(x) everywhere and hence f ∗∗0 (c) ≤ f ∗∗(c) everywhere.But f ∗∗0 (c) = f0(c), hence f0(c) ≤ f ∗∗(c) everywhere: f ∗∗ is the maximallower semicontinuous convex minorant to f , i.e. the lower semicontinuousconvex hull of f , and

f ∗∗ ≡ f ⇐⇒ f is lower semicontinuous and convex. (5.13)

In this latter case f and f ∗ are called mutual Legendre transforms or mutualconjugate functions.

Let f(c) be strictly convex, twice differentiable at c and f ′′(c) > 0. Then∂f(c) = f ′(c), and from a relation conjugate to (5.10) f ∗(f ′(c)) + f(c) =

Page 114: The Fundamentals of Density Functional Theory (revised and

116 5. Legendre Transformation

cf ′(c). Differentiating this equation a second time one finds f ∗′(f ′(c))f ′′(c)+f ′(c) = f ′(c) + cf ′′(c), hence

f ∗′(f ′(c)) = c. (5.14)

This means ∂f ∗(f ′(c)) = c, that is, f ∗(x) is differentiable at x = f ′(c)with the derivative equal to c. (Check (5.14) for f(c) = |c|p/p.)

To compare to an elementary well known example, consider the Lagrangefunction L(v) = mv2/2 of a free particle of mass m and with velocity v,and as its Legendre transform the corresponding Hamilton function H(p) =supvpv − L(v). The condition for the extremum is ∂(pv − L)/∂v = 0,yielding p = L′(v) = mv and H(p) = p2/2m. The relation (5.14) takes inthis case the form H ′(L′(v)) = H ′(p) = p/m = v. Another typical exampleis given by the inner energy E(S, V ) as a function of entropy S and volumeV and the free energy F (T, V ) as a function of temperature T and volume,both being thermodynamic potentials and E and −F being mutual Legen-dre transforms for fixed V : −F (T, V ) = supSTS − E(S, V ), E(S, V ) =supTST +F (T, V ). Hence, E is convex in S (and in V ) and F is concave inT (and convex in V ), and the conditions for the extrema are the well knownthermodynamic relations S = −∂F/∂T and T = ∂E/∂S. In most textbooksof Thermodynamics you may read F = E − TS. But this does not tell youmuch since it must be understood as F (T, V ) = E(S(T, V ), V )− TS(T, V )with S(T, V ) implicitly given by T = ∂E/∂S. Two simple instructive non-standard examples of mutual Legendre transforms are given on Fig.10.

The material of this section was presented in such a way that little has tobe modified, if the real line R is replaced by a functional space and functionsare replaced by functionals.

5.2 Prelude on Topology

In order to generalize analysis to functions of (possibly uncountably) in-finitely many variables, the notion of a limes has to be put onto a broaderbasis. A sequence has a limes, if its elements get arbitrarily close to thelimiting point as the subscript increases. This is alternatively expressed bysaying that every neighborhood of the limiting point contains all but finitelymany of the elements of the sequence. A neighborhood of a point is a setcontaining that point as an inner point; and it is precisely the notion of anopen set which relates its points as inner points of the set. This way we arelet to the concept of open sets as underlying the notion of a limes.

Page 115: The Fundamentals of Density Functional Theory (revised and

5.2 Prelude on Topology 117

A set X of which all open subsets are distinguished is called a topologicalspace. In order that this definition meets the notion in use, the followingproperties of the family T of all open subsets of X are to be added:

1. T is closed under finite intersections, i.e., if A, B ∈ T , then A∩B ∈ T .

2. T is closed under unions, i.e., if Aγ ∈ T for all γ of some index set Γ,then ∪γ∈ΓAγ ∈ T .

3. ∅ ∈ T and X ∈ T .

4. Any two different points x, y ∈ X are contained in two disjunct opensets A, B ∈ T : x ∈ A, y ∈ B, A ∩ B = ∅ (Hausdorff).

T is the topology of the topological space X, T . The family of allcomplements X \ A of open sets A in X forms the family of all closed setsof the topological space. A subfamily B ⊆ T is a base of the topology T ,if any T ∈ T can be obtained as T = ∪γBγ with Bγ ⊆ B. (We includedthe Hausdorff property in our definition because we consider only topologieshaving this property; it guarantees that, if a limes exists, it is unique.)

Take, for example, the set X of all sequences (ci), i = 0, 1, 2, . . . of com-plex numbers for which

i |ci|2 = 1 (quantum states as expanded into acomplete basis) and consider the families B1 = Bn,(zi,εi) = (ci) | |ci− zi| <εi for i < n, ci arbitrary for i ≥ n (all open cylinders perpendicular tofinitely many directions), B2 = B(zi),ε = (ci) |

i(ci − zi)2 < ε2

(all open ε-balls), B3 = B(zi) = (zi) (all sets consisting of a singlepoint). They form bases of topologies T1, T2, T3, and for short we denotethe corresponding topological spaces by X1 = X, T1, X2 = X, T2,and X3 = X, T3. A point in X is (ci), a sequence of points in Xis ((ci)j) ≡ (cij). Consider the sequences C1 and C2 of points given byC1 : cij = δij , C2 : c0j = (1− 1/j)1/2, cij = 1/j for i = 1, . . . , j, cij = 0 fori > j. C1 converges to (0) in X1, because any Bn,(0,εi) contains all (ci)j forj ≥ n. It does not converge to (0) nor to any other point in X2, because∑

i(cij − cik)2 = 2 for j 6= k. If any point of the sequence C1 is in some

B(zi),1/2 of X2, then all the remaining points are outside of this ball of radius1/2. Each component ci of C1 converges individually, because cij = 0 for allj > i and hence limj→∞ cij = 0, and this means convergence in X1. For ev-ery j there is, however, a component ci (with i = j) in C1, which is away byone from the limiting value zero: no uniform convergence of all componentsof (ci)j (for all i simultaneously), i.e., no convergence in X2. C2 converges to

Page 116: The Fundamentals of Density Functional Theory (revised and

118 5. Legendre Transformation

(δi0) in both X1 and X2, as easily seen. None of the two sequences convergesin X3, because if a sequence of points converges to some (zi) in X3, thenthere must be some j0 so that all (ci)j ∈ B(zi) for j > j0. But this meansthat all cij = zi for j > j0. (So to say convergence in a finite number ofsteps.)

We have seen that convergence of a sequence is a weak statement in X1,a stronger statement in X2 and the strongest statement in X3. As is easilyseen from the above definition of a base of a topology, T1 ⊂ T2 ⊂ T3, or, inother words, the family of open sets of X1 is a subset of the family of opensets of X2, and that one is a subset of the family of open sets of X3 (thelatter comprising all sets of X because every set is a union of points). Withrespect to their strength, topologies T are (partially) ordered by the subsetrelation. In X3, every subset of X is an open set (and therefore is everysubset closed too). Hence this is the strongest of all possible topologies. (Itis called the discrete topology because in a sense not discussed here all pointsare disconnected from each other.) If we, on the other hand, denote by Cithe set of all converging sequences in (X, Ti), then C1 ⊃ C2 ⊃ C3. (Note thereversed order.)

One subtlety of an important issue considered in the next paragraphsrequires a generalization of sequences. A directed system is an index set Γ,partially ordered (by ≺) in such a way that for every two γ′, γ′′ ∈ Γ there isa common successor γ ∈ Γ, γ′ ≺ γ, γ′′ ≺ γ. A net or generalized sequence(or Moore-Smith sequence) is a set (xγ)γ∈Γ indexed by a directed systemΓ. A section of a net (xγ) is (xγ)γ≺γ for some γ, i.e. that part of the netwith subscripts following γ. A net (xγ′)γ′∈Γ′ is a subnet of (xγ)γ∈Γ, if everysection of the latter contains all but finitely many points of the former. Asfor ordinary sequences (being special nets), a net (xγ) converges to a limesx in a topological space, if every open set containing x also contains all butfinitely many of the xγ . Unlike an ordinary sequence, a net may consistof uncountably many points, and that is why it is introduced. Nets areneeded each time, if the topology of the space X at point x cannot locallybe characterized by a countable set of neighborhoods.

One important concept of topology is compactness: A set C of a topolog-ical space X, T is compact, if every open cover of C, i.e. C ⊆ ∪U∈UU, U ⊆T , contains a finite subcover: C ⊆ ∪ni=1Ui, Ui ∈ U . This is important be-cause of the famous Bolzano-Weierstrass theorem: A set C of a topologicalspace is compact if and only if every net in C has a convergent subnet. (Thelimes of a subnet is a cluster point of the net: every neighborhood of it con-tains infinitely many points of the net; however, unlike the case of the limes

Page 117: The Fundamentals of Density Functional Theory (revised and

5.2 Prelude on Topology 119

of the net itself, another infinite number of points may still be outside.) InRN with its ordinary topology the closed unit ball and every closed boundedset is compact; reversely, every compact set is closed and bounded. It is amajor difference for analysis that in infinite-dimensional normed spaces theclosed unit ball is not compact. In the above examples, the sequence C1,though bounded by the closed unit ball of X2, does not have a subsequenceconverging in X2.

In the following we will only meet two types of topological spaces: metricspaces and locally convex spaces. In a metric space a non-negative distanceof points is defined for all pairs of points, which is non-zero if the points aredifferent, and which obeys the ordinary triangle inequality. A base of itstopology is the family of all ε-balls centered at all points. In a metric space,nets are unnecessary; they may be replaced by sequences in all statements.In a linear metric space the distance d(x, y) is often defined via a norm:

d(x, y) = ‖x− y‖, (5.15)

where a norm is characterized as usual by the properties

‖αx‖ = |α|‖x‖, ‖x+ y‖ ≤ ‖x‖ + ‖y‖,

‖x‖ = 0 if and only if x = 0. (5.16)

If the last property is abandoned, one speaks of a seminorm.Unfortunately, normed linear spaces are often a too narrow frame in func-

tional analysis. A locally convex space is by definition a linear topologicalspace whose topology is given by a family of seminorms pjj∈J such that

x = 0 if and only if pj(x) = 0 for all j ∈ J. (5.17)

(This property and likewise the last property (5.16) of a norm guaranteesthat the so defined topology is Hausdorff.) A base of this topology is

Bn,(jk),z,ε = x | pjk(x− z) < ε, k = 1, . . . , n . (5.18)

The sets Bn,(jk),z,ε are absolutely convex, i.e. x, y ∈ Bn,(jk),z,ε and α ∈ [0, 1]implies αx + (1 − α)y ∈ Bn,(jk),z,ε, and x ∈ Bn,(jk),z,ε and |α| ≤ 1 impliesαx + (1 − α)z ∈ Bn,(jk),z,ε. This property is giving the type of spaces itsname.

In the special case where the family of seminorms consists of a singlenorm, we are again left with a metric linear space whose metric is given by

Page 118: The Fundamentals of Density Functional Theory (revised and

120 5. Legendre Transformation

a norm. X1 of the above given examples of topological spaces is a locallyconvex space whose topology was not given via a metric. Nevertheless it ismetrizable by

d((ci)j, (ci)k) =∞

i=1

2−i[ |cij − cik|1 + |cij − cik|

]

. (5.19)

This metric generates the same topology as the family of seminormspl((ci)j, (ci)k) = |clj − clk| from which it is derived, but this metric can-not be derived from a single norm. However, countability of the family ofseminorms was essential in this construction of the metric.6 In the generalcase of a locally convex space, the topology of which is not metrizable, netsare needed to exhaust all possible limiting processes.

Another basic concept for sequences of real numbers one would like tomaintain in a general situation is Cauchy’s convergence criterion. In a metricspace, a sequence (xi) is called Cauchy, if for every ε > 0 there is an iε sothat d(xi, xj) < ε for all i, j > iε. This is expressed by writing

limi,j→∞

d(xi, xj) = 0. (5.20)

A metric space is called complete, if every Cauchy sequence converges to apoint of the space. (Every metric space may be completed by adding allCauchy sequences, not yet converging in it, just as additional points; thereal line is obtained by completing the rational line this way.) A normedlinear space complete in the metric of the norm is called a Banach space.

In a locally convex space, a net (xγ) is accordingly called Cauchy, iffor every ε > 0 and every seminorm pj there is a section cut by γε,j sothat pj(xγ′ − xγ′′) < ε in the section, i.e. for all γ′, γ′′ ≻ γε,j. A locallyconvex space is called complete, if every Cauchy net converges. (It is—lessdemanding—sequentially complete, if every Cauchy sequence converges.)

The basic maps of topological spaces are continuous functions. Recallthat a function f : X → Y is continuous, if the preimage f−1(A) of everyopen set A ⊆ Y is an open set of X. This is equivalent to saying that forevery net (xγ) converging to x in X the net (f(xγ)) converges to f(x) in Y .Indeed, suppose that (f(xγ)) does not converge to f(x). This means that

6Complete metric spaces of that type are called Frechet spaces. They play a fun-damental role in the theory of generalized functions as well as in the modern theory ofanalytic complex functions. The relation between the considered topological linear spacesis locally convex spaces⊃Frechet spaces⊃Banach spaces⊃Hilbert spaces⊃finite-dimensional Euklidian spaces.

Page 119: The Fundamentals of Density Functional Theory (revised and

5.3 Prelude on Lebesgue Integral 121

infinitely many points f(xγ) of the net are outside of some open neighbor-hood U of f(x). Hence, infinitely many points xγ are outside of f−1(U) ∋ ximplying that either f−1(U) is not open or (xγ) does not converge to x.Important examples of continuous functions from a linear topological spaceonto the real line R with its ordinary topology given by open intervals ofreal numbers are the norm x 7→ ‖x‖ and the seminorm pj(x). In fact thetopologies derived from a norm or a family of seminorms as above may al-ternatively be defined as the weakest topologies in which the norm or allseminorms, respectively, are continuous.

(A bijective map f : X → Y is called a homeomorphism, if both fand f−1 are continuous functions. If it exists, then both spaces X and Yare equivalent with respect to their topologies. Hence continuous functionsintroduce into the class of topological spaces the structure of an algebraiccategory of objects and morphisms; see e.g. [Lang, 1965, §7].)

5.3 Prelude on Lebesgue Integral

In linear functional spaces a norm is usually defined by some integral overa real-valued function. In order to obtain a complete metric space withpoints closest to the ordinary notion of function, a broader notion of inte-gral than that of Riemann is necessary. We will, however, by far not considerthe most general case as we need only the Lebesgue integral over real- orcomplex-valued functions on the real N -dimensional Euclidean space RN .This starts with non-negative real-valued functions f(x), where the contri-bution of function values above some positive number y to the integral isestimated from below by the value y times the measure of the domain onwhich f(x) > y, and this measure has to be defined.

The simplest measurable subset of RN is an open brick A (open in thetopology of the Euclidean metric, hence N -dimensional). Its measure µ(A)is naturally equal to the product of the edge lengths of A. The next step isthe measure of countable unions of disjoint open bricks, defined as

µ (∪∞i=1Ai)def=

∞∑

i=1

µ(Ai), Ai ∩ Aj = ∅ for i 6= j, (5.21)

where the sum may be +∞. We denote the family of countable unions ofdisjoint open bricks by A.

This suffices to define the Rieman integral, the Lebesgue measure is,however, defined for a much broader family of sets called the Borel sets and

Page 120: The Fundamentals of Density Functional Theory (revised and

122 5. Legendre Transformation

being defined as the smallest family B of subsets of RN with the properties

1. B contains all open bricks of RN .

2. B is closed under countable unions,

3. B is closed under complements in RN ,

Hence B contains countable unions of complements of countable unionsof complements of . . . of bricks. This comprises practically all conceivablesubsets of RN . Although the existence of non-Borel subsets of RN can beproved, there is little more than no idea what they actually look like. Forevery B ∈ B,

µ(B)def= inf

B⊆A∈Aµ(A). (5.22)

Here, an essential point is that the infimum is not merely over finite unions ofbricks as intuition would suggest, but over countable unions. This Lebesguemeasure has all the defining properties of a general regular measure on B:

1. µ(∅) = 0,

2. µ is σ-additive, i.e. for a countable family Bi∞i=1 ⊆ B of mutuallydisjoint sets, µ(∪Bi) =

µ(Bi),

3. µ(B) = infµ(O) |B ⊆ O, O open ,

4. µ(B) = supµ(C) |C ⊆ B, C compact .

A real-valued function f is called measurable, if for every open intervalI = (a, b) the set f−1(I) is measurable (i.e. f−1(I) ∈ B). In this case,f+ = maxf, 0 and f− = max−f, 0 are also measurable. Because ofthe third property of B, for every closed or semiclosed interval I ′, f−1(I ′) ismeasurable too. Let n ≥ 0 be integer, then

dNx f+(x)def= lim

n→∞

∞∑

m=0

m

2nµ

(

f−1+

([

m

2n,m+ 1

2n

)))

,

dNx f(x)def=

dNx f+(x)−∫

dNx f−(x). (5.23)

Because the expression in braces in the first line is non-negative and, aseasily seen, not decreasing as a function of n, the limes exists (it may be

Page 121: The Fundamentals of Density Functional Theory (revised and

5.3 Prelude on Lebesgue Integral 123

+∞). The definition of the second line is only valid if at least one of the twoterms on the right side is finite. For a complex-valued function the integralis defined by separate integration over the real and imaginary parts.

The meaning of this definition is to divide the range of f into intervals ofwidth 1/2n and to look for the sets in the domain of f for which the functionvalues fall into that interval. The contribution of this set to the integral isthen estimated from below with the help of the Lebesgue measure of theset (m/2n estimates the function value from below on the set). Call thisestimate (the expression in braces) In. As was already stated, In+1 ≥ In. IfIn as a function of n is not bounded, then its limes is +∞, and the integralis defined to be +∞. If In is bounded, then, as a monotone sequence, it hasa finite limes I.

Consider a domain Ω ⊂ RN of finite measure µ(Ω), and replace themeasure in the expression in braces of (5.23) by µ(Ω ∩ f−1

+ ([m/2n, (m +1)/2n))). Call the expression in braces now In(Ω), it estimates the integralover the domain Ω from below. Replacing the prefactor m/2n by (m+1)/2n,now estimates the function value from above. Call this expression Jn(Ω).Now, Jn+1(Ω) ≤ Jn(Ω) by considering one step of refinement. It is easilyseen that In(Ω) ≤ Jn(Ω) ≤ In(Ω) + µ(Ω)/2n, thus both estimates have thesame limes, equal to

ΩdNx f+(x). For every increasing sequence (Ωi), Ωi ⊆

Ωi+1, ∪iΩi = RN , the limes of∫

ΩidNx f+(x) is equal to the first line of (5.23)

as easily seen. This justifies the definition (5.23).

This Lebesgue integral has all the ordinary properties of an integral andcoincides with the Riemann integral for all functions for which the latteris defined. The value of the integral remains unchanged, if the functionvalue f(x) is arbitrarily changed for x-points forming a set of zero Lebesguemeasure in RN . (Recall that the Lebesgue measure in RN depends bydefinition on N : e.g. the measure of the set of all points on the x1-axisof the R2 is zero, whereas the measure in R1 of this set as a set of R1 isinfinite; the first is an area with zero extension in one direction, whereas thesecond is an infinite interval length.) While for the Riemann integral onlychanges of the function values at nowhere dense points are harmless, thereare plenty of dense sets of Lebesgue measure zero, e.g. the set of all rationalpoints of R1 = R. There are even plenty of uncountable sets of Lebesguemeasure zero, e.g. the well known Cantor set. This only indicates how farone has to leave comprehension, if one wants to treat all Cauchy sequencesof functions as functions (see next section).

A statement is said to hold true almost everywhere (a.e.), if it holds truefor all points but a set of zero measure.

Page 122: The Fundamentals of Density Functional Theory (revised and

124 5. Legendre Transformation

5.4 Banach Space

As was already stated in Section 5.2, a Banach space is defined as a normedlinear space complete in the metric of the norm. Besides this definition,there is one important property which is decisive for a normed linear spaceto be Banach.

Let X be a normed linear space. Then the following two state-ments are equivalent:

• X is a Banach space.

• xn ∈ X and∑∞

n=0 ‖xn‖ <∞ =⇒ ∑∞n=0 xn = s ∈ X.

The sequence (xn) of the second statement is called absolutely summable.To prove this theorem, let first X be Banach, and let

∑ ‖xn‖ <∞. Denote∑m

n=0 xn by sm. We must prove lim sm = s ∈ X. Because of the convergenceof the series of norms, there is a p(ε) for every ε > 0 so that for everym ≥ p(ε) and for every k,

∑m+kn=m+1 ‖xn‖ ≤ ε and hence ‖∑m+k

n=m+1 xn‖ ≤ ε.This latter relation means ‖sm+k − sm‖ ≤ ε, that is, (sm) is Cauchy, andhence it converges to some s ∈ X. Let now the second statement of thetheorem be true. This time we have to prove that every Cauchy sequenceconverges in X. Let (xn) be Cauchy, this means that for every integerk ≥ 0 there is a p(k) so that for m,n ≥ p(k), ‖xm − xn‖ ≤ 1/2k. We choosep(k+1) > p(k) for every k and consider the series xp(0)+

∑∞k=0(xp(k+1)−xp(k)).

It converges in X by our assumption, because the corresponding series ofnorms converges. Its limes equals the limes of the Cauchy sequence (xn)which hence exists in X. This proves that X is Banach.

The simplest Banach space is R with the norm ‖x‖ = |x| given by theabsolute value. If we were to confine ourselves to this trivial case only, wewould have been done with Section 5.1. A simple generalization would beRN . The goal of this whole chapter is to transfer all the content of Section5.1 to uncomparably much richer structures. It may literally be transferedto any Banach space one can invent, we will, however, only consider Lp

spaces, the points of which may be characterized by Lebesgue measurablefunctions.

Let p ≥ 1 and let Lp(RN) be the set of Lebesgue measurable real- orcomplex-valued functions f : RN → R or C for which

dNx |f(x)|p <∞. (5.24)

Page 123: The Fundamentals of Density Functional Theory (revised and

5.4 Banach Space 125

Gather all functions coinciding almost everywhere on RN with a givenf ∈ Lp(RN) into one equivalence class. (If g(x) = f(x) a.e. and h(x) = f(x)a.e., then g(x) = h(x) a.e. since the union of two sets of zero measure haszero measure; hence g(x) = f(x) a.e. is indeed an equivalence relation.) Theobtained equivalence class is again denoted by f and called a p-summablefunction (although it is in fact a class of functions). Obviously those equiv-alence classes are compatible with addition of functions (compatible meansthat, if we take any two functions out of two given equivalence classes andadd them, the result will always be in one and the same equivalence class;this is again true because the union of two sets of zero measure has zero mea-sure). The equivalence classes are also compatible with multiplication of afunction by a real or complex number. Furthermore, the condition (5.24)is stable under linear combinations. Hence, p-summable functions form alinear space over the scalar field of real or complex numbers. This space,equipped with the norm

‖f‖p def=

(∫

dNx |f(x)|p)1/p

, 1 ≤ p <∞ (5.25)

is a Banach space, the (real or complex) Lebesgue space Lp(RN), or in shortLp, of p-summable functions.

To be indeed a norm, (5.25) must have the three properties (5.16), thefirst of which is trivially fulfilled for every scalar (i.e. constant on RN) α.Starting with Young’s inequality (5.8), one has for f ∈ Lp, g ∈ Lq, 1/p +1/q = 1 (and hence p, q > 1) with ‖f‖p, ‖g‖q 6= 0

dNx

f(x)

‖f‖pg(x)

‖g‖q

≤∫

dNx

( |f(x)|pp‖f‖pp

+|g(x)|qq‖g‖qq

)

=1

p+

1

q= 1,

hence

dNx |f(x)g(x)| ≤ ‖f‖p‖g‖q for 1/p+ 1/q = 1. (5.26)

Page 124: The Fundamentals of Density Functional Theory (revised and

126 5. Legendre Transformation

This is Holder’s inequality. If at least one of the norms on the right side ofthe inequality is zero, then f(x)g(x) = 0 a.e., hence the inequality is validalso in this case. Next, because of |f + g| ≤ |f |+ |g|, for f, g ∈ Lp,

dNx |f(x) + g(x)|p ≤

≤∫

dNx |f(x)| · |f(x) + g(x)|p−1 +

dNx |g(x)| · |f(x) + g(x)|p−1.

Applying to both right hand integrals Holder’s inequality yields

‖f + g‖pp ≤ ‖f‖p‖ |f + g|p−1 ‖q + ‖g‖p‖ |f + g|p−1 ‖q.

Since (p − 1)q = p, the common right hand factor is ‖ |f + g|p−1 ‖q =

(∫

dNx |f + g|(p−1)q)1/q = (∫

dNx |f + g|p)(1/p)(p/q) = ‖f + g‖p/qp (whereforeit exists on grounds of our presupposition on f and g). Considering finallyp− p/q = 1, the norm property

‖f + g‖p ≤ ‖f‖p + ‖g‖p (5.27)

is obtained, which in the case of Lebesgue space is called Minkowski’s in-equality. It holds also true for p = 1, which is immediately seen from thedefinition of the L1-norm. The last property (5.16) of a norm is guaran-teed by the convention, in Lp not to make a difference between a functionf non-zero on a set of zero measure only and the function f ≡ 0.

Finally, the proof of completeness of Lp, i.e.

fi ∈ Lp and limi,j→∞

‖fi − fj‖p = 0 =⇒ limi→∞

fi = f ∈ Lp, (5.28)

may be found in standard textbooks on functional analysis (Riesz-Fishertheorem).

Let f be an equivalence class (in the above sense) of Lebesgue measurablefunctions on RN such that f(x) is a.e. bounded. This property is apparentlystable under linear combinations, and hence these classes form a linear space.Consider

‖f‖∞ def= ess sup |f(x)| def

= inf c | |f(x)| ≤ c a.e. . (5.29)

This is a norm by the properties of an absolute value and because a unionof two sets of zero measure has zero measure. Due to σ-additivity ofa measure, even a countably infinite union of sets of zero measure has

Page 125: The Fundamentals of Density Functional Theory (revised and

5.4 Banach Space 127

zero measure, and hence the presupposition of (5.28) for p = ∞ meanslimi,j→∞ |fi(x)− fj(x)| = 0 for almost all x. This implies lim fi(x) = f(x)for almost all x and some f(x), since R or C, respectively, is complete.Choose a subsequence fi′ of the Cauchy sequence fi so that ‖fi′ − fi′+1‖∞ <2−i

. Then, 1 >∑∞

i′=1 ‖fi′ − fi′+1‖∞ ≥ ∑∞i′=1 |fi′(x) − fi′+1(x)| ≥

|∑∞i′=1(fi′(x) − fi′+1(x))| = |f1′(x) − f(x)| for almost all x, and hence

‖f‖∞ < ‖f1′‖∞ + 1. We have shown that (5.28) extents to p = ∞, and theso defined normed space, denoted by L∞(RN) or shortly L∞, is a Banachspace. As is easily seen, even (5.26) extents to the case p = 1 and q =∞.

Note that, as another special case, L2(RN) is a Hilbert space with thescalar product (f |g) =

dNx f ∗(x)g(x). (It contains the usual Hilbertspace of quantum mechanics as a subspace.) For this case, (5.26) impliesaccording to |(f |g)| = |

dNx f ∗(x)g(x)| ≤∫

dNx |f(x)g(x)| ≤ ‖f‖2‖g‖2the well known Schwarz inequality

|(x|y)| ≤ ‖x‖ ‖y‖ (5.30)

valid in every Hilbert space. (A Hilbert space is just a special case of aBanach space whose norm is given via a scalar product.)

Let p ≤ t ≤ p′ and∫

dNx |f |p < ∞,∫

dNx |f |p′ < ∞. Define thecomplementary sets A = x | |f(x)| ≤ 1 and B = x | |f(x)| > 1 . Bothsets are measurable, and

dNx |f |t ≤∫

AdNx |f |p+

BdNx |f |p′ <∞. Hence,

f ∈ Lp and f ∈ Lp′ =⇒ f ∈ Lt for p ≤ t ≤ p′. (5.31)

Note that Lp ∩Lp′ is again a Banach space with respect to the norm

‖f‖p...p′ def= sup

p≤t≤p′‖f‖t. (5.32)

The first and third properties (5.16) of a norm are evidently fulfilled for thisnorm. Furthermore, ‖f1 + f2‖p...p′ = supp≤t≤p′ ‖f1 + f2‖t ≤ supp≤t≤p′(‖f1‖t+‖f2‖t) ≤ ‖f1‖p...p′ + ‖f2‖p...p′ . Finally, a sequence which is Cauchy in thenorm ‖ · ‖p...p′ is evidently Cauchy in every norm ‖ · ‖t for p ≤ t ≤ p′. Henceit converges to an f ∈ Lt for every t with p ≤ t ≤ p′.

Next, consider the family of Banach spaces Lp(TN ), 1 ≤ p ≤ ∞, onthe N -dimensional torus of finite total measure LN , defined by periodicboundary conditions xi ≡ xi + L, i = 1, . . . , N , cf. (1.28). Let f ∈ Lp(TN ),i.e.,

TN dNx |f |p < ∞, and let again A = x | |f(x)| ≤ 1 and B =

x | |f(x)| > 1 . This time,∫

AdNx |f |t ≤ LN < ∞ for every t ≥ 1,

and∫

BdNx |f |t ≤

BdNx |f |p for every 1 ≤ t ≤ p. Hence,

f ∈ Lp(TN ) =⇒ f ∈ Lt(TN ) for all 1 ≤ t ≤ p. (5.33)

Page 126: The Fundamentals of Density Functional Theory (revised and

128 5. Legendre Transformation

Especially, f ∈ L∞(TN)⇒ f ∈ Lt(TN) for all t ≥ 1.Finally, consider the linear hull of the spaces Lq(RN ) and Lq′(RN ), that

is, the space of equivalence classes of Lebesgue measurable functions f(x),which may be represented as f = g + h, g ∈ Lq(RN ), h ∈ Lq′(RN ). Thisspace is denoted by Lq(RN )+Lq′(RN ). It is a Banach space with the norm

‖f‖(qq′) def= inf‖g‖q + ‖h‖q′ | g + h = f. (5.34)

The infimum is taken over all possible decompositions of f into a sum g+h.Again, the first and the third property (5.16) are evident for this norm.Moreover, ‖f1 + f2‖(qq′) = inf‖g1 + g2‖q + ‖h1 + h2‖q′ | g1 + g2 + h1 + h2 =f1 + f2 ≤ inf‖g1‖q + ‖g2‖q + ‖h1‖q′ + ‖h2‖q′ | g1 + g2 + h1 + h2 =f1 + f2 ≤ inf‖g1‖q + ‖h1‖q′ | g1 + h1 = f1 + inf‖g2‖q + ‖h2‖q′ | g2 +h2 = f2 = ‖f1‖(qq′) + ‖f2‖(qq′). For every absolutely summable series∑ ‖fn‖(qq′) < ∞, by definition of the norm (5.34) there is gn, hn so that‖gn‖q < 2‖fn‖(qq′), ‖hn‖q′ < 2‖fn‖(qq′) and hence

∑ ‖gn‖q <∞,∑ ‖hn‖q′ <

∞. Since Lq and Lq′ are Banach, it follows∑

gn = g ∈ Lq(RN),∑

hn =h ∈ Lq′(RN) and therefore

fn =∑

(gn+hn) = g+h ∈ Lq(RN)+Lq′(RN ).Thus we have shown that every absolutely summable series converges in thatspace, and hence Lq(RN) + Lq′(RN) is a Banach space.

Let X and Y be two normed linear spaces, and consider bounded linearoperators A : X → Y defined on X and having values in Y so that

‖A‖ def= sup

x∈X, x 6=0

‖Ax‖Y‖x‖X

<∞. (5.35)

It is easily seen that A is continuous at x = 0 and hence, due to the linearityof X, Y and A, on all X. Continuity of the linear operator means that thex-points for which ‖Ax‖Y < ε must form an open neighborhood of x = 0.That is, there must be a δ(ε) > 0 so that for all x, for which ‖x‖X < δ(ε)holds, ‖Ax‖Y < ε, hence ‖A‖ ≤ ε/δ(ε) < ∞. Therefore bounded linearoperator and continuous linear operator from a normed linear space into anormed linear space mean the same thing. Linear combinations of boundedlinear operators are again bounded linear operators, and it is not hard to seethat (5.35) has all properties of a norm. Thus, all bounded linear operatorsfrom a normed linear space X into a normed linear space Y form again anormed linear space, which is denoted by L(X, Y ), and which is a Banachspace if Y is a Banach space. (Note that it was not presupposed that X isa Banach space, the comparatively simple proof of this statement can againbe found in textbooks or treated as an exercise.)

Page 127: The Fundamentals of Density Functional Theory (revised and

5.5 Dual Space 129

5.5 Dual Space

A bounded linear functional is a bounded linear operator from a normedlinear space X to the real line or complex plane, respectively, dependingon whether X is real or complex. We will only consider the real case, thecomplex case is a straightforward generalization. The (topological) dual toa normed space X is the Banach space

X∗def= L(X,R). (5.36)

(It is always a Banach space because R is a Banach space.)

Let 1 ≤ p ≤ ∞ and 1/p + 1/q = 1. Let g(x) ∈ Lq. From Holder’sinequality (5.26) it is clear that

gf = (g|f)def=

dNx g(x)f(x) for every f ∈ Lp (5.37)

defines a bounded linear functional g on Lp whose norm ‖g‖ ≤ ‖g‖q since|gf |/‖f‖p ≤ ‖g‖q. Since f(x) = signg(x) |g(x)|q−1 ∈ Lp with ‖gq−1‖p =

‖g‖q/pq , equality holds for this f in Holder’s inequality, hence ‖g‖ = ‖g‖q.It is a rather tedious work to show that for 1 ≤ p < ∞ (but not for

p =∞) every bounded linear functional g may be defined by some g ∈ Lq,hence, by identifying g with g(x) and ‖g‖ with ‖g‖q, we have

Lp∗ = Lq, 1 ≤ p <∞, 1/p+ 1/q = 1, L∞∗ ⊃ L1. (5.38)

(For 1 < q ≤ ∞ this implies by the way again that Lq is complete.) From(5.38) it follows immediately that

Lp∗∗ = Lp, 1 < p <∞, L1∗∗ ⊃ L1, L∞∗∗ ⊃ L∞. (5.39)

Moreover, for 1 ≤ p, p′ <∞, (Lp ∩Lp′)∗ = Lq + Lq′ .A Banach space X is said to be reflexive, if X∗∗ = X. (By considering

the bilinearity of (g|f), allowing for an interpretation as a linear functionalg on X or as a linear functional f on X∗, X∗∗ ⊇ X is true for every normedlinear space, where X∗∗ is a Banach space even if X is not; hence, to formthe second dual is a way to complete X.)

From the definition of a seminorm by the first two properties (5.16) itis seen that any bounded linear functional y defines a seminorm on X bypy(x) = |yx| = |(y|x)|. If y runs over all elements of the dual space X∗, then

Page 128: The Fundamentals of Density Functional Theory (revised and

130 5. Legendre Transformation

(5.17) is fulfilled, hence the dual can be used in a given Banach space X todefine a weak topology with the base

Bn,(yk),z,ε = x | |(yk|x− z)| < ε, k = 1, . . . , n for every n, yk ∈ X∗, z ∈ X, ε > 0. (5.40)

Except for finite-dimensional spaces, this weak topology is strictly weakerthan the norm topology. X equipped with this topology is in general merelya locally convex space rather than a Banach space. (The Banach propertyalways refers to a corresponding norm topology.)

Geometrically, a point y ∈ X∗ defines via (y|x) = 0 a norm-closed hy-perplane in X and via dy(x) = (y|x)/‖y‖ a signed distance of x from thathyperplane. While norm convergency of a series (xi) means uniform conver-gency of all those distances from all possible hyperplanes, weak convergencyonly means independent convergency of those distances. (Cf. the examplesof Section 5.2.)

Analogously, the points of X∗∗ can be used to introduce the weak topol-ogy in X∗. If, however, X is not reflexive, the points of X ⊂ X∗∗ defineeven a weaker topology in X∗, called the weak∗ topology, and for this theBanach-Alaoglu theorem says: The closed unit ball of X∗ is weak∗ compact.This implies that for a reflexive Banach space (for which the weak and weak∗

topologies coincide) the closed unit ball is weak compact, or, in other terms,every bounded net in a reflexive Banach space has a weak converging subnet.

The main theorem for extremal problems states that for a real functionalF on a topological space X the minimum problem

minx∈A⊆X

F [x] = α (5.41)

has a solution x0 ∈ A, F [x0] = α, if F is lower semicontinuous and A isnonempty and compact. We have learned that different topologies may beconsidered in X. The weaker the topology, the stronger the semicontinuitycondition, but the weaker the compactness condition. If X is a reflexive Ba-nach space, these conditions are often realized in the weak topology (withoutreflexivity in the weak∗ topology). The proof of the theorem is simple: con-sider the infimum of the functional on A, pick a corresponding net, and usecompactness to select a convergent subnet.

Just for comparison we come back to the trivial case X = R at the end ofthis section. A linear functional in this case is a linear function y = cx for allx ∈ R with some c ∈ R. With the ordinary norm ‖x‖ = |x| in R, the norm(5.35) of y is ‖y‖ = |c|. Since |c| <∞, the linear function is always bounded

Page 129: The Fundamentals of Density Functional Theory (revised and

5.6 Conjugate Functionals 131

(in the sense of a linear operator, i.e. of bounded norm), and it is of coursecontinuous. (Non-continuous linear functions have to do with an infinitenumber of dimensions of space: the slope of the function may unboundedlyincrease, if one runs through the dimensions.) Hence, R∗ = R = R∗∗. Weakand norm topologies coincide in R. (The same holds true for RN .)

5.6 Conjugate Functionals

In this section we consider a dual pair (X,X∗) of reflexive real Banach spaces,x ∈ X, y ∈ X∗. A functional on X is a function F : X → R; in analogy toSection 5.1 we slightly generalize this notion by allowing F : X → R∪+∞.The conjugate functional (or Legendre transform) F ∗ to F is defined as

F ∗[y]def= sup

x∈X(y|x)− F [x] for all y ∈ X∗. (5.42)

Now we are ready to transfer the content of Section 5.1 to this case.A functional F is convex, if

F [cx1 + (1− c)x2] ≤ cF [x1] + (1− c)F [x2]

for every c ∈ [0, 1] and every x1, x2 ∈ X. (5.43)

Lower semicontinuity is also defined exactly as in Section 5.1, this time us-ing the norm topology for convergence of sequences of points of X. Weaklower semicontinuity is defined using the weak topology in X and eventuallyreplacing sequences by nets. In general there are less weak lower semicontin-uous functionals than (norm) lower semicontinuous functionals: the lattercomprise the former but not vice versa. However, every convex norm lowersemicontinuous functional on a Banach space is weak lower semicontinuous.A subgradient of F at x is a bounded linear functional y so that

F [x′] ≥ F [x] + (y|x′ − x) for all x′ ∈ X (5.44)

(cf. (5.9)). Geometrically, the graph (in X×R with points (x′, F [x′])) of theaffine-linear functional Hx,y[x

′] = F [x] + (y|x′ − x); x, y fixed in accordancewith (5.44), is a hyperplane of support for F at x, a hyperplane touchingthe graph of F [x′] at (x, F [x]) and being nowhere above this graph. Thesubdifferential ∂F [x] is defined again to be the set of all subgradients y ofF at x.

Now, for every F : X → R ∪ +∞,

F ∗[y] ∈ R ∪ +∞, F ∗ 6≡ +∞,

Page 130: The Fundamentals of Density Functional Theory (revised and

132 5. Legendre Transformation

F ∗ is convex and lower semicontinuous, (5.45)

F ∗[y] + F [x] ≥ (y|x), (5.46)

F ∗[y] + F [x] = (y|x) ⇐⇒ x ∈ ∂F ∗[y]. (5.47)

F1[x] ≤ F2[x] for all x =⇒ F ∗1 [y] ≥ F ∗2 [y] for all y. (5.48)

The relevant issue of the Hahn-Banach theorem is:

For every point (in X × R) below the graph of a lower semi-continuous convex functional there is a norm-closed hyperplaneseparating the former from the latter, i.e. for which the graph ofthe convex functional is entirely above the hyperplane, and thepoint is below.

(The proof consists of transfinite induction through linear independent yγ ∈X∗, of the step considered in Section 5.1.) From this issue, by the samereasoning as in Section 5.1, F ∗∗ = (F ∗)∗ is the lower semicontinuous convexhull of F , and

F ∗∗ ≡ F ⇐⇒ F is lower semicontinuous and convex. (5.49)

In this case, F and F ∗ form a dual pair of lower semicontinuous convexmutually conjugate functionals or of Legendre transforms.

5.7 The Functional Derivative

There are generalizations of partial derivatives and of a gradient to the caseof a functional F on a normed linear space X. If there exists a boundedlinear functional F ′[x0] ∈ X∗, x0 ∈ X, so that

limα→0

F [x0 + αx]− F [x0]

α= (F ′[x0]|x) for all x ∈ X (5.50)

with scalar (real or complex) α, then F ′[x0] is called the G-derivative (orGateaux derivative, directional derivative) of F at x0. For every fixed non-zero x ∈ X the number (F ′[x0]|x)/‖x‖ generalizes a partial derivative at x0

in the direction of x. (As a further generalization, the G-derivative is bythe same expression (5.50) even defined for functionals on a locally convexspace X without a norm.)

If F is a real functional on the real Lp(RN) = Lp and f0 ∈ Lp, thenF ′[f0], if it exists, is an element of Lq, i.e. F ′[f0] = g ∈ Lq and, for every

Page 131: The Fundamentals of Density Functional Theory (revised and

5.7 The Functional Derivative 133

f ∈ Lp, (F ′[f0]|f) =∫

dNx g(x)f(x). (Note that, in this particular case, ournotation replaces x ∈ X and y ∈ X∗ of our general considerations by f ∈ Lp

and g ∈ Lq, respectively, while x means a point of RN .) This situation isoften expressed by writing

δF [f ]

δf(x)= g(x). (5.51)

This writing implies without mention that both sides are functionals of f0 ∈Lp, the point where the derivative is taken.

If X is a normed linear space and a bounded linear functional F ′[x0] ∈X∗, x0 ∈ X exists so that

F [x0 + x]− F [x0] = (F ′[x0]|x) + o(‖x‖) as x→ 0 (5.52)

for all x ∈ X, then F ′[x0] is called the F-derivative (or Frechet derivative,gradient) of F at x0. (The notation o(‖x‖) means lim‖x‖→0 o(‖x‖)/‖x‖ = 0.)F ′[x0] generalizes the gradient of F at x0.

Note that while the limes in (5.50) was taken for a fixed x setting adirection in the functional space, (5.52) is demanded uniformly for all x. F-derivative and G-derivative coincide at x0 in a normed space, if and only ifF ′ as a function on X with values in X∗ is (norm) continuous at x0. This isexactly the same situation as is already met for functions of a finite numberof variables, as for instance functions on R2. For example,

F (u, v) =

u3v/(u4 + v2) if (u, v) 6= (0, 0),0 if (u, v) = (0, 0)

(5.53)

has at (u, v) = (0, 0) both partial derivatives and every directional deriva-tive in the u, v-plane equal to zero, but not continuous: ∂F/∂v is sin-gular at (u, v) = (0, 0) as a function of u. Nevertheless F has a non-zero slope on the curve v = u2, hence there is no gradient at (0, 0)[Kolmogorov and Fomin, 1970, chap. X, §1].

A natural generalization replaces F : X → R by a mapping F : X → Yfrom a normed linear space X into a normed linear space Y and the bounded

linear functional F ′[x0] by a bounded linear operator F ′[x0] : X → Y . The

operator F ′[x0] ∈ L(X, Y ) is a G-derivative, if

limα→0

F [x0 + αx]− F [x0]

α= F ′[x0]x for all x ∈ X. (5.54)

Page 132: The Fundamentals of Density Functional Theory (revised and

134 5. Legendre Transformation

The limes is understood here in the norm topology of the space Y . It is anF-derivative, if

F [x0 + x]− F [x0] = F ′[x0]x+ o(‖x‖) as x→ 0 (5.55)

for all x ∈ X.This generalization allows for the formulation of a general chain rule:

Let G : X → Y and H : Y → Z. Let F = H G be the composite mapF : X → Z : F [x] = H [G[x]]. Then

F ′[x0] = H ′[G[x0]]G′[x0] (5.56)

is an F-derivative, if the right side expressions exist as F-derivatives. In this

equality, F ′[x0] is a linear operator from X into Z, G′[x0] is a linear operator

from X into Y , and H ′[G[x0]] is a linear operator from Y into Z. The lastline of (4.31) is to be understood in this sense for a situation, where thelinear operators of (5.56) are integral operators.

For a map F : X → Y , the derivative is a map F ′ : X → L(X, Y ), i.e.,for every x0 ∈ X, F ′[x0] ∈ L(X, Y ). Taking again the derivative of F ′, amap F ′′ : X → L(X,L(X, Y )) is obtained. (Compare the tensor structureof higher derivatives in RN .) For x0 ∈ X, the image of the map F ′′ is a

bounded linear operator F ′′[x0] on X, so that for every x1 ∈ X the value

F ′′[x0]x1 is in L(X, Y ), i.e., it is again a bounded linear operator on X: For

every x1, x2 ∈ X, the value F ′′[x0]x1x2 ∈ Y depends linearly on both x1 andx2. For a functional F [x], F ′′[x0] is a bilinear functional, and in the scalarproduct notation of linear functionals we write this as ((F ′′[x0]|x1)|x2) ∈ R,in the dual case as (y2|(y1|F ∗′′[y0])), yi ∈ X∗. Continuing this process, e.g.for a functional F : X → R and its F-derivatives, the Taylor expansion

F [x0 + x] = F [x0] +n

k=1

1

k!(· · · ((F (k)[x0]|x)|x) · · · |x) + o(‖x‖n) (5.57)

holds, if the F-derivatives exist. (For X = Lp, the k-th F-derivative is anelement of Lq ×Lq × · · · ×Lq (k factors). On the basis of Fubini’s theoremon multiple integrals, this is given by a (symmetric) function g(x1 . . .xk) sothat

(· · · (F (k)[f0]|f) · · · |f) =

dNx1 · · · dNxk g(x1 . . .xk)f(x1) · · ·f(xk),

Page 133: The Fundamentals of Density Functional Theory (revised and

5.8 Lagrange Multipliers 135

where g is a function of f0.)As an application let F ∗[y0] be strictly convex, twice differentiable at y0 ∈

X∗, and (y2|(y1|F ∗′′[y0])) 6= 0 if y1, y2 6= 0. Then, ∂F ∗[y0] = F ∗′[y0], and(5.47) reads F ∗[y0] + F [F ∗′[y0]] = (y0|F ∗′[y0]). Note that F ∗′[y0] is a linearfunctional on X∗ and hence an element of X. A further differentiation usingthe chain rule yields F ∗′[y0] + (F ′[F ∗′[y0]]|F ∗′′[y0]) = F ∗′[y0] + (y0|F ∗′′[y0]),hence (F ′[F ∗′[y0]] − y0|F ∗′′[y0]) = 0. Because F ∗′[y0] is an element of X,F ′[F ∗′[y0]] is defined and is a linear functional on X and hence an elementof X∗. With our assumption on F ∗′′ it follows that

F ′[F ∗′[y0]] = y0. (5.58)

This result generalizes (5.14).

5.8 Lagrange Multipliers

Again we do not consider the most general case.Let F : X → R ∪ +∞, and let G : X → Y be a continuous linear

operator from the Banach space X onto the Banach space Y (that is, G issupposed to be surjective). Then, Gx = y0 defines a closed hypersurfacein X (with dimension dimX − dimY ; Y = R would be a simple case).Consider the minimum problem

minF [x] | x ∈ X, Gx = y0 = α. (5.59)

In this case one says that a bound minimum of F [x] is searched for, where xis bound to the hypersurface defined by the side condition Gx = y0. (Thisis in contrast to a free minimum on all X.)

Let F [x0] = α, Gx0 = y0, and suppose that F ′[x0] exists as an F-derivative. If Gx1 = 0, then G(x0 + x1) = y0 and hence x1 is an admissiblevariation by the side condition. Therefore the minimum condition implies(F ′[x0]|x1) = 0. If on the other hand Gx2 6= 0, then (F ′[x0]|x2) is notrestricted by the minimum condition because x2 is not admissible for theminimum search. Hence, the null space of F ′[x0] (i.e. the subspace of Xwhich is nullified by F ′[x0]) contains the null space of G, and consequentlythere exists a linear functional Λ : Y → R, so that (Λ|G|x) = (F ′[x0]|x)for all x ∈ X. (If there would exist an x1 ∈ X with (F ′[x0]|x1) 6= 0, butGx1 = 0, then the above Λ could not exist.) Thus we arrive at

F ′[x0]− Λ G = 0 (5.60)

Page 134: The Fundamentals of Density Functional Theory (revised and

136 5. Legendre Transformation

as a necessary condition for (5.59). Every solution of (5.59) is thus a solutionof

F [x]− (Λ|G|x)⇒ free stationary for all x ∈ X. (5.61)

Λ of course depends on y0 and on F and is called a Lagrange multi-plier. The commonly known case is X = Lp, Y = R, G ∈ Lq andGf =

dNx g(x)f(x). In this case, Λ : R → R : r 7→ λr and

(Λ|G|f) = λ∫

dNx g(x)f(x), and the Lagrange multiplier is a number λ.As a generalization, now let F : X → R ∪ +∞ be arbitrary, H : Y →

R∪+∞ convex and lower semicontinuous, G : X → Y a continuous linearoperator, and x ∈ X, y ∈ Y ∗. Consider

L[x, y] = F [x] + (y|G|x)−H∗[y]. (5.62)

From the assumptions, the doubly conjugate H∗∗ = H, supy∈Y ∗(y|G|x) −H∗[y] = H∗∗[Gx] = H [Gx], and hence the inf-sup problem

infx∈X

supy∈Y ∗

L[x, y] = α (5.63)

is equivalent to the inf problem

infx∈XF [x] +H [Gx] = α. (5.64)

Obviously, infx L[x, y0] ≤ L[x0, y0] ≤ supy L[x0, y] for all x0, y0, hencesupy infx L[x, y] ≤ infx supy L[x, y]. Generally, from the properties of infimaand suprema, (5.63) implies

supy∈Y ∗

infx∈X

L[x, y] = β ≤ α. (5.65)

With (y|G|x) = (G†y|x) where G† is the operator adjoint to G, it follows thatinfx∈XF [x] + (G†y|x) = − supx∈X(−G†y|x)− F [x] = −F ∗[−G†y]. The

sup-inf problem (5.65) is thus equivalent to supy∈Y ∗−F ∗[−G†y]−H∗[y] =β and hence to

infy∈Y ∗F ∗[−G†y] +H∗[y] = −β. (5.66)

Finally, (5.65) may be written as

infy∈Y ∗

supx∈X−L[x, y] = −β. (5.67)

Page 135: The Fundamentals of Density Functional Theory (revised and

5.8 Lagrange Multipliers 137

L[x, y] is called a general Lagrange function, which mediates a duality be-tween (5.64) and (5.66), called Fenchel’s duality. Comparing (5.62) with(5.61) one says that in the problem (5.63) equivalent to (5.64) y plays therole of an abstract Lagrange multiplier, and that in the problem (5.67) equiv-alent to (5.66) x plays such a role.

The introduction of the Lagrange function (5.62) and of the correspond-ing inf-sup problems leads naturally to the notion of a saddle point of afunctional. Let L[x, y] : A× B → R, and

maxy∈B

L[x0, y] = L[x0, y0] = minx∈A

L[x, y0], (x0, y0) ∈ A× B. (5.68)

Then (x0, y0) is a saddle point of L with respect to A × B. In this case,α = β in the above relations, and the suprema and infima are maxima andminima, respectively. Rather general sufficient conditions for the existenceof a saddle point are:

1. X and Y are reflexive Banach spaces, A ⊆ X, B ⊆ Y, A, B areconvex, closed, and non-empty.

2. L[x, y] is convex and lower semicontinuous on A for all fixed y ∈ B,and −L[x, y] is convex and lower semicontinuous on B for all fixedx ∈ A.

3. A is bounded in X, or there exists y′ ∈ B so that L[x, y′] → +∞as ‖x‖ → +∞. B is bounded in Y , or there exists x′ ∈ A so that−L[x′, y]→ +∞ as ‖y‖ → +∞.

For more details see e.g. [Zeidler, 1986, volume III].

Page 136: The Fundamentals of Density Functional Theory (revised and

6 Density Functional Theory by Lieb

The central idea of this chapter is to treat the Hohenberg-Kohn functionalF [n] as the Legendre transform to E[v] (in a certain strict sense presentedin the subsequent text).

One complication of this approach lies in the fact that the functionalspaces of admitted densities n and admitted potentials v, which naturallyappear from physical reasoning, are not reflexive. This comes about fromthe infinity of the physical space R3: in approaching the infimum of energyof an N -particle problem, some of the particles may disappear at infinity.For practically all problems of physical relevance, this difficulty may be sep-arated by first considering a finite space (preferably with periodic boundaryconditions as introduced in Section 2.7). For getting the required answers,it suffices in most cases to consider a large enough finite periodic volume.Only in special problems the limit with the volume tending to infinity mustreally be considered afterwards.

The first section of this chapter reformulates the problem of the groundstate energy in this specified context and establishes the important convex-ity properties of the ground state energy. Since dependences on particlenumbers are a most important issue of density functional theory, the par-ticle number is treated as a continuous variable from the very beginning.The second section then develops the density functional theory straightfor-wardly with the use of the machinery of Legendre transforms. A universaldensity functional H [n], which is most closely related to both Lieb’s densityfunctional and approximate functionals in practical use, is obtained for theHohenberg-Kohn variational principle.

The more technical details and mathematical subtleties are put into thethird section which may be skipped in a first reading. One artificial featureof Lieb’s functional F is its separate and independent dependence on thetotal particle number N and on the density n. The gauge invariance ofSchrodinger’s equation with respect to an additive potential constant putsF to +∞, if the integral over the density n is not equal to N . Our exploitingof Legendre transforms for both the N - and v-dependences of the groundstate energy and the above mentioned gauge invariance avoids this feature

Page 137: The Fundamentals of Density Functional Theory (revised and

6.1 The Ground State Energy 139

from the very beginning and condenses the separate dependences of F on nand N into a single dependence on n of a modified functional H [n] which isthe convex hull of all F [n,N ]. This brings Lieb’s theory back to the frameunderstood in approximative implementations for numerical calculations (ase.g. the local-density approximation) where one uses functionals dependingon N only via n.

The last section of this chapter gives a final answer to the question ofexistence of functional derivatives of density functionals and hence puts theKohn-Sham equations on a rigorous basis. Finally it summarizes the level theHohenberg-Kohn-Sham theory has reached as a reliable closed mathematicalframe.

In this chapter throughout the general case of spin density functionaltheory is treated with a possibly non-unidirectional magnetic field and apossibly non-collinear spin polarization.

6.1 The Ground State Energy

In Chapter 4 we dealt with three different definitions of the ground state en-ergy of increasing generality: (4.4), (4.49), and (4.66). In Section 4.5 we sawthat the consideration of ensemble states solves problems with level cross-ings and answers in general the question of occupying Kohn-Sham orbitalswith the aufbau principle. In Section 4.6, mixed states with distinct parti-cle numbers led to the convexity of E[v,N ] in the real variable N , a veryimportant issue for the following (and for any practical approach to densityfunctional theory as will be discussed later). There are many more reasonsto consider most general mixed states as candidates for ground states, someof which were discussed in Section 4.5.

Hence, we leave the particle number N undefined so far and start withthe family of Hamiltonians of type (4.1) (with the general potential form(4.77)),

H[v,M ] = −1

2

M∑

i=1

∇2i +

M∑

i=1

vsis′i(ri) +

1

2

M∑

i6=j

w(|ri − rj|) (6.1)

for all (admissible) external potentials v and all integer particle numbersM . In Section 4.1 we also discussed that given a potential v, not for everyparticle number M there exists a ground state. This problem is connectedwith the infinite measure (volume) of the position space R3 which allowsunbound particles to disappear at infinity and causes scattering states not

Page 138: The Fundamentals of Density Functional Theory (revised and

140 6. Density Functional Theory by Lieb

to be normalizable but which has no practical relevance in our context.Therefore we replace the position space R3 by a torus T 3 (box with periodicboundary conditions like in (2.82)) with volume |T 3| sufficiently large notsignificantly to change the considered results. This makes the spectrum ofall Hamiltonians discrete and ground states existing and normalizable in anycase. In practical calculations for finite systems (atoms, molecules, clusters)the boundary conditions may simply be ignored as long as all particles arebound; in case of extended systems they introduce a discrete mesh of k-vectors like in (1.29) and the results have to be converged with the densityof k-points. The very helpful formal implications on the structure of thetheory will be discussed in Section 6.3.

We admit ensemble states of the most general type (4.64):

γ =∑

K

|ΨK〉gK〈ΨK |, 0 ≤ gK ,∑

K

gK = 1, (6.2)

where the pure states |ΨK〉 may be expanded into a fixed orthonormal setof particle number eigenstates:

|ΨK〉 =∑

M

|ΦMK 〉CK

M ,∑

M

|CKM |2 = 1, N |ΦM

K 〉 = |ΦMK 〉M. (6.3)

The energy expectation value in the general ensemble state (6.2) is

tr (Hγ) =∑

K,M

gK |CKM |2〈ΦM

K |H[v,M ]|ΦMK 〉, (6.4)

where

0 ≤ pKMdef= gK |CK

M |2,∑

K,M

pKM = 1. (6.5)

The expectation values for the particle number and particle density are

tr (N γ) =∑

K,M

pKMM, (6.6)

and

tr (nγ) =∑

K,M

pKMnMK,ss′(r) = nss′(r), (6.7)

Page 139: The Fundamentals of Density Functional Theory (revised and

6.1 The Ground State Energy 141

where nMK,ss′(r) is the spin density matrix of the state |ΦMK 〉. The interaction

energy with the external potential

tr (U γ) =∑

ss′

d3r vss′(r)ns′s(r)def= (v|n). (6.8)

is as previously a linear functional of the density n.Now, the ground state energy as a functional of the external potential v

and a function of the real particle number N can be defined as

E[v,N ]def= inf

γ

tr (Hγ)∣

∣tr (N γ) = N

=

= infpK

M

K,M

pKM〈ΦMK |H[v,M ]|ΦM

K 〉∣

K,M

pKMM = N

. (6.9)

TheM-sums run over allM = 0, 1, 2, . . . and for eachM theK-sums run overarbitrarily many orthonormal M-particle states. (For the sake of complete-ness we include one abstract state |Φ0〉 (vacuum) with 〈Φ0|H[v, 0]|Φ0〉 = 0for all v.)

Caution is needed here since given v and N , (6.9) need not exist even if〈ΦM

K |H[v,M ]|ΦMK 〉 is bounded below for every fixed M . A simple coun-

terexample is 〈ΦM0 |H[v,M ]|ΦM

0 〉 = −εM2 and p00 = 1 − N/M1, p

0M1

=N/M1. p

KM = 0 else. Clearly the sum in (6.9) diverges for M1 → ∞. This

cannot happen for w ≡ 0 since in this case 〈ΦMK |H[v,M ]|ΦM

K 〉 ≥ −ε1M whereε1 is the lowest orbital energy. (Cf. Fig.5a. Recall that 〈ΦM

0 |H[v,M ]|ΦM0 〉 is

convex as a function of M in this case.)In the whole of this chapter the particle-particle interaction is presupposed

repulsive, w ≥ 0. Then, with v fixed (6.9) is bounded below by the convexfunction E0[v,N ] and the infimum exists.

Now, fix v and pick two particle numbers N1 and N2 (not necessarilyintegers). There exist sequences γij, i = 1, 2; j = 1, 2, . . . with tr (N γij) = Ni

and limj tr (Hγij) = E[v,Ni]. Take the sequence γj = cγ1j + (1 − c)γ2

j , 0 ≤c ≤ 1. Obviously tr (N γj) = cN1 + (1− c)N2, limj tr (Hγj) = cE[v,N1] +(1 − c)E[v,N2]. On the other hand, by definition E[v, cN1 + (1 − c)N2] ≤limj tr (Hγj). This proves the convexity of E[v,N ] in N for fixed v:

E[v, cN1 + (1− c)N2] ≤ cE[v,N1] + (1− c)E[v,N2], 0 ≤ c ≤ 1. (6.10)

(By definition (6.9) and the text above, N ≥ 0 with E[v, 0] = 0 for all v.One may, however, formally define E[v,N ] = +∞ for all N < 0 and all v,

Page 140: The Fundamentals of Density Functional Theory (revised and

142 6. Density Functional Theory by Lieb

which makes E[v,N ] defined for all N , convex and lower semicontinuous.)As discussed in Section 4.6, for integer N the energy (6.9) may be below thatof (4.49) for the search restricted to particle number eigenstates with M = N(cf. Fig.5b). However, later it will become clear that if one uses one universaldensity functional H [n] for all densities n integrating up to arbitrary particlenumbers N (as one always does in practical implementations) one produces(6.9) instead of (4.49).

Next, we fix N and find

E[cv1 + (1− c)v2, N ] =

= infγ

c tr (Hv1 γ) + (1− c) tr (Hv2γ)∣

∣tr (N γ) = N

≥ c infγ

tr (Hv1 γ)∣

∣tr (N γ) = N

+

+(1− c) infγ

tr (Hv2 γ)∣

∣tr (N γ) = N

=

= cE[v1, N ] + (1− c)E[v2, N ], 0 ≤ c ≤ 1. (6.11)

In the first equality (4.2) was used, and then the simple fact that the infimumof a sum cannot be lower than the sum of the corresponding independentinfima.

In summary:

The ground state energy E[v,N ] is a convex function of N forfixed v and a concave functional of v for fixed N .

These simple convexity properties of E[v,N ] together with the gaugeproperty (4.6) form the deep logical foundation of density functional theory.

6.2 The Hohenberg-Kohn Variational Principle

Starting with the convexity of E[v,N ] in N a Legendre transform G[v, µ]may be defined with the pair of transformations

G[v, µ] = supNµN −E[v,N ] (6.12)

E[v,N ] = supµ

Nµ− G[v, µ]

(6.13)

Because of the gauge property (4.6) the expression in braces of (6.12) is−E[v − µ,N ], and hence

G[v, µ] = G[v − µ, 0]def= G[v − µ] (6.14)

Page 141: The Fundamentals of Density Functional Theory (revised and

6.2 The Hohenberg-Kohn Variational Principle 143

has only one functional dependence. The above duality relations simplify to

G[v] = − infNE[v,N ], E[v,N ] = sup

µNµ−G[v − µ] . (6.15)

G is convex in v as shown by the following chain of relations (c ∈ [0, 1]):

G[cv1 + (1− c)v2] = − infNE[cv1 + (1− c)v2, N ] ≤

≤ − infNcE[v1, N ] + (1− c)E[v2, N ] ≤

≤ −c infN ′E[v1, N

′]− (1− c) infN ′′

E[v2, N′′] =

= cG[v1] + (1− c)G[v2]. (6.16)

From the first to the second line the convexity of−E[v,N ] in v was used, andthen again the negative of an infimum of a sum cannot exceed the negativeof the sum of the independent infima of the items.

Because G[v] is convex, it can be again back and forth Legendre trans-formed. This time it is a functional Legendre transformation for whichfunctional spaces have to be specified. We postpone this specification andintroduce formally −n as a dual variable to v in order to arrive at commonnotations:

H[−n] = supv(−n|v)−G[v] def

= H [n] (6.17)

G[v] = supn

(v| − n)− H [−n]

=

= − infnH [n] + (v|n) . (6.18)

(v|n) is a real scalar product since v and n are Hermitian spin matrices,hence (−n|v) = (v| − n) = −(v|n).

Inserting the first relation (6.15) into (6.17) yields

H [n] = supv

−(n|v) + infNE[v,N ]

≤ infN

supvE[v,N ]− (n|v) =

= infNF [n,N ], (6.19)

where the general rule sup inf ≤ inf sup (see Section 5.8) was applied, and

F [n,N ]def= sup

vE[v,N ]− (n|v) (6.20)

Page 142: The Fundamentals of Density Functional Theory (revised and

144 6. Density Functional Theory by Lieb

was first introduced by Lieb [Lieb, 1983] as a density functional. (Lieb re-lated it to EN [v] as defined by (4.4) with the obvious relation EN [v] ≥E[v,N ]. The corresponding Lieb functional FN [n]

def= supvEN [v] − (n|v)

will shortly be considered below.)Next we insert (6.18) into the right equation (6.15) and obtain

E[v,N ] = supµ

Nµ + infnH [n] + (v − µ|n)

≤ infn

H [n] + (v|n) + supµ

[N − (1|n)] µ

. (6.21)

Again the rule sup inf ≤ inf sup was applied, and, since µ can be treatedas a potential function constant in space, (µ|n) = (1|n)µ. It will be shownin the next section that in both relations (6.19) and (6.21) the inequalities≤ may be sharpened into equalities. Since the expression under the lastsupremum is linear in µ, the supremum is either +∞, if (1|n) 6= N , or zero,if (1|n) = N . Taking then the infimum over all n means just selecting thelast case, that is,

E[v,N ] = infnH [n] + (v|n) | (1|n) = N . (6.22)

Evidently, this is the variational principle by Hohenberg and Kohn, and H [n]is the celebrated density functional in its logically most satisfying context.

As a Legendre transform, H [n] is automatically convex and lower semi-continuous (in the norm and even in the weak topology of the functionalspace yet to be specified, cf. Section 5.6) and has a non-empty subdifferential∂H [n] at every n where H [n] is finite. We shall see that this subdifferentialis governed by the basic theorem by Hohenberg and Kohn relating v to n.

6.3 The Functionals F , G, and H

The functional dependence of G[v] on v consists of its dependence onvmod(µ) and of the dependence on the constant µ resembling a chemicalpotential. Given vmod(µ), the dependence of G on µ can easily be ana-lyzed. The situation is illustrated on Fig.11. On the left panel E[v,N ]is plotted against N for fixed v. It is piecewise linear between integer N ,E[v, 0] = 0, and E[v,N ] = +∞ for N < 0. From (N = 0, E = 0) it startsout with (negative) slope µ0. For 0 ≤ N ≤ 1, µ0N − E[v,N ] = 0, andµ0N −E[v,N ] ≤ 0 elsewhere, hence G[v−µ0] = supNµ0N −E[v,N ] = 0.

Page 143: The Fundamentals of Density Functional Theory (revised and

6.3 The Functionals F , G, and H 145

In general, for any slope µ, the ordinate section −G of the tangent of sup-port l(N) = µN − G defines G[v − µ]. (For any slope µ < µ0 this tangentof support passes through the origin, whence G[v − µ] = 0 for µ < µ0.) Onthe right panel of Fig.11 −G is plotted against µ for the sake of simplercomparison with the left panel. G itself is obviously convex (and piecewiselinear), and it is not difficult to realize

G[v − µ] = µN(µ)−N(µ)−1∑

N=0

µN , µN(µ)−1 ≤ µ ≤ µN(µ) (6.23)

with

µN = E[v,N + 1]− E[v,N ] = −IN+1 = −AN , (6.24)

where IN is the ionization potential of the N -particle state and AN is itsparticle affinity.

Next consider the relation

H [n] = supv

infNE[v,N ]− (n|v) ≤ inf

NsupvE[v,N ]− (n|v)

contained in (6.19). The question arises, given n, does E[v,N ] − (n|v) asa function of v and N have a saddle point? If yes, then the ≤ sign in theabove relation can be sharpened into equality (cf. Section 5.8). The answerto the question is yes. Given v (and n), for every potential constant c,E[v − c, N ] = E[v,N ] − cN . Since E[v,N ] is convex in N , for Nn = (n|1)one finds a cn, so that E[v− cn, Nn] = E[v,Nn]− (n|1)cn = minNE[v,N ]−Ncn = minN E[v − cn, N ]. (This cn just fixes the potential zero at therequired value of the chemical potential for N = Nn: it is given by the slopeof a tangent of support to E[v,N ] at N = Nn, cf. Fig.11.) Hence, for that v,infNE[v−cn, N ]−(n|v−cn) = E[v−cn, Nn]−(n|v−cn) = E[v,Nn]−(n|v)and, since v was chosen arbitrarily at the beginning of this consideration,H [n] ≥ E[v,Nn] − (n|v) for every v, since H is the supremum over all vof the latter v-dependent infimum. The important point is that the onlyother v-dependent quantity cn dropped out of the last estimate, and one hasH [n] ≥ supvE[v,Nn] − (n|v). Since H [n] is not less than this supremumfor N = Nn, it is a fortiori not less than the infimum of that supremum overall N : H [n] ≥ infN supvE[v,Nn] − (n|v). (6.19) holds with the reversedinequality sign, and hence the indeed universal density functional is

H [n] = infN

supvE[v,N ]− (n|v) = inf

NF [n,N ]. (6.25)

Page 144: The Fundamentals of Density Functional Theory (revised and

146 6. Density Functional Theory by Lieb

E

N

AAAAAA@@@HHHXXX

r

AA

AA

AA

ZZZZ

ZZZZ

ZZZZ

ZZ

r r r r r r r r r r r r r r r r r r r

µ0NµN

−G

N(µ)

−G

µ

HHHHHH@@@JJJ

µ0 µ1 µ2

r r r r r r r r r r r r r r r r

1

Figure 11: General situation with E[v, N ] and G[v − µ] for v(x)mod(µ) fixed asexplained in the text. The right panel shows the µ-dependence of −G[v−µ] accordingto (6.14) and (6.12).

It is the infimum over N of the density functional by Lieb F [n,N ] whichlatter is defined by (6.20) and depends independently on n and N , that iswithout the connection N = (n|1).

The next question is, what is F in the case N 6= (n|1) or for n(x) negativeon a region of non-zero measure (admitted since n varies in a whole functionalspace, the dual to the space of all potentials). Using the definition (6.20)of F , it is not difficult to show that F [n,N ] = +∞ there. To see this,first consider a density n(x) which is negative on a set Ω ⊂ T 3 of non-zeromeasure (for the sake of simplicity assumed spin diagonal; the general casereduces to it by a local diagonalization). Take a potential v equal to somepositive constant c on Ω and equal to zero on T 3 \ Ω. For this potentialv ≥ 0, obviously E[v,N ] ≥ 0 (E[v,N ] → 0 for |T 3| → ∞) for every c > 0.On the other hand, −(n|v) = −

dxnv → +∞ for c → +∞ (recall that nand v are independent in the braces of (6.20)), hence,

F [n,N ] = +∞ if not n(x) ≥ 0 a.e. (6.26)

Page 145: The Fundamentals of Density Functional Theory (revised and

6.3 The Functionals F , G, and H 147

Next consider N 6= (n|1) = Nn. Let this time c denote a constant potentialequal to the real number c on the whole torus T 3: E[c, N ] = E[0, N ] +Ncand (n|v) = Nnc. If Nn < N , then the expression in braces of (6.20) tendsto +∞ for c→ +∞, in the opposite case Nn > N for c→ −∞, hence,

F [n,N ] = +∞ if not (n|1) = N. (6.27)

This is just a generalization of the situation sketched in Fig.10a. A densitychange δn(x) on the hypersurface

dxn = N in the functional space ofdensities is characterized by the subspace

dx δn = 0. Then, for v = c,∫

dx δn c = 0, i.e., constant potentials v = c are ‘perpendicular’ to thatsubspace (they form the annullator of that subspace), and E[v,N ] dependslinearly on v = c. Hence, ‘perpendicular’ to the hypersurface

dxn = N ,F [n,N ] behaves singularly like f(c) of Fig.10a.

The property (6.27) immediately implies

H [n] = infNF [n,N ] = F [n, (n|1)], (6.28)

and since the inverse Legendre transformation to (6.20) (based on the con-cavity of E[v,N ] in v for fixed N) is

E[v,N ] = infnF [n,N ] + (v|n) = inf

nN

F [nN , N ] + (v|nN) (6.29)

with (nN |1) = N . Comparison to (6.28) and (6.22) also proves that theinequality of (6.21) is sharpened into the equality (6.22). Because of (6.28),the property (6.26) is retained for H .

The relation between F [n,N ] and H [n] is depicted in Fig.12: H [n] is theconvex hull of all F [n,N ] for all N . As already mentioned, Lieb defined hisfunctional (6.20) with E[v] from (4.4) for integer N instead of the generalE[v,N ] from (6.9) for real N . We call the former EN [v] here to indicatealso its N -dependence, and we call the corresponding functional by LiebFN [n]. Since generally E[v,N ] ≤ EN [v], it follows F [n,N ] ≤ FN [n]. Thisis also depicted in Fig.12 for an integer value N3. H [n] by nature relatesto F [n,N ]. One should realize that any explicit dependence of the densityfunctional on the particle number N is a very non-local business (dependenceon

dxn(x)) which has never been fully anticipated in approximations tothe density functional (except for the kinetic energy part).

It remains to specify the functional spaces for n and v which should betopologically dual to each other (so that the scalar product (n|v) is alwaysfinite). In order to define the functional space X for admissible density

Page 146: The Fundamentals of Density Functional Theory (revised and

148 6. Density Functional Theory by Lieb

F

H

dxn(x)

H

F [n,N1] F [n,N2] FN3[n] F [n,N4]

u uu u

N1 N2 N3 N4

Figure 12: H [n] and F [n, Ni] for several Ni on a path in n-space, ‘perpendicular to∫

dxn(x) = const.’ The functional F [n, Ni] is equal to +∞ except for∫

dxn(x) = Ni.H is the convex hull. At N3, the function FN3

[n] for a case FN3[n] > F [n, N3] is

sketched.

functions, following Lieb, we first require that the kinetic energy be finiteand investigate the consequences on the density of a pure state (ensemblestates do not make any difference in this context)

n(r) = N∑

s

dx2 . . . dxN Ψ∗(rs, x2 . . . xN)Ψ(rs, x2 . . . xN )

def= N〈Ψ|Ψ〉′(r). (6.30)

For every fixed r, this expression has all properties of a scalar prod-uct, denoted by 〈·|·〉′. The gradient of the density reads now ∇n(r) =N〈∇Ψ|Ψ〉′ +N〈Ψ|∇Ψ〉′ = 2NRe 〈∇Ψ|Ψ〉′, and its square is

[∇n]2 ≤ 4N2|〈∇Ψ|Ψ〉′|2 ≤ 4N2〈Ψ|Ψ〉′〈∇Ψ|∇Ψ〉′ = 4Nn(r)〈∇Ψ|∇Ψ〉′.(6.31)

The first inequality just estimates the square of the real part by thesquare of the absolute value, and the second inequality is an application

Page 147: The Fundamentals of Density Functional Theory (revised and

6.3 The Functionals F , G, and H 149

of Schwarz’ inequality (5.30). The kinetic energy, on the other hand, is〈T 〉 = (N/2)

d3r 〈∇Ψ|∇Ψ〉′, and hence, with ∇n1/2 = ∇n/2n1/2,

1

2

d3r[

∇n1/2]2

=

d3r[∇n]2

8n≤ 〈T 〉. (6.32)

Therefore, 〈T 〉 < ∞ implies ∇n1/2 ∈ L2. (Of course, that does not yetguarantee the opposite: that 〈T 〉 is finite for ∇n1/2 ∈ L2.)

There is a general inequality by Sobolev (see [Lieb, 1983]), in three di-mensions estimating the L2-norm of ∇f by the L6-norm of f :

‖∇f‖22 ≥ 3(π

2

)4/3

‖f‖26. (6.33)

Applying this to f = n1/2 and considering ‖n1/2‖26 = ‖n‖3 and (6.32) yields

3

2

2

)4/3

‖n‖3 ≤ 〈T 〉 <∞ =⇒ n ∈ L3. (6.34)

Moreover, of course N =∫

d3r n⇒ n ∈ L1.Therefore, Lieb considered

X = L3(R3) ∩L1(R3), X∗ = L3/2(R3) + L∞(R3). (6.35)

In this case, X∗ includes Coulomb potentials as seen in complete analogyto Section 3.1, with use of the decomposition (3.7). Also, by the sameargument as in Section 3.1, only with the L5/3 replaced by the L3/2, theCoulomb pair interaction is finite for n ∈ X. Hence the total Coulombenergy is finite for n ∈ X. If now X comprises all ground states of Coulombsystems, then it follows from the virial theorem (2.81) for Coulomb systemsthat E[v,N ] is finite on X∗ in that case. Using the concavity of E[v,N ] in vand the estimate of 〈T 〉 from (6.34), it has been shown [Lieb, 1983] for anyinteraction w, that E[v,N ] (with N ≥ 0) is finite and hence continuous onX∗, if only E[v0] is finite for some v0 ∈ X∗.

From (4.39, 4.41), obviously

X ⊃ JN ⊃ AN . (6.36)

In the spin-dependent case, each matrix element of nss′(r) and vss′(r) mustbe in X and X∗, respectively.

Unfortunately, although X∗∗ ⊃ X, since L1(R3) is not reflexive, X∗∗ 6=X from (6.35), and X and X∗ would not be mutually dual. This is not

Page 148: The Fundamentals of Density Functional Theory (revised and

150 6. Density Functional Theory by Lieb

a problem any more when a torus T 3 of finite volume |T 3| is consideredinstead of the R3. In this case,

X = L3(T 3), X∗ = L3/2(T 3), (6.37)

and we need not further require n ∈ L1, since we know from (5.33) thatthe L3(T 3) is a subspace of the L1(T 3). (We need not further be concernedwith the behavior of functions for r → ∞.) In this case, X and X∗ arereflexive, and (6.22, 6.25) hold on grounds of the general theory. One furtheradvantage is that this approach covers also the cases of extended systems.

Of course, replacing the r-space R3 by the torus T 3 of finite volume withperiodic boundary conditions also modifies the sets AN , A0

N and JN (as wellas VN). Here we consider quantum mechanics on a torus T 3, hence X andX∗ denote the reflexive Banach spaces (6.37).

For v ∈ L3/2(T 3) ⊆ VN we have

EN [v] = minn∈AN

FHK[n] + (n|v) =

= minn∈JN

FLL[n] + (n|v) =

= minn∈JN

FDM[n] + (n|v) =

= minn∈JN

FN [n] + (n|v) (6.38)

E[v,N ] = infn∈XF [n,N ] + (n|v) =

= infn∈XH [n] + (n|v) | (n|1) = N . (6.39)

Since on a torus the N -particle ground state always exists (and thereforeL3/2(T 3) ⊆ VN), the minimizing density also exists in (6.38). Yet, for thegeneral ensemble state the existence of a minimizing γ cannot be guaranteedin (6.9) although as regards physics, it will not exist for very exotic v andw only. So far we retained the inf sign in (6.39). However, even if thereis no limiting γ for a series γj which leads to the infimum of (6.9), thecorresponding series of densities nj may still converge yielding a minimumin (6.39). This consideration is continued in the next section. In view of(6.36), the above relations imply

n ∈ AN⇓

FHK[n] = FLL[n] = FDM[n] = FN [n] ≥ F [n,N ] = H [n]. (6.40)

Page 149: The Fundamentals of Density Functional Theory (revised and

6.4 The Kohn-Sham Equation 151

For n ∈ JN we found FLL ≥ FDM in Section 4.5. As a conjugate functional,FN is convex, whereas FLL is not. Moreover, one can show that FDM = FNon JN , and both are equal to the convex hull of FLL on JN [Lieb, 1983]. Insummary,

n ∈ JN =⇒ FLL[n] ≥ FDM[n] = FN [n] ≥ F [n,N ] = H [n]. (6.41)

Only FN [n], F [n,N ], and H [n] are defined on a whole linear space X.This completes the rigorous density functional theory for the ground

state. The version outlined here is particularly appropriate to shed light onthe general character of the Hohenberg-Kohn variational principle: Exceptfrom the concavity of E[v,N ] in v and its convexity in N (and its simplegauge property) the relations (6.22, 6.25) by themselves do not contain anyparticular physics. This type of theory gains physical content only via con-structive expressions for H [n] (or one of the alternative density functionals).

6.4 The Kohn-Sham Equation

A crucial point for deriving Kohn-Sham equations is the existence of theemployed functional derivatives. Moreover, the point of solution of the dif-ferential equation (as a point in the functional space) must be an inner pointof the range of differentiability of the considered quantities (cf. Fig.3a). For-mally, Euler’s equation for the problem (6.22) is

−δHδn

= v − µ, (6.42)

where µ is the Lagrange multiplier of the side condition. As a Legendretransform, H [n] has a non-empty subdifferential for all n for which it is finiteand for which the supremum (6.17) is a maximum. This follows immediatelyfrom the general property (5.47) of Legendre transforms. It is naturallyconjectured that due to the compactness of the basic space T 3 all infima andsuprema of this chapter are minima and maxima in cases they are finite.

Indeed, ‖n‖3 ≤ (2/3)(2/π)4/3〈T 〉 from (6.34), and hence, for a givenv ∈ X∗, the search in (6.39) may be restricted to ‖n‖3 ≤ C for a sufficientlylarge positive C[v] taken from an upper estimate of the kinetic energy forthat case. The ball n | ‖n‖3 ≤ C of radius C is weak compact in X bythe Banach-Alaoglu theorem and by the reflexivity of X, and hence themain theorem for extremal problems, reported at the end of Section 5.5,guarantees the existence of a solution n0(x), E[v] = H [n0]+

dxn0v, of the

Page 150: The Fundamentals of Density Functional Theory (revised and

152 6. Density Functional Theory by Lieb

last variational problem of (6.39). Furthermore, n0 ∈ JN . Apart from exoticcases, there is even n0 ∈ AN by the very definition of AN , and hence theinfima in (6.39) might be replaced by minima for every v ∈ X∗ = L3/2(T 3).

Since H is finite on JN of (4.39), it has a non-empty subdifferential forn ∈ JN (even if n is not an inner point of JN in the topology of X; suchdensities, for instance having nodes so that indefinite density functions existin every neighborhood of n, have been discussed in the literature). Moreover,a convex function on a normed space has a derivative (G-derivative), if andonly if the subdifferential consists a unique element of the dual space. (Forfinite dimensions this is easily realized.) Hence,

if v − µ ∈ X∗ is uniquely defined in the theory for some givenn ∈ X for which H [n] is finite, then H [n] has there a functionalderivative equal to µ− v.

This is the place where the basic theorem by Hohenberg and Kohn (Section4.1) plays its important role in the theory. (The proof given there goesthrough with ensemble states of degenerate ground states; certain problemswith external magnetic fields as discussed in Sections 4.7 and 4.8 remain.)At least for scalar external potentials, v is uniquely defined by n ∈ JN up toa constant in space, and this constant, µ, is uniquely defined by the particlenumber N except possibly for integer values of N , where it can make a finitejump (cf. (6.23), (6.24) and Fig.11). At least for the spin-independent theoryand for impure spin states,

the functional derivative δH/δn ∈ X∗ exists and is equal to µ−vfor non-integer N = (n|1). For integer N it may jump by a finitevalue, constant in r-space.

This establishes the rigorous basis for the Kohn-Sham theory in the formformally developed in Section 4.5. Define a kinetic energy functional for non-interacting reference systems as

K[n]def=

TDM[n] for n ∈ X, n(x) ≥ 0+∞ elsewhere

(6.43)

(note that by definition of X, TDM of (4.57) is well determined for n(x) ≥ 0),and define EXC by

H [n] = K[n] + EH[n] + EXC[n], n ∈ X = L3(T 3). (6.44)

Now, EXC can also be defined on the whole X: it is defined for n(x) ≥ 0 bythe above relation, and it can be anything elsewhere, where both H and K

Page 151: The Fundamentals of Density Functional Theory (revised and

6.4 The Kohn-Sham Equation 153

are +∞. (It is left to the reader’s choice to understand the letters H andK as a tribute to Hohenberg and Kohn.) Substituting this representation ofH [n] into (6.22) and combining the density variation again with the orbitalvariation of (4.57) into one single step leads to the Kohn-Sham equationexactly in the way as previously derived at the end of Section 4.5, that is,to

(

−∇2

2+ veff

)

φi = φi εi, veffdef= v + vH + vXC, (6.45)

vXC,ss′(r)def=δEXC[n]

δns′s(r), (6.46)

ε1 ≤ ε2 ≤ · · · , (6.47)

ni = 1 for εi < εN , 0 ≤ ni ≤ 1 for εi = εN , ni = 0 for εi > εN . (6.48)

nss′(r) =∑

i

φi(rs)niφ∗i (rs

′), N =∑

s

d3r nss(r), (6.49)

E[v,N ] =∑

i

niεi − EH[n]−∑

ss′

d3r nss′vXC,s′s + EXC[n], (6.50)

Also with respect to these equations no representability problems are left.It is not difficult to convince himself that K[n] as defined in (6.43)

coincides with H [n] as defined in (6.25) for w ≡ 0. Both have jumpsof the derivative by finite constant values in r-space at integer values ofN . In the interaction free case w ≡ 0 those jumps (from one HOMO tothe next as N increases) are exactly modeled in the construction (4.57).It is naturally expected that the heights of those jumps depend on theinteraction w, hence, from (6.44), vXC = δEXC[n]/δn should also jumpby a constant spin matrix, independent of r, if N runs through an inte-ger [Perdew and Levy, 1983, Sham and Schluter, 1983, Lannoo et al., 1985].(Pure spin states may lead to spin-dependent jumps.)

Orbital variation expressions of the general type (4.57) are a convenienttool to model non-local functional dependences on the density n. Put

H [n] = K[n] + L[n], (6.51)

K[n] = minφi,ni

k[φi, ni]

i

φiniφ∗i = n, 0 ≤ ni ≤ 1

, (6.52)

L[n] =

d3rn(r)l(nss′(r),∇nss′(r), . . .) (6.53)

Page 152: The Fundamentals of Density Functional Theory (revised and

154 6. Density Functional Theory by Lieb

with a suitable chosen orbital functional k and a suitable function l(n, . . .)one derives Kohn-Sham type equations

(

k + v + vL)

φi = φiεi (6.54)

with an orbital operator

k :δk

δφ∗i= kφini, (6.55)

and a local Kohn-Sham potential

vL =δL

δn(6.56)

as well as the aufbau principle (6.47) to (6.49).The restriction to the torus T 3 of finite volume brought it about, that

Eqs. (6.25, 6.43, 4.57, 6.44–6.50) provide a mathematically well definedclosed theory. There is no N -representability problem of potentials (un-known set VN ) and no v-representability problem of densities (unknown setAN) any more, because those unknown sets do not figure any more in thetheory, and there is no problem of the existence of functional derivativeswith respect to densities left. There are also no restrictions of applicabilityof the Kohn-Sham equations to be considered.

If one is interested in N -particle eigenstates as ground states and intheir energies EN [v] instead of density matrix ground states with energiesE[v,N ], the former states may not be found and their energies may only beestimated from below, if EN [v] is not convex as a function of N . (Note thatthe reason for a failure of convexity is always interaction: E0

N [v] is convexfor every potential v.) On the other hand, E[v,N ] is always concave in vand convex in N by its very definition (4.66) as a thermodynamic quantity(grand canonical energy). For fixed v, it is the convex hull of EN [v].

Page 153: The Fundamentals of Density Functional Theory (revised and

7 Approximative Variants

The density functional theory presented in Chapters 4 and 6 is a rigoroustheory on the safe basis of many-body quantum theory. It is, however, notexplicit, so that practical computations cannot directly be based upon it.On the other hand, Thomas-Fermi theory, although originally very naıve,could be understood as a crude approximation to density functional theoryin the above sense, with the advantage of being explicit. The use of theKohn-Sham trick in handling the kinetic energy and treating the exchangeand correlation energy functional in the spirit of Thomas and Fermi leadsto the explicit local density approximation, which is the basis or at leastthe starting point of nearly all explicit approximative variants of densityfunctional theory in use. The aim of this last chapter of Part I is to link thematerial presented in the previous ones to practical applications.

So far, and very likely also in future, we do not have a systematic ac-cess to the rigorously defined density functional H [n] the theory is basedupon. Hence we have to model it and to probe the models by comparisonto phenomenology. This situation is not principally different from othermany-particle approaches where either models of sufficiently simple Hamil-tonians are used (in quantum field theory) or the wave function is modeled(for instance in Hartree-Fock or Gutzwiller approaches). It is important torealize that all numerical density functional results provided so far are modelresults although many of them like for instance the local density approxima-tion do not contain any adjustable parameter, whence they are often termedab-initio.

The basis of the local density approximation for the exchange and cor-relation energy functional is the theory of the homogeneous electron liquid.This is a most important model system, which of course does not exist in na-ture, but which can nowadays theoretically be treated with extremely highprecision. The results and parameterizations of that theory are reviewedin the first section. Subsequently, the local density approximation is intro-duced, and explanation is given how and why it works, and in which respectit does not work.

All practical density functional approaches from Thomas-Fermi up tothe most sophisticated current ones may be classified with respect to the

Page 154: The Fundamentals of Density Functional Theory (revised and

156 7. Approximative Variants

manner of splitting the basic density functional H [n] according to (6.44) ormore generally according to (6.51)–(6.53) into a non-local part expressedthrough an orbital variation expression and a part subject to some localdensity approximation. Except for the Hartree term, the non-local partis usually only implicitly given by an orbital variation expression. Thisclassification is carried through in Section 7.3 of the present chapter.

As two rather successful examples related to that classification, in thelast two sections of this chapter we consider the self-interaction correction tothe local density approximation and the LDA+U approach. For other ap-proaches beyond the original local density approximation, particularly thosebased on gradient expansions, we refer the reader to the existing literature.(See, for example [Dreizler and Gross, 1990]).

7.1 The Homogeneous Electron Liquid

The homogeneous electron liquid is an important reference model system,which, however, does not exist in nature. Nevertheless, due to sophisticatedtheoretical approaches as well as computer simulations, many of its prop-erties are known to an uncomparably high precision. In the literature it issometimes called the homogeneous electron gas; however, we reserve thisname for the interaction-free case (1.41–1.45).

The homogeneous electron liquid is a variant of the Coulomb system(2.69), only with the point charges of the nuclei replaced by a homogeneous,i.e. constant in space, positive charge density

n+ =

[

4πr3s

3

]−1

, (7.1)

where, as in (3.30), the density parameter rs is the radius of a sphere con-taining one charge quantum. Hence its Hamiltonian is (formally)

Hhom = −1

2

N∑

i=1

∇2i −

N∑

i=1

d3rn+

|ri − r| +

+1

2

N∑

i6=j

1

|ri − rj|+

1

2

d3r d3r′n2

+

|r − r′| . (7.2)

This Hamiltonian is formal because the figuring integrals diverge. As in

Page 155: The Fundamentals of Density Functional Theory (revised and

7.1 The Homogeneous Electron Liquid 157

Section 2.7, it has to be replaced by

HαL,hom = −1

2

N∑

i=1

∇2i −

N∑

i=1

T3

d3r n+ vαL(ri − r) +

+1

2

N∑

i6=j

vαL(ri − rj) +1

2

T3

d3r d3r′ n2+ vαL(r − r′) (7.3)

with

N = n+|T 3| =[

4πr3s

3

]−1

L3. (7.4)

Afterwards,

ε(rs) = (E/N)(rs) = limL→∞

1

Nlimα↓0

EαL(rs) (7.5)

is to be considered as the ground state energy per electron in dependenceon the density parameter rs.

The external potential v(ri) = −∫

T3 d3r n+ vαL(ri− r), produced by n+

and contained in the second term of (7.3), is the Yukawa interaction energyof the electron at ri with a homogeneous charge distribution on the torus,visualized as a periodically repeated cube, and hence is independent of ri. Itis a potential constant, and N times this constant is exactly canceled by thesum of the spatial average over the electron-electron interaction (third termof (7.3) averaged over all positions ri and rj) and the likewise constant lastitem of (7.3). This cancellation prevails in the α-limit (where the individualterms tend to infinity), and hence the Hamiltonian in this limit, but still atfinite L, may be given in momentum representation (1.34) as

〈 q1 . . . qN |H| q′1 . . . q′N 〉 =1

2

i

k2i

j

δqjq′j+

+1

2

i6=j

δkı+k,k′i+k′

j

L3

[

δsıs′iδss′j

w|kı−k′i|−

−δsıs′jδss′i

w|kı−k′j |

]

(−1)P∏

l(6=i,j)

δqlq′l (7.6)

with

w|k|=0 = 0, w|k|6=0 =4π

L3|k|2 , (7.7)

Page 156: The Fundamentals of Density Functional Theory (revised and

158 7. Approximative Variants

and the k-vectors given by (1.29).

Note that deviating from our general form (4.1) the Hamiltonian (7.3)(and likewise (2.86)) contains a constant term, which is finite for non-zero αonly. This term together with a constant external potential is just combinedwith the long-range part of the electron-electron interaction to modify thelatter in the way given in (7.7). After this little trick, α may be let go tozero, and with (7.6) we are back in our frame of theory, with a Coulombpotential screened by the strict charge neutrality of the torus T 3. For anL large compared to the diameter of an exchange and correlation hole, thislatter screening concerns only the Hartree energy and leaves the exchangeand correlation energy unaltered. This is the general way density func-tional theory applies to extended Coulomb systems. The L-limit then isalways understood to be taken at the very end, because otherwise all den-sity functionals would be infinite and moreover the constancy constraint forthe particle number could not be formulated any more.

Spin-polarized ground states may be considered at any value of the den-sity parameter rs, eventually by adding a spin-polarizing spatially constantpotential term to (7.6).

The simplest case is of course again the interaction-free homogeneouselectron gas. Its ground state density is constant and equal to n+ at everyrs-value. We relate the electron density n, its spin components n(+) andn(−), and the degree of spin polarization ζ of a collinear spin situation (spin-density matrix diagonal with respect to a global z-direction, cf. (2.33–2.37))according to

n = n(+) + n(−), ζ =n(+)− n(−)

n, (7.8)

n(+) =n

2(1 + ζ), n(−) =

n

2(1− ζ) (7.9)

and have for the kinetic energy in Hartree units (cf. (1.45, 1.43 and 7.1))

t(n, ζ) =T (n, ζ)

N=

3

10

(

4

)2/3(1 + ζ)5/3 + (1− ζ)5/3

2rs(n)2=

=1.1049

rs(n)2

1

2

[

(1 + ζ)5/3 + (1− ζ)5/3]

. (7.10)

The minimum is at ζ = 0 for every rs, that is, without an external magneticfield the homogeneous electron gas is not spin-polarized.

Page 157: The Fundamentals of Density Functional Theory (revised and

7.1 The Homogeneous Electron Liquid 159

In lowest order many-body perturbation theory the exchange energy ofthe homogeneous electron liquid is obtained as

εX,1(n, ζ) = − 3

(

4

)1/3(1 + ζ)4/3 + (1− ζ)4/3

2rs(n)=

= −0.4582

rs(n)

1

2

[

(1 + ζ)4/3 + (1− ζ)4/3]

. (7.11)

This contribution dominates in the polarization dependence, therefore, fol-lowing a suggestion of von Barth and Hedin [von Barth and Hedin, 1972],the polarization dependence of the exchange and correlation energy EXC of(2.65) of the homogeneous electron liquid is generally interpolated betweenthe paramagnetic (ζ = 0) and the saturated ferromagnetic (ζ = 1) casesaccording to

εXC(n, ζ) = εXC(n, 0) + [εXC(n, 1)− εXC(n, 0)]f(ζ), (7.12)

where

f(ζ) =(1 + ζ)4/3 + (1− ζ)4/3 − 2

24/3 − 2(7.13)

is the von Barth-Hedin interpolation function. From (7.8), −1 ≤ ζ ≤ 1. Theinterpolation function f(ζ) is shown on Fig.13; it is symmetric in ζ and sois εXC:

f(−ζ) = f(ζ), εXC(n,−ζ) = εXC(n, ζ). (7.14)

The exchange and correlation energy does of course not depend on whetherthe spin-polarization is up or down.

The functions εXC(n, 0) and εXC(n, 1) for the homogeneous ground stateare obtained from numerical total energy calculations by taking the differ-ence of the total energy and the kinetic energy of the interaction-free case(7.10) at the same values n and ζ . The presently most accurate resultsdistinguish two regions of rs: For small values of rs (high density), many-body perturbation theory with partial summations of diagrams yields in theparamagnetic case

εXC(n, 0) = −0.4582

rs+ 0.03109 ln rs +B + Crs ln rs +Drs + · · · (7.15)

The first term was already given in (7.11), the second term, whose coef-ficient is (1 − ln 2)/π2, was first obtained in [Macke, 1950], and numerical

Page 158: The Fundamentals of Density Functional Theory (revised and

160 7. Approximative Variants

0

0.5

1

−1 −0.5 0 0.5 1

f(ζ)

ζ

Figure 13: Spin polarization interpolation function f(ζ) of (7.13).

values of the next terms may be found in [Carr and Maradudin, 1964]. Thelogarithmic terms arise from partial summations of diagrams diverging dueto the long-range part of the Coulomb interaction. The ζ-dependence of theso-called random-phase contributions dominating in (7.15) is given by thescaling relation [Hedin, 1965, Misawa, 1965]

εRPAXC (n, 1) =

1

2εRPAXC (24n, 0), (7.16)

which is consistent with (7.11). For the total energy of the homogeneouselectron liquid at intermediate and large values of rs, numerical quantumMonte Carlo results exist [Ceperly and Alder, 1980].

Perdew and Zunger [Perdew and Zunger, 1981] proposed an interpola-

tion of those numerical results with Pade approximants in r1/2s for rs > 1,

and recommended to use the high-density many-body perturbation expan-sion for 0 < rs ≤ 1 with coefficients C and D of (7.15) fixed in such away that εXC(n, ζ) and its first derivative with respect to n is continuous atrs = 1. The result is

εXC(n, 0) = −0.4582

rs+ 0.03109 ln rs −

− 0.0480 + 0.0020rs ln rs − 0.0116rs, (7.17)

εXC(n, 1) = −0.5773

rs+ 0.01555 ln rs −

− 0.0269 + 0.0007rs ln rs − 0.0048rs (7.18)

Page 159: The Fundamentals of Density Functional Theory (revised and

7.2 The Local Density Approximation 161

for 0 < rs ≤ 1, and

εXC(n, 0) = −0.4582

rs− 0.1423

1 + 1.0529√rs + 0.3334rs

, (7.19)

εXC(n, 1) = −0.5773

rs− 0.0843

1 + 1.3981√rs + 0.2611rs

(7.20)

for 1 < rs. There are alternative parameterizations essentially reproducingthe same results.7

7.2 The Local Density Approximation

The spin density of an inhomogeneous system in a collinear situation

n(x) =

(

n(r,+)n(r,−)

)

= n(r)

(

(1 + ζ(r))/2(1− ζ(r))/2

)

(7.21)

may be expressed by the two functions n(r) and ζ(r). Since n is the particlenumber per unit volume, and εXC(n, ζ) is the exchange and correlation en-ergy per particle of the homogeneous electron liquid, nεXC(n, ζ) is the energyper unit volume. Hence, in the spirit of the Thomas-Fermi theory,8

EXC[n] ≈ ELDAXC [n]

def=

d3r n(r) εXC(n(r), ζ(r)) (7.22)

may be taken as an approximation for the exchange and correlation energyfunctional of an inhomogeneous system, defined by (6.44). The Kohn-Shamexchange and correlation potential (6.46) is then approximated by

vXC(r,±) ≈ ∂

∂n(r,±)n(r)εXC(n(r), ζ(r)) =

= εXC(n(r), ζ(r)) + n(r)∂εXC

∂n(n(r), ζ(r))±

±(1∓ ζ(r))∂εXC

∂ζ(n(r), ζ(r))

def=

def= vLDA

XC±(n(r), ζ(r)). (7.23)

7The presently most precise fit is that of J. P. Perdew and Y. Wang, Phys. Rev. B45,13244–13249 (1992).

8For gradient corrections see J. P. Perdew, K. Burke and M. Ernzerhof, Phys. Rev.Lett. 77, 3865–3868 (1996); Errata ibid. 78, 1396 (1997).

Page 160: The Fundamentals of Density Functional Theory (revised and

162 7. Approximative Variants

Here, we used the rule

∂n(±)=

∂n

∂n(±)

∂n+

∂ζ

∂n(±)

∂ζ=

∂n± (1∓ ζ)

n

∂ζ, (7.24)

which immediately follows from (7.8).With use of the expressions (7.12, 7.17–7.20), vLDA

XC± is continuous for alln > 0 and all ζ , and

vLDAXC±(n(r), ζ(r)) = vLDA

XC∓(n(r),−ζ(r)) (7.25)

as it should be. In a non-collinear spin situation these formulas may be usedafter a local spin rotation which diagonalizes nss′(r) so that the local z-axisfor the spin points in the direction of em of (2.37). The potential (7.25) maythen be back rotated to global spin variables.

The main reasons, why this scheme works surprisingly well even forstrongly inhomogeneous systems as atoms, molecules and solids, can be ana-lyzed in the following manner [Gunnarsson et al., 1979]: According to (2.67),the exchange and correlation energy (as a physical number, not a densityfunctional) of any ground state of a Coulomb system may be expressed via

EXC =

d3r n(r)ǫXC(r), (7.26)

ǫXC(r) =1

2

d3r′hKS(r

′, r)

|r′ − r| (7.27)

in terms of the Kohn-Sham exchange and correlation hole hKS(r′, r) seen

by an electron at r and distributed over the r′-space. With (2.68), theseare exact relations. Furthermore, since the general sum rule (2.51) holdsindependent of the coupling constant λ, from (2.68) it also follows that

d3r′ hKS(r′, r) = −1. (7.28)

All this is especially also valid for the homogeneous electron liquid, whereǫKS(r) is constant in r and equal to εXC(n, ζ), and hKS(r

′, r) dependsisotropically on the vector r′ − r: hKS(r

′, r) = hhom(|r′ − r|;n, ζ) with4π

∫∞

0dy y2 hhom(y;n, ζ) = −1.

In the general inhomogeneous case we expand the angular dependenceon this vector according to

hKS(r′, r)

def= h(r′ − r, r) = h(y, r), (7.29)

Page 161: The Fundamentals of Density Functional Theory (revised and

7.2 The Local Density Approximation 163

h(y, r) =√

4π∑

L

hL(y, r)YL(y) (7.30)

with spherical harmonics YL and L = (l,m), and have the exact relations

ǫXC(r) =1

2

∫ ∞

0

dy 4πy2 h0(y, r)

y= 2π

∫ ∞

0

dy y h0(y, r), (7.31)

∫ ∞

0

dy y2 h0(y, r) = −1, (7.32)

which are not influenced by the anisotropy of h(y, r) as a function of y.Because of the isotropy of the Coulomb interaction 1/|r′ − r|, the exchangeand correlation energy depends on the isotropic part of the exchange andcorrelation hole only.

The local density approximation now means

ǫXC(r) ≈ εXC(n(r), ζ(r)), (7.33)

which is equivalent to putting

h0(y, r) ≈ hLDA0 (y, r)

def= hhom(y;n(r), ζ(r)). (7.34)

The shape of the complete exchange and correlation hole h(y, r) ofan atom or molecule or solid has usually very little in common withhhom(y;n(r), ζ(r)), however, its isotropic part h0(y, r) may still be very closeto (7.34), which closeness gets further support by the sum rule, and for thesereasons the local density approximation works.

One defect, however, which definitely cannot be remedied within theframe of the local density approximation, is connected with the dependenceon particle number. As we know, for a given r, the potential vXC,ss′(r) is afunctional of the functions nss′(r

′), or likewise of n(r′), ζ(r′) and the spinpolarization direction em(r′). Let us write for a moment vXC[n, ζ, em] inorder to make this functional dependence explicit. We know further, thatthis potential vXC may change discontinuously, if n(r′) is changed in sucha way that N =

d3r′ n(r′) moves through an integer. The local densityapproximation replaces this functional dependence by a local dependenceaccording to (7.23), with possibly an additional spin rotation:

vXC[n, ζ, em](r) ≈ vLDAXC (n(r), ζ(r), em(r)). (7.35)

The numbers n(r), ζ(r), em(r) at position r do not contain any more theinformation on what N is, and hence, at a given r, vLDA

XC does not depend

Page 162: The Fundamentals of Density Functional Theory (revised and

164 7. Approximative Variants

any more on N ; it depends on nss′(r) at that considered value of r only (andthat dependence as given by (7.23) is continuous everywhere).

In order not to overload the analysis, in the rest of this chapter thecollinear spin case with em pointing in the global z-direction is consideredonly. Non-collinearity introduces apart from the additional local spin rota-tions no new aspects into the following considerations.

Let N be an integer, and let n(x) be a density with∫

dx n(x) = N . Let

further δn(x) be an increment so that∫

dx δn(x) = δN>→ 0. Then,

vXC[n+ δn] = vXC[n− δn] + ∆XC[n], (7.36)

and it was shown in Section 6.4 that ∆XC, if non-zero, is a constant in space(equal to ∆(vH0 − vK0 ) in the notation of Section 6.4), i.e. independent of r,the argument of vXC(r). Analogously,

δT [n]

δn(x)

n+δn

=δT [n]

δn(x)

n−δn

+ ∆T [n] (7.37)

(where ∆T is equal to ∆vK0 of Section 6.4), this latter discontinuity, however,is automatically implemented in the Kohn-Sham equations via the replace-ment of δT [n]/δn(x) by (4.30, 4.20) and is contained in the differences ofsubsequent Kohn-Sham levels εi of (6.47) computed from the same potentialveff via (6.45). Since ∆XC is a constant, if it is non-zero, it results just in ashift of the Kohn-Sham levels by that value ∆XC, too, but this shift mustbe inserted into the Kohn-Sham equations as a potential shift of vXC.

Now, fixing the external potential v(x) and solving hypothetically the ex-act Kohn-Sham equations for varyingN yields ground state densities n[v,N ].According to Section 4.6, the first excitation energy of the system, the ex-citation gap ∆, is I − A, and in an exact theory this is strictly expressedby the difference between the lowest unoccupied and the highest occupiedKohn-Sham levels. However, the former is to be calculated for N + 0 andthe latter for N − 0. According to our discussion, the former must be cal-culated with the potential constant ∆XC included (a positive ∆XC enhancesthe εHOMO and hence reduces A), so that

∆ = I(N − 0)− A(N + 0) = ∆T [n[v,N ]] + ∆XC[n[v,N ]]. (7.38)

The constant ∆XC appears as a gap correction to the LDA excitation gap.The local density approximation, even with the use of (4.71, 4.72) (whichonly in an exact theory should be equal to (4.74)), introduces two gap errors:

∆LDAXC = 0, ∆LDA

T = ∆T [nLDA]. (7.39)

Page 163: The Fundamentals of Density Functional Theory (revised and

7.3 Generations of Kohn-Sham Type Equations 165

The latter error is caused by the LDA-error of the ground state density. Itis still on debate, which of the errors is larger in which situation.

7.3 Generations of Kohn-Sham Type Equations

The expression (7.22) is the prototype of a local density approximation(6.53), which latter we, in view of Section 4.7, always understand as thelocal spin-density approximation.

The most consequent, or, as one could say, zeroth generation of such anapproximation would try an expression

H [n] ≈ H(0)[n] =

d3r n(r) h(n(r), ζ(r)), (7.40)

for the density functional H itself, which, given a suitable function h forthe homogeneous case, would replace the usual Kohn-Sham equations by azeroth generation of equations reading (cf. (6.42) and the text thereafter)

µ− v(r,±) =δH(0)

δn(r,±)= h + n

∂h

∂n± (1∓ ζ)∂h

∂ζ. (7.41)

This would be a set of two transcendent (i.e. non-linear) equations connect-ing n(r) and ζ(r) locally with the two components v(r,±) of the externalpotential. For a Coulomb system, however, such a scheme would completelyfail because of the long-range character of the Coulomb interaction. (It couldbe tried for systems with short-range interactions.)

For a Coulomb system, the simplest non-local term to be singled outof the density functional H [n] is the classical Coulomb energy: the Hartreeterm containing the self-interaction. (It is written as a nonlocal densityfunctional here, but it can likewise be written as an orbital expression andhence it fits into the general scheme (6.51)–(6.53); EH [n] behaves indifferentbetween (6.52) and (6.53).) Hence, the first generation of approximation forthe density functional would read

H [n] ≈ H(1)[n] =1

2

d3rd3r′ n(r′)w(|r′ − r|)n(r) +

+

d3r n(r) ( t(n(r), ζ(r)) + εXC(n(r), ζ(r)) ) .

(7.42)

The remaining LDA integral (second line) contains the kinetic and the ex-change and correlation energy functionals. This generation replaces the

Page 164: The Fundamentals of Density Functional Theory (revised and

166 7. Approximative Variants

Kohn-Sham equations by the Thomas-Fermi equations. See (3.3), where thenon-polarized case ζ = 0 was considered and the XC-term was neglected,but an exchange-only term was introduced in (3.46). Due to the non-localHartree-term of the density functional, the spin density is now connectedwith the external potential via an integral equation of the type (3.47):

5

3CF n

2/3(r)(1± ζ(r))2/3 = [µ− v(r,±)− vH(r)− vXC(r,±)]+, (7.43)

where n(r) enters the integral expression (3.13) of the Hartree potentialvH(r). Although this generation accounts for the long-range Coulomb effects,it suffers from many deficiencies due to the neglect of non-localities of thekinetic energy functional. These deficiencies were discussed in Chapter 3.

The next, the second generation consequently replaces the kinetic energyfunctional by the non-local implicit variational expression (4.57), but retainsthe local approximation for the exchange and correlation energy functional:

H [n] ≈ H(2)[n] = TDM[n] +1

2

d3rd3r′ n(r′)w(|r′ − r|)n(r) +

+

d3r n(r)εXC(n(r), ζ(r)). (7.44)

Introducing an orbital variational expression like (4.57) as a density func-tional, as finally pointed out by [Levy, 1982], is the proper way leadingto the ordinary Kohn-Sham equations (6.45) having the form of effectivesingle-particle Schrodinger equations. This second generation implies thelocal density approximation for exchange and correlation as discussed in theprevious section.

As a next step, to avoid self-interaction problems, one could combine theHartree energy with the exchange energy into an ensemble state Hartree-Fock type functional in accord with (6.52):

KHF[n]def= min

i

ni

dxφ∗i

(

−∇2

2

)

φi +1

2

ij

ninj ∗

∗∫

dxdx′ φ∗i (x)φ∗j(x′)w(|r′ − r|) (φj(x

′)φi(x)− φi(x′)φj(x))∣

i

φiniφ∗i = n, 0 ≤ ni ≤ 1

.

(7.45)

Page 165: The Fundamentals of Density Functional Theory (revised and

7.3 Generations of Kohn-Sham Type Equations 167

Of course, this corresponds to the ordinary Hartree-Fock expression only,if the density matrix comes from a pure-state single-determinant or froman open shell pure state composed by linear-combining a finite number ofdeterminants. The main advantage of this expression is that contrary tothe Hartree functional EH[n] (first expression of (7.42)) it is exactly self-interaction free. KHF[n] as defined above is a density functional. Hence, thedifference between H [n] and this expression is again a density functional,which we call a correlation energy functional and for which we introduce anLDA expression by writing

H [n] ≈ H(3)[n] = KHF[n] +

d3r n(r)εC(n(r), ζ(r)). (7.46)

Given a suitable function εC for the homogeneous case, the variation analo-gous to (4.26) and (4.60) would lead to Hartree-Fock like equations

(

−∇2

2+ veff(x)

)

φi(x) = φi(x) εi, veffdef= v + vH + vX + vC (7.47)

with the Hartree potential vH defined as previously, and

(vX φk)(x) = −∑

j

nN ;j

dx′ φ∗j(x′)w(|r′ − r|)φk(x′)φj(x), (7.48)

vC(r,±) = εC + n∂εC

∂n± (1∓ ζ)∂εC

∂ζ. (7.49)

The density itself would be given as in (6.47–6.49).If εC is neglected, this generation reduces to the ordinary Hartree-Fock

theory, in the same manner as, if εXC is neglected, the first generation (7.42,7.43) reduces to the ordinary Thomas-Fermi theory. Note, however, thatwith the correlation term preserved, the right-hand side of (7.46) does notyield the splitting into Hartree-Fock and correlation energies commonly usedin connection with the ordinary Hartree-Fock theory in the literature, wherethe Hartree-Fock part is defined as the total energy of the Hartree-Fock so-lution and the correlation part as the difference to the exact total energy.This must be kept in mind when comparing numerical results. The pointis that the orbitals involved are different in both approaches. Likewise, thefirst term of (7.45) does not yield the T -functional (4.57), since the minimiz-ing orbitals are different. The splitting into energy contributions in densityfunctional theory is always different from the corresponding splitting in otherapproaches; only the total energy itself may be compared.

Page 166: The Fundamentals of Density Functional Theory (revised and

168 7. Approximative Variants

Another comment concerns the appearance of an exchange potential op-erator vX in equation (7.47). This potential is non-local (integral operator)but not orbital dependent. (The integral kernel is the same for all orbitals.)Hence there is no problem with choosing the orbital solutions of (7.47) or-thonormal to each other. Although the exact theory of Section 6.4 resultedin a local effective potential (cf. (6.45), veff ∈ X∗), for the different approachH [n] = KHF[n] +EC[n] with a potentially also exact EC[n], the orbital vari-ation (7.45) introduces a non-local potential operator in the same manneras the orbital variation (4.57) earlier introduced a non-local kinetic energyoperator.

Of course, there must be a drawback to this third generation of Kohn-Sham type theory, otherwise it would be much more widely used. Themain drawback comes from the strong cancellation between exchange andcorrelation energy contributions at short distances. (Exchange and Coulombcorrelations both keep particles apart from each other, hence exchange partlyreplaces classical correlation.) As can be seen from (7.17) and (7.11), εXC isgenerally smaller than εX (for rs ≈ 5 by a factor of two), and, less obviouslybut more importantly, vXC[n] is in a more local manner depending on nthan is vC[n] (cf. the discussion in [Hedin and Lundqvist, 1971]). This canbe understood on the basis of the discussion of the pair correlation functiong(r) in Section 2.5, stating that g(r) resulting from exchange and correlationis more short-range than that of exchange or correlation only.

Self-interaction corrected density functionals and the so-called LDA+Uapproach (the latter if properly introduced) are within this frame in betweenH(2)[n] and H(3)[n]. Another approach related to the content of this sectionis the use of the so-called optimized effective potential method within densityfunctional theory, cf. [Grabo and Gross, 1995].

7.4 The Self-Interaction Correction

A particular situation where the LDA of Section 7.2 fails is a situation withspatially well separated particles.

Recall that for a ground state density n(x) corresponding to an externalpotential v(x) the total ground state energy is given by

E = T [n] + EH[n] + EXC[n] +

dxn(x)v(x). (7.50)

In Section 2.5 we already discussed situations where EXC is of a purely formalnature. Consider a single hydrogen atom and assume the spin of the electron

Page 167: The Fundamentals of Density Functional Theory (revised and

7.4 The Self-Interaction Correction 169

in upward direction. Then

n(r,+) = |φ(r)|2, n(r,−) = 0. (7.51)

The exact relation

EXC[n = |φ|2, ζ = 1] = −EH[|φ|2] = −1

2

d3rd3r′|φ(r)|2|φ(r′)|2|r − r′|

(7.52)

is obtained from a Kohn-Sham exchange and correlation hole

hKS(r′, r) = −|φ(r′)|2, (7.53)

which is independent of the particle position r. The LDA exchange andcorrelation hole, on the other hand, is equal to

hLDAKS (r′, r) = hhom(|r′ − r|, n = |φ|2, ζ = 1), (7.54)

which leads to

ELDAXC [n = |φ|2, ζ = 1] 6= −EH[|φ|2] (7.55)

and causes an error of the LDA of about 5 percent of the hydrogen energy.If two hydrogen atoms are at sufficiently separated positions R1 and

R2 so that their electron clouds do not overlap, then the Hartree energyconsists of three contributions: the physically real Coulomb interaction ofthe electron clouds around atom 1 and 2, respectively, which screens theproton potentials, and the two self-interaction energies of these clouds. Thelatter again must be canceled by EXC which is not provided by the LDA.

To remedy this defect at least approximately, one replaces the orbitalvariation expression (6.52) for the Hartree energy functional by the self-interaction corrected (SIC) functional [Perdew and Zunger, 1981], see also[Svane, 1995]

KSIC[n]def= min

i

ni

dxφ∗i

(

−∇2

2

)

φi +

+∑

i6=j

ninj2

dxdx′ |φi(x)|2w(|r − r′|)|φj(x′)|2 −

−∑

i

niELDAXC [n = |φi|2, ζ = 1]

i

φiniφ∗i = n, 0 ≤ ni ≤ 1

.

(7.56)

Page 168: The Fundamentals of Density Functional Theory (revised and

170 7. Approximative Variants

The basic density functional H [n] is then approximated by

H [n] ≈ HSIC[n]def= KSIC[n] + ELDA

XC [n]. (7.57)

In the cases considered above this expression just extracts the self-interactionpart from the last contribution and puts it together with the self-interactioncorrected Hartree term into the orbital variation term KSIC[n]. If on theother hand the Kohn-Sham orbitals of an N -particle system are extendedover a volume ∼ N , one has |φi|2 ∼ 1/N , and, since both EH and EXC

depend on the density in higher than first order, the sum over the N termsin braces of (7.56) tends to zero for N → ∞. For extended states of anextended system the self-interaction correction vanishes.

Use of this density functional in the Hohenberg-Kohn-Sham variationleads to a Kohn-Sham like equation

[

−∇2

2+ v(x) + vH(r) + vSIC

XC,i(x)

]

φi(x) = φi(x)εi, (7.58)

where vH is the Hartree potential as previously including the self-interaction(but therefore orbital independent). In contrast to the ordinary Kohn-Shamequation (6.45) it contains an orbital-dependent potential term in the ex-change and correlation potential,

vSICXC,i(x) = vLDA

XC [n, ζ ](x)−

−nN ;i

(∫

d3r′|φi(r′)|2|r − r′| − v

LDAXC [|φi(x)|2, 1](x)

)

. (7.59)

For a discussion of the consequences of the non-orthogonality of the orbitalsresulting from (7.59), which is usually small and was neglected in (7.58),and for further discussion see [Perdew and Zunger, 1981].

The self-interaction correction leads to substantial improvements in manyapplications. Obviously, (7.57) is to be placed somewhere between the sec-ond and the third generation in the context of last section. This schemewould be consistent, if the solutions φi(x) of (7.59) would come out uniquelyand orthogonal to each other. This self-interaction correction cannot, ofcourse, remedy the defect discussed at the end of Section 7.2, becausethat defect is present in delocalized situations, where self-interaction cor-rections disappear. Gap corrections need another essentially nonlocal con-tribution to EXC[n] to be explicitly introduced, which perhaps might befound along similar lines, such as, for example, a screened exchange term.(Cf. [Seidl et al., 1996].)

Page 169: The Fundamentals of Density Functional Theory (revised and

7.5 The LDA+U Approach 171

7.5 The LDA+U Approach

The models for the density functional considered so far were all free of ad-justable parameters. This is an important feature which bears the poten-tial of broad applicability and predictive power. Nevertheless, there areareas of very successful application in the sense of producing results in goodagreement with phenomenology (of course provided the numerics is accu-rate enough for such issues; this is unfortunately a weak point of quitea number of publications even today), and there are cases of failure. Tothe latter belong so called correlation insulators or Mott insulators, solidswhich would be metals in a mean-field treatment of the crystal potential butwhich develop an excitation gap for particle-hole excitations due to stronglocal electron-electron correlation. This case, in particular in the strong cor-relation limit, is with certain success treated with the LDA+U approachwhich has seemingly more arbitrariness in modeling and is at least up tonow often not free of adjustable parameters, yet it fits perfectly in the aboveconsidered scheme and hence it is shortly presented here. For recent surveyssee [Anisimov et al., 1997, Eschrig et al., 2003a]. As was stated earlier, weare considering the spin density theory only in this chapter. What we callLDA here is in the literature often called LSDA to make the difference tothe spin-independent case explicit.

The first step is to specify local strongly correlated lc-shells (typically d orf -shells of orbitals |Rmσ) centered at lattice sites R and chosen as angularmomentum eigenstates with azimuth quantum numbers m = −lc, . . . , lc andspin quantum number σ = +,− with respect to a chosen and possiblysite dependent spin quantization axis while the independent choice of theorbital quantization axis is irrelevant as the so-called rotational invarianttreatment of LDA+U will be presented here. It is important to distinguishthese correlated model orbitals from the Kohn-Sham orbitals. At center R,the screened on-site Coulomb matrix elements of the correlated orbitals are

(m1m2|w|m3m4), w ≈ w(|r − r′|), σ1 = σ3, σ2 = σ4, (7.60)

where the screening depends on the orbital occupation but is assumed (as areasonable approximation) rotational invariant. As a consequence, the SO3

transformation properties of the matrix elements are

(m1m2|w|m3m4) =∑

m′1m′

2m′

3m′

4

U †m1m′1

(O)U †m2m′2

(O)∗

∗ (m′1m′2|w|m′3m′4)Um′

3m3

(O)Um′4m4

(O), (7.61)

Page 170: The Fundamentals of Density Functional Theory (revised and

172 7. Approximative Variants

where O is any rotation of the r-space and the U -matrices yield the relevantSO3 representation:

U †(O)U(O) = 1 = U(O)U †(O),∫

dO Um1m2(O)U †m3m4

(O) =1

2l + 1δm1m4

δm2m3.

(7.62)

In the last orthogonality relation, dO is Haar’s invariant measure in theparameter space of the SO3,

dO = 1.From these fundamental representation properties the important sum

rules for the matrix elements follow: Use unitarity of U(O) and integrateover dO (which leaves the l.h.s. unchanged as it is independent of O and∫

dO = 1) to obtain∑

m1

(m1m2|w|m1m4) =

=∑

m1

dO∑

m′1m′

2m′

3m′

4

U †m1m′1

(O)U †m2m′2

(O) (m′1m′2|w|m′3m′4)·

· Um′3m1

(O)Um′4m4

(O) =

=

dO∑

m′1m′

2m′

3m′

4

δm′1m′

3U †m2m′

2

(O)(m′1m′2|w|m′3m′4)Um′

4m4

(O) =

=1

2l + 1

m′1m′

2m′

4

(m′1m′2|w|m′1m′4)δm2m4

δm′2m′

4=

=δm2m4

2l + 1

m′1m′

2

(m′1m′2|w|m′1m′2) = δm2m4

(2l + 1) U.

(7.63)

The last equation is the definition of the Coulomb integral U (not to beconfused with the transformation U(O)). In the same manner,

m1

(m1m2|w|m3m1) =

=δm2m3

2l + 1

m′1m′

2

(m′1m′2|w|m′2m′1) = δm2m3

(U + 2lJ)(7.64)

is obtained which additionally defines the exchange integral J . The firstresult (7.63) is intuitively obvious: after summation over m1 and integrationover r in the matrix element, no angular dependence is left except the or-thogonality (m2|m4) = δm2m4

since no direction is any more distinguished.The second result (7.64) is less obvious but nevertheless true.

Page 171: The Fundamentals of Density Functional Theory (revised and

7.5 The LDA+U Approach 173

Expansion of the interaction function into spherical harmonics,

w(

|r1 − r2|)

= w(

(r21 + r2

2 − 2r1r2 cos θ)1/2)

=

=

∞∑

l=0

wl(r1, r2)Pl(cos θ) =

∞∑

l=0

wl(r1, r2)4π

2l + 1

l∑

m=−l

Ylm(r1)Y∗lm(r2)

(7.65)

leads to Slater’s analysis

(m1m2|w|m3m4) =

2li∑

l=0

Fl al(m1m2m3m4),

Fl =

∞∫∫

0

dr1dr2(

r1Rc(r1))2(

r2Rc(r2))2wl(r1, r2)

≈∞

∫∫

0

dr1dr2(

r1Rc(r1))2(

r2Rc(r2))2 rl<rl+1>

for l > 0,

al(m1m2m3m4) =4π

2l + 1

l∑

m=−l

(Ylcm1|Ylm|Ylcm3

)(Ylcm4|Ylm|Ylcm2

)∗.

(7.66)

Here, Rc is the radial part of the correlated orbitals, and the second linefor Fl holds for the unscreened Coulomb interaction which for l > 0 is areasonable approximation since intraatomic screening is effective only forthe s-component of the interaction.

Now, from∑

m Ylm(r)Y ∗lm(r) = Pl(1)(2l + 1)/4π and

m1

al(m1m2m1m2) = =4π

2l + 1

[

m1

(Ylcm1|Yl0|Ylcm1

)

]

(Ylcm2|Yl0|Ylcm2

)∗ =

=√

4π2lc + 1

2l + 1δl0 (Ylcm2

|Yl0|Ylcm2)∗ = (2lc + 1)δl0

it follows immediately that

U = F0. (7.67)

Page 172: The Fundamentals of Density Functional Theory (revised and

174 7. Approximative Variants

Furthermore,

m1m2

al(m1m2m2m1) =4π

2l + 1

m1m2m

(Ylcm1|Ylm|Ylcm2

)(Ylcm2|Y ∗lm|Ylcm1

) =

=4π

2l + 1

∫∫

dΩ1dΩ2

(

m1

Ylcm1(r2)Y

∗lcm1

(r1))

∗(

m2

Ylcm2(r1)Y

∗lcm2

(r2))(

m

Ylm(r1)Y∗lm(r2)

)

=

=(2lc + 1)2

(4π)2

∫∫

dΩ1dΩ2

[

Plc(cos θ12)]2Pl(cos θ12) =

=(2lc + 1)2

dΩ[

Plc(cos θ)]2Pl(cos θ) =

= (2lc + 1)2

(

lc l lc0 0 0

)2

and hence

m1m2

(m1m2|w|m2m1) = (2lc + 1)2

2lc∑

l=0

Fl

(

lc l lc0 0 0

)2

=

= (2lc + 1)(U + 2lcJ).

(7.68)

Eqs. (7.67) and (7.68) relate the Coulomb and exchange integrals U and Jto Slater’s (screened) integrals Fl. U and J are often treated as parameters;a correct approach would try to calculate independently the Fls.

Next, the occupation matrix n of the correlated shell is to be determinedwith the Kohn-Sham orbitals φi and their occupation numbers ni:

nmm′σdef=

i

(Rmσ|φi〉ni〈φi|Rm′σ). (7.69)

Here on the l.h.s. and in the following we drop the site vector R in order notto overload the notation. At each site (and possibly for each spin directionindependently) the occupation matrix may bem-diagonalized by introducinglocal orbital coordinates:

nmm′σ = U (σ)mµ nµσ U

(σ)∗m′µ . (7.70)

Averages over a correlated shell,

nσ =1

2lc + 1

µ

nµσ, n =1

2

(

n+ + n−)

, (7.71)

Page 173: The Fundamentals of Density Functional Theory (revised and

7.5 The LDA+U Approach 175

are used later on.The LDA+U density functional is obtained by taking k[φi, ni] in (6.52)

in the form

k = t+ eH + eU (7.72)

with

t+ eH =∑

i

ni〈φi|t|φi〉+∑

ij

ninj2〈φiφj|

1

|r′ − r| |φiφj〉 (7.73)

as in the LDA. There have essentially two versions of eU been introduced[Czyzyk and Sawatzky, 1994]: the ‘around the mean field’ version (AMF),

eU,AMF =1

2

Rσµµ′

(µσµ′−σ|w|µσµ′−σ)(nµσ − nσ)(nµ′−σ − n−σ) +

+[

(µσµ′σ|w|µσµ′σ)− (µσµ

′σ|w|µ′σµσ)

]

(nµσ − nσ)(nµ′σ − nσ)

=1

2

Rσµµ′

(µσµ′−σ|w|µσµ′−σ) nµσnµ′−σ+

+[

(µσµ′σ|w|µσµ′σ)− (µσµ

′σ|w|µ′σµσ)

]

nµσnµσ

− 1

2

U(

N − nσ)

− J(

Nσ − nσ)

Nσ,

Nσ =∑

µ

nµσ = (2l + 1)nσ,

(7.74)

and the ‘atomic limit’ version (AL),

eU,AL =1

2

Rσµµ′

(µσµ′−σ|w|µσµ′−σ) nµσnµ′−σ+

+[

(µσµ′σ|w|µσµ′σ)− (µσµ

′σ|w|µ′σµσ)

]

nµσnµσ

− 1

2

R

UN(

N − 1)

− J∑

σ

(

Nσ − 1)

,=

= eU,AMF +1

2

(U − J)(1− nσ)Nσ.

(7.75)

In (7.74), in the second equality the sum rules (7.63, 7.64) have been usedfor a better comparison of both versions. N is the number of electronsoccupying the whole correlated shell, Nσ is that for one spin sort.

Page 174: The Fundamentals of Density Functional Theory (revised and

176 7. Approximative Variants

The first two lines of (7.75) and of the second form of (7.74) can be con-sidered as the mean-field expectation value of a Hartree-Fock operator oftype (1.74); the third line contains the ‘double counting correction’, it sub-tracts from the L-part (6.53), for which the LDA expressions of Section 7.2are used, what is believed to be contained there already of the contributionsto the first two lines. In (7.74) it is assumed that the LDA works already wellif there is no orbital polarization in the correlated shell, that is, if nµσ = nσ.It is most easily seen from the first form of (7.74) that it vanishes in thiscase. Therefore the AMF version may also be called an orbital polarizationfunctional because it corrects the LDA for orbital polarization of the corre-lated shell which is assumed to be better represented by the Hartree-Focklike expression. The AL version is even more ad hoc.

For the derivation of Kohn-Sham equations it is essential that in thespirit of (6.51–6.53) the variational quantities are as previously the Kohn-Sham orbitals φi and their occupation numbers ni. The orbitals |Rmσ) ofthe correlated shells and the corresponding values of Fl or U and J are modelquantities but not variational. (They are supposed to model H [n] through(6.51–6.53) in the vicinity of certain densities n. In this sense, those modelquantities may be different in different vicinities in the functional n-space,but they all are supposed to approximate locally in the functional n-spacethe same universal functional H [n].) The eU -functional now leads accordingto (6.55) to an orbital-dependent potential operator acting on φi,

1

ni

δ

δφ∗ieU =

Rµσ

∂eU

∂nµσ

1

ni

δnµσδφ∗i

=∑

Rµσ

|Rµσ)∂eU

∂nµσ(Rµσ|φi〉. (7.76)

The U -potentials of the two considered versions are

∂eU,AMF

∂nµσ=

µ′

(µσµ′−σ|w|µσµ′−σ)(nµ′−σ − n−σ)+

+[

(µσµ′σ|w|µσµ′σ)− (µσµ

′σ|w|µ′σµσ)

]

(nµ′σ − nσ)

(7.77)

and

∂eU,AL

∂nµσ=∂eU,AMF

∂nµσ− (U − J)

(

nσ −1

2

)

. (7.78)

One characteristic feature of this latter U -potential is that in case of anisolated shell it moves the occupied states downward by (U − J)/2 and the

Page 175: The Fundamentals of Density Functional Theory (revised and

7.5 The LDA+U Approach 177

unoccupied states upward by (U−J)/2 independent of the shell occupation.By way of contrast, the center of the AMF spin subshell potential split movesup with increasing subshell occupation.

There are other versions of LDA+U functionals possible. Finding thebest LDA+U model for H [n] and the best parameterization of those mod-els is presently an active field of exploration. It should be noticed that theso-called LDA+DMFT approach (DMFT stands for dynamical mean fieldtheory) is an approach to the self-energy of the single-particle Green func-tion, and, though it uses similar ways of modeling, logically it belongs toanother although related context.

Page 176: The Fundamentals of Density Functional Theory (revised and

Part II:

RELATIVISTIC THEORY

Page 177: The Fundamentals of Density Functional Theory (revised and

8 A Brief Introductionto Quantum Electrodynamics

Due to the zero (or at least up to now immeasurably small) rest mass ofthe photon, electrodynamics, the theory of time-dependent electromagneticfields, is an intrinsically relativistic theory and, as is well known, had servedas the prototype of relativistic physics. Quantum electrodynamics (QED)is the relativistic quantum theory of electrically charged particles, mutuallyinteracting by exchange of photons. If this system is additionally subjectto a (static) external field, an inhomogeneous situation is obtained to whichdensity functional theory may apply. The standard situation is again thatof electrons (and possibly appearing positrons) moving in the adiabatic fieldproduced by nuclei in atoms, molecules or solids. Since we will focus on thatsituation, in our context the particle field will exclusively be the electron-positron field.

The magnitude of relativistic effects in atoms is easily estimated: thetotal energy of all electrons of an atom with nuclear charge Z is of the orderZ2 (natural atomic units used). It is essentially the energy of the electronsin the 1s-orbital which is large compared to that of higher orbitals. In non-relativistic treatment a kinetic energy of the same order of magnitude isobtained from the virial theorem. The kinetic energy per electron is henceof the order of Z. Since the electron mass is equal to unity in our naturalunit system, the average electron velocity is of the order

√Z. The velocity

of light in natural atomic units is c ≈ 137. (It is equal to the inverse of thefine structure constant.) Hence the average relativistic mass correction is∆m/m ∼ (v/c)2 ∼ Z/1372. Since the average kinetic energy per electron was∼ Z, the average energy correction per electron is ∆ε ∼ (Z/137)2Hartree.This amounts to about 1meV for hydrogen, about 1eV for silver, say, andup to 10eV for actinides.

To prepare for a relativistic density functional theory, in this chapter webrowse through quantum electrodynamics, just to recall some basic ideas,notions, and relations necessary for our goal. We start with classical elec-trodynamics, introduce the covariant notation, the Lagrange formalism, rel-ativistic mechanics, and finally introduce the quantized photon field, theDirac field, and their interaction.

Page 178: The Fundamentals of Density Functional Theory (revised and

180 8 A Brief Introduction to Quantum Electrodynamics

8.1 Classical Electrodynamics

Electrodynamics connects the dynamics of electric fields E and magneticfields H with the dynamics of their sources and vortices: electric chargesand currents. In an infinite space causality demands that outgoing wavesof electromagnetic field are correlated with the motion of charges and withcurrents in the past only, and that incoming waves are correlated with themotion of charges and with currents in the future only.

The basic equations of classical electrodynamics are Maxwell’s equations:

ǫ0∇E = −en, ∇×H = ǫ0E − ej, (8.1)

∇×E = −µ0H , ∇H = 0. (8.2)

The equations in the first line identify the electrical charge density −en(n being the particle density of electrons with charge −e, minus that ofpositrons with charge +e) to be the source density and the electric currentdensity −ej (j being the particle current of electrons minus that of positrons)together with the ‘displacement current’ ǫ0E to be the vortex density of theelectromagnetic field.

Note that although we have four field-creating components (n, j), onlythree of them are independent, since the first two of Maxwell’s equationsprovide charge conservation

n+∇j = 0 (8.3)

and hence determine only three independent components of the electromag-netic field. The remaining three field components are determined by thesecond group (8.2) of Maxwell’s equations, of which again only three areindependent because the right one ensures that both sides of the left onehave zero divergence.

Besides the charge quantum e, the vacuum permittivity ǫ0 and the vac-uum permeability µ0 appear as coefficients in Maxwell’s equations. Causalityadds boundary conditions for the fields to those equations, but we will notconsider them here.

To fulfill the second group (8.2) of Maxwell’s equations, electromagneticpotentials (U,A) are introduced for the fields according to

µ0H = ∇×A, E = −A−∇U. (8.4)

Since four potentials were introduced to fulfill three independent equations,there is still a free choice of one of the potential components. The Lorentz

Page 179: The Fundamentals of Density Functional Theory (revised and

8.2 Lorentz Covariance 181

gauge

1

c2U +∇A = 0,

1

c2= ǫ0µ0 (8.5)

is the most symmetric of all choices, transforming the first group (8.1) ofMaxwell’s equations into

U = −enǫ0, A = −µ0ej, =

1

c2∂2

∂t2−∇2. (8.6)

The d’Alembert operator uncovers the full symmetry of the Maxwell the-ory.

The connection between the electrodynamic potential and the mechanicalHartree potential is vH = −eU , and that between the electrodynamic chargeand the electrostatic coupling constant is λ = e2/4πǫ0. (In electrostatics oneoften defines units by putting ǫ0 = 1/4π; this was done in Part I.) Poisson’sequation (3.14) is now the static version of the first equation (8.6).

8.2 Lorentz Covariance

D’Alembert’s operator suggests Minkowski’s geometry of space-time:

xµ = (ct, x, y, z), xµ = gµνxν = (ct,−x,−y,−z), (8.7)

gµν =

1 0−1

−10 −1

= gµν , (8.8)

∂µ =∂

∂xµ, ∂µ =

∂xµ, ∂µ∂

µ = , (8.9)

where summation over repeated indices, one upper and one lower, (tensorcontraction) is understood. Besides the fundamental or (pseudo-)metric ten-sor gµν , which determines the indefinite metric (or pseudo-metric) of theMinkowski space, there is a fundamental four-form (Levi-Civita pseudo-tensor)

ǫµνρσ = ǫµνρσ, ǫP(0123) = (−1)P , ǫµνρσ = 0 otherwise, (8.10)

Page 180: The Fundamentals of Density Functional Theory (revised and

182 8 A Brief Introduction to Quantum Electrodynamics

where P means permutation. It determines the invariant measure (volume)of a parallel epiped spanned by any four four-vectors (dxµ)i, i = 1, . . . , 4 or(dxµ)i, i = 1, . . . , 4 to be

dV = ǫµνρσ(dxµ)1(dx

ν)2(dxρ)3(dx

σ)4 =

= ǫµνρσ(dxµ)1(dxν)2(dxρ)3(dxσ)4 (8.11)

in that affine Minkowski space. (If the dxµ are differentials of coordinatetransformation functions, the above expression for dV produces automati-cally the corresponding Jacobian.) Every totally antisymmetric rank-4 ten-sor is proportional to that fundamental four-form. Every totally antisym-metric rank-n tensor (n-form, n ≤ 4) can be expressed as the contraction ofa totally antisymmetric rank-(4−n) tensor with the fundamental four-form.

In this language the four-current density (free of four-sources because of(8.3)) becomes

jµ = (cn, jx, jy, jz), ∂µjµ = 0, (8.12)

and the four-field tensor is

F µν =

0 Ex/c Ey/c Ez/c−Ex/c 0 µ0Hz −µ0Hy

−Ey/c −µ0Hz 0 µ0Hx

−Ez/c µ0Hy −µ0Hx 0

, (8.13)

that is,

F 0i = Ei/c, F ik = µ0ǫiklHl, (8.14)

where as in the whole following text Greek super-(sub-)scripts refer toMinkowski’s four-quantities and Latin ones to three-dimensional spatial ten-sors. The fundamental three-form ǫikl is defined in complete analogy to(8.10). The antisymmetric four-field tensor is made up of an antisymmetricthree-tensor being dual (via ǫ) to the (pseudo-)three-vector H of the mag-netic field (axial vector), and framed by the three-vector E of the electricfield (polar vector).

Maxwell’s equations are cast into

∂νFµν = −µ0ej

µ, ǫµνρσ∂νF ρσ = 0. (8.15)

The four-potential (four-source free by the Lorentz gauge) is

Aµ = (U/c, Ax, Ay, Az), ∂µAµ = 0, (8.16)

Page 181: The Fundamentals of Density Functional Theory (revised and

8.3 Lagrange Formalism 183

F µν = ∂νAµ − ∂µAν . (8.17)

Check the equivalence to (8.4).9

The whole theory is now covariant under transformations

x′µ

= Qµνx

ν , QµνgµρQ

ρσ = gνσ, (8.18)

where the right relation is the isometry condition for the transformation Qin Minkowski’s indefinite metric. E.g. the special Lorentz transformation is

Qµν =

1√1−v2/c2

−v/c√1−v2/c2

0 0

−v/c√1−v2/c2

1√1−v2/c2

0 0

0 0 1 00 0 0 1

. (8.19)

It connects two inertial systems, one moving with velocity v in x-directionrelative to the other. Another special case of (8.18) is an orthogonal trans-formation of the three-space (rotation by some Euler angles with coordinateorigin fixed).

8.3 Lagrange Formalism

A dynamical system in physics is characterized by a Lagrange function L ofits dynamical degrees of freedom f(t), so that the dynamics minimizes theaction S with f(t1) and f(t2) fixed:

δS = δ

∫ t2

t1

dt L = 0. (8.20)

9The topologically interested reader may note that Maxwell’s theory is a case of deRham’s cohomology: If ω is a differentiable form (n-form), then the exterior differen-tial dω generalizes the rot-operation of three dimensions and creates an (n + 1)-form (acoboundary to ω). d2 = 0 generalizes rot rot = 0. The generalized integral theorem reads∫

V dω =∫

∂V ω, where ∂V is the n-dimensional oriented closed boundary (∂2 = 0, i.e.,the closed boundary itself has no boundary) of the (n + 1)-dimensional manifold V . Inthis frame, the four-potential A is a one-form and the four-field F = dA is its coboundary.The second group of Maxwell’s equations, dF = 0, is a simple case of the general ruled2 = 0. More generally, a closed manyfold with vanishing boundary is a cycle, and a formwith vanishing coboundary is a cocycle. The second group of Maxwell’s equations statesthat F is a cocycle, and hence, in a manyfold which is homotopically contractible into apoint, it may be given as the coboundary of a potential. (Cf. [Schwarz, 1994].)

Page 182: The Fundamentals of Density Functional Theory (revised and

184 8 A Brief Introduction to Quantum Electrodynamics

This variation yields in the well known way equations of motion

d

dt

∂L

∂f=∂L

∂f. (8.21)

(Provided the functional S[f ] has a unique functional derivative S ′ =∂L/∂f − (d/dt)(∂L/∂f ).)

If the dynamical degrees of freedom are fields f(r) in three-space, theaction is expressed in terms of a Lagrange density L according to

S =

∫ t2

t1

dt

d3rL, (8.22)

and the equations of motion adopt the form

∂µ∂L

∂(∂µf)=∂L∂f

. (8.23)

(Cf. e.g. [Itzykson and Zuber, 1980].)The Lagrange density of a free electromagnetic field, i.e. a field in the

absence of sources jµ, is, up to a constant factor α to be determined at theend of this section,

L = αF µνFµν =

= 2α [(∂νAµ)(∂νAµ)− (∂µAν)(∂νAµ)] =

= 2α(∂νAµ)(∂νAµ)− 2α∂µ(Aν∂νAµ) =

= L′ + four-divergence. (8.24)

The parentheses limit the action of the differential operators in these ex-pressions. The third expression for L is equal to the second because of theLorentz gauge of the four-potential. An additive four-divergence term of Lconsists of a sum of a three-divergence and a time-derivative. The three-divergence does not contribute to L and to the equations of motion, respec-tively, if the fields vanish at infinity in space (or if a finite space withoutboundary is considered, as, e.g., a three-dimensional torus). An additivetime-derivative to L does also not contribute to the equations of motion,because it adds only a constant to the action S. From (8.24) we find

∂ν∂L′

∂(∂νAµ)= 4α∂ν∂

νAµ = 4αAµ, (8.25)

Page 183: The Fundamentals of Density Functional Theory (revised and

8.3 Lagrange Formalism 185

and

∂L′∂Aµ

= 0, (8.26)

since the Lagrange density does not depend on the four-potential itself butonly on its derivatives. Hence, the field equations of motion take the form

Aµ = 0, (8.27)

thus justifying (8.24). Recall that we have essentially used the Lorentz gaugeof the four-potential in deriving this result.

The covariant formalism uncovers the full symmetry of a dynamical sys-tem. A particular physical state of the system will in general of course nothave this symmetry. The only requirement is that the state obtained by asymmetry transformation from a given one is again a physical state of thatsame system. Particular problems are usually most effectively solved usingcoordinates adapted to the symmetry of the state of interest only.

The transition from the Lagrange form of the equations of motion to theHamilton form is provided by a Legendre transformation, which destroys thesymmetry of space-time because it distinguishes time (and it distinguishesenergy against momentum). Hence the Hamilton formalism is not covariant.The canonical momentum density of the field is10

Πµ =∂L

∂(∂0Aµ)= 4α(∂0Aµ − ∂µA0) = 4αF µ0. (8.28)

The Hamilton density is the Legendre transform of the Lagrange densityaccording to H = sup(∂0Aµ)(Π

µ∂0Aµ − L) (where the supremum conditionleads to (8.28)). We find

H = 4α

(

F µ0∂0Aµ −1

4F µνFµν

)

=

= 4α

(

F µ0Fµ0 −1

4F µνFµν

)

+ 4αF i0∂iA0 =

= H′ + 4α∂i(Fi0A0). (8.29)

Both sides of the last equality are equal up to a term containing ∂iFi0 =

∂µFµ0 as a factor, which is zero in the absence of field sources j0 by (8.15)

10Note Π0 ≡ 0; only three components of momentum correspond to the three indepen-dent components of Aµ.

Page 184: The Fundamentals of Density Functional Theory (revised and

186 8 A Brief Introduction to Quantum Electrodynamics

(or likewise by (8.27) under the Lorentz gauge). The three-divergence termdoes not contribute to the integral of the Hamilton density over the space(for fields vanishing at infinity or for space without boundary). The finalresult is

H′ = 1

2

(

ǫ0E2 + µ0H

2)

, (8.30)

if we put 4α = −1/µ0. This reproduces the well known expression for theenergy density of an electromagnetic field.

8.4 Relativistic Kinematics

It was Einstein’s essential step on top of the achievements of Lorentz andPoincare to realize that just kinematics, i.e. the role of space and time inphysics, had to be refined in order to bring electrodynamics and mechanicstogether. In a natural way, this led Minkowski to his introduction of thefour-tensor formalism.

The invariant line element of Minkowski’s geometry is (with dr = vdt)

ds2 = dxµdxµ = c2dt2 − dr2 = c2dt2(1− v2/c2). (8.31)

In the rest frame of that line element one has

v = 0 : ds = c dt : ds/c = ‘proper time interval’. (8.32)

This suggests the introduction of a four-velocity uµ = dxµ/(proper timeinterval):

uµ = c dxµ/ds = (c,v)/√

1− v2/c2, uµuµ = c2, (8.33)

whose spatial components correspond to the ordinary velocity.Energy and momentum conservation follow from invariance under time

and spatial translations, respectively. Energy and momentum must hencetogether form a four-vector. It is given the dimension of a momentum byreplacing the energy E by E/c and is called the four-momentum. For v/c≪1, its spatial components must be p = mv, hence

pµ = muµ = (E/c,p), E2/c2 − p2 = pµpµ = m2uµuµ = m2c2, (8.34)

where m is the rest mass of the particle under consideration, i.e. the massin the limit v/c≪ 1. The last chain of relations (8.34) yields

E = c√

m2c2 + p2 (8.35)

Page 185: The Fundamentals of Density Functional Theory (revised and

8.5 Relativistic Mechanics 187

as the relativistic dispersion relation between energy and momentum of afree particle.

This may also be obtained from the action

S = −∫

dsmc = −∫

dtmc2√

1− v2/c2,

L = −mc2√

1− r2/c2, (8.36)

since∫

ds is (up to constant factors defining units of physical quantities) theonly invariant action which can be formed for a free particle, represented bya world line with line element ds. With p = ∂L/∂r and E = H = p · r − Lwe arrive again at (8.35).

8.5 Relativistic Mechanics

For a particle with charge −e, moving in an external (given) electromagneticfield, the most simple invariant action, which can be formed, is

S =

(−mcds+ eAµdxµ) =

=

dt (−mc2√

1− v2/c2 + eU − eA · v) =

dt L. (8.37)

Constant factors again define units. A Legendre transformation yields thecanonical momentum

p =∂L

∂v=

mv√

1− v2/c2− eA (8.38)

and the Hamilton function

H = p · v − L = c√

m2c2 + (p + eA)2 − eU, (8.39)

which describes the motion of a charged particle with charge −e in givenexternal electric and magnetic fields, characterized by their potentials U andA.

As a next step one would consider charged particles and fields as a closeddynamical system, this type of theory, however, has to fight against a severedefect, because the field created by a single point-like charged particle con-tains infinite field energy. The electric field strength is proportional to r−2,and the energy density (8.30) of the field is hence proportional to r−4 which

Page 186: The Fundamentals of Density Functional Theory (revised and

188 8 A Brief Introduction to Quantum Electrodynamics

integrates to infinity at small r. Note that relativity excludes the possibil-ity of a rigid body: if one point of that body is set to motion, the otherpoints can follow only with retardation limited by the light velocity c, thuscausing deformation and hence change of internal energy of the body, i.e.of its rest mass. Hence, any classical elementary particle with a single defi-nite rest mass and with kinetic energy, solely characterized by one velocity,must be point-like. On the other hand, the field energy of the electric fieldE = −e/(4πǫ0r2) of an electron, obtained by integration over (8.30) from ra-dius r to infinity, is e2/(4πǫ0r) and exceeds the rest energy mc2 for r-valuessmaller than the classical electron radius rc = e2/(4πǫ0mc

2). For lengthsclose to and below rc classical electrodynamics loses its physical meaning.

8.6 The Principles of Relativistic Quantum Theory

The superposition principle of quantum physics requires the representationof quantum states by Hilbert vectors |Ψ〉, so that linear superpositions ofstates are again states. Besides energy conservation, in relativistic quan-tum physics there is no separate conservation of rest mass. An eigenstateof an observable not commuting with the Hamiltonian contains componentsreferring to arbitrarily large energies. From experiment we know that thisopens the possibility of particle-antiparticle pair creation out of the vac-uum. Hence, consequent relativistic quantum theory cannot be formulatedfor a fixed particle number, it definitely needs field quantization—e.g. Fockspace—representation.

For the same reason, the particle coordinate cannot be an observable inrelativistic quantum theory: because of the uncertainty principle, a positionuncertainty of a particle, smaller than ∆x causes a momentum uncertaintylarger than ~/∆x and thus an energy uncertainty larger than ~c/∆x. Ifthis energy uncertainty exceeds twice the rest mass mc2, then pair creationsets in, and the individuality of that particle cannot be maintained. Hence,the position uncertainty of a particle is bounded below by -λ = ~/mc, theCompton wavelength of that particle. (Note that the classical electron ra-dius mentioned in the last section is smaller than its Compton wavelengthby a factor α = e2/(4πǫ0~c) = 1/137, the fine structure constant. Hence thedefect of the classical theory is completely masked by quantum theory.) Un-like the particle coordinates, momenta are observable quantities as there isno general physical principle setting a lower limit to momentum uncertainty.

Since the position eigenstate |r〉 has no physical meaning (even not as animproper eigenstate of the continuous spectrum), there is no wavefunction

Page 187: The Fundamentals of Density Functional Theory (revised and

8.6 The Principles of Relativistic Quantum Theory 189

Ψ(r) = 〈r|Ψ〉 figuring in quantum field theory. Nevertheless, local fieldoperators like ψ(r), acting in the representation space on Hilbert states, areused as dynamical variables, in the same manner as creation and annihilationoperators are used in occupation number representation of non-relativistictheory: observables are expressed in terms of those dynamical variables.Particularly, local expectation values may be determined with their use.

In order to describe physical dynamics, quantum states must be consid-ered at a given instant of time t. Stationary states are characterized (inSchrodinger picture, ~ is put to unity again from here on throughout) by

i∂0|Ψ〉 = |Ψ〉E/c or |Ψ(t)〉 = |Ψ(0)〉 e−iEt (8.40)

For the physical vacuum state, denoted by |〉,

∂0|〉 = 0 (8.41)

holds, and, in contrast to non-relativistic physics, energies E are non-negative in relativistic physics: a negative energy E of a state would imply anegative rest mass of the system, for the existence of which nature providesno evidence.

If, at t = 0, c† creates stationary matter with E > 0 out of the vacuum,and c annihilates it,

|Ψ(0)〉 = c†|〉, |〉 = c|Ψ(0)〉, (8.42)

then time evolution is either described in Schrodinger picture by time-dependent states and time-independent operators or alternatively in Heisen-berg picture by time-independent states and time-dependent operators, bothtypes of quantities coinciding at t = 0. From (8.40, 8.42) one has

eiEt = 〈Ψ(t)|Ψ(0)〉 = (〈Ψ(0)|eiEt)(c†|〉) =

= 〈Ψ(0)|(eiEtc†)|〉 = 〈Ψ|c†(t)|〉, (8.43)

and

e−iEt = 〈Ψ(0)|Ψ(t)〉 = (〈|c)(e−iEt|Ψ(0)〉) =

= 〈|(ce−iEt)|Ψ(0)〉 = 〈|c(t)|Ψ〉. (8.44)

Hence, in Heisenberg picture creation and annihilation operators, respec-tively, of stationary matter have definite time-dependences

c†(t) = c† eiEt, c(t) = c e−iEt, E > 0. (8.45)

Page 188: The Fundamentals of Density Functional Theory (revised and

190 8 A Brief Introduction to Quantum Electrodynamics

Field operators are generally composed of linear combinations of both cre-ation and annihilation operators. In Heisenberg picture they are functions ofposition and of time, which generally allows for a covariant way of writing.

As an example the field operator of the free photon field

Aµ(xρ) =∑

µ0

2k0

[

cνkuµν e

−ikx + cν†kuµν e

ikx]

, (8.46)

[cµk, cν

k′] = 0 = [cµ†k, cν†

k′ ], [cµk, cν†

k′ ] = −gµν δkk′, uµσgστuντ = gµν (8.47)

is considered. Here, kx in the exponents abbreviates kµxµ, and, with

k0 = |k|, (8.46) obeys the field equation Aµ = 0 for the free photon field (cf.(8.27)), as it should. The uµν , ν = 0, 1, 2, 3 are constant polarization vectors;completeness demands that they form an isometric transformation matrixin Minkowski’s metric (last relation (8.47)). However, in order to have co-variant canonical bosonic commutation relations as (8.47)—a demand ofcausality, otherwise the field components Aµ would not commute outside ofthe light cone (i.e., Aµ(xρ) with Aν(x′ρ) for (xρ − x′ρ)(xρ − x′ρ) < 0, cf.[Itzykson and Zuber, 1980])—, for µ = ν = 0 it had to be given the wrongsign leading to creation of Fock-space states with negative norm.11 On theother hand, (8.46) does not yet correspond to the classical electrodynamics,because Aµ = 0 was the correct classical field equation only under theLorentz gauge ∂µA

µ = 0. This gauge, as applied to the field operator (8.46),again conflicts with (8.47) and hence with causality. Therefore, instead ofposing the Lorentz gauge as a condition on the field operator, one must con-sider it as a condition on physical states |Ψ〉: only states with ∂µA

µ|Ψ〉 = 0are to be considered physical. Physical states then have a positive normand are characterized by occupation of transverse photon modes only. (Fordetails cf. again e.g. [Itzykson and Zuber, 1980].)

Finally, the prefactor in (8.46) was chosen in such a way, that the Hamil-tonian in correspondence with (8.30) is

H =

d3r1

2

(

ǫ0E2 + µ0H

2)

=

=∑

k

−c|k| cρ†kuµρgµνu

νσcσk

+ const., (8.48)

11With the natural definition of the vacuum, 〈|〉 = 1, c0†k|〉 = |k0〉, c0

k|〉 = 0, one finds

from (8.47) for µ = ν = 0 that 〈k0|k0〉 = 〈|c0

kc0†k|〉 = −〈|〉 + 〈|c0†

kc0

k|〉 = −1, i.e., a

time-polarized photon would have a negative norm.

Page 189: The Fundamentals of Density Functional Theory (revised and

8.7 The Dirac Field 191

where E and H follow from Aµ like in the classical case, and integration isover a normalization volume, which we tacitly put to unity. The (infinite)additive constant is put to zero by a simple renormalization, consisting ofrearranging products of field operators to normal order, so that creatorsalways precede annihilators from left to right. Normal order is denoted byputting the corresponding expressions between colons:

H =

d3r1

2:(

ǫ0E2 + µ0H

2)

: =∑

k

−c|k| cρ†kuµρgµνu

νσ cσk. (8.49)

Due to the wrong sign of the commutation relations for time-polarized pho-tons, this Hamiltonian is not positive definite in the total Hilbert space; itis, however, definite in the physical sector of states containing only trans-verse photons. If the c

(t)†k

= cρ†kutρ, t = 1, 2 create transverse photons, the

projection of the Hamiltonian on the physical sector of the Hilbert space is∑

k

∑1,2t c|k| c(t)†

kc(t)k

.Since photons are bosons, modes may be occupied in macroscopic num-

ber, forming a classical (Bose condensed) field, which may be embossed intothe ‘quantum vacuum’ by renormalization. Time-independent electromag-netic fields are always classical, Bose condensed. Therefore their nonzerolongitudinal components do not pose problems with the Hilbert space normof the quantized (transverse) part of the field.

8.7 The Dirac Field

For the energy of a free electron, the dispersion relation

E2 = p2c2 +m2c4 (8.50)

holds. Putting i∂µ = (E/c,p), one finds the Klein-Gordon equation

(∂µ∂µ +m2c2)ψ(xν) = 0. (8.51)

Containing a second time-derivative and no spin operator, it does on theone hand not really look close to the Schrodinger equation, which should beits non-relativistic limit, and describes on the other hand a scalar (bosonic)field and not a spinor (fermionic) field. The Dirac equation instead is afour-component equation

i∂0ψ = (−iα ·∇+ βmc)ψ = HD ψ, ψ =

ψ1

ψ2

ψ3

ψ4

(8.52)

Page 190: The Fundamentals of Density Functional Theory (revised and

192 8 A Brief Introduction to Quantum Electrodynamics

with 4× 4 matrices αi and β such that each component of a ψ obeying thisDirac equation is a solution of the Klein-Gordon equation. From (8.52) itfollows that 0 = (i∂0 + HD)(i∂0− HD)ψ = −(∂2

0 + H2D)ψ. Hence we demand

−∂0∂0ψ = (−iα ·∇ + βmc)2ψ

!= (−∇2 +m2c2)ψ. (8.53)

The last relation requires the algebraic properties

[αi, αk]+ = 2δik, [αi, β]+ = 0, β2 = 1, (8.54)

where 1 means a 4× 4 unit matrix, and δik as well is for every index pair ikeither a 4 × 4 unit matrix or a zero matrix. These algebraic relations maybe cast into a covariant form

γ0 = β, γi = βαi, [γµ, γν ]+ = 2gµν . (8.55)

having a simple representation as

γ0 =

(

1 00 −1

)

, γi =

(

0 σi−σi 0

)

, (8.56)

or

β =

(

1 00 −1

)

, αi =

(

0 σiσi 0

)

, (8.57)

where 1 means a 2×2 unit matrix, and σi are the well known Pauli matrices(twice the matrices (1.10)).

In covariant writing the Dirac equation is now

(iγµ∂µ −mc)ψ = 0. (8.58)

As charge conservation is implemented in quantum theory as a symmetrywith respect to some inner conjugation, we want to consider the Hermitianconjugate of this equation. Observe in this respect, that αi and β = γ0

are Hermitian matrices, while γi† = γ0γiγ0. Hermitian conjugation of the

equation (8.58) thus leads to ψ†γ0(−i←

∂µ γµ −mc)γ0 = 0 (since we had to

reverse the order of matrix multiplications, differentiation must now operateto the left). The result is usually written as

ψ(i←

∂µ γµ +mc) = 0, ψ

def= ψ†γ0, (γµ = γµ). (8.59)

Page 191: The Fundamentals of Density Functional Theory (revised and

8.7 The Dirac Field 193

Adding (8.58), multiplied by ψ from the left, and (8.59), multiplied by ψfrom the right, yields

ψ(←

∂µ γµ + γµ

∂µ)ψ = ∂µ(ψγµψ) = 0. (8.60)

Hence, the conserved four-current with density

−ejµ = −ecψγµψ (8.61)

appears as a proposed candidate for the electric four-current.For physical interpretation of the field ψ itself, less compressed expres-

sions are needed. If one composes the four-component field ψ as a bispinorby two two-component spinors φ and χ, the Dirac equation acquires theform

ψ =

(

φχ

)

,

iφ = mc2φ− icσ ·∇χiχ = −mc2χ− icσ ·∇φ (8.62)

This shows, that the Dirac equation has two types of solutions: if φ is large,then the time-derivative corresponds to a positive energy ∼ mc2, and if χis large, then the time-derivative corresponds to a negative energy ∼ −mc2.Hence, in the field operator of the free Dirac field

ψ(xµ) =∑

1√2k0

[

akσ ukσ e−ikx + b†

kσ vkσ eikx

]

,

k0 =√

k2 +m2c2 (8.63)

the amplitude bispinor ukσ must be taken from a solution of the Dirac equa-tion with large φ, and vkσ must be taken from a solution with large χ. Forboth cases there are two independent orthogonal solutions obeying

1,2∑

σ

ukσukσ = γµkµ +mc,

1,2∑

σ

vkσvkσ = −γµkµ +mc. (8.64)

Since these conditions have a covariant form, no conflict between field quan-tization and causality appears.

To obtain (8.64), first observe that the Dirac equation (8.58) for ψ re-quires

(γµkµ −mc) ukσ = 0, (γµkµ +mc) vkσ = 0, (8.65)

Page 192: The Fundamentals of Density Functional Theory (revised and

194 8 A Brief Introduction to Quantum Electrodynamics

where k0 was already chosen so that (γµkµ)2 − (mc)2 = 0. Hence we may

put

ukσ ∼ (γµkµ +mc)ekσ, vkσ ∼ (γµkµ −mc)fkσ, (8.66)

where the ekσ’s have zeros in the last two components and the fkσ’s havezeros in the first two components, whence

1,2∑

σ

ekσekσ ∼ (1 + γ0)/2,

1,2∑

σ

fkσfkσ ∼ (1− γ0)/2. (8.67)

Further observe, again because of (γµkµ)2−(mc)2 = 0, that (±γµkµ+mc)(1±

γ0)(±γµkµ + mc) = (±γµkµ + mc)[(±γµkµ − mc)(1 ∓ γ0) + 2mc + 2k0] =(±γµkµ + mc)2(mc + k0). Hence, if we normalize ekσ, fkσ ∼ 1/

√mc+ k0,

we arrive at (8.64). Along the same line scalar products ukσukσ = vkσvkσ =mc, ukσvk′σ′ = 0, ukσγukσ = 2k, and so on, are obtained.

In order to have a causal anticommutation rule for ψ and ˆψ, one mustrequire

[akσ, ak′σ′ ]+ = 0 = [bkσ, bk′σ′ ]+, [akσ, bk′σ′ ]+ = 0 = [akσ, b†k′σ′ ]+,

[akσ, a†k′σ′ ]+ = δσσ′δkk′ = [bkσ, b

†k′σ′ ]+. (8.68)

(With commutation rules, one would not be able to obtain causal fields

ψ and ˆψ, commuting outside of the light cone; this is how Pauli’s spin-statistic theorem is obtained, which states that half-integer spin parti-cles must be fermions, and integer spin particles must be bosons; e.g.[Itzykson and Zuber, 1980].) Since the electric charge density is, accord-ing to (8.61, 8.62), equal to : −eφφ + eχχ :, it is clear that a†

kσ creates an

electron and b†kσ creates a positron.

The Hamiltonian of the free Dirac field is obtained as

H =

d3r c : ˆψ(−iγ ·∇+mc)ψ : =

=∑

c√

k2 +m2c2 [a†kσakσ + b†

kσ bkσ]. (8.69)

Here, in the normal ordering process fermionic sign changes are understood.As in Section 8.5, electromagnetic coupling (with electron charge −e) is

obtained by adding a term ejµAµ to the Lagrangian density. (In Section 8.5jµ was given by the particle position and velocity.) Since in the Dirac theory

Lint = e : jµAµ : = ec : ˆψγµψAµ : (8.70)

Page 193: The Fundamentals of Density Functional Theory (revised and

8.7 The Dirac Field 195

does not contain time-derivatives of the fields, it does not modify the canoni-cal field momenta, hence the same interaction term with reversed sign entersthe Hamiltonian density:

Hint = −e : jµAµ : . (8.71)

This theory, taken literally, again has a defect: it can be shown to renor-malize all observable electric charges to zero, hence it must be used withphenomenological charge renormalization to the observed value. The defectcan be cured by replacing the electromagnetic theory by electro-weak the-ory with a not too large number of coupled fermions [Berestetskii, 1976], butthis well confirmed theory again . . .

Page 194: The Fundamentals of Density Functional Theory (revised and

9 Current Density FunctionalTheory

The basic concern of density functional theory is the ground state energyof an inhomogeneous interacting many-particle system as a function of theexternal potential acting on its density (and possibly as a function of theparticle number, allowing for the consideration of ionization energies, affini-ties, and excitation gaps). This ground state energy function is replaced byits Legendre transform as a function of the particle density, and the Legen-dre back transformation leads via insertion of an interaction-free referencesystem to an effective one-particle equation whose solution finds the groundstate energy and density to any given external potential. As was seen in PartI, this theory provides rather a reformulation than a solution of the many-body problem for the ground state, because it does not explicitly providethe Legendre transform H [n] to E[v,N ] (cf. (6.22 and 6.25)). The originalmany-body problem is now contained in H [n] which only can be guessed,although not without success.

The most radical relativistic generalization of this theory, which is con-sidered in the present chapter, does the same for the ground state energyof an inhomogeneous interacting quantum field of matter as a function ofthe external field acting on its particle current density (and possibly as afunction of the conserved charges of the field of matter). It again providesan effective single-particle equation—the Kohn-Sham-Dirac equation—forobtaining the ground state energy and current density from the Legendretransform of the ground state energy functional. This is now a current den-sity functional which potentially contains the whole information on groundstates of interacting quantum fields (of a given type), and which, of course,is all the more unknown. One can but apply guesswork to account for partof relativistic effects of the problem.

In the first section, the ground state of an inhomogeneous quantum fieldand its current density are defined and the necessary longitudinal fields ofparticle interaction are introduced as condensed mean fields. The Kohn-Sham-Dirac equation in its most general form is derived in the subsequentsection, and in the third section the four-current is decomposed into itsorbital and spin parts in order to achieve closer resemblance to the non-

Page 195: The Fundamentals of Density Functional Theory (revised and

9.1 QED Ground State in a Static External Field 197

relativistic theory. With respect to the exchange and correlation energy func-tional, the shape of the relativistic theory also provides useful suggestions fornon-relativistic functionals. Some related approximative functionals are pre-sented in section four. The most important relativistic term is of course thatof the kinetic energy, yielding the Dirac form of the Kohn-Sham equation.Any implementation of this theory must take care for a proper projectionof the Dirac Hamiltonian onto the sector of electron states only. Otherwisethe Hamiltonian in Schrodinger representation would not be bounded below(as the field-theoretic Hamiltonian is due to normal order). A very accu-rate numerical implementation together with a survay on relativistic densityfunctional applications is presented in [Eschrig et al., 2003b].

To be specific, the material of this chapter is presented for the case ofthe electron-positron-photon field, the only relevant known case in natureof a field subject to the action of an inhomogeneous external potential. Inprinciple it applies, however, to any field theory, and even to cases which areinhomogeneous due to spontaneous symmetry breaking only, as e.g. atomicnuclei constituting droplets of nuclear matter in homogeneous surroundings.(The truly homogeneous ground state is of course also included in densityfunctional theory, however, this theory does not provide any result for thatcase; the independently acquired knowledge on that case is on the contraryused to guess the density functional.)

9.1 QED Ground State in a Static External Field

We consider a quantum electrodynamical system in a static external classical(Bose condensed) field F µν , given by the four-potential

Aµ(xσ) : ∂0Fµν = ∂0(∂

νAµ − ∂µAν) = 0. (9.1)

Of course, the static condition refers to a certain distinguished referenceframe. We further assume spatial periodicity in a large periodic spatial vol-ume |T 3| with respect to that reference frame, and refer all integrated quan-tities to that volume |T 3| (toroidal space; note that in QED this step hasanother completely independent advantage: it prevents an ‘infrared catas-trophe’ of creation of long wavelength photons in unlimited number). Ofcourse, a thorough treatment needs the same caution as described in Sec-tion 2.7.

The Hamiltonian in question is

HA =

d3r (H − ejµAµ), (9.2)

Page 196: The Fundamentals of Density Functional Theory (revised and

198 9 Current Density Functional Theory

where

jµ = c : ˆψγµψ :, (9.3)

is the four-current density operator of the electron-positron field, and (cf.(8.69, 8.71, and 8.49)

H = c : ˆψ(−iγ ·∇+mc)ψ : − ejµAµ +1

2: (ǫ0E

2 + µ0H2) : . (9.4)

(The colons for normal order of jµ were already included in the definitionof the four-current density operator and need not be repeated in the secondterm of H; since ψ and Aµ commute, no reordering of the product is needed.)The field operator Aµ describes the field created in the quantum electrody-namical system and is to be distinguished from the external (c-number) fieldAµ in (9.1, 9.2). Furthermore, (cf. (8.4, 8.16))

E = −c(∂0A +∇A0), µ0H = ∇× A. (9.5)

Note that we did not include the field energy of the external field Aµ in theHamiltonian. Therefore, the eigenvalues of that Hamiltonian are allowed tobe negative (for negative −eA0). For the following it is furthermore crucialthat normal order in (9.4) refers to the creation and annihilation operatorsof the (renormalized) asymptotic fields of the homogeneous system, i.e. withzero external four-potential Aµ, so that this intrinsic Hamiltonian density isindependent of the external four-potential.

In Heisenberg picture this Hamiltonian leads to operator field equationsof motion

[

iγµ(∂µ − ieAµ − ieAµ)−mc]

ψ = 0, (9.6)

∂νFµν = −µ0ej

µ. (9.7)

They determine the function dependence of the Heisenberg field operators onxσ in accordance with boundary conditions to be given additionally. Partic-ularly, equal-time canonical (anti-)commutation relations between canonicalconjugate field variables must be required as initial conditions for the equa-tions (9.6, 9.7).

In quantum field theory one mostly considers scattering situations (in aninfinite spatial volume V ). One considers given incoming (asymptotic) fields

Page 197: The Fundamentals of Density Functional Theory (revised and

9.1 QED Ground State in a Static External Field 199

at far past (initial conditions) and adds retarded (as required by causal-ity) contributions appearing from interaction, or alternatively one consid-ers given outgoing fields at far future and adds advanced contributions ab-sorbed by interaction. The connection between given incoming and givenoutgoing fields is then determined by the scattering matrix S. The quan-tum state |Ψ〉 is fixed in this Heisenberg picture. The total electric charge−eN = −(e/c)

d3r 〈Ψ|j0|Ψ〉 of this state is also fixed due to charge con-

servation ∂µjµ = 0. N is the excess number of electrons against positrons.

The number of electron-positron pairs as well as the number of photons,however, may change in time.

By way of contrast, here we are facing another situation, where the ex-pectation values of all observables are stationary, i.e.,

〈Ψ|jµ|Ψ〉 = Jµ (9.8)

and

〈Ψ|F µν|Ψ〉 = fµν (9.9)

are time-independent. We fix the total charge

Q = −ec

d3r J0(r) (9.10)

in the system and consider the ground state of the quantum field as thatstate minimizing

E[A,Q]def= min

Ψ

〈Ψ|HA|Ψ〉∣

−ec

d3r 〈Ψ|j0|Ψ〉 = Q

. (9.11)

(In a spatial torus we can always assume that this ground state exists pro-vided HA is bounded below on the sector of states with fixed Q.) Because of(9.7) we have, however, −µ0eJ

µ = 〈Ψ|∂νF µν |Ψ〉 = 〈Ψ|∂ν∂νAµ− ∂ν∂µAν |Ψ〉,and hence, whenever Jµ is non-zero, we face a situation with anomalousmeans 〈Ψ|Aµ|Ψ〉 6= 0, i.e. 〈Ψ|cµ

k|Ψ〉 6= 0 6= 〈Ψ|cµ†

k|Ψ〉, indicative of the pres-

ence of a condensate in the state |Ψ〉: a stationary current (e.g. a chargeas its time component) produces a stationary (condensed) electromagneticfield.

The standard renormalization procedure [Itzykson and Zuber, 1980, §§11.4.2 and 12.5.3] in such a situation is to subtract |Ψ〉-dependent mean

Page 198: The Fundamentals of Density Functional Theory (revised and

200 9 Current Density Functional Theory

values from the quantum fields Aµ and jµ, and to pass over to a descriptionwith reduced fields aµ and Jµ defined by

Aµ = aµ + aµ, 〈Ψ|aµ|Ψ〉 = 〈aµ〉 = 0, (9.12)

jµ = Jµ + Jµ, 〈Ψ|Jµ|Ψ〉 = 〈Jµ〉 = 0. (9.13)

Separating the mean value from the field equation (9.7) yields

∂νfµν = −µ0eJ

µ, ∂ν fµν = −µ0eJ

µ. (9.14)

Recall that Aµ, aµ, Jµ are time-independent, and hence ∇ · J = 0. Theformer state |Ψ〉 is now to be understood as belonging to the vacuum sectorwith respect to the fields Jµ, aµ, i.e., to the Fock space above the vacuumof those fields, which does not allow for anomalous means of those fields.

The next step consists in transforming the Hamiltonian to the new vari-ables. First observe that the static electric field e = −c∇a0 appears outof the time component of the original four-potential operator, end hence,because of the indefinite metric of the corresponding Hilbert space (cf. thefootnote on page 190), 〈: E2 :〉 = −e2+〈: e2 :〉, while 〈: H2 :〉 = h2+〈: h2 :〉.Moreover,

−ǫ0e2 = ǫ0ce ·∇a0 = − ǫ0ca0∇ · e = ea0J0, (9.15)

µ0h2 = (∇× a) · h = a · (∇× h) = −ea · J . (9.16)

The sign = means that the expressions are equivalent under the spatialintegral (after integration by parts).

Terms linear in the new field operators are omitted in the transformedHamiltonian, since they have zero expectation values in the vacuum sector ofthe reduced fields. The effective Hamiltonian of the inhomogeneous quantumfield system is now obtained as

HeffA,Q =

d3r(

HeffJ − eJµAµ

)

, −ec

d3r J0(r) = Q, (9.17)

HeffJ = c : ˆψ(−iγ ·∇+mc)ψ : +

1

2: (ǫ0e

2 + µ0h2) : − eJµaµ −

− 1

2eJµaµ, (9.18)

where the last term collects the mean values of the two previous ones. Itrepresents the mean-field interaction in the considered state. The effective

Page 199: The Fundamentals of Density Functional Theory (revised and

9.1 QED Ground State in a Static External Field 201

Hamiltonian is labeled J since its definition (the definition of the figuringfield operators) depends on the expectation value J of the four-current, whichalso determines the expectation value aµ of the intrinsic four-potential viathe first equation (9.14) together with suitable boundary conditions. Thenormal order is the same as is in effect in (9.4).

Although many questions regarding the save content of a quantum fieldtheory with realistic interacting quantum fields have not yet got a finalanswer, one may expect that for every Jµ and every state |Ψ〉 in the vacuumsector of the reduced fields there exists a state |Ψ′〉 figuring in (9.11) andvice versa. The expectation values in corresponding to each other states ofboth Hamiltonians coincide. Hence, the ground state energy is now obtainedas

E[A,Q] = minΨ

〈Ψ|HeffA,Q|Ψ〉

∣〈Ψ|Jµ|Ψ〉 = 0

. (9.19)

In the operator part of the Hamiltonian HeffA,Q, correlation and all vacuum

polarization processes are retained. The corresponding states may again berestricted to those obeying ∂µa

µ|Ψ〉 = 0 and containing transversal photonsof the field aµ only. Note that for reasons which soon become clear, in thefirst term of (9.18) the full current jµ = Jµ + Jµ has been retained.

The Hamiltonian (9.17) is manifestly gauge invariant with respect totransformations

a/2 + A −→ a/2 + A +∇χ, ψ −→ ψ exp(−ieχ) (9.20)

with an arbitrary c-number function χ(r). Furthermore, an additive poten-tial constant to A0 will not affect the ground state |ΨAQ〉 of (9.11) and onlyadd a constant to its energy:

A0 −→ A0 + w : E[A,Q] −→ E[A,Q] + cQw (9.21)

cf. (4.6). This has the same consequences as in the non-relativistic case.Note that the projection of the gauge of a onto the physical sector of photonstates (cf. the text after (8.47)) is fixed by the second relation (9.12) to betransversal.

The content of this section may appear rather esoteric to some reader,however, accepting the existence and objectivity of the functional E[A,Q]of (9.19) is enough for all what follows. The preceding text tried to connectthe formal theory of the following sections to QED.

Page 200: The Fundamentals of Density Functional Theory (revised and

202 9 Current Density Functional Theory

9.2 Current Density Functionalsand Kohn-Sham-Dirac Equation

The non-relativistic limit of E[A,Q] from (9.11) or equivalently from (9.19)is E[v,N ], where v = −ecA0 and N = −Q/e. One therefore may proceedin analogy to Section 6.1.

First, fix A and take two values of charge Qi, i = 1, 2 to be integer multi-ples of the charge quantum e. Let |ΨAQi

〉 be the corresponding ground statesaccording to (9.11) and let 0 < α < 1, |Ψα〉 =

√α|ΨAQ1

〉 +√

1− α|ΨAQ2〉.

Note that the |ΨAQi〉 are orthogonal to each other (because of an integer dif-

ference of charge quanta) eigenstates of HA and Q, and that 〈Ψα|Q|Ψα〉 =αQ1 + (1 − α)Q2. Hence, E[A, αQ1 + (1 − α)Q2] ≤ 〈Ψα|HA|Ψα〉 =α〈ΨAQ1

|HA|ΨAQ2〉+(1−α)〈ΨAQ2

|HA|ΨAQ2〉 = αE[A,Q1]+(1−α)E[A,Q2].

As in the non-relativistic case, states of an expectation value Q of chargewhich is not an integer multiple of the charge quantum have energies in-terpolating linearly between eigenenergies corresponding to integer numbersof charge quanta, because Q commutes with HA and hence eigenstates ofthe one operator can always be chosen to be also eigenstates of the other.Altogether we find, that, for fixed A, E[A,Q] is always a convex functionof Q as, for fixed v, E[v,N ] was always a convex function of N (as definedwith respect to the Fock space and comprising thermodynamic mixtures ofphases in situations of phase separation).

Next, since HA of (9.2) has an affine-linear dependence on the externalfour-potential Aµ, E[A,Q] for fixed Q is a concave functional of Aµ: Fix Q,pick A1, A2 for which the HAi

are bounded below and let 0 < α < 1. Then,obviously HαA1+(1−α)A2

is also bounded below, and

E[αA1 + (1− α)A2, Q] =

= minΨ

α〈Ψ|HA1|Ψ〉+ (1− α)〈Ψ|HA2

|Ψ〉∣

∣〈Ψ|Q|Ψ〉 = Q

≥ αminΨ′

〈Ψ′|HA1|Ψ′〉

∣〈Ψ′|Q|Ψ′〉 = Q

+

+ (1− α) minΨ′′

〈Ψ′′|HA2|Ψ′′〉

∣〈Ψ′′|Q|Ψ′′〉 = Q

=

= αE[A1, Q] + (1− α)E[A2, Q].

This result comes again from the simple fact that the minimum of a sumcannot be lower than the sum of the independent minima.

Considering further the gauge property (9.21), we may finally define thecurrent density functional (in the sector of electron excess Q ≤ 0) along the

Page 201: The Fundamentals of Density Functional Theory (revised and

9.2 Current Density Functionals and Kohn-Sham-Dirac Equation 203

same lines as H [n] was introduced in Section 6.2: The convexity of E[A,Q]with respect to Q gives rise to a pair of mutual Legendre transforms

G[A, ζ ] = supQζQ− E[A,Q], E[A,Q] = sup

ζQζ − G[A, ζ ].

where ζ has obviously the meaning of an electrochemical potential. Thegauge property (9.21) implies E[A,Q]− ζQ = E[Aµ− δµ0 (ζ/c), Q] and henceallows for a writing G[A, ζ ] = G[Aµ − δµ0 (ζ/c), 0] = G[Aµ − δµ0 (ζ/c)] with

G[A] = − infQE[A,Q], E[A,Q] = sup

ζQζ −G[Aµ − δµ0 (ζ/c)]. (9.22)

Like−E, G is convex inA, hence another pair of mutual Legendre transformsis

H [J ] = supA(J |A)−G[A], G[A] = sup

J(A|J)−H [J ], (9.23)

where the notation

(J |A) = e

d3rJµ(r)Aµ(r) (9.24)

is introduced. Substituting the second equation (9.23) into the second equa-tion (9.22) yields

E[A,Q] = supζ

Qζ − supJ(Aµ − δµ0 (ζ/c)|J)−H [J ]

=

= supζ

Qζ + infJH [J ]− (Aµ − δµ0 (ζ/c)|J)

≤ infJ

H [J ]− (A|J) + supζ

(

Q+ (δµ0 (1/c)|J))

ζ

.

Substitution of the first relation (9.22) into the first relation (9.23) yields

H [J ]def= inf

Q≤0supA

E[A,Q] + e

d3r JµAµ

= infQ≤0

F [J,Q] (9.25)

where the inversion of sup inf into inf sup may be justified in a manneranalogous to the non-relativistic case. Also the definition of F and thesharpening of the inequality in the above expression for E[A,Q] into an

Page 202: The Fundamentals of Density Functional Theory (revised and

204 9 Current Density Functional Theory

equality is obtained analogously. The Hohenberg-Kohn variational principlenow reads

E[A,Q] = inf

H [J ]− e∫

d3r JµAµ

J ∈ X, −ec

d3r J0 = Q

.

(9.26)

Of course, H [J ] can be known only to that degree of approximation to whichthe problem (9.11) can be treated. The functional space X for the variationof J and its dual X∗ which A has to belong to must still be defined.

In the non-relativistic theory, the kinetic energy is Tnr = 〈Φ| −∑

∇2i /2|Φ〉, and T < +∞ implies n ∈ L3(T 3) for the density n via Sobolev’s

inequality (6.33). The relativistic kinetic energy is (in the Schrodinger rep-resentation for Φ) Tr = 〈Φ|∑ icα · ∇i|Φ〉, and one is tempted to supposeJµ ∈ L3/2(T 3). If for instance a Dirac bispinor orbital behaves like φ ∼ r−s,then Jµ ∈ L3/2(T 3) implies s < 1 for which indeed Tr < +∞ (see for in-stance [Landau and Lifshitz, 1982, Chapter IV]). Hence Aµ ∈ L3(T 3) mustbe demanded as L3(T 3) is the dual of L3/2(T 3). For a potential A0 ∼ r−s

this implies again s < 1 and hence excludes Coulomb potentials although itpermits to treat them as a limiting case.

In a relativistic theory Coulomb potentials are a touchy case anyway.First of all, HA would not be bounded below for A0 ∼ r−s with s > 1: anunlimited number of electrons (or positrons, depending on the sign of A0)would plunge into the potential center as their mutual Coulomb repulsioncould not compensate the attraction for small enough r. For s = 1, inthe case cA0 = Z/r, Z > c a.u., a single electron would again slip downthe potential thereby reducing Z effectively by one charge unit, and thisprocess would continue until Zeff < c a.u. The formal workaround withAµ ∈ L3(T 3) is to cut off Coulomb potentials at a nuclear radius r0 > 0,in the course of which for Z ≤ 137 the limes r0 → 0 may be taken. Intruth QED is inherently inconsistent anyway unless the Coulomb interac-tion is cut off at very small radii by the asymptotic freedom of electroweaktheory [Berestetskii, 1976]. (For the stability range of a slight caricature ofthe Hamiltonian HA see Refs. [Lieb and Yau, 1988, Lieb et al., 1996].)

A practically tractable case is an interaction-free electron-positron sys-tem in the external field, described by

H0 = c : ˆψ(−iγ ·∇+m0c)ψ : . (9.27)

(The difference of masses m−m0 amounts to mass renormalization causedby interaction, which reduces in the non-relativistic case to the exclusion of

Page 203: The Fundamentals of Density Functional Theory (revised and

9.2 Current Density Functionals and Kohn-Sham-Dirac Equation 205

self-interaction.) We again denote the pendant of the functional F [J,Q] forthis case by T [J,Q], and find for Q = −eN the corresponding E0[A,Q] asthe sum of the lowest N eigenvalues of the Dirac equation

(−icα ·∇ + βm0c2 − ecβγµAµ)ψk = ψkεk (9.28)

corresponding to solutions with large components in the electron sector (firsttwo components of the bispinor in the usual representation (8.56) of the γ-matrices). The corresponding four-current density is

Jµ = c∑

k

nkψkγµψk, 〈ψk|ψk′〉 = δkk′, Q = −e

k

nk. (9.29)

Eventually, partial occupation of the HOMO is understood, if N is notinteger.

For a non-interacting particle field in a static external potential no pair-processes take place, and therefore wavefunctions may be used. Therefore,the variational result (9.28, 9.29) is obtained in complete analogy to the non-relativistic case as described in Section 4.2 with the relation to the generalT -functional as discussed in Section 6.4. Moreover, in this case no coherent,Bose condensed internal fields appear and hence no separate treatment ofthe coherent and fluctuating parts of the fields as introduced in the renor-malization procedure of last section is needed. In order to directly comparewith the interacting case, the full current jµ via the full field operator ψ hadbeen retained in the first term of (9.18).

In the general case of interacting electrons, (9.29) is again used as aparameterization of the variational four-current Jµ in terms of Kohn-Sham-Dirac spinor orbitals ψk and orbital occupation numbers nk.

In essentially all practical cases the normal order of (9.3, 9.4) demandsthe Dirac spinor orbitals ψk to be electron orbitals which according to theremark after (9.5) are orbitals which develop continuously from the free-electron continuum when the external potential Aµ is continuously switchedon. It is this restriction which makes the Kohn-Sham-Dirac Hamiltonianbounded below.

The four-current density functional H [J ] is again split into an orbitalvariation part K and a density integral L:

H [J ] = K[J ] + L[J ],

K[J ] = minψk,nk

k[ψk, nk]∣

∣Jµ = c

k

nkψkγµψk, (ψk|ψk′) = δkk′

,

L[J ] =

d3r J0(r) l(Jµ(r),∇Jµ(r), . . .).

(9.30)

Page 204: The Fundamentals of Density Functional Theory (revised and

206 9 Current Density Functional Theory

Substitution into (9.26) yields the Kohn-Sham-Dirac variational problem

E[A,Q] = minψk ,nk

k[ψk, nk] + L[

c∑

k

nkψkγµψk

]

− ec∑

k

nk(ψk|γ0γµAµ|ψk)∣

(ψk|ψk′) = δkk′, −e∑

k

nk = Q

(9.31)

which were equivalent to (9.26) would not the local ansatz for L[J ] alreadyintroduce an approximation.

An alternative, still potentially exact splitting of H [J ] is

H [J ] = K[J ]− e

2

d3r Jµaµ + EXC[J ], (9.32)

where

k[ψk, nk] =∑

k

nk〈ψk|β(−icγ · ∇ + c2)|ψk〉 (9.33)

and aµ is linearly related to Jµ by the first equation (9.14), so that allterms of (9.30) except of EXC have been defined previously and (9.30) thusdefines EXC, the variation of (9.26) with the general ansatz (9.29) for thefour-current density leads to the most general Kohn-Sham-Dirac equation[Eschrig et al., 1985], cf. also [Dreizler and Gross, 1990]

[

−icα ·∇+ βm0c2 − ecβγµ(Aµ + aµ + aXC

µ )]

ψk = ψkεk, (9.34)

with the Kohn-Sham exchange and correlation four-potential

−eaXCµ

def=δEXC[J ]

δJµ. (9.35)

Given a functional EXC[J ], this Kohn-Sham-Dirac equation in connectionwith the ansatz (9.29) is to be solved self-consistently to find the groundstate energy and the ground state four-current density.

Recall that, if quantum electrodynamics would be taken literally, theamount of mass renormalization m−m0 would be infinite due to an infiniteself-interaction of an electron via its own electromagnetic field. This basicfield-theoretic defect prevents an ab-initio treatment of mass renormaliza-tion. In lack of anything better one substitutes m0 by the phenomenologicalelectron mass m and simultaneously excludes the static self-interaction con-tributions from EXC[J ] as in the non-relativistic theory.

Page 205: The Fundamentals of Density Functional Theory (revised and

9.3 The Gordon Decomposition and Spin Density 207

The gauge invariance properties of this Kohn-Sham-Dirac equation are

A + a + aXC −→ A + a + aXC +∇χ, ψk −→ ψk exp(−ieχ) (9.36)

and

A0 + a0 + aXC0 −→ A0 + a0 + aXC

0 + w, εk −→ εk − ecw (9.37)

with an arbitrary function χ(r) and an arbitrary constant w. (The lat-ter gauge transformation fits into the former frame by writing ψk(t) −→ψk(t) exp(iecwt).)

9.3 The Gordon Decomposition and Spin Density

The four-current density (9.29) may be decomposed in a manner allowingfor a physical interpretation [Gordon, 1928]. This decomposition was in-troduced in the first relativistic formulation of density functional theory by[Rajagopal and Callaway, 1973]. It starts with use of certain algebraic prop-erties of the γ-matrices:

[γµ, γν]+ = 2gµν, [γµ, γν ]− = −2iσµν , (9.38)

where the anticommutator is proportional to the unit matrix (as previously,for every pair µν of tensor indices, the tensor elements gµν and σµν are4× 4-matrices acting in the bispinor space), and the commutator defines anantisymmetric form, the elements of which are

σ0k = iαk, σjk = ǫjklΣl, Σl =

(

σl 00 σl

)

. (9.39)

σl are Pauli’s 2× 2 spin matrices as before.If we now write the Dirac equation (for the sake of simplicity we take

(9.28), because we need only its general structure) in its covalent form,replacing the energy term by the time-derivative,

[γν(i∂ν + eAν)−mc]ψ = 0 = ψ[(−i←

∂ ν +eAν)γν −mc], (9.40)

we find

γµψ =1

mcγµγν(i∂ν + eAν)ψ =

1

mc(gµν − iσµν)(i∂ν + eAν)ψ, (9.41)

Page 206: The Fundamentals of Density Functional Theory (revised and

208 9 Current Density Functional Theory

ψγµ =1

mcψ(−i

∂ ν +eAν)(gνµ − iσνµ). (9.42)

Adding both expressions, multiplied by ψ and ψ, respectively, yields

ψγµψ =1

2mcψ(−i

∂µ +i∂µ + 2eAµ)ψ +1

2mc∂νψσ

µνψ. (9.43)

For the three-current density, this result may be written as

J = I +1

m∇× S +

∂G

∂t, (9.44)

with the orbital current density

I =1

2mψ(−i∇ + i

∇ +2eA)ψ, (9.45)

the spin current density, derived from the spin density

S =1

2ψΣψ, (9.46)

and a relativistic correction term, derived from

G = − i

2mcψαψ. (9.47)

In the stationary situation considered here this last term vanishes and weare left with the sum of orbital and spin current densities.

The total stationary current density must have zero divergence due tocharge conservation. Since the divergence of the spin current density van-ishes by its very structure as a curl, the orbital current density must also bedivergence free:

∇I = 0. (9.48)

Since further I either vanishes at infinity or is periodic, it may likewise beexpressed as a curl of some vector field L, to be visualized as an ‘angularmomentum density’:

I =1

2m∇×L. (9.49)

(Recall that an orbital angular momentum density cannot really figure inquantum mechanics because position and momentum cannot have sharp val-ues at the same time. Accordingly, (9.49) defines L only up to an arbitraryadditive gradient term.)

Page 207: The Fundamentals of Density Functional Theory (revised and

9.3 The Gordon Decomposition and Spin Density 209

The total electric current density may then be expressed as

−eJ = ∇×M = − e

2m∇× (L + 2S). (9.50)

M has the dimension of a ‘magnetization density’, related in a non-renormalized way to the angular momenta by the Bohr magneton. (Notethat except for the non-relativistic case, the decomposition of the currentgiven above is formal, which is also indicated by the appearance of idealgyromagnetic factors. Moreover, what was said above on the ‘angular mo-mentum density’ refers likewise to the orbital part of the ‘magnetizationdensity’.)

The four-current density is now given by Jµ = (nc,−∇ ×M/e), andhence the functional EXC[J ] may be rewritten as a functional EXC[n,M ]with the mechanical exchange and correlation potential acting on an electron

vXC = −euXC = −ecaXC0 =

δEXC

δn(9.51)

and, expressing formally the curl with the help of (9.50) as a functionalderivative of J with respect to M , with a magnetic exchange and correlationfield12

µ0hXC = ∇× aXC =

d3r′δJ(r′)

δM

δEXC

δJ(r′)=δEXC

δM. (9.52)

The interaction terms −eJµAµ are cast into

−eJ0A0 = nV (9.53)

and

−eJkAk = eJ ·A = −A ·∇×M = −M ·∇×A = −µ0H·M . (9.54)

The sign = again means equivalence under the spatial integral after integra-tion by parts. V and H are the mechanical potential and the magnetic fieldcorresponding to Aµ.

The Kohn-Sham-Dirac equation (9.34) was obtained by varying (9.26)with respect to ψk. From (9.50) and (9.46) we find

δM(r)

δψk(r′)= − e

2m

(

δL(r)

δψk(r′)+ δ(r′ − r)Σψk(r)

)

. (9.55)

12Writing (9.50) formally as −eJ(r′) =∫

d3r δ(r′−r)∇×M(r), one immediately finds−eδJ(r′)/δM(r) = δ(r′ − r)∇×.

Page 208: The Fundamentals of Density Functional Theory (revised and

210 9 Current Density Functional Theory

This allows for an alternative way of writing down the general Kohn-Sham-Dirac equation:

[

−icα ·∇ + βmc2 + V (r) + v(r) + vXC(r)]

ψk(r)−

− µ0β

d3r′(

H(r′) + h(r′) + hXC(r′))

·δM(r′)

δψk(r)= ψk(r)εk.

(9.56)

At the price of replacing the four-current, local in terms of the Kohn-Shamorbitals ψk, by a ‘magnetization density’ M , non-locally depending on theψk and whose orbital part L is subject to another gauge (undetermined gra-dient term), the vector potential with its unpleasant far-ranging characterhas been eliminated. Note that for a homogeneous electron liquid in a homo-geneous magnetic field the orbital current density is zero except close to theboundary of the volume while the vector potential A(r) increases linearlywith increasing r.

The crucial problem remaining to be solved is to find a suitable expressionfor δL(r)/δψk(r

′). Whether a quasi-local functional expression L[ψk] can befound or not, obviously depends on the choice of the undetermined gradientterm contained in L, since in the just mentioned example of a homogeneoussituation this is the only term of L present at finite r.

9.4 Approximative Variants

If we completely neglect the orbital current I, then we find from (9.55)

δM(r)

δψk(r′)= − e

2mΣψk δ(r − r′) for I = 0. (9.57)

In this case, the Kohn-Sham-Dirac equation acquires the simple form[

−icα ·∇ + βmc2 + V + v + vXC +

+eµ0

2mβΣ · (H + h + hXC)

]

ψk = ψkεk, (9.58)

where the magnetic field couples to the spin only. As in many applications(however by far not always) the influence of the orbital current is small, thisform is widely used in computations.

Without the spin-dependent Σ-term, the Kohn-Sham-Dirac equationwas investigated by [Rajagopal, 1978] and by [MacDonald and Vosko, 1979].

Page 209: The Fundamentals of Density Functional Theory (revised and

9.4 Approximative Variants 211

The Σ-term with the exchange and correlation field hXC, first introducedby [Rajagopal and Callaway, 1973], has been essential already in the non-relativistic theory where the exchange and correlation field appears as thedifference of the Kohn-Sham potential for spin up and down electrons, re-spectively (cf. Section 4.7 as well as Sections 7.1 and 7.2).

In the non-relativistic LDA for collinear spin structures, the exchangeand correlation field is given by

eµ0

mhXC,LDA(n(r), ζ(r)) = vLDA

XC+(n(r), ζ(r))− vLDAXC– (n(r), ζ(r)) (9.59)

with vLDAXC± obtained from n(r) and Σ(r) = n(r)ζ(r) via (7.23). Note that, by

the very nature it enters the theory via (9.52), the exchange and correlationfield has to be divergence-free, which is not provided by the LDA. Theconsequences of this defect of the spin-dependent LDA have not yet beeninvestigated.

In the non-relativistic limit, spin and orbital currents decouple, so thatthe exchange and correlation energy functional may separately dependon both. This situation has been analyzed in [Vignale and Rasolt, 1987,Vignale and Rasolt, 1988]. Those authors introduce the ‘paramagnetic cur-rent density’ (of nonrelativistic spin-orbitals ψ(x))

Ip(x)def=

1

2mψ∗(x)(−i∇ + i

∇)ψ(x) (9.60)

instead of the physical orbital current density (9.45). The current densityIp is not gauge invariant. Using gauge arguments, it has been shown in theabove cited papers, that the exchange and correlation energy functional candepend on Ip only via the ‘vorticity’

ν(x)def= ∇×

(

Ip(x)

n(x)

)

. (9.61)

Up to second order in Ip, a local current density approximation

EXC[n,ν] = EXC[n, 0]+

dx

(

4

)1/31

24π2rs

(

χLχ0L

− 1

)

|ν(x)|2 (9.62)

has been found, where rs = rs(x) and

(

χLχ0L

− 1

)

= 0.027 64rs ln rs + 0.014 07rs +O(r2s ln rs) (9.63)

Page 210: The Fundamentals of Density Functional Theory (revised and

212 9 Current Density Functional Theory

is the ratio of the diamagnetic susceptibilities for the interacting and non-interacting homogeneous electron systems minus one (in the high-densitylimit). The use of the paramagnetic current density also slightly modifiesthe Kohn-Sham equations. (Cf. also [Trickey, 1990, p. 235–254].)

Later on [Skudlarski and Vignale, 1993], the exchange and correlationenergy EXC(n, ζ,H) of a homogeneous electron liquid in an arbitrarily stronghomogeneous magnetic field H has been found numerically. Thereby, thedegree of spin polarization ζ was varied independently via a varying Zeemancoupling constant g between spin and field. A more general local currentdensity approximation

EXC[n, ζ,ν] =

=

d3r n(r)EXC(n(r), ζ(r),H(r) = −mc|ν(r)|/eµ0) (9.64)

has been suggested. Experience still must be collected with those approxi-mations.

Alternatively, a non-relativistic current density functional approach usingthe physical orbital current density instead of the paramagnetic one has beenproposed in [Diener, 1991]. However, explicit expressions have not yet beentried.

A largely unsolved problem regarding magnetically polarized groundstates is the orbital polarization contribution of tightly bound d- and f -electrons to the exchange and correlation energy functional. (For nearlyfree electrons in a solid at moderate field strength orbital polarization isquenched by kinetic energy.) Take some complete set φnlm of atomic-likeorbitals and use the ansatz

L(r) ≈occ.∑

k

d3r′ ψk(r′) ∗

∗∑

nlm

δ(r − r′)L′ −∇λnlm(r, r′)

φnlm(r′) (φnlm|ψk),(9.65)

where L′ is the non-relativistic orbital angular momentum operator, so thatthe spatial integral over L(r) gives the total non-relativistic orbital angularmomentum for any choice of the λnlm. With this ansatz one would findexpressions of the type

d3rh(r) ·δL(r)

δψk(r′)= fnlm(r′)(φnlm|ψk), (9.66)

Page 211: The Fundamentals of Density Functional Theory (revised and

9.4 Approximative Variants 213

where one could try to adjust the fnlm phenomenologically to standard sit-uations. This would loosely give a justification of the orbital polarizationcorrections first introduced in [Brooks, 1985]. Note, however, that (9.65) isan approximation because it does not locally fulfill ∇×L = 2mI.

Page 212: The Fundamentals of Density Functional Theory (revised and

Bibliography

[Ando, 1963] T. Ando, “Properties of Fermion Density Matrices,” Rev.Mod. Phys. 35, 690–702 (1963).

[Anisimov et al., 1997] V. I. Anisimov, F. Aryasetiawan, and A. I. Lichten-stein, “First-principles calculations of the electronic structure and spec-tra of strongly correlated systems: the LDA+U method,” J. Phys.:Condens. Matter 9, 767–808 (1997).

[Berestetskii, 1976] V. B. Berestetskii, “Zero Mass and Asymptotic Free-dom,” Uspekhi Fizicheskich Nauk (in russian) 120, 439–454 (1976).

[Berezin, 1965] F. A. Berezin, The Method of Second Quantization (Aca-demic Press, New York, 1965).

[Born and Huang, 1968] M. Born and K. Huang, Dynamical Theory of Crys-tal Lattices (Clarendon Press, Oxford, 1968).

[Brooks, 1985] M. S. S. Brooks, “Calculated Ground State Properties ofLight Actinide Metals and their Compounds,” Physica 130B, 6–12(1985).

[Carr and Maradudin, 1964] W. J. Carr, Jr. and A. A. Maradudin,“Ground-state Energy of a High-Density Electron Gas,” Phys. Rev. 133,A371–A374 (1964).

[Ceperly and Alder, 1980] D. M. Ceperly and B. J. Alder, “Ground State ofthe Electron Gas by a Stochastic Method,” Phys. Rev. Lett. 45, 566–569(1980).

[Coleman, 1963] A. J. Coleman, “Structure of Fermion Density Matrices,”Rev. Mod. Phys. 35, 668–689 (1963).

[Cook, 1953] J. M. Cook, “The Mathematics of Second Quantisation,”Trans. Am. Math. Soc. 74, 224–245 (1953).

[Czyzyk and Sawatzky, 1994] M. T. Czyzyk and G. A. Sawatzky, “Local-density functional and on-site correlations: The electronic structure ofLa2CuO4 and LaCuO3,” Phys. Rev. B49, 14211–14228 (1994).

[Dawydow, 1987] A. S. Dawydow, Quantenmechanik, 7th ed. (VEB Deut-scher Verlag der Wissenschaften, Berlin, 1987).

Page 213: The Fundamentals of Density Functional Theory (revised and

Bibliography 215

[Diener, 1991] G. Diener, “Current-Density-Functional Theory for a Non-Relativistic Electron Gas in a Strong Magnetic Field,” J. Phys.: Con-dens. Matter 3, 9417–9428 (1991).

[Dirac, 1930] P. A. M. Dirac, “Note on Exchange Phenomena in the ThomasAtom,” Proc. Camb. Phil. Soc. 26, 376–385 (1930).

[Dirac, 1958] P. A. M. Dirac, Principles of Quantum Mechanics, 4th ed.(Clarendon Press, Oxford, 1958).

[Dreizler and da Providencia, 1985] Density Functional Methods in Physics,Proceedings of a NATO ASI Held in Alcabideche, Portugal, September1983, edited by R. M. Dreizler and J. da Providencia (Plenum Press,New York, 1985).

[Dreizler and Gross, 1990] R. M. Dreizler and E. K. U. Gross, Density Func-tional Theory (Springer-Verlag, Berlin, Heidelberg, New York, 1990).

[Dyson and Lenard, 1967] F. J. Dyson and A. Lenard, “Stability of Matter.I.,” J. Math. Phys. 8, 423–434 (1967).

[Englisch and Englisch, 1984a] H. Englisch and R. Englisch, “Exact DensityFunctionals for Ground-State Energies. I General Results,” phys. stat.sol. (b) 123, 711–721 (1984).

[Englisch and Englisch, 1984b] H. Englisch and R. Englisch, “Exact DensityFunctionals for Ground-State Energies. II Details and Remarks,” phys.stat. sol. (b) 124, 373–379 (1984).

[Erdahl and Smith, 1987] Density Matrices and Density Functionals, Pro-ceedings of the Symposium in Honor of John Coleman Held in Kingston,Ontario, August 1985, edited by R. Erdahl and V. H. Smith, Jr. (ReidelPublishing Company, Dordrecht, 1987).

[Eschrig et al., 2003a] H. Eschrig, K. Koepernik, and I. Chaplygin, “DensityFunctional Application to Strongly Correlated Electron Systems,” J.Solid St. Chem. (Special Issue 2003).

[Eschrig and Pickett, 2001] H. Eschrig and W. Pickett, “Density functionaltheory of magnetic systems revisited,” Solid St. Commun 118, 123–127(2001).

Page 214: The Fundamentals of Density Functional Theory (revised and

216 Bibliography

[Eschrig et al., 2003b] H. Eschrig, M. Richter, and I. Opahle, in Relativis-tic Electronic Structure Theory—Part II: Applications, edited by P.Schwerdtfeger (Elsevier, Amsterdam, 2003), Chap. 12, Relativistic SolidState Calculations.

[Eschrig et al., 1985] H. Eschrig, G. Seifert, and P. Ziesche, “Current Den-sity Functional Theory of Quantum Electrodynamics,” Solid St. Com-mun. 56, 777–780 (1985).

[Ewald, 1921] P. P. Ewald, “Die Berechnung Optischer und Elektrostatis-cher Gitterpotentiale,” Ann. Physik (Leipzig) 64, 253–287 (1921).

[Fermi, 1927] E. Fermi, “Un metodo statistico per la determinazione di al-cune priorieta dell’atome,” Rend. Accad. Naz. Lincei 6, 602–607 (1927).

[Feynman, 1939] R. P. Feynman, “Forces in Molecules,” Phys. Rev. 56, 340–343 (1939).

[Fuchs, 1935] K. Fuchs, “A Quantum Mechanical Investigation of the Cohe-sive Forces of Metallic Copper,” Proc. Roy. Soc. A151, 585–602 (1935).

[Gaspar, 1954] R. Gaspar, “Uber eine Approximation des Hartree-Fock-schen Potentials durch eine universelle Potentialfunktion,” Acta Phys.Acad. Sci. Hung. 3, 263–286 (1954).

[Gell-Mann and Low, 1951] M. Gell-Mann and F. Low, “Bound States inQuantum Field Theory,” Phys. Rev. 84, 350–353 (1951).

[Gilbert, 1975] T. L. Gilbert, “Hohenberg-Kohn Theorem for Nonlocal Ex-ternal Potentials,” Phys. Rev. B12, 2111–2120 (1975).

[Gordon, 1928] W. Gordon, “Der Strom der Diracschen Elektronentheorie,”Z. Physik 50, 630–632 (1928).

[Grabo and Gross, 1995] T. Grabo and E. K. U. Gross, “Density-FunctionalTheory Using an Optimized Exchange-Correlation Potential,” Chem.Phys. Lett. 240, 141–150 (1995).

[Gross and Dreizler, 1995] Density Functional Theory, Proceedings of aNATO ASI Held in Il Ciocco, Italy, August 1993, edited by E. K. U.Gross and R. M. Dreizler (Plenum Press, New York, 1995).

Page 215: The Fundamentals of Density Functional Theory (revised and

Bibliography 217

[Gunnarsson et al., 1979] O. Gunnarsson, M. Jonson, and B. I. Lundqvist,“Description of Exchange and Correlation Effects in InhomogeneousElectron Systems,” Phys. Rev. B20, 3136–3164 (1979).

[Gunnarsson and Lundqvist, 1976] O. Gunnarsson and B. I. Lundqvist,“Exchange and Correlation in Atoms, Molecules and Solids by the Spin-Density Formalism,” Phys. Rev. B13, 4274–4298 (1976).

[Harriman, 1980] J. E. Harriman, “Orthonormal Orbitals for the Represen-tation of an Arbitrary Density,” Phys. Rev. A24, 680–682 (1980).

[Hedin, 1965] L. Hedin, “New Method for Calculating the One-ParticleGreen’s Function with Application to the Electron-Gas Problem,” Phys.Rev. 139, A796–A823 (1965).

[Hedin and Lundqvist, 1971] L. Hedin and B. I. Lundqvist, “Explicit LocalExchange and Correlation Potentials,” J. Phys. C: Solid St. Phys. 4,2064–2083 (1971).

[Hellmann, 1937] H. Hellmann, Einfuhrung in die Quantenchemie (Deu-ticke, Leipzig, 1937).

[Hoffmann-Ostenhof et al., 1980] M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof, R. Ahlrichs, and J. Morgan, “On the Exponential Falloffof Wave Functions and Electron Densities,” in Mathematical Problemsin Theoretical Physics, Proceedings of the International Conference onMathematical Physics held in Lausanne, Switzerland, August 20–25,1979, Vol. 116 of Springer Lecture Notes in Physics, edited by K. Os-terwalder (Springer-Verlag, Berlin, Heidelberg, New York, 1980), pp.62–67.

[Hohenberg and Kohn, 1964] P. Hohenberg and W. Kohn, “InhomogeneousElectron Gas,” Phys. Rev. 136, B864–B871 (1964).

[Itzykson and Zuber, 1980] C. Itzykson and J.-B. Zuber, Quantum FieldTheory (McGraw-Hill Book Company, New York, 1980).

[Janak, 1978] J. F. Janak, “Proof that ∂E/∂ni = εi in Density FunctionalTheory,” Phys. Rev. B18, 7165–7168 (1978).

[Kirshnits, 1957] D. A. Kirshnits, “Quantum Corrections to the Thomas-Fermi Equation,” JETP 5, 64–71 (1957).

Page 216: The Fundamentals of Density Functional Theory (revised and

218 Bibliography

[Kohn, 1985] W. Kohn, “Density Functional Theory: Fundamentals andApplications,” in Highlights of Condensed Matter Theory, edited by F.Bassani, F. Fumi, and M. P. Tosi (North-Holland, Amsterdam, 1985),pp. 1–15.

[Kohn and Sham, 1965] W. Kohn and L. J. Sham, “Self-Consistent Equa-tions Including Exchange and Correlation Effects,” Phys. Rev. 140,A1133–A1138 (1965).

[Kolmogorov and Fomin, 1970] A. Kolmogorov and S. Fomin, IntroductoryReal Analysis (Prentice Hall, Englewood Cliffs, NJ, 1970).

[Koopmans, 1934] T. Koopmans, “Uber die Zuordnung von Wellenfunktio-nen und Eigenwerten zu den einzelnen Elektronen eines Atoms,” Physica1, 104–113 (1934).

[Landau and Lifshitz, 1977] L. D. Landau and E. M. Lifshitz, Quantum Me-chanics (Non-Relativistic Theory) (Pergamon Press, Oxford, 1977).

[Landau and Lifshitz, 1982] L. D. Landau and E. M. Lifshitz, QuantumElectrodynamics (Pergamon Press, Oxford, 1982).

[Lang, 1965] S. Lang, Algebra (Addison-Wesley, Reading, Mass., 1965).

[Lannoo et al., 1985] M. Lannoo, M. Schluter, and L. J. Sham, “Calculationof the Kohn Sham Potential and its Discontinuity for a Model Semicon-ductor,” Phys. Rev. B32, 3890–3899 (1985).

[Lenard and Dyson, 1968] A. Lenard and F. J. Dyson, “Stability of Matter.II.,” J. Math. Phys. 9, 698–711 (1968).

[Levy, 1982] M. Levy, “Electron Densities in Search of Hamiltonians,” Phys.Rev. A26, 1200–1208 (1982).

[Lieb, 1976] E. H. Lieb, “The Stability of Matter,” Rev. Mod. Phys. 48,553–569 (1976).

[Lieb, 1981] E. H. Lieb, “Thomas-Fermi and Related Theories of Atoms andMolecules,” Rev. Mod. Phys. 53, 603–641 (1981), erratum, Rev. Mod.Phys. 54, 311 (1982).

[Lieb, 1983] E. H. Lieb, “Density Functionals for Coulomb Systems,” Int. J.Quant. Chem. XXIV, 243–277 (1983).

Page 217: The Fundamentals of Density Functional Theory (revised and

Bibliography 219

[Lieb et al., 1996] E. H. Lieb, M. Loss, and H. Siedentop, “Stability of Rel-ativistic Matter via Thomas-Fermi Theory,” Helv. Phys. Acta 69, 974–984 (1996).

[Lieb and Simon, 1977] E. H. Lieb and B. Simon, “The Thomas-Fermi The-ory of Atoms, Molecules and Solids,” Adv. in Math. 23, 22–116 (1977).

[Lieb and Thirring, 1975] E. H. Lieb and W. E. Thirring, “Bound for theKinetic Energy of Fermions Which Proves the Stability of matter,” Phys.Rev. Lett. 35, 687–689 (1975).

[Lieb and Yau, 1988] E. H. Lieb and H.-T. Yau, “The Stability and Instabil-ity of Relativistic Matter,” Commun. Math. Phys. 118, 177–213 (1988).

[MacDonald and Vosko, 1979] A. H. MacDonald and S. H. Vosko, “A Rel-ativistic Density Functional Formalism,” J. Phys. C 12, 2977–2990(1979).

[Macke, 1950] W. Macke, “Uber die Wechselwirkungen im Fermi-Gas,” Z.Naturforschung 5a, 192–208 (1950).

[Marc and McMillan, 1985] G. Marc and W. G. McMillan, “The Virial The-orem,” Adv. Chem. Phys. 58, 209–361 (1985).

[McWeeny, 1960] R. McWeeny, “Some Recent Advances in Density MatrixTheory,” Rev. Mod. Phys. 32, 335–369 (1960).

[Mermin, 1965] N. D. Mermin, “Thermal Properties of the InhomogeneousElectron Gas,” Phys. Rev. 137, A1441–A1443 (1965).

[Misawa, 1965] S. Misawa, “Ferromagnetism of an Electron Gas,” Phys.Rev. 140, A1645–A1648 (1965).

[Parr and Yang, 1989] R. G. Parr and W. Yang, Density-Functional Theoryof Atoms and Molecules (Oxford University Press, Oxford, 1989).

[Pauli, 1933] W. Pauli, “Die allgemeinen Prinzipien der Wellenmechanik,”in Handb. d. Phys., 2. ed., edited by H. Geiger and K. Scheel (Springer-Verlag, Berlin, 1933), Vol. 24, part 1, pp. 83–272.

[Perdew, 1985] J. Perdew, “What do the Kohn-Sham Orbital EnergiesMean? How do Atoms Dissociate?,” in Density Functional Methods

Page 218: The Fundamentals of Density Functional Theory (revised and

220 Bibliography

in Physics, Proceedings of a NATO ASI Held in Alcabideche, Portu-gal, September 1983, edited by R. M. Dreizler and J. da Providencia(Plenum Press, New York, 1985), pp. 265–308.

[Perdew and Levy, 1983] J. P. Perdew and M. Levy, “Physical Content ofthe Exact Kohn-Sham Orbital Energies: Band Gaps and DerivativeDiscontinuities,” Phys. Rev. Lett. 51, 1884–1887 (1983).

[Perdew and Zunger, 1981] J. P. Perdew and A. Zunger, “Self-InteractionCorrection to Density-Functonal Approximations for Many-ElectronSystems,” Phys. Rev. B23, 5048–5079 (1981).

[Rajagopal, 1978] A. K. Rajagopal, “Inhomogeneous Relativistic ElectronGas,” J. Phys. C 11, L943–L948 (1978).

[Rajagopal and Callaway, 1973] A. K. Rajagopal and J. Callaway, “Inhomo-geneous Electron Gas,” Phys. Rev. B7, 1912–1919 (1973).

[Reed and Simon, 1973] M. Reed and B. Simon, Methods of Modern Math-ematical Physics. Vol. I: Functional Analysis (Academic Press, NewYork, 1973).

[Sakurai, 1985] J. J. Sakurai, Modern Quantum Mechanics (Benjamin,Menlo Park, 1985).

[Schwarz, 1994] A. S. Schwarz, Topology for Physicists (Springer-Verlag,Berlin, 1994).

[Seidl et al., 1996] A. Seidl et al., “Generalized Kohn-Sham schemes and theband-gap problem,” Phys. Rev. B53, 3764–3774 (1996).

[Sewell, 1986] G. L. Sewell, Quantum Theory of Collective Phenomena,Monographs on the Physics and Chemistry of Materials (Oxford Uni-versity Press, Oxford, 1986).

[Sham and Schluter, 1983] L. J. Sham and M. Schluter, “Density-FunctionalTheory of the Energy Gap,” Phys. Rev. Lett. 51, 1888–1891 (1983).

[Skudlarski and Vignale, 1993] P. Skudlarski and G. Vignale, “Exchange-Correlation Energy of a Three-Dimensional Electron Gas in a MagneticField,” Phys. Rev. B48, 8547–8559 (1993).

Page 219: The Fundamentals of Density Functional Theory (revised and

Bibliography 221

[Slater, 1951] J. C. Slater, “A Simplification of the Hartree-Fock Method,”Phys. Rev. 81, 385–390 (1951).

[Slater, 1960] J. C. Slater, Quantum Theory of Atomic Structure (McGraw-Hill Book Company, New York, 1960), Vol. 1.

[Svane, 1995] A. Svane, “Comment on Self-Interaction-Corrected Density-Functional Formalism,” Phys. Rev. B51, 7924–7926 (1995).

[Teller, 1962] E. Teller, “On the Stability of Molecules in the Thomas-FermiTheory,” Rev. Mod. Phys. 34, 627–631 (1962).

[Thomas, 1927] L. H. Thomas, “The Calculation of Atomic Fields,” Proc.Camp. Philos. Soc. 23, 542–548 (1927).

[Trickey, 1990] Density Functional Theory of Many-Fermion Systems,Vol. 21 of Advances in Quantum Chemistry, edited by S. B. Trickey(Academic Press, Inc., San Diego, 1990).

[Vignale and Rasolt, 1987] G. Vignale and M. Rasolt, “Density-FunctionalTheory in Strong Magnetic Fields,” Phys. Rev. Lett. 59, 2360–2363(1987).

[Vignale and Rasolt, 1988] G. Vignale and M. Rasolt, “Current- and Spin-Density-Functional Theory for Inhomogeneous Electron Systems inStrong Magnetic Fields,” Phys. Rev. B37, 10685–10696 (1988).

[von Barth and Hedin, 1972] U. von Barth and L. Hedin, “A Local Ex-change-Correlation Potential for the Spin Polarized Case: I,” J. Phys.C 5, 1629–1642 (1972).

[von Neumann, 1955] J. von Neumann, Mathematical Foundation of Quan-tum Mechanics (Princeton University Press, Princeton NJ, 1955).

[von Weizsacker, 1935] C. F. von Weizsacker, “Zur Theorie der Kern-massen,” Z. Physik 96, 431–458 (1935).

[Wilson, 1962] E. B. Wilson, Jr., “Four-Dimensional Electron Density Func-tion,” J. Chem. Phys. 36, 2232–2233 (1962).

[Young, 1969] L. C. Young, Lectures on the Calculus of Variations and Op-timal Control Theory (W. B. Saunders Company, Philadelphia, London,Toronto, 1969).

Page 220: The Fundamentals of Density Functional Theory (revised and

222 Bibliography

[Zeidler, 1986] E. Zeidler, Nonlinear Functional Analysis and its Applica-tions, vol. I-III (Springer-Verlag, New York, 1986).

[Zumbach and Maschke, 1985] G. Zumbach and K. Maschke, “Density-Matrix Functional Theory for the N -Particle Ground State,” J. Chem.Phys. 82, 5604–5607 (1985).

Page 221: The Fundamentals of Density Functional Theory (revised and

223

Indexabsolutely summable, 124action, 183, 187adiabatic approximation, 57adiabatic forces, 56adiabaticity assumption, 53affine-linear, 77, 110almost everywhere (a.e.), 63, 123angular momentum density, 208anticommutator, 32aufbau principle, 95

Banach space, 120reflexive, 129

Banach-Alaoglu theorem, 130base of topology, 117bispinor, 193Bohr radius, 17Bolzano-Weierstrass theorem, 118Borel set, 121Bose condensed field, 191bound minimum, 135bounded linear functional, 129bounded linear operator, 128

canonical momentum density, 185Cauchy net, 120Cauchy sequence, 120chain rule for functional derivatives, 134charge conservation, 180chemical potential, 70, 98classical electron radius, 188closed set, 117closed-shell Hartree-Fock method, 25commutator, 31compact, 118complete set, 29

of N -particle states, 24of spin-orbitals, 15

complete space, 120Compton wavelength, 188conjugate function, 112conjugate functional, 131continuous, 120continuous linear operator, 128

convex, 111, 131convex hull, 115, 132correlation insulator, 171coupling constant, 17, 52creation and annihilation operators, 32

bosonic, 30fermionic, 31

current density functional H [J ], 202current density, paramagnetic, 211

d’Alembert operator, 181de Rham’s cohomology, 183density functional

universal, 145density functional by Hohenberg and

Kohn, 79of kinetic energy, 81

density functional by Levy and Lieb, 88density functional for N -particle density

matrices, 93density matrix, 90

reduced, 35–47density operator, 33, 41density parameter rs, 70Dirac equation, 191–193distance, 119dual pair, 131, 132

electromagnetic field, 180electromagnetic potentials, 180electron affinity, 99electron liquid, homogeneous, 69,

156–161electronegativity, 100ensemble state, 90essential supremum (ess sup), 63exchange and correlation energy, 48,

161, 212exchange and correlation field, 209exchange and correlation hole, 48–51exchange energy, 26

of the homogeneous electron liquid,159

exchange hole, 40, 49

Page 222: The Fundamentals of Density Functional Theory (revised and

224

exchange potential operator, 27exchange term, 20

F-derivative, 133Fenchel’s duality, 137Fermi energy, 70Fermi radius, 23fermion gas, interaction-free, 38

homogeneous, 23, 37, 40, 49field operators, 33fine structure constant, 179, 188finite from below, 111Fock operator, 27, 29Fock space, 30four-current density, 182, 193, 205four-field tensor, 182four-momentum, 186four-potential, 182four-velocity, 186fractional orbital occupation numbers,

90free Dirac field, 193free minimum, 135free photon field, 190functional derivative, 85, 132–135fundamental form, 181fundamental tensor, 181

G-derivative, 132gap correction, 164gauge invariance, 78, 201, 207generalized sequence, 118gradient expansion, 75ground state, 78

degenerate, 80determinantal, 82ensemble, 90non-degenerate, 80of the quantum field, 199spin-polarized, 102

ground state energy, 77, 139, 201ensemble, 90, 141

Hahn-Banach theorem, 113, 132Hamiltonian, 13

in Heisenberg representation, 24effective, of the inhomogeneous

quantum field system, 200

Hamiltonian,field quantized, 33for Coulomb systems, 55

total, 56in momentum representation, 19in natural units, 17in occupation number

representation, 32in Schrodinger representation, 16of QED, 197of the free Dirac field, 194of the free photon field, 190of the homogeneous electron liquid,

156on the torus, 58

Hamiltonian density, 185of electromagnetic coupling, 195

hard potential barrier, 79Hartree energy, 26, 48Hartree potential, 27Hartree unit, 17Hartree-Fock energy, 26, 28Hartree-Fock equations, 27Hartree-Fock orbitals, 28Hausdorff property, 117Heisenberg picture, 189Heitler-London ansatz, 51Hellmann-Feynman theorem, 53Hilbert space, 13, 127

N -particle, bosonic, 16N -particle, fermionic, 15

Holder’s inequality, 126Hohenberg-Kohn variational principle,

80, 144, 204hole state, 34HOMO, 28, 100hyperplane, 130hyperplane of support, 131

impure spin states, 107inf-sup problem, 136inner point, 116ionization potential, 100

Janak’s theorem, 95, 99jump of the Kohn-Sham potential, 153

kinetic energy, 43, 48

Page 223: The Fundamentals of Density Functional Theory (revised and

225

kinetic energy,exchange and correlation

contribution, 53kinetic energy functional, 62Klein-Gordon equation, 191Kohn-Sham exchange and correlation

energy, 53Kohn-Sham exchange and correlation

hole, 54, 162Kohn-Sham exchange and correlation

potential, 85, 161, 206Kohn-Sham orbital, 86Kohn-Sham orbital energy, 100, 105Kohn-Sham-Dirac equation, 210

general, 206, 210Kohn-Sham-Dirac spinor orbitals, 205Kohn-Sham-Dirac variational problem,

206Koopmans’ theorem, 28

Lagrange density, 184Lagrange function, 137Lagrange multiplier, 135–137

abstract, 137LDA+U density functional, 175Lebesgue integral, 123Lebesgue measure, 121Lebesgue space Lp, 125–128Legendre transform, 112, 131limes, 116

of a net, 118local current density approximation,

211, 212local density approximation (LDA), 87,

161–165locally convex space, 119Lorentz gauge, 181, 190Lorentz transformation, 183lower semicontinuous, 111, 131Lp-spaces, 63LUMO, 28, 99L(X, Y ), 128

magnetization density, 209main theorem for extremal problems,

130Maxwell’s equations, 180, 182

mean field, 28measurable function, 122measurable set, 122metric space, 119minimum problem, 130Minkowski’s geometry, 181Minkowski’s inequality, 126molecular field, 28, 29Mott insulator, 171

n-representable potential, 79N -representable density, 88natural orbitals, 93natural units, 17neighborhood, 116net, 118norm, 119normal order, 191, 198

occupation matrix, 174open set, 116open-shell Hartree-Fock method, 25orbital current density, 208orbital relaxation energy, 28

p-summable function, 125pair correlation function, 40pair density, 38paramagnetic state, 92particle density, 23, 35particle number operator, 31Pauli-Hellmann-Feynman theorem, 53Perdew-Zunger interpolation of εXC,

160periodic boundary conditions, 18phase separation, 98Poisson’s equation, 66potential gauge constant, 78pure spin states, 107pure state, 35pure-state v-representable density, 79

quantum electrodynamics (QED), 179

regular measure, 122relativistic dispersion relation, 187relativistic effects

magnitude, 179

Page 224: The Fundamentals of Density Functional Theory (revised and

226

renormalization, 34, 191, 199, 204rest mass, 186Riesz-Fisher theorem, 126

saddle point, 137scaled nuclear charges, 55Schrodinger picture, 189Schwarz’ inequality, 127self-consistent field, 29self-interaction, 27, 46, 51, 205, 206seminorm, 119side condition, 135σ-additive, 122simplex, 91Slater determinant, 15Sobolev’s inequality, 149spin current density, 208spin density, 44spin magnetization density, 44spin operator, 14spin polarization, 100–105

degree of, 44, 158spin-density matrix, 36, 44spin-orbital, 15spinor, 14subdifferential, 113, 131subgradient, 112, 131

tangent of support, 113Taylor expansion of a functional, 134Teller’s no-binding theorem, 72thermodynamic limit, 33, 57–60, 98

Thomas-Fermi energy, 64of the atom, 68

Thomas-Fermi equation, 65, 166Thomas-Fermi functional, 62Thomas-Fermi screening length, 70Thomas-Fermi-Dirac theory, 73Thomas-Fermi-λWeizsacker theory, 74topological space, 117topology, 117torus, 18, 57, 140, 150total charge, 199trace, 36transition state, 100

U -potential, 176

vacuum permeability µ0, 180vacuum permittivity ǫ0, 180virial theorem, 56

of Thomas-Fermi theory, 72von Barth-Hedin interpolation function

f(ζ), 159vorticity, 211

wavenumber space, 18weak lower semicontinuous, 131weak topology, 130weak∗ topology, 130

Xα-approach, 87

Young’s inequality, 112