axiomatic and constructive quantum eld theory thesis for ... · quantum eld theory (qft) is the...

144
Axiomatic and constructive quantum field theory Thesis for the master’s in Mathematical physics Sohail Sheikh Student number: 0481289 Thesis advisor : prof.dr. R.H. Dijkgraaf Second reader : dr. H.B. Posthuma Korteweg-de Vries Institute (KdVI) Universiteit van Amsterdam (UvA) FNWI August 2013 Abstract We investigate the mathematical structure of quantum field theory. For this purpose, we first have to develop the mathematical framework for a relativistic quantum theory in terms of a Hilbert space on which there is defined a unitary representation of the universal covering group of the restricted Poincar´ e group. We then discuss how quantum fields can be used in physics to compute physical quantities for scattering processes, such as scattering amplitudes. After this discussion about the use of quantum fields in physics, we analyze two different axiom systems for mathematically rigorous quantum field theory, namely the Wightman axiom system and the Haag-Kastler axiom system. Finally, we look at some results in constructive quantum field theory (CQFT). In CQFT the goal is to construct concrete non-trivial examples of models that satisfy the Wightman axioms or Haag-Kastler axioms.

Upload: others

Post on 27-Jun-2020

4 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Axiomatic and constructive quantum field theory

Thesis for the

master’s in Mathematical physics

Sohail SheikhStudent number: 0481289

Thesis advisor : prof.dr. R.H. Dijkgraaf

Second reader : dr. H.B. Posthuma

Korteweg-de Vries Institute (KdVI)

Universiteit van Amsterdam (UvA)FNWI

August 2013

Abstract

We investigate the mathematical structure of quantum field theory. For this purpose, wefirst have to develop the mathematical framework for a relativistic quantum theory in termsof a Hilbert space on which there is defined a unitary representation of the universal coveringgroup of the restricted Poincare group. We then discuss how quantum fields can be used inphysics to compute physical quantities for scattering processes, such as scattering amplitudes.After this discussion about the use of quantum fields in physics, we analyze two different axiomsystems for mathematically rigorous quantum field theory, namely the Wightman axiom systemand the Haag-Kastler axiom system. Finally, we look at some results in constructive quantumfield theory (CQFT). In CQFT the goal is to construct concrete non-trivial examples of modelsthat satisfy the Wightman axioms or Haag-Kastler axioms.

Page 2: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Contents

1 Introduction 31.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Conventions and notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Special relativity and quantum theory 62.1 Special relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 Minkowski spacetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.1.2 The Lorentz group, causal structure and the Poincare group . . . . . . . . . 10

2.2 Quantum theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.2.1 States and observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.2.2 The general framework of quantum theory . . . . . . . . . . . . . . . . . . . 212.2.3 Symmetries in quantum theory . . . . . . . . . . . . . . . . . . . . . . . . . 242.2.4 Poincare invariance and one-particle states . . . . . . . . . . . . . . . . . . 322.2.5 Many-particle states and Fock space . . . . . . . . . . . . . . . . . . . . . . 46

3 The physics of quantum fields 523.1 The interaction picture and scattering theory . . . . . . . . . . . . . . . . . . . . . 523.2 The use of free quantum fields in scattering theory . . . . . . . . . . . . . . . . . . 573.3 Calculation of the S-matrix using perturbation theory . . . . . . . . . . . . . . . . 623.4 Obtaining V from a Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643.5 Some remarks on the physics of quantum fields . . . . . . . . . . . . . . . . . . . . 71

4 The mathematics of quantum fields 734.1 The Wightman formulation of quantum field theory . . . . . . . . . . . . . . . . . 73

4.1.1 Mathematical preliminaries: Distributions and operator-valued distributions 734.1.2 The Wightman axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784.1.3 Wightman functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804.1.4 Important theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 844.1.5 Example: The free hermitean scalar field . . . . . . . . . . . . . . . . . . . 904.1.6 Haag-Ruelle scattering theory . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.2 The Haag-Kastler formulation of quantum field theory . . . . . . . . . . . . . . . . 954.2.1 The algebraic approach to quantum theory . . . . . . . . . . . . . . . . . . 954.2.2 The Haag-Kastler axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034.2.3 Vacuum states in the Haag-Kastler framework . . . . . . . . . . . . . . . . 105

5 Constructive quantum field theory 1075.1 The Hamiltonian approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

5.1.1 The (λφ4)2-model as a Haag-Kastler model . . . . . . . . . . . . . . . . . . 1075.1.2 The physical vacuum for the (λφ4)2-model . . . . . . . . . . . . . . . . . . . 1145.1.3 The P(φ)2-model and verification of some of the Wightman axioms . . . . . 1155.1.4 Similar methods for other models . . . . . . . . . . . . . . . . . . . . . . . . 116

5.2 The Euclidean approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1165.2.1 Euclidean fields and probability theory . . . . . . . . . . . . . . . . . . . . . 1175.2.2 An alternative method: The Osterwalder-Schrader theory . . . . . . . . . . 1275.2.3 The P(φ)2-model as a Wightman model . . . . . . . . . . . . . . . . . . . . 127

A Hilbert space theory 130A.1 Direct sums and integrals of Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . 130A.2 Self-adjoint operators and the spectral theorem . . . . . . . . . . . . . . . . . . . . 130

1

Page 3: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

B Examples of free fields 135B.1 The (0, 0)-field (or scalar field) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135B.2 The (1

2 ,12)-field (or vector field) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

B.3 The (12 , 0)-field and the (0, 1

2)-field . . . . . . . . . . . . . . . . . . . . . . . . . . . 137B.4 The (1

2 , 0)⊕ (0, 12)-field (or Dirac field) . . . . . . . . . . . . . . . . . . . . . . . . . 138

References 140

Popular summary (english) 142

Popular summary (dutch) 143

2

Page 4: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

1 Introduction

Quantum field theory (QFT) is the physical theory that emerged when physicists tried to constructa quantum theory that would be compatible with special relativity. Although this theory is usedto describe high-energy sub-atomic particles, the fundamental objects of the theory are fields.The theoretical predictions that can be made with QFT have been tested several times againstexperimental results, and these predictions turned out to be highly accurate. For this reason, QFTshould be viewed as a very important physical theory.

When I first came to study QFT, there were some aspects of the theory that puzzled me. Forexample, I was very used to the systematic approach of non-relativistic quantum mechanics, wherefor each system one can specify very precisely in what Hilbert space the physical states live. In fact,in non-relativistic quantum mechanics the choice of Hilbert space is completely determined by thenumber of degrees of freedom, by virtue of the Stone-Von Neumann theorem. However, in QFTthe only systems for which the Hilbert spaces are specified are the free systems, i.e. the systemsthat describe particles that do not interact with each other. I found it very confusing that in QFTone describes a physical system quantum mechanically without ever mentioning the Hilbert space.After all, in non-relativistic quantum mechanics the starting point for any quantum mechanicaldescription was the Hilbert space. Another thing that I found very difficult about QFT is thefact that it is built around perturbation theory. In non-relativistic quantum mechanics one oftenstarts with an exact theory for describing a particular system and then uses perturbation theoryto approximate the solution for the system. In QFT the starting point is already a perturbativeexpression, such as the Dyson series, without the mention of what exact mathematical expressionwe are approximating.

In my first meeting with professor Dijkgraaf I explained to him that I had some difficultiesin understanding QFT and that I would like to work on a master thesis that would take thesedifficulties away. It was in this context that we came up with the idea to examine the mathematicalstructure of QFT in terms of the Wightman framework and the Haag-Kastler framework. ProfessorDijkgraaf emphasized that he wanted me to motivate these mathematical frameworks by usingarguments from physics, and that he also wanted to see some results from constructive quantumfield theory (CQFT). Besides these two demands, I was completely free to organize the thesisaccording to my own taste.

This master thesis should be readable to anyone with a healthy knowledge of real and complexanalyis, measure and integration theory, functional analysis, operator algebras, Lie groups and Liealgebras. Since I have followed several courses1 in these fields during my study, I did not botherexplaining anything concerning these topics. For instance, I do not give the definition of C∗-algebras and Von Neumann algebras, and I do not explain the Gelfand-Naimark-Segal constructionfor general C∗-algebras or the Gelfand-Naimark theorem for abelian C∗-algebras. On the otherhand, since I was not familiar with the theory of distributions and operator-valued distributions,I did write a subsection on these subjects. Furthermore, since the theory of unbounded operatorson a Hilbert space is not part of any course that one can follow at the University of Amsterdam, Ialso included an appendix on unbounded (self-adjoint) operators. Strictly speaking, no knowledgeof physics is required for reading this thesis, although the material would be more natural if onealready has some background knowledge in quantum physics.

1.1 Overview

This thesis consists of four chapters, excluding the present introduction. These chapters consist ofseveral sections, some of which are further decomposed into subsections. The content of each ofthe four chapters can be briefly summarized as follows.

1In the national mastermath programm there were two excellent courses given by M. Muger and N.P. Landsmanon C∗-algebras and operator algebras, respectively. I am very grateful to these two teachers from de RadboudUniversiteit Nijmegen, not only for giving these courses, but also for the fact that they helped me very much infinding good literature for this thesis.

3

Page 5: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

In chapter 2 we will consider the two main ingredients of quantum field theory, namely specialrelativity and quantum theory. The first section of this chapter, on special relativity, commenceswith the introduction of certain physical concepts such as inertial observers and the invarianceof the speed of light. These physical concepts will then be used to motivate the structure of themathematical model that will be used for the description of spacetime. In the remainder of thesection we will investigate the properties of this mathematical model, including a detailed discussionof the Poincare group and its universal covering. In the second section, on quantum theory, wedefine the notion of a state and an observable from the physical point of view. We will then explainthat in quantum theory these states and observables are represented mathematically in terms ofHilbert space objects, and we will consider time evolution in both the Heisenberg picture andSchrodinger picture. Because any quantum theory that is consistent with special relativity shouldbe invariant under Poincare transformations, the next logical step is to study how symmetries areimplemented in a quantum theory. The important result in this context is Wigner’s theorem, whichstates that, in a quantum system without superselection rules, any symmetry can be representedeither by a unitary operator or by an anti-unitary operator. Wigner’s theorem can then be appliedto Poincare transformations, which will eventually lead to the conclusion that the Hilbert spaceof any relativistic quantum system must contain a unitary representation of the universal coveringgroup of the restricted Poincare group. In case this representation is irreducible, the correspondingHilbert space is interpreted as the pure state space of a single-particle system. The concepts ofmass and spin of a particle arise very naturally in the study of these irreducible representations.In the last part of the section we will construct the Hilbert spaces for systems consisting of morethan one particle. These spaces are tensor products of single particle Hilbert spaces, but in generalnot all states in these tensor products are physically realizable. Finally, we will consider spaces inwhich the total number of particles is not constant.

In chapter 3 we will give a brief overview of the use of quantum fields in physics. The conceptof a quantum field will be introduced as a computational tool in the perturbative calculations thatare carried out in the quantitative description of experiments in which high-energy particles collidewith each other. This point of view is adopted from Weinberg’s book [35]. Once the quantumfields are defined, we will sketch how they can be used to compute physically interesting quantities.These computations are done by using perturbation theory, which is made somewhat easier by theuse of Feynman diagrams and the corresponding Feynman rules. However, we will not explainthe precise content of the Feynman rules, nor will we consider the process of renormalizationwhich is necessary to transform infinite quantities into finite ones. After considering methods ofcomputation in quantum field theory, we will show how one can obtain a quantum field theory froma classical Lagrangian field theory, since this is the route that is followed in practice. Finally, wewill try to motivate what aspects of the physical theory should be included in any mathematicallyrigorous treatment of quantum fields.

In chapter 4 we will discuss two different axiom systems for quantum field theory. In the firstsection of this chapter we will consider the Wightman axioms. These axioms will be motivatedby the physical structure of quantum field theory as described in chapter 3. However, beforewe can formulate the Wightman axioms, we first need to study the theory of distributions andoperator-valued distributions. After these mathematical concepts have been studied and after theaxioms have been formulated, we will prove several properties that are shared by all Wightmantheories. Among these properties are the spin-statistics theorem and the PCT-theorem, which arevery important from the physical point of view. As an easy example of a Wightman theory wewill consider the free hermitean scalar field. We will treat this example in some detail, because theresults will also be necessary when we will construct an interacting quantum field theory in thenext chapter. To close our discussion of the Wightman theory, we will show that under certainconditions the Wightman theories allow an interpretation in terms of particles. These conditionsdefine what is called the Haag-Ruelle scattering theory. In the second section of chapter 4 wewill consider an alternative axiom system, namely the Haag-Kastler system. Again we will try tomotivate the content of the axioms by using our knowledge from previous discussions. Because theHaag-Kastler systems are formulated in terms of abstract algebras rather than concrete Hilbert

4

Page 6: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

spaces, we will begin with a treatment of algebraic quantum theory. This treatment includes topicssuch as physical representations, superselection rules and symmetries. After we have consideredalgebraic quantum theory in some detail, we will formulate the Haag-Kastler axioms and we willconsider some important results concerning Haag-Kastler systems.

In chapter 5 we will describe some of the early developments in the constructive quantum fieldtheory programm, which started in the 1960s. In constructive quantum field theory (CQFT) thepurpose is to construct concrete examples of non-trivial models that satisfy all axioms of one orboth of the two axiom systems mentioned above. This is a very difficult task, and for that reasonpeople started with the easiest possible models in 2 and 3-dimensional spacetime. We will considersome of the results that were obtained for these ’easy’ models. Because the detailed proofs ofthese results are almost always very technical (and not very fun), we will only focus on the mainarguments and constructions.

1.2 Conventions and notation

Physicists and mathematicians often use very different notations for the same mathematical ob-ject, so we had to make some choices. The most important conventions and notations that we willuse are the following.-We will use the Einstein summation convention: if in some equation a Greek letter appears onceas a lower index and once as an upper index, then that index should be summed over all possiblevalues of the index.-Inner products 〈., .〉 : V × V → C on a complex vector space will always be linear in the firstargument and conjugate-linear in the second. This coincides with the convention used in the math-ematics literature on linear algebra and functional analysis, but is opposite to the convention usedin the physics literature on quantum theory. In particular, the bra-ket notation of Dirac will notbe used here.-The complex conjugate of a complex number z is denoted by z (not by z∗) and the adjoint of anoperator A is denoted by A∗ (not by A†).

1.3 Acknowledgements

I would like to thank professor Dijkgraaf for all his time and effort in supervising this thesis. Ireally appreciate that he always took his time to answer my questions, despite the fact that hisagenda was always overfull during his function as president of the KNAW. Even when he becamedirector of the Institute of Advanced Study (IAS) in Princeton, he still made time for me duringthe short periods that he was in the Netherlands. In this context I would also like to thank hispersonal assistant Ms. Corina de Boer, who always arranged my appointments with the professorand who brought me into contact with professor Dijkgraaf in the first place. I am also very gratefulto the second reader of this thesis, doctor Posthuma, for all his time spent on reading the text andfor all his feedback.

I would also like to thank my family for all their support during my study. My parents alwaysinspire and motivate me in everything I do, and, as the youngest child, I have the privilege that Ican always take an example from my brother and sister.

5

Page 7: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

2 Special relativity and quantum theory

In this chapter we will describe the two main ingredients for quantum field theory, namely specialrelativity and quantum theory. At the end of this chapter we will combine the two theories toobtain the general framework for a relativistic quantum theory. This chapter is rather long anddetailed, because it is only after we have developed a proper comprehension of relativistic quantumtheory that we can introduce quantum fields.

2.1 Special relativity

In the absence of gravity, freely moving macroscopic objects (i.e. macroscopic objects on which noexternal agencies act) have constant relative velocities. This emperical fact allows us to define aspecial class of observers, namely those for which all freely moving macroscopic objects move withconstant velocities. Such observers are called inertial observers. For convenience, we assume thatall inertial observers construct an orthogonal three-dimensional coordinate system in precisely thesame manner. For example, we might agree that the origin of this coordinate system is always thecenter of mass of the observer’s body and that the x1-axis runs from left to right, the x2-axis runsfrom back to front and the x3-axis runs from down to up; note that this defines a right-handedcoordinate system. Distances along any of these axes are measured by using light rays. We alsoassume that all inertial observers are equipped with the same kind of clock. Thus, we may infact define an inertial observer to be a right-handed three-dimensional coordinate system movingthrough space with constant velocity relative to all freely moving macroscopic objects, togetherwith a clock at the origin. Two such coordinate systems that coincide but carry clocks that do nothave the same t = 0 moment, are considered as different inertial observers. By using light rays, anytwo clocks that are at rest with respect to the same inertial observer can be synchronized in thefamiliar way. We may therefore imagine that there is a clock at every point of the coordinate systemof any inertial observer and that all these clocks are synchronized. This allows a coordinatizationof space and time by four numbers (t, x1, x2, x3) for any inertial observer.

It is an emperical fact that all inertial observers are physically equivalent in the sense thatthey all obtain the same outcome whenever they conduct the same experiment. This is called theprinciple of special relativity. In order to fully understand this principle, we first introduce someterminology concerning physical experiments; this terminology is borrowed from chapter 1 of [1].The part of the physical world that is studied in a particular measurement is called the measuredobject under consideration and the measurements that can be carried out on the measured objectare called physical quantities. In any measurement process, the measured object is interactingfor some time with a measuring apparatus, after which the interaction stops and the measuringapparatus immediately indicates the measured value. In any measurement, the measured object isprepared during a preparation process. By definition, this preparation process ends at the momentwhen the interaction between the measured object and the measuring apparatus stops and whenthe measuring apparatus indicates the measured value. In what follows, we will denote measuredobjects by Greek letters α, β, . . ., and these Greek letters should be interpreted as a completedescription of how the measured object is prepared, in terms of the space and time coordinates ofall constituent parts of the measured object during the preparation process. We will often denotethese spacetime coordinates symbolically as α(x). Physical quantities will be denoted by capitalletters A,B, . . ., and these should be interpreted as a complete description of how to perform acertain measurement process, in terms of the space and time coordinates of all constituent partsof all measuring apparatuses during the measurement process; by definition, the measurementprocess ends at the moment that the measured value is produced, i.e. at the same moment thatthe preparation process for the measured object ends. As for measured objects, we will often writeA(x) to symbolically denote the spacetime coordinates of all parts of the measuring apparatuses.The outcome of the measurement of a physical quantity A for a measured object α, i.e. themeasured value, is denoted by M(α,A). Measured values are assumed to be Borel subsets ofRn, where n = nA depends on the physical quantity A. The reason that measured values areBorel subsets of Rn, rather than elements of Rn, is that measurements always involve some errors.

6

Page 8: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Consider now some inertial observer O measuring the physical quantity A(y) for the measuredobject α(x), where x and y symbolically represent the spacetime coordinates of all parts of themeasured object and of all parts of the measuring apparatuses, respectively. Note, in particular,that this implies that x and y are such that the end of the preparation process α(x) and the endof the measuring process A(y) coincide. We say that a second inertial observer O′ carries outa similar experiment as observer O if this second observer measures the physical quantity A(y′)for the measured object α(x′), where x′ and y′ are the coordinates with respect to O′ and thesecoordinates have the same numerical values as the coordinates x and y. Now consider the situationwhere N different inertial observers carry out similar experiments. If N(B) denotes the numberof observers that find a measured value in the Borel subset B ⊂ Rn, then it is an emperical factthat for any such B the fraction N(B)/N approaches some definite value as N becomes largeenough. This suggests that the similar experiments carried out by the different inertial observersshould be interpreted as repetitions of the same probabilistic experiment. This is the form of theprinciple of special relativity that we will need in the following. In most texts on special relativitythis principle is not stated in probabilistic form but in deterministic form, since these texts areoften concerned only with classical dynamics. The deterministic form is stated by the equationM(α(x), A(y)) = M(α(x′), A(y′)), i.e. the measured values are the same for all inertial observerscarrying out a similar experiment. In other words, in deterministic form the principle of specialrelativity can be loosely stated as follows: inertial observers carrying out similar experiments willobtain the same measured value.

When we apply (the deterministic form of) the principle of special relativity to experiments inclassical electromagnetism, it follows that all inertial observers measure the same speed of light.This fact can be used to derive the coordinate transformations between different inertial frames,as explained in any introductory text on special relativity. Here we will merely state the resultsfor later reference. We already mentioned above that an inertial observer can coordinatize thepoints of spacetime by four numbers (x0, x1, x2, x3), where x0 = t represents the time coordinate2

and the other components represent the spatial coordinates. Therefore, in any particular inertialframe O, spacetime can be identified with the four-dimensional vector space R4. If a secondinertial observer O′ is at rest with respect to the observer O and is standing at the point in spacewith spatial coordinates (a1, a2, a3) with respect to the frame O and if O′ is oriented parallel toobserver O, then any point in space with coordinates (x1, x2, x3) with respect to O has coordinates((x′)1, (x′)2, (x′)3) = (x1 − a1, x2 − a2, x3 − a3) with respect to O′. Furthermore, if the time zeromoment (x′)0 = 0 of the clock of observer O′ takes place at time x0 = a0 with respect to O, thenthe time coordinate of any point in spacetime with respect to O′ is (x′)0 = x0−a0, where x0 is thetime coordinate of the point with respect to O. Thus, in this case the coordinates with respect toO′ are related to those with respect to O by

(x′)µ = xµ − aµ. (2.1)

Such coordinate transformations are called spacetime translations. Now consider another observerO′′ that is standing at the same point in space as observer O with a clock that is synchronizedwith the clock of O, but suppose that the orientation of O′′ is obtained from the orientation of Oby rotating counterclockwise (as seen from a point with positive x3-coordinate) over an angle θ inthe x1x2-plane. Then the coordinates with respect to O′′ of a point in spacetime are related tothose with respect to O by

((x′′)0, (x′′)1, (x′′)2, (x′′)3) = (x0, x1 cos θ + x2 sin θ,−x1 sin θ + x2 cos θ, x3). (2.2)

A very similar expression is obtained for rotations in the other two spatial planes. Rotationsaround arbitrary axes are more complicated, but we will come back to them later. Coordinatetransformations of this form are called (spatial) rotations. Finally, consider yet another observerO′′′ that has the same spatial orientation as O but moves with velocity v in the positive x1-direction,

2We will always use units in which the speed of light c is equal to 1. Otherwise it would have been more naturalto define x0 = ct.

7

Page 9: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

and suppose that at the unique moment where the two observers are at the same point in spacetheir clocks are both at time zero, i.e. x0 = (x′′′)0 = 0 at that moment. Then their coordinatesare related by

((x′′′)0, (x′′′)1, (x′′′)2, (x′′′)3) = (γ(v)(x0 − vx1), γ(v)(x1 − vx0), x2, x3), (2.3)

where γ(v) = (1−v2)−1/2 is the Lorentz factor. Similar expressions are obtained when the observerO′′′ moves along one of the other two spatial axes. For more general directions, the expressionbecomes more complicated as we will discuss later. These coordinate transformations are called(Lorentz) boosts. It should be clear that the coordinate transformations between any two inertialframes can be obtained by a composition of translations, rotations and boosts.

In the situations above, where an intertial observer passes from the coordinates of his own frameto the coordinates of another inertial observer’s frame, in order to see how the other observerobserves all objects in spacetime, we speak of a passive transformation. However, the observercan obtain the same result by keeping his own coordinates and by transforming all objects inspacetime. For example, observer O above could translate all objects in spacetime over the vector−aµ to obtain the same point of view as observer O′. Such transformations are called activetransformations. From the active viewpoint, the principle of special relativity may be stated asfollows: if we move all objects in spacetime (along with all measuring apparatuses) according toa transformation that relates two inertial frames, then the outcome of any measurement remainsunchanged. Note that, in either form, this principle makes it possible to ’repeat’ an experiment atdifferent times and places, but also at different velocities.

Our goal for the rest of this section is to formulate a mathematical theory of space and timethat agrees with the data described above. In the first subsection we will give the mathematicaldefinition of spacetime and in the second subsection we will describe the transformations betweendifferent inertial reference frames.

2.1.1 Minkowski spacetime

As dicussed above, each inertial observer can identify spacetime with the vector space R4 andthe coordinate transformations between different inertial frames are generated by translations,rotations and boosts. However, we would like to define spacetime mathematically in a mannerthat describes its intrinsic properties, independent of a choice of inertial frame. For example,since two inertial frames that are related by a spacetime translation have different origins for theircoordinate frames, it is clear that spacetime should not be represented mathematically by a vectorspace, but rather by an affine space. Furthermore, under any coordinate transformation betweeninertial frames the quantity

(∆x) · (∆x) =[(∆x)0

]2 − 3∑j=1

[(∆x)j

]2, (2.4)

is left invariant, so this quantity should play an important role in the mathematical definitionof spacetime; here ∆x denotes the difference between two points in spacetime. In fact, we canidentify the quantity in (2.4) as some kind of metric (analogous to the Euclidean metric describingdistances in Euclidean space) that acts on differences of two spacetime points. Thus, spacetimeshould be represented by a four-dimensional affine space together with some kind of metric actingon difference vectors, and the coordinate transformations between two inertial frames are thenrepresented by transformations that preserve the metric. Before defining the precise mathematicalmodel for spacetime (which will be called Minkowski spacetime, or simply Minkowski space), wewill first recall the definition of an affine space and of symmetric nondegenerate bilinear forms.

Definition 2.1 An affine space is a triple (A, V, `) consisting of a set A, a vector space V and amap ` : V ×A→ A such that(1) `(0, a) = a for all a ∈ A;

8

Page 10: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

(2) `(v, `(w, a)) = `(v + w, a) for all v, w ∈ V and a ∈ A;(3) for each a ∈ A the map `a : V → A defined by `a(v) = `(v, a) is a bijection.The dimension of the affine space is defined to be the dimension of the vector space V .

Note that (3) implies that A and V are in fact identical as sets. Instead of `(v, a) we also writev + a. If a1, a2 ∈ A then, according to condition (3) in the definition, there exists a unique v ∈ Vsuch that a2 = `a1(v) = `(v, a1) = v + a1, which we rewrite as a2 − a1 = v. In this sense, wecan subtract points in A to obtain elements in V , so V can be interpreted as the set of differencesbetween points in A.

Recall from (multi-)linear algebra that a multilinear map T : (V ∗)×k × V ×l → R on a vectorspace V is called a (k, l)-tensor on V . If T : V × V → R is a (0, 2)-tensor on a real vector spaceV , then we say that T is symmetric if T (v, w) = T (w, v) for all v, w ∈ V and antisymmetricif T (v, w) = −T (w, v) for all v, w ∈ V . If T is either symmetric or antisymmetric, then T iscalled nondegenerate if T (v, w) = 0 for all w ∈ V implies that v = 0. The following theorem onnondegenerate symmetric bilinear forms, which we will state without proof, will be very useful forour purposes.

Theorem 2.2 Let V be an n-dimensional real vector space on which a nondegenerate symmetric(0, 2)-tensor T : V ×V → R is defined. Then there exists a basis ejnj=1 of V such that T (ei, ej) =0 if i 6= j and T (ei, ei) = ±1 for i = 1, . . . , n. Moreover, the number of basis vectors ei for whichT (ei, ei) = −1 is the same for any such basis (this number is called the index of T ).

Such a basis as described in the theorem is called an orthonormal basis with respect to T . Nowthat we have considered affine spaces and symmetric nondegenerate bilinear forms, we can defineour mathematical model for spacetime as follows.

Definition 2.3 Let (M, V, `) be a 4-dimensional real affine space equipped with a nondegeneratesymmetric (0, 2)-tensor η : V × V → R of index 3. Then η is called a Minkowski metric on M,and (M, V, `, η) is called Minkowski spacetime.

The action of η on two vectors in V , η(v, w) = v · w, is called the inner product (or scalar/dotproduct) of v and w. Two vectors v, w ∈ V for which v · w = 0 are called orthogonal. The normof a vector v ∈ V is v · v (not the square root of this quantity), and we say that a vector v ∈ Vis timelike if its norm is positive, lightlike or null if its norm is zero and spacelike if its norm isnegative.

Definition 2.4 We define the light cone Cx at a point x ∈M in Minkowksi spacetime by

Cx = y ∈M : η(y − x, y − x) = 0.

If y ∈M with η(y−x, y−x) > 0, we say that y lies inside the lightcone at x; if η(y−x, y−x) < 0,we say that y lies outside the light cone at x.

Note that a point y ∈ M lies on the light cone at x if and only if y − x is a null vector, whereasy ∈ M lies inside (or outside) the light cone at x if and only if y − x is a timelike (or spacelike)vector.

We will denote Minkowski spacetime simply by M instead of (M, V, `, η). Furthermore, withabuse of notation, more often than not we will denote the vector space V also by M. In fact,from now on M will always denote the vector space V in the triple (M, V, `); whenever we areconsidering the affine space M, we will state this explicitly. By theorem 2.2, there exists a basiseµ3µ=0 of M with η(e0, e0) = 1, η(ei, ei) = −1 for i = 1, 2, 3 and η(eµ, eν) = 0 if µ 6= ν. We willalways work in an orthonormal basis from now on, and we will always label the orthonormal basisvectors such that η(e0, e0) = +1.

If eµ3µ=0 denotes the dual basis of eµ, i.e. eµ is a basis of the dual vector space M∗ ofM such that eµ(eν) = δµν , then we can write η = ηµνe

µ ⊗ eν , where ηµν = η(eµ, eν). Of course, in

9

Page 11: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

our particular choice of basis we have η00 = −η11 = −η22 = −η33 = 1 and all other coefficients are0. The metric η :M×M→ R defines a map

η :M → M∗

v 7→ η(eµ, v)eµ.

In physics it is more convenient to write η(v) = vµeµ rather than η(v) = [η(v)]µe

µ, and for thisreason physicists often refer to the map v 7→ η(v) as ’lowering the indices of v’ and in theirnotation this map is simply vµ 7→ vµ = ηµνv

ν . Because η(e0) = e0 and η(ej) = −ej for j = 1, 2, 3,it is clear that η has an inverse η−1 : M∗ → M and this inverse can be considered as a mapthat ’raises indices’: fµ 7→ fµ. This map, in turn, defines a (2, 0)-tensor η−1 : M∗ ×M∗ → Ron M given by η−1(f, g) = g(η−1(f)) = f(η−1(g)) for f, g ∈ M∗. In components we haveη−1(f, g) = gµf

µ = fµgµ. We call η−1 the inverse Minkowski metric and its nonzero components

ηµν = η(eµ, eν) are η00 = −η11 = −η22 = −η33 = 1. Note that for f ∈ M∗ we can nowwrite the raised components fµ in terms of ηµν as fµ = ηµνfν . Also, for f, g ∈ M∗ we haveηµνfµgν = fµg

µ = ηµνfµgν . For this reason we will often write, with abuse of notation, f ·g instead

of η−1(f, g). Similarly, because for f ∈ M∗ and v ∈ M we have that f(v) = fµvµ = ηµνf

νvµ, wewill also write f · v instead of f(v). In other words, because we can move vectors back and forthbetween M and M∗, we will not make a clear distinction between M and M∗ and we write allscalars η(v, w), η−1(f, g) and f(v) as a dot product.

2.1.2 The Lorentz group, causal structure and the Poincare group

As stated above, the coordinate transformations between different inertial frames are transforma-tions that leave the Minkowski metric invariant. Therefore, we will now study such transformations.

Definition 2.5 A linear map L : M → M is said to be an orthogonal transformation of M ifη(Lv, Lw) = η(v, w) for all v, w ∈ M. An orthogonal transformation is also called a (general)Lorentz transformation.

In this subsection we will investigate the properties of these Lorentz transformations. We willalso find that the Lorentz transformations form a Lie group, and we will study some importantproperties of this Lie group.

Algebraic properties of Lorentz transformations and causal structure of MThe following discussion is largely based on the first chapter of [26]. If L : M → M is an or-thogonal transformation and Lv = 0 for some v ∈ M, then for all w ∈ M we have η(v, w) =η(Lv, Lw) = η(0, Lw) = 0, so that v must be the zero vector in M since η is nondegenerate. Thismeans that L is injective and hence we can conclude that orthogonal transformations are linearautomorphisms of M. In particular, an orthogonal transformation L is invertible and we haveη(L−1v, L−1w) = η(LL−1v, LL−1w) = η(v, w), so that its inverse is also an orthogonal transfor-mation. We have the following characterization of orthogonal transformations:

Lemma 2.6 Let L :M→M be a linear map in Minkowski space. Then the following statementsare equivalent:(a) L is an orthogonal transformation, i.e. η(Lv,Lw) = η(v, w) for all v, w ∈M.(b) η(Lv,Lv) = η(v, v) for all v ∈M.(c) L carries an orthonormal basis of M onto another orthonormal basis of M.

ProofThat (a) implies (b) is trivial. The opposite implication follows from the identity

4η(v, w) = η(v + w, v + w)− η(v − w, v − w)

= η(L(v + w), L(v + w))− η(L(v − w), L(v − w))

= η(Lv + Lw,Lv + Lw)− η(Lv − Lw,Lv − Lw)

= 4η(Lv,Lw).

10

Page 12: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

To see that (a) implies (c), let eµ be an orthonormal basis ofM. Because L is an automorphismof M, Leµ is also a basis of M, and from η(Leµ, Leν) = η(eµ, eν) it follows that Leµ is anorthonormal basis of M. Finally, to prove that (c) implies (b), let eµ be an orthonormal basisofM. By assumption, Leµ is also an orthonormal basis ofM, so that3 η(Leµ, Leν) = η(eµ, eν).This immediately implies that η(Lv, Lv) = η(v, v) for all v ∈M.

If L : M → M is a linear map and eµ is an orthonormal basis of M, we define Lµν forµ, ν ∈ 0, 1, 2, 3 by Lµν = (Leν)µ. Then Leν = (Leν)µeµ = Lµνeµ, so for all v ∈ M we haveLv = vνLeν = vνLµνeµ, and hence the components of Lv can be expressed in terms of the Lµνas (Lv)µ = Lµνv

ν . In case L is an orthogonal transformation, the constants Lµν satisfy a specialproperty, namely,

ηµν = η(Leµ, Leν) = η(Lρµeρ, Lσνeσ) = LρµL

σνη(eρ, eσ)

= LρµLσνηρσ (2.5)

or, equivalently,ηµν = LµρL

νση

ρσ. (2.6)

Conversely, it is also true that if L : M → M is a linear map such that the constants Lµν =(Leν)µ (with eµ an orthonormal basis of M) satisfy (2.5), then L : M →M is an orthogonaltransformation. Thus, the identity above gives a characterization of orthogonal transformationson Minkowski space M in terms of the constants Lµν = (Leν)µ. Of course, the Lµν are nothingelse than the matrix coefficients of L with respect to the orthonormal basis eµ, where µ denotesthe row index of the matrix [L] of L and ν denotes the column index. When we define the matrix[η] by [η]µν = ηµν , we can write (2.5) in matrix form as

[L]T [η][L] = [η].

From this matrix identity, the following proposition follows immediately by taking the determinanton both sides.

Proposition 2.7 Let L :M→M be an orthogonal transformation and let [L] be its matrix withrespect to some orthonormal basis eµ of M. Then det([L]) = ±1.

An orthogonal transformation for which the determinant of its matrix is +1 (or −1) is called proper(or improper).

Because (1) the composition of two orthogonal transformations is again an orthogonal transfro-mation, (2) the composition of linear maps onM is associative, (3) the identity map is orthogonaland (4) the inverse4 of an orthogonal transformation is again an orthogonal transformation, it fol-lows that the orthogonal transformations onM form a group under the compostion of maps. Thisgroup is called the Lorentz group and is denoted by L. Because the determinant is multiplicative,the set of proper elements in L forms a subgroup of L. It is called the proper Lorentz group andis denoted by L+. In practice, we will often identify the Lorentz group L with the 4× 4-matricesΛ satisfying ΛT [η]Λ = [η]; the proper Lorentz group L+ is then identified with the set of thoseelements Λ ∈ L that also satisfy det(Λ) = 1. Apart from the determinant being either +1 or −1,the elements of L have another important property:

Proposition 2.8 Let L :M→M be an element of L. Then either L00 ≥ 1 or L0

0 ≤ −1.

ProofSubstitution of µ = ν = 0 in (2.5) gives (L0

0)2−∑3

k=1(Lk0)2 = 1, or (L00)2 = 1+

∑3k=1(Lk0)2 ≥ 1,

so that |L00| ≥ 1.

3Here we use our convention that orthonormal bases are always labeled such that η(e0, e0) = +1.4Note that it follows from the identity [L]T [η][L] = [η] that the inverse matrix [L]−1 = [L−1] of [L] can be

expressed as [L]−1 = [η][L]T [η].

11

Page 13: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

An element L ∈ L is called orthochronous (or nonorthochronous) if L00 ≥ 1 (or L0

0 ≤ −1).To understand the properties of orthochronous elements of L, we first need to define the notion ofa past and future. For this, we need the following theorem.

Theorem 2.9 Suppose that v ∈ M is timelike and w ∈ M is either timelike or else a nonzeronull vector. Let eµ be an orthonormal basis for M and write v = vµeµ and w = wµeµ. Theneither(a) v0w0 > 0, in which case η(v, w) > 0, or(b) v0w0 < 0, in which case η(v, w) < 0.In particular, v0w0 6= 0 and η(v, w) 6= 0.

ProofBy assumption, (v0)2 − (v1)2 − (v2)2 − (v3)2 = η(v, v) > 0 and (w0)2 − (w1)2 − (w2)2 − (w3)2 =η(w,w) ≥ 0, so (v0w0)2 = (v0)2(w0)2 >

((v1)2 + (v2)2 + (v3)2

) ((w1)2 + (w2)2 + (w3)2

)≥ (v1w1 +

v2w2 + v3w3)2, where we have used the Cauchy-Schwarz inequality for R3. Thus, we find that∣∣v0w0∣∣ > ∣∣v1w1 + v2w2 + v3w3

∣∣ ,so in particular v0w0 6= 0 and, moreover, η(v, w) 6= 0. Suppose that v0w0 > 0. Then v0w0 =∣∣v0w0

∣∣ > ∣∣v1w1 + v2w2 + v3w3∣∣ ≥ v1w1 + v2w2 + v3w3 and so v0w0− v1w1− v2w2− v3w3 > 0, i.e.

η(v, w) > 0. On the other hand, if v0w0 < 0, then η(v,−w) > 0, so η(v, w) < 0.

Corollary 2.10 If a nonzero vector in M is orthogonal to a timelike vector, then it must bespacelike.

We denote by τ the collection of all timelike vectors inM and define a relation ∼ on τ by v ∼ w ifand only if η(v, w) > 0 (so that, by theorem 2.9, v0 and w0 have the same sign in any orthonormalbasis). The relation ∼ on τ is an equivalence relation with exactly two equivalence classes. Wedenote the two equivalence classes of ∼ on τ (in an arbitrary way) by τ+ and τ− and refer tothe elements of τ+ as future-directed timelike vectors, whereas we refer to the elements of τ− aspast-directed. Then, given some orthonormal basis eµ, we have that either v0 < 0 for all v ∈ τ+

(and v0 > 0 for all v ∈ τ−) or else that v0 > 0 for all v ∈ τ+ (and v0 < 0 for all v ∈ τ−). Thisfollows immediately by considering the equivalence class of the timelike vector e0. Clearly, bothτ+ and τ− are cones, which means that if v, w ∈ τ± and r > 0, then rv ∈ τ± and v + w ∈ τ±.

Now that we have obtained the notion of past- and future-directed timelike vectors, we willdefine past- and future-directed null vectors. For this we need the following lemma.

Lemma 2.11 If n ∈M is a nonzero null vector, then n · v has the same sign for all v ∈ τ+.

ProofSuppose that v1, v2 ∈ τ+ with n · v1 < 0 and n · v2 > 0. Define v′1 := n·v2

|n·v1|v1. Because n · v2 > 0, it

follows from the fact that τ+ is a cone that v′1 ∈ τ+, and we have n ·v′1 = n·v2|n·v1|n ·v1 = n·v1

|n·v1|n ·v2 =

−n · v2. Thus 0 = n · v′1 + n · v2 = n · (v′1 + v2). But again using the fact that τ+ is a cone,v′1 +v2 ∈ τ+; in particular v′1 +v2 is timelike. Since n is nonzero and null this contradicts corollary2.10.

From this lemma it follows that the following definition makes sense.

Definition 2.12 Let n ∈M be a nonzero null vector. Then n is called future-directed if n · v > 0for all v ∈ τ+ and past-directed if n · v < 0 for all v ∈ τ+.

Proposition 2.13 Let n1, n2 ∈ M be two nonzero null vectors. Then n1 and n2 have the sametime orientation (i.e. future-directed or past-directed) if and only if (n1)0 has the same sign as(n2)0 relative to any orthonormal basis for M.

12

Page 14: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

ProofSuppose that (n1)0 and (n2)0 have the same sign with respect to any orthonormal basis. Choosean arbitrary orthonormal basis eµ and let λ ∈ −1, 1 be such that v := λe0 ∈ τ+. Then thetwo inner products v · n1 = λ(n1)0 and v · n2 = λ(n2)0 have the same sign by assumption. By theprevious lemma, this is not only true for v, but for all vectors in τ+. Thus, n1 and n2 have thesame time orientation. For the converse statement, assume that n1 and n2 have the same timeorientation and let eµ be an orthonormal basis. Again we let λ ∈ −1, 1 be such that λe0 ∈ τ+.Because n1 and n2 have the same time orientation, λe0 ·n1 and λe0 ·n2 have the same sign, whichimplies that (n1)0 and (n2)0 have the same sign.

Definition 2.14 In the affine space M we define the future light cone at a point x ∈M by

C+x = y ∈ Cx : y − x is future-directed

and the past light cone by

C−x = y ∈ Cx : y − x is past-directed.

Now that we have introduced past- and future-directed timelike and null vectors, we can interpretthe orthochronous elements of L. This interpretation is given in the following theorem.

Theorem 2.15 Let L ∈ L and let eµ be an orthonormal basis for M. Then the following areequivalent:(a) L is orthochronous.(b) L preserves the time orientation of all nonzero null vectors.(c) L preserves the time orientation of all timelike vectors.

Before we can prove the theorem, we first need a little fact. Let L ∈ L. Substituting µ = ν = 0in (2.6) gives (L0

0)2 −∑3

k=1(L0k)

2 = 1 and so (L00)2 >

∑3k=1(L0

k)2. Now let v = vµeµ ∈ M be

either timelike or else null and nonzero, so (v0)2 ≥∑3

k=1(vk)2 (note that v0 6= 0, since otherwisev = 0). Using these two inequalities and the Cauchy-Schwarz inequality for R3 (and the fact thatv0 6= 0), we get(

3∑k=1

L0kvk

)2

(3∑

k=1

(L0k)

2

)(3∑

k=1

(vk)2

)< (L0

0)2(v0)2 = (L00v

0)2.

We may rewrite this as

0 >

(3∑

k=1

L0kvk

)2

− (L00v

0)2 =

(3∑

k=1

L0kvk − L0

0v0

)(3∑

k=1

L0kvk + L0

0v0

)

= −

[(3∑

k=0

L0kek

)· v

]L0

µvµ = −(w · v)(Lv)0,

where we have defined the timelike vector w =∑3

k=0 L0kek. Thus (w ·v)(Lv)0 > 0, so we conclude

that w · v and (Lv)0 have the same sign. To summarize, if v ∈ M is either timelike or else nulland nonzero and if L ∈ L, then (Lv)0 and w · v (with w the timelike vector defined above) havethe same sign. We will use this fact to prove the theorem.

ProofLet v ∈ M be again a timelike or nonzero null vector and let w be the timelike vector definedabove.

Assume L00 ≥ 1 (L orthochronous). We separate two cases. In case v0 > 0 we have w0v0 =

L00v

0 > 0, so by theorem 2.9 we have v ·w > 0. Thus (Lv)0 > 0, by the discussion above. In case

13

Page 15: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

v0 < 0 we have w0v0 = L00v

0 < 0, so by theorem 2.9 we have v · w < 0. Thus (Lv)0 < 0, by thediscussion above. So we conclude that if L0

0 ≥ 1, then v0 and (Lv)0 always have the same sign,i.e. we have proved that (a) implies (b) and that (a) implies (c).

Assume L00 ≤ −1 (L nonorthochronous). We separate two cases. In case v0 > 0 we have

w0v0 = L00v

0 < 0, so by theorem 2.9 we have v ·w < 0. Thus (Lv)0 < 0, by the discussion above.In case v0 < 0 we have w0v0 = L0

0v0 > 0, so by theorem 2.9 we have v · w > 0. Thus (Lv)0 > 0,

by the discussion above. So we conclude that if L00 ≤ −1, then v0 and (Lv)0 always have opposite

signs, i.e. we have proved that (b) implies (a) and that (c) implies (a).

Corollary 2.16 If L ∈ L is nonorthochronous, it reverses the time orientation of all timelike andnonzero null vectors.

It is now clear how to interpret the orthochronous elements of L: they are precisely those elementsof L that preserve the causal structure of Minkowski space. Now suppose that L1, L2 ∈ L areboth orthochronous. If v ∈M is a timelike or nonzero null vector that is future-directed (or past-directed), then by theorem 2.15 the vector w = L1v is also a future-directed (or past-directed)timelike or nonzero null vector in M, and by the same argument, so is L2w = L2L1v. Thus, theelement L2L1 ∈ L preserves the time orientation of all timelike and all nonzero null vectors. Usingtheorem 2.15 again, we conclude that L2L1 ∈ L is orthochronous. Furthermore, it is clear thatthe identity map I :M→M is orthochronous since I0

0 = 1. Finally, if L ∈ L is orthochronous,and L−1 ∈ L would be nonorthochronous, then I = L−1L would reverse the time orientationof all timelike and all nonzero null vectors. This shows that L−1 must in fact be orthochronouswhenever L is orthochronous. Thus, the set of orthochronous elements of L forms a subgroup L↑of L and is called the orthochronous Lorentz group. The intersection L↑+ := L+ ∩ L↑ of the twosubgroups L+ and L↑ of L is again a subgroup of L and is called the restricted Lorentz group. Wealso define the subsets L− and L↓, consisting of the Lorentz transformations L with det(L) = −1and L0

0 ≤ −1, respectively. Of course these subsets cannot be subgroups of L since they do notcontain the identity element.

The Lie group structure of the Lorentz groupThe Lorentz group L can be viewed as a subgroup of the matrix Lie group GL(4,R), and itcan in fact be shown that L is itself a six-dimensional matrix Lie group. It has four connectedcomponents, namely L↑+, L↓+ := L+ ∩ L↓, L↑− := L− ∩ L↑ and L↓− := L− ∩ L↓. Typical el-

ements of L↑−, L↓− and L↓+ are the space-inversion Is, time-reversal It and spacetime-inversionIst = IsIt, respectively, where Is is defined by Is(x

0, x1, x2, x3) = (x0,−x1,−x2,−x3) and It is de-

fined by It(x0, x1, x2, x3) = (−x0, x1, x2, x3). The transformation Is defines a bijection L↑− → L

↑+

by L 7→ IsL. Similarly, It defines a bijection L↓− → L↑+ and Ist defines a bijection L↓+ → L

↑+. In

other words, the orthochronous Lorentz group L↑ = L↑+ ∪ L↑− is generated by L↑+ ∪ Is and the

proper Lorentz group L+ = L↑+ ∪L↓+ is generated by L↑+ ∪ Ist. It also follows that the subgroup

L0 := L↑+ ∪ L↓−, called the orthochorous Lorentz group, is generated by L↑+ ∪ It.

The Lie algebra l of the Lorentz group L is by definition the set of all transformations X :M → M such that etX ∈ L for all t ∈ R, or, in terms of matrices, the set of all 4 × 4-matricessuch that [η] = (etX)T [η]etX , which is equivalent to (etX)−1 = [η]−1(etX)T [η] = [η](etX)T [η]. But

(etX)T = etXT

and (etX)−1 = e−tX , so a 4× 4-matrix X is in l if and only if

e−tX = [η]etXT

[η] = et[η]XT [η],

where we have used that for each 4 × 4-matrix M and each 4 × 4-matrix G satisfying G2 = I we

have etGMG =∑∞

k=0tk(GMG)k

k! =∑∞

k=0tkGMkG

k! = G(∑∞

k=0tkMk

k!

)G = GetMG. Thus, X ∈ l if

and only if [η]XT [η] = −X, or XT [η] + [η]X = 0. We have thus found that

l = X ∈M4(R) : XT [η] + [η]X = 0.

14

Page 16: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Now let eµν be the standard basis of M4(R) (i.e. (eµν)ρσ = 1 if (ρ, σ) = (µ, ν) and (eµν)ρσ = 0for other values of ρ and σ) and define for j, k = 1, 2, 3 the matrices

Xjk = ekj − ejk

Xk0 = −X0k = ek0 + e0k.

The basis matrices eµν of M4(R) cannot be defined covariantly with the lower indices µ and νacting as Lorentz indices, by which we mean that for L ∈ L the matrices e′ρσ := LµρL

νσeµν are

not the same as the matrices eµν defined above5. However, if we define X00 = 0 then the matricesX00, Xk0 and Xjk can be given by the covariant expression (Xµν)αβ = δαµηνβ − δαν ηµβ, which isantisymmetric in µ and ν. The matrices Xµν satisfy the commutation relations

[Xµν , Xρσ] = ηµρXσν + ηνρXµσ + ηνσXρµ + ηµσXνρ, (2.7)

which easily follow from writing out [Xµν , Xρσ]αβ by using the covariant expression for (Xµν)αβabove. In other words, we have [Xµν , Xρσ] = 0 whenever the sets µ, ν and ρ, σ are either equalor disjoint, and we have [Xµν , Xνσ] = ηννXµσ if µ 6= ν and ν 6= σ. The matrices Xµνµ<ν form abasis of l and their commutation relations are obtained from (2.7) by replacing Xκλ → −Xλκ onthe right-hand side whenever κ > λ. We mention furthermore that the elements of the form etXµν

generate L↑+.

We will now focus on the restricted Lorentz group L↑+. The restricted Lorentz group is aconnected six-dimensional Lie group that is not simply connected, i.e. not every closed path inthis group can be contracted continuously to a point. According to the theory of Lie groups, thereexists for each connected Lie group G a simply-connected Lie group G together with a Lie grouphomomorphism6 Φ : G → G such that the associated Lie algebra homomorphism7 φ : g → g isa Lie algebra isomorphism. The Lie group G is called a universal covering group of G and thehomomorphism Φ is called the covering homomorphism. The universal covering group is uniquein the following sense: if (G1,Φ1) and (G2,Φ2) are universal covers of G then there exists a Lie

group isomorphism Ψ : G1 → G2 such that Φ2 Ψ = Φ1. The universal covering group L↑+ of

L↑+ is SL(2,C), the group of all 2 × 2 complex matrices with unit determinant. The covering

homomorphism Φ : SL(2,C) → L↑+ can be obtained as follows. Let H(2,C) be the set of 2 × 2complex Hermitian matrices and, given an orthonormal basis eµ of M, define the R-linearisomorphism ψ :M→ H(2,C) by

ψ(x) =

3∑µ=0

xµσµ =

(x0 + x3 x1 − ix2

x1 + ix2 x0 − x3

),

where σ0 is the 2× 2 identity matrix and the σj with j = 1, 2, 3 are the Pauli matrices

σ1 =

(0 11 0

), σ2 =

(0 −ii 0

), σ3 =

(1 00 −1

).

Note that this definition of ψ depends on the choice of the orthonormal basis eµ of M; fur-thermore, once this choice of basis has been made (so that ψ is determined), we cannot use thesame formula to compute ψ(x) with respect to another orthonormal basis because the σµ are notsupposed to transform in any way. The important property of ψ is that

det(ψ(x)) = (x0)2 −3∑

k=0

(xk)2 = x · x.

5The eµν are given by (eµν)αβ = δµαδνβ , where δµν is the (non-covariant) Kronecker delta with lower indices.Note that it would in fact be possible to define a standard basis eµν by using the covariant Kronecker deltas δµν .

6This is a smooth map of Lie groups that is also a group homomorphism.7This is the unique map φ : g → g satisfying Φ(eX) = eφ(X) for all X ∈ g. The proof of the existence and

uniqueness of such a map (and that such a map is a Lie algebra homomorphism, i.e. φ([X,Y ]g) = [φ(X), φ(Y )]g forall X,Y ∈ g) can be found in [22], theorem 2.21.

15

Page 17: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

If A ∈ SL(2,C) and X ∈ H(2,C), then (AXA∗)∗ = (A∗)∗X∗A∗ = AXA∗, so AXA∗ ∈ H(2,C).Also, det(AXA∗) = det(A) det(X) det(A∗) = det(X), since for all A ∈ SL(2,C) we have det(A) =det(A∗) = 1. Thus, each element A ∈ SL(2,C) defines a map ΨA : H(2,C)→ H(2,C) by

ΨA(X) = AXA∗

that preserves the determinant. Under the correspondence ψ : M → H(2,C), this determinantpreserving map on H(2,C) corresponds to a norm preserving linear map Φ(A) := ψ−1 ΨA ψ :M→M given by

Φ(A)x = (ψ−1 ΨA ψ)(xµeµ) = (ψ−1 ΨA)

3∑µ=0

xµσµ

= ψ−1

A 3∑µ=0

xµσµ

A∗

= ψ−1

3∑µ=0

xµAσµA∗

=

1

2

3∑ν=0

Tr

3∑µ=0

xµAσµA∗

σν

=1

2

3∑ν=0

3∑µ=0

xµTr (AσµA∗σν) eν ,

where we have used that the inverse of the R-linear isomorphism ψ : M → H(2,C) is given byψ−1 : X 7→ 1

2

∑3µ=0 Tr(Xσµ)eµ. Thus we obtain a map Φ : SL(2,C)→ L, where

Φ(A)µν =1

2Tr(AσνA∗σµ).

Note that the different index placement on both sides reflects the fact that the map Φ is not definedin a covariant way. We can rewrite the equation Φ(A)x = (ψ−1 ΨA ψ)(x) = ψ−1(Aψ(x)A∗) as

ψ(Φ(A)x) = Aψ(x)A∗.

Using this equation, we find that for A,B ∈ SL(2,C) we have for each x ∈M

ψ(Φ(AB)x) = ABψ(x)B∗A∗ = Aψ(Φ(B)x)A∗ = ψ(Φ(A)Φ(B)x).

Using the invertibility of ψ, we conclude that for each x ∈M we have Φ(AB)x = Φ(A)Φ(B)x, andthus that Φ(AB) = Φ(A)Φ(B). So Φ : SL(2,C) → L is a group homomorphism. It is also clearfrom the formula for Φ(A)µν that this map is smooth, so Φ is in fact a Lie group homomorphism. Inparticular, since Φ is continuous and SL(2,C) is a connected (even simply connected) Lie group,the image Φ(SL(2,C)) ⊂ L must be connected. Because the identity of L is contained in this

image, the image must lie in the connected component of the identity, i.e. Φ(SL(2,C)) ⊂ L↑+.

The Lie group homomorphism Φ : SL(2,C)→ L↑+ induces a homomorphism φ : sl(2,C)→ l of

the associated Lie algebras. Note that because L↑+ is the connected component of the identity in

L, the Lie algebra associated to L↑+ coincides with the Lie algebra l associated with L. To see whatφ is, we first need some information about sl(2,C). The Lie algebra sl(2,C) consists of all complex2× 2-matrices with zero trace. It can be viewed as a three dimensional complex Lie algebra, butfor our purposes it is more convenient to consider it as a six-dimensional real Lie algebra. A basisof sl(2,C) is given by the six matrices 1

2σj ,12iσjj=1,2,3, where the σj denote the Pauli matrices.

Note that the 12iσj ’s span the Lie algebra su(2) ⊂ sl(2,C); they satisfy [ 1

2iσj ,12iσk] = 1

2iσl, where(j, k, l) is a cyclic permutation of (1, 2, 3). In terms of this basis for sl(2,C), the Lie algebrahomomorphism φ : sl(2,C)→ l is given by

φ

(1

2iσj

)= Xkl for (j, k, l) a cyclic permutation of (1, 2, 3)

φ

(1

2σk

)= X0k.

16

Page 18: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Since φ maps a basis of sl(2,C) onto a basis of l, it is clear that φ : sl(2,C)→ l is an isomorphismof Lie algebras, and by definition of φ we have

Φ(et2iσj ) = etφ( 1

2iσj) = etXkl (2.8)

Φ(et2σk) = etφ( 1

2σk) = etX0k ,

where in the first expression (j, k, l) is a cyclic permutation of (1, 2, 3). Because the elements

etXjk and etX0k on the right-hand sides generate L↑+, it follows immediately that Φ is surjective.

Thus, Φ : SL(2,C)→ L↑+ satisfies all the right properties of the universal covering map. We note

furthermore that the map Φ : SL(2,C) → L↑+ is two-to-one: for each L ∈ L↑+ the inverse imageΦ−1(L) is a set of the form A,−A.

The Poincare groupSo far we have given the definition of Minkowski spacetime as a 4-dimensional affine space and wehave studied the group of transformations L : M → M with η(Lv, Lw) = η(v, w), where v andw are elements in the vector space (and not the affine space) M. However, we are actually inter-ested in transformations P :M→M of the affine space M that satisfy η(Px− Py, Px− Py) =η(x− y, x− y) for all x and y in the affine space. Such transformations are called Poincare trans-formations. In order to formulate the general form of a Poincare transformation, note that if wechoose a fixed point x0 ∈ M in the affine space then we can write any x in the affine space asx = x0 +(x−x0), where x−x0 lies in the vector spaceM. When we have agreed on such point x0,we can actually identify the affine space M with the vector space M by identifying a point x inthe affine space with the point x−x0 in the vector space; note that the point x0 in the affine spaceis then identified with the origin of the vector space. With this identification, a general Poincaretransformation can then be written as

Pa,L(x) = Lx+ a,

with L a Lorentz transformation and a an element of the vector space M. If we take L to be theidentity map 1 ∈ L, we obtain the map Ta(x) := Pa,1 = x+a, which is a spacetime translation. If wetake a to be the zero vector, we obtain the map P0,L(x) = Lx, which is a Lorentz transformation. Ageneral Poincare transformation can thus be written as the composition of a Lorentz transformationand a spacetime translation:

Pa,L(x) = Lx+ a = (Ta L)(x).

From now on we will always write (Ta, L), or simply (a, L), to denote the Poincare transformationPa,L. The composition of two Poincare transformations (a1, L1) and (a2, L2) is again a Poincaretransformation and its action is given by

((a1, L1) (a2, L2))x = (a1, L1)(L2x+ a2) = L1(L2x+ a2) + a1 = L1L2x+ (L1a2 + a1),

so we have found the rule (a1, L1) (a2, L2) = (a1 + L1a2, L1L2). In particular, we have for anyPoincare transformation (a, L)

(0, 1) (a, L) = (a, L) (0, 1) = (a, L),

where 1 ∈ L is the identity map. Furthermore, for any isometry (a, L) we have

(a, L) (−L−1a, L−1) = (−L−1a, L−1) (a, L) = (0, 1).

This shows that the set of Poincare transformations forms a group under composition of maps withmultiplication given by (a1, L1)(a2, L2) = (a1 + L1a2, L1L2), unit element (0, 1) and (a, L)−1 =(−L−1a, L−1). This group is called the Poincare group and is denoted by P. From the considera-tions above, the group P is a semi-direct product of the additive group R4 and the Lorentz group

17

Page 19: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

L. We obtain the subgroups P↑+, P+, and P↑ of P by demanding that the Lorentz transformation

L in (a, L) ∈ P lies in L↑+, L+ or L↑, respectively. The subgroups P↑+, P+, and P↑ are calledthe restricted Poincare group, the proper Poincare group and the orthochronous Poincare group,respectively. In a similar way we also define the subsets P↓+, P↑− and P↓− by demanding that L lies

in L↓+, L↑− or L↓−.

The Poincare group P is a ten-dimensional real Lie group with connected components P↑+, P↓+,

P↑− and P↓−. The Lie algebra p of the Poincare group contains the Lie algebra l of the Lorentzgroup as a Lie subalgebra. The Lie algebra p is spanned by the basis elements Xµνµ<ν of ltogether with four elements Yµ with µ = 0, 1, 2, 3. The Lie bracket in p is given in terms of thesebasis elements Xµν and Yµ by

[Xµν , Xρσ] = ηµρXσν + ηνρXµσ + ηνσXρµ + ηµσXνρ

[Xµν , Yρ] = −(ηνρYµ − ηµρYν)

[Yµ, Yν ] = 0.

Because the subgroup P↑+ of P is connected, it has a universal covering group P↑+. This universal

covering group P↑+ consists of all pairs (a,A), where a ∈M and A ∈ SL(2,C). The multiplication

in P↑+ is given by(a1, A1)(a2, A2) = (a1 + Φ(A1)a2, A1A2),

where Φ denotes the covering homomorphism Φ : SL(2,C) → L↑+. The covering homomorphism

Π : P↑+ → P↑+ is given by

Π((a,A)) = (a,Φ(A)),

so that Π((a,A)) acts on any x ∈M as

Π((a,A))x = Φ(A)x+ a =1

2

∑µ,ν

Tr(AσνA∗σµ)xµeν + a.

Physical interpretation of the Poincare transformationsAt the beginning of this section on special relativity we gave the explicit form of some importantcoordinate transformations, namely spacetime translations, spatial rotations (around the x3-axis)and Lorentz boosts (in the x1-direction). We will now relate these coordinate transformationsto the Poincare transformations (a, L). It is clear that the spacetime translation in (2.1) can bewritten as

x′ = (−a, 0)x.

To rewrite the spatial rotation in (2.2), note that

etX12 =

1 0 0 00 cos(t) − sin(t) 00 sin(t) cos(t) 00 0 0 1

.

We can thus write (2.2) as

x′′ =(

0, e−θX12

)x.

More generally, if we define X = (X23, X31, X12), then a rotation of the coordinate axes of observerO′′ over an angle θ around the unit vector θ (according to the right-hand rule) with respect to the

axes of observer O gives the transformation rule x′′ =(

0, e−θθ·X)x =

(0, e−θ·X

)x, where θ = θθ.

Finally, because

etX01 =

cosh(t) − sinh(t) 0 0− sinh(t) cosh(t) 0 0

0 0 1 00 0 0 1

,

18

Page 20: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

we can write (2.3) as

x′′′ =(

0, eφvX01

)x,

where sinh(φv) = γ(v)v and cosh(φv) = γ(v), i.e. φv = 12 ln

(1+v1−v

). We call φv the boost parameter

corresponding to the speed v. More generally, if observer O′′′ moves with velocity v with respectto O and if we define X = (X01, X02, X03), then the coordinate transformation is given by x′′′ =(

0, eφ|v|v·X)x =

(0, eφv·X

)x, where we have defined φv = φ|v|v.

Earlier we mentioned that the elements of the form etXµν generate the restricted Lorentz groupL↑+, so the elements of the form

(a, etXµν

)generate the restricted Poincare group P↑+. In other

words, P↑+ is generated by translations, spatial rotations and Lorentz boosts. The restricted

Poincare group P↑+ is therefore precisely the group of transformations that relates different inertial

frames. The physically important group is thus P↑+, rather than the entire Poincare group P.

2.2 Quantum theory

In this section we will describe quantum theory. In the first subsection we introduce the definitionsof states and observables and some of their properties. In the second subsection we formulate thegeneral mathematical structure of quantum theory. In the third subsection we discuss how symme-tries are described in the mathematical framework of quantum theory. In the fourth subsection wewill consider a particular form of symmetry, namely relativistic symmetry; this automatically leadsto the theory of unitary representations of the universal covering group of the Poincare group. Theirreducible representations will lead naturally to a mathematical definition of a particle. Finally,in the fifth subsection we will describe the state spaces associated with many-particle states.

2.2.1 States and observables

As stated at the beginning of the previous section on special relativity8, if we repeat an experimentN times, and if N(B) denotes the number of times that we find a measured value in the Borelsubset B ⊂ Rn, then it is an emperical fact that for any such B the fraction N(B)/N approachessome definite value as N becomes large enough. Therefore, we assume that for any measuredobject α and for any physical quantity A there exists some theoretical probability

PAα (B) := limN→∞

(N(B)/N)

that the measured value M(α,A) lies in the Borel set B ⊂ RnA . Note that if we write α(x) andA(y) to symbolically denote the spacetime components of all parts of the measured object and ofall parts of the measuring apparatuses, then according to the principle of special relativity theseprobabilities satisfy

PA(y′)α(x′) (B) = P

A(y)α(x) (B) (2.9)

where x′ and y′ denote spacetime components with respect to another inertial observer that arenumerically equal to x and y.

States and observables in the physical worldIf for two measured objects α and β we have that

PAα (B) = PAβ (B)

for all physical quantities A and all Borel sets B, then the measured objects α and β cannot bedistinguished by any experiment. This defines an equivalence relation on the set of all measuredobjects, the corresponding equivalence classes of which are called states. With abuse of notation,

8Like our discussion at the beginning of the previous section, the present discussion is also largely inspired by thefirst chapter of [1].

19

Page 21: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

we will denote the equivalence class (i.e. the state) of a measured object α also by α, and forthe probabilities we correspondingly write PAα (B) where α now denotes the state rather than ameasured object. If for two physical quantities A and B we have that

PAα (B) = PBα (B)

for all states α and all Borel sets B, then the physical quantities A and B cannot be distinguishedby any experiment. This defines an equivalence relation on the set of physical quantities, and thecorresponding equivalence classes are called observables. We will use the same letter for a physicalquantity as for its equivalence class (i.e. the observable) and we use the notation PAα (B) whereα is a state and A is an observable. It is clear from the definition of a state that a state α iscompletely characterized by the set of probabilities PAα (B)A,B, where A runs over all possibleobservables.

Now we would like to define the notion of simultaneously measurable observables. For anyobservable A, we can define for each Borel-measurable function f : RnA → Rm an observablef(A); namely, if some kind of measuring process results in a measured value in the Borel subsetB ⊂ RnA for the observable A then the Borel subset f(B) ⊂ Rm represents the measured valuefor the observable f(A). We say that the observables A1, . . . , Am are simultaneously measurable instate α if there exists an observable B and functions fj : RnB → RnAj such that for all j = 1, . . . ,m

the observables Aj are indistinguishable from fj(B) in state α, i.e. if PAjα (B) = P

fj(B)α (B) for

all Borel sets B. Note that for such a set A1, . . . , Am of simultaneously measurable observablesin state α we can now define the observable g(A1, . . . , Am) for state α for arbitrary measurablefunctions g : RnA1

+...+nAm → Rk by g(A1, . . . , Am) = g(f1(B), . . . , fm(B)). In particular, we candefine sums and products of such observables to obtain new observables.

States and observables in physical theoriesIn physical theories, the states α and observables A defined above are represented by certainmathematical objects α and A, which are also called states and observables, respectively. Foreach inertial observer there are bijective correspondences

Ts : α 7→ α and To : A 7→ A,

and for each inertial observer these correspondences are defined in identical ways in terms of thecoordinate system. It seems reasonable to suspect that the mathematical objects corresponding tothe states and observables at one particular instant of time exhaust the entire set of mathematicalobjects necessary to describe the physical world. Thus, at each moment of time we can use thesame set mathematical objects. However, it is not true that two states or two observables that arerelated by a time translation will automatically be mapped to the same mathematical object. Wewill come back to this below, when we will consider the time evolution of a system.

Although the correspondences Ts and To are completely determined for a given inertial ob-server, there might be mathematical objects in the theory, other than states and observables, thatare not completely determined for a given inertial observer. For instance, in electrodynamics eachobserver can choose any particular gauge for the electromagnetic potentials.

For each pair (Ts(α),To(A)) and for each Borel set B, a physical theory should provide a

number PTo(A)Ts(α) (B) that represents the probability PAα (B); this is the minimum requirement that

any physical theory should satisfy in order to be consistent with emperical data. There is alsoanother consistency requirement for physical theories: the theory should be consistent with theprobabilistic form of the principle of special relativity. What this means, can be understood asfollows. Suppose that a particular inertial observer O is interested in the possible outcome of a mea-surement of the observable A for some state α.9 For this, the observer O uses the correspondencesTs/o to obtain the mathematical objects Ts(α) and To(A), and finds that the theory predicts the

9It is important to emphasize at this point that the descriptions of the α and A are complete: there are noexternal forces that are not included in the descriptions of α and A.

20

Page 22: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

probabilities PTo(A)Ts(α) (B). Now suppose that O actually performs the corresponding measurement

and suppose that a second observer O′ watches O performing the measurement. Then O′ willconclude, from his or her point of view, that a measurement was made of the observable A′ for astate α′. If O′ wants to know the probabilities for the possible outcomes of the measurement ofO, he or she should consider the mathematical objects Ts(α

′) and To(A′). Note in particular that

this procedure defines bijections

F ′O→O′ : Ts(α) 7→ Ts(α′) and FO→O′ : To(A) 7→ To(A

′)

of the mathematical objects. The observer O′ now concludes that the theory predicts the prob-

abilities PTo(A′)Ts(α′)

(B). The requirement that the theory must be consistent with the probabilisticform of the principle of special relativity can thus be stated as

PTo(A)Ts(α) (B) = P

To(A′)Ts(α′)

(B) = PFO→O′To(A)

F ′O→O′Ts(α)

(B), (2.10)

where α′ and A′ are the state and observable α and A of observer O, as seen from observer O′.

Symmetries of theories and of systemsIn discussing the principle of special relativity, we argued that there is a pair (s′, s) of bijections onthe sets of mathematical objects representing states and observables, respectively. Such a pair ofbijections is called a symmetry of the theory. However, in physics one often studies the symmetriesof a particular physical system, which can only be found in a limited set of states and for whichonly a limited set of observables can be measured. The phrase ’symmetry of a physical system’actually refers to the symmetries of a subsystem with respect to the entire system. For example, ifsome inertial observer O places a fixed point charge Q at the origin of his/her coordinate systemand he/she wants to study the motion of some test charge q in the field of Q, it would be easier tomake use of the spherical symmetry. By spherical symmetry we mean the following. Suppose forthe moment that observer O does not know that there is a charge Q at the origin and suppose thathe/she develops a theory that describes the subsystem consisting of the test charge q. This theoryassigns mathematical objects to the states αq and observables Aq of the test charge q in asimilar manner as we discussed above. Now consider a second observer O′ whose coordinate axesare rotated with respect to the coordinate system of O, and suppose that this second observer usesthe same physical theory as O and uses the same correspondence to assign mathematical objects tothe states and observables of q. Then spherical symmetry means that there is a relation like (2.10)between the mathematical objects of O and O′, only this time we consider only the possible statesand observables of the test charge q, rather than the set of all states and observables. In particular,there is also a pair (s′, s) of bijections of the mathematical objects as above. Generalization ofthese results to arbitrary subsystems motivates us to define a symmetry of a (sub)system to bea pair of bijections (s′, s) of the mathematical objects representing the states and observables ofthat particular system. However, in what follows we will only be concerned with symmetries ofthe theory.

In the following subsection we will explain precisely how states and observables are representedmathematically in quantum theory, and how these (mathematical representations of) states andobservables can be paired to obtain probabilities.

2.2.2 The general framework of quantum theory

The content of this subsection can be found in very many places; we have mainly used [34]. Forsome background information on self-adjoint operators on a Hilbert space one can consult appendixA. The mathematical description of a quantum system is characterized by a pair (H,A) consistingof a separable Hilbert space H and a set A of self-adjoint operators on H, the elements of which arecalled observables. Until subsection 4.2.1 we will always assume that A is the set of all self-adjointoperators in H; in subsection 4.2.1 we will discuss the more general case in which A does not

21

Page 23: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

contain all self-adjoint operators on the physical Hilbert space. The subset A0 = A∩B(H) of A iscalled the set of bounded observables and B(H) is called the algebra of bounded observables. Theset of states S consists of all linear functionals ρ : B(H)→ C of the form

ρ(A) = Tr(ρA),

where ρ ∈ B1(H) is a trace-class operator with Tr(ρ) = 1. Such an operator is called a densityoperator and it is clear that any convex combination ρ := λρ1 + (1− λ)ρ2 (with 0 ≤ λ ≤ 1) of twodensity operators ρ1, ρ2 is again a density operator and hence (using linearity of the trace) definesa state given by

ρ(A) = Tr(ρA) = λTr(ρ1A) + (1− λ)Tr(ρ2A) = λρ1(A) + (1− λ)ρ2(A),

where ρi ∈ S denotes the state corresponding to ρi. In particular, this shows that S is a convexset (just read the equation from right to left and use the fact that ρ = λρ1 + (1 − λ)ρ2 indeeddefines a density operator). The extremal points of the convex set S are called pure states, andwe denote the set of all pure states by PS. The elements in S\PS are called mixed states. Anydensity operator is a countable sum of the form

ρ =∑i

λiEi, (2.11)

with Ei one-dimensional projections on H and λi ≥ 0 with∑

i λi = 1. Conversely, any operatorof the form (2.11) defines a density operator. As a special case of (2.11), when ρ is a one-dimensional projection onto the (one-dimensional) subspace V1 of H, it is easy to see, by choosingan orthonormal basis eii for H containing a unit vector Ψ ∈ V1 (so ek = Ψ for some k), that forsuch a state we have that for all A ∈ B(H)

ρ(A) = Tr(ρA) =∑i

〈ρAei︸ ︷︷ ︸∈CΨ

, ei〉 = 〈AΨ,Ψ〉. (2.12)

Such a state is also called a vector state and we often write such a vector state as ρΨ. Combining(2.11) and (2.12), we see that each state ρ ∈ S is a countable convex combination of vector statesand therefore we can write the action of ρ on A ∈ B(H) as

ρ(A) =∑i

λi〈AΨi,Ψi〉 (2.13)

(with 0 ≤ λi ≤ 1 and∑

i λi = 1 as before) for some countable collection Ψii of unit vectors inH. An immediate consequence of (2.13) is that all pure states must be vector states. The converseof this statement is also true10 and therefore the set of pure states coincides exactly with the setof vector states, which in turn is in one-to-one correspondence with the set of unit vectors in Hmodulo phase factors. Thus the set PS of all pure states is in one-to-one correspondence with theset of all unit rays

R(Ψ) = eiθΨ : ‖Ψ‖ = 1, 0 ≤ θ < 2πin H. Note that R(Ψ1) = R(Ψ2) if and only if Ψ2 = eiθΨ1 for some θ ∈ [0, 2π).

Let P(R) denote the set of probability measures on R. We then define a map A× S → P(R)as follows. If A ∈ A with spectral resolution A =

∫R λdEA(λ) and if ρ ∈ S, we define a probability

measure µA,ρ on R byµA,ρ(B) = ρ(EA(B)) = Tr(ρEA(B))

for any Borel set B ⊂ R. The quantity µA,ρ(B) then represents the probability that, when thequantum system is in the state ρ, the result of a measurement of the observable A lies in the Borelset B. The expectation value 〈A〉ρ of the observable A in the state ρ of the system is then

〈A〉ρ =

∫RλdµA,ρ(λ)

10This will no longer be the case in the more general situation of subsection 4.2.1, where the algebra of observablesis allowed to be a proper subalgebra of B(H).

22

Page 24: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

and the variance of A in this state is

σ2ρ(A) = 〈(A− 〈A〉ρ1H)2〉ρ = 〈A2〉ρ − 〈A〉2ρ.

In particular, if ρ ∈ S is a pure state given by ρ(.) = 〈.Ψ,Ψ〉, then µA,ρ(B) = 〈EA(B)Ψ,Ψ〉 forany observable A and for all Borel sets B ⊂ R, and

〈A〉ρ =

∫Rλd〈EA(λ)Ψ,Ψ〉 = 〈AΨ,Ψ〉,

where the last equality makes sense only when Ψ lies in the domain of A. Also,

〈A2〉ρ = 〈A2Ψ,Ψ〉 = 〈AΨ, AΨ〉 = ‖AΨ‖2,

so σ2ρ(A) = ‖AΨ‖2 − 〈AΨ,Ψ〉2. If A = A1, . . . , An is a finite set of observables that pairwise

commute11, we can define a projection-valued measure EA on Rn by setting EA(B1×· · ·×Bn) =EA1(B1) . . . EAn(Bn) for any Borel sets B1, . . . ,Bn ⊂ R, and this will then define EA for all Borelsubsets of Rn. If the system is in state ρ, this will define a probability measure µA,ρ on Rn byµA,ρ(B) = ρ(EA(B)) = Tr(ρEA(B)) for any Borel set B ⊂ Rn, and the quantity µA,ρ(B) is tobe interpreted as the probability that the result (a1, . . . , an) of a simultaneous measurement of theobservables A1, . . . , An belongs to the Borel set B ⊂ Rn. When two observables do not commute,it is impossible to measure them simultaneously.

For each quantum system there is a special observable H ∈ A, called the Hamiltonian operator,or simply the Hamiltonian of the quantum system. Given this Hamiltonian H (with dense domainD(H)), we define a strongly-continuous one-parameter unitary group UH(t) on H by12

UH(t) = e−itH .

For any vector Ψ ∈ D(H) we then have i ddtUH(t)Ψ = UH(t)HΨ = HUH(t)Ψ, where in the lastequality we have assumed that each UH(t) leaves D(H) invariant. In quantum theory we havethe so-called Heisenberg picture and Schrodinger picture to describe the dynamics of a quantumsystem.

In the Heisenberg picture, the observables depend on time, whereas the states do not dependon time. Here we will only consider quantum systems for which the Hamiltonian does not dependexplicitly on time. When we have a quantum system in state ρ ∈ S at time t = 0, then the timeevolution A(t) of an observable with A(0) = A ∈ A is given by

A(t) = UH(−t)AUH(t) ∈ A

and hence the probability that the result of a measurement at time t of an observable A lies in theBorel set B ⊂ R is

µA(t),ρ(B) = Tr(ρEA(t)(B)).

Now assume that Ψ ∈ D(H) is such that UH(t′)Ψ ∈ D(A) for all t′ in some neighborhood of t ∈ R,then

d

dtA(t)Ψ =

d

dt′

∣∣∣∣t′=t

UH(−t′)AUH(t)Ψ +d

dt′

∣∣∣∣t′=t

UH(−t)AUH(t′)Ψ

= iHUH(−t)AUH(t)Ψ− iUH(−t)AUH(t)HΨ

= i(HA(t)−A(t)H)Ψ.

In this sense, we can say that A(t) satisfies the Heisenberg equation of motion

dA

dt(t) = i[H,A(t)].

11We say that two self-adjoint elements S and T commute if ES(B1)ET (B2) = ET (B2)ES(B1) for all Borel setsB1,B2 ⊂ R.

12We will always use units in which ~ = 1. Otherwise there would have been a factor ~−1 in the exponent.

23

Page 25: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

An observable A ∈ A is called a quantum integral of motion (or constant of motion) if A(t) = Afor all t ∈ R. So an observable A is a constant of motion if and only if it commutes with all UH(t),which happens precisely when it commutes with H. Hence constants of motion are observablesthat commute with the Hamiltonian.

In the Schrodinger picture, states depend on time and observables do not depend on time.When we have a quantum system which is at state ρ at t = 0, then the time evolution of the stateis described in term of the time evolution of the density operator as

ρ(t) = UH(t)ρUH(−t) ∈ B1(H).

and hence the probability that the result of a measurement at time t of an observable A lies in theBorel set B ⊂ R is

µA,ρ(t)(B) = Tr(ρ(t)EA(B)).

For any Ψ ∈ D(H) we then have

d

dtρ(t)Ψ =

d

dt′

∣∣∣∣t′=t

UH(t′)ρUH(−t)Ψ +d

dt′

∣∣∣∣t′=t

UH(t)ρUH(−t′)Ψ

= −iHUH(t)ρUH(−t)Ψ + iUH(t)ρUH(−t)HΨ

= i(Hρ(t)− ρ(t)H)Ψ.

In this sense, ρ(t) satisfies the Schrodinger equation of motion

dt(t) = −i[H, ρ(t)].

A state ρ ∈ S is called stationary when ρ(t) = ρ for all t ∈ R. Thus a state ρ is stationary if andonly if ρ commutes with H. If ρ is a pure state given by ρ = EΨ, then for any Φ ∈ H we have

ρ(t)Φ = UH(t)EΨ(UH(−t)Φ) = UH(t)(〈UH(−t)Φ,Ψ〉Ψ) = 〈Φ, UH(t)Ψ〉UH(t)Ψ = EΨ(t)Φ,

where Ψ(t) = UH(t)Ψ. Now if Ψ ∈ D(H), we get dΨdt (t) = d

dtUH(t)Ψ = −iHUH(t)Ψ = −iHΨ(t),so then Ψ(t) satisfies the Schrodinger equation

idΨ

dt(t) = HΨ(t).

2.2.3 Symmetries in quantum theory

The global structure of this subsection is borrowed from section 2.4 of [1], but the proofs presentedhere are more detailed than in [1]. In quantum theory, a symmetry of a quantum system (H,A)is a pair (s, s′) of bijections s : A0 → A0 and s′ : S → S on the set of bounded observables andthe set of states, respectively, satisfying

(s′ρ)(sA) = ρ(A), s(f(A)) = f(s(A))

for all ρ ∈ S, f ∈ C(σ(A)) and A ∈ A0. Also, if (s1, s′1) and (s2, s

′2) are two symmetries, it can be

shown that s1 = s2 ⇔ s′1 = s′2; in other words, a symmetry (s, s′) is completely determined oncewe know either s or s′. If ρ =

∑i λiρi ∈ S is a convex combination of states ρi ∈ S, then for each

A ∈ A0 we have(s′ρ)(sA) = ρ(A) =

∑i

λiρi(A) =∑i

λi(s′ρi)(sA).

Because s is a bijection, this implies that for each A ∈ A0 we have

(s′ρ)(A) = (s′ρ)(s(s−1A)) =∑i

λi(s′ρi)(s(s

−1A)) =∑i

λi(s′ρi)(A),

24

Page 26: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

so s′ρ =∑

i λis′ρi, and hence s′ preserves the convex structure of S. In particular, the set of

extreme points is mapped bijectively onto itself, so s′ gives a bijection from the set of pure statesonto itself. Because the set of pure states is in one-to-one correspondence to the set of unit raysin H (since we have assumed that A is the set of all self-adjoint operators on H), this means that(s, s′) induces a bijection

s : R(Ψ)Ψ∈H → R(Ψ)Ψ∈Hfrom the set of unit rays of H onto itself.

If ρ1, ρ2 ∈ PS are the two pure states corresponding to the unit rays R(Ψ1) and R(Ψ2),respectively, then we say that the pure states ρ1 and ρ2 are orthogonal if R(Ψ1) ⊥ R(Ψ2) (i.e. ifthe vectors in the unit ray R(Ψ1) are orthogonal to the vectors in the unit ray R(Ψ2)), and wewrite ρ1 ⊥ ρ2. Orthogonal pure states can also be characterized as follows.

Lemma 2.17 Two pure states ρ1, ρ2 ∈ PS are orthogonal if and only if there exists a projectionoperator E satisfying ρ1(E) = 1 and ρ2(E) = 0.

ProofIf such projection operator exists, then ‖(1− E)Ψ1‖2 = ρ1(1− E) = 0 and ‖EΨ2‖2 = ρ2(E) = 0,where Ψ1 and Ψ2 are unit vectors in the unit rays corresponding to the pure states ρ1 and ρ2,respectively. Thus (1− E)Ψ1 = 0 and EΨ2 = 0, so

〈Ψ1,Ψ2〉 = 〈(1− E)Ψ1,Ψ2〉+ 〈EΨ1,Ψ2〉 = 0 + 〈Ψ1, EΨ2〉 = 0.

Now suppose that 〈Ψ1,Ψ2〉 = 0. If E is the one-dimensional projection onto CΨ1 then EΨ1 = Ψ1

and EΨ2 = 0, so ρ1(E) = 〈EΨ1,Ψ1〉 = 1 and ρ2(E) = 〈EΨ2,Ψ2〉 = 0.

Lemma 2.18 If (s, s′) is a symmetry and ρ1, ρ2 ∈ PS are pure states, then ρ1 ⊥ ρ2 if and onlyif s′ρ1 ⊥ s′ρ2. In other words, if R1 and R2 are the unit rays corresponding to ρ1 and ρ2, thenR1 ⊥ R2 if and only if s(R1) ⊥ s(R2).

ProofSuppose that ρ1 ⊥ ρ2. Let E ∈ A0 be a projection with ρ1(E) = 1 and ρ2(E) = 0. Because(sE)2 = s(E2) = sE, sE is also a projection and we have (s′ρ1)(sE) = ρ1(E) = 1 and (s′ρ2)(sE) =ρ2(E) = 0. Hence s′ρ1 ⊥ s′ρ2, by the previous lemma. Now suppose that s′ρ1 ⊥ s′ρ2. Because(s−1, (s′)−1) is also a symmetry, the argument above gives that (s′)−1s′ρ1 ⊥ (s′)−1s′ρ2, i.e. ρ1 ⊥ ρ2.

Definition 2.19 If ρ1 and ρ2 are two pure states corresponding to the unit raysR(Ψ1) andR(Ψ2),respectively, then

〈ρ1, ρ2〉 := |〈Ψ1,Ψ2〉|2

is called the transition probability between the states ρ1 and ρ2.

Physically, this transition probability represents the probability that a system in state ρ1 is foundto be in state ρ2 after measurement, or vice versa.

Lemma 2.20 If (s, s′) is a symmetry, then for all pure states ρ1, ρ2 ∈ PS we have

〈s′ρ1, s′ρ2〉 = 〈ρ1, ρ2〉.

As a consequence, if R1 and R2 are the unit rays corresponding to ρ1 and ρ2 respectively, then forarbitrary Ψj ∈ Rj and Ψ′j ∈ s(Rj) we have

|〈Ψ′1,Ψ′2〉|2 = |〈Ψ1,Ψ2〉|2.

25

Page 27: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

ProofLet Ψ1 and Ψ2 be two unit vectors in the unit rays corresponding to ρ1 and ρ2, respectively, andlet Eρ1 be the one-dimensional projection onto CΨ1. Then sEρ1 is a projection, (s′ρ1)(sEρ1) =ρ1(Eρ1) = 1 and for any ρ ∈ PS with ρ ⊥ ρ1 we have (s′ρ)(sEρ1) = ρ(Eρ1) = 0. This implies that

sEρ1 = Es′ρ1 .

We also have

〈ρ1, ρ2〉 = |〈Ψ1,Ψ2〉|2 = |〈Ψ1, Eρ1Ψ2〉|2 = ‖Eρ1Ψ2‖2 = 〈Eρ1Ψ2, Eρ1Ψ2〉 = 〈Eρ1Ψ2,Ψ2〉= ρ2(Eρ1),

and similarly, 〈s′ρ1, s′ρ2〉 = (s′ρ2)(Es′ρ1), so we get

〈s′ρ1, s′ρ2〉 = (s′ρ2)(Es′ρ1) = (s′ρ2)(sEρ1) = ρ2(Eρ1) = 〈ρ1, ρ2〉.

This lemma has a nice implication. Let Ψnn be an orthonormal basis of H, let (s, s′) be asymmetry and let Ψ′nn be a set of vectors with Ψ′n ∈ s(R(Ψn)). Then, according to the lemma,for arbitrary m,n we have |〈Ψ′m,Ψ′n〉|2 = |〈Ψm,Ψn〉|2 = δmn. By positive definiteness of the innerproduct, this implies that

〈Ψ′m,Ψ′n〉 = δmn.

So Ψ′nn is an orthonormal set in H. Now suppose that there would exist a unit vector Ψ′ ∈ Hwith Ψ′ ⊥ Ψ′n for all n. Then for each Ψ ∈ s−1(R(Ψ′)) we have that |〈Ψ,Ψn〉|2 = |〈Ψ′,Ψ′n〉|2 = 0for all n. Because Ψnn is an orthonormal basis of H, this implies that Ψ = 0. This contradictsthe fact that Ψ is a unit vector. Hence, we conclude that no unit vector Ψ′ ∈ H satisfying Ψ′ ⊥ Ψ′nfor all n can exist. This shows that Ψ′nn is also an orthonormal basis of H.

Lemma 2.21 Let Ψ1,Ψ2 ∈ H be unit vectors with Ψ1 ⊥ Ψ2 and let R(Ψ1) and R(Ψ2) be theircorresponding unit rays. For λ, µ ∈ C with |λ|2 + |µ|2 = 1 we define the unit vector Ψλ,µ :=λΨ1 + µΨ2 with corresponding unit ray R(Ψλ,µ). If (s, s′) is a symmetry and Ψ′ ∈ s(R(Ψc1,c2))with c1c2 6= 0, then there exist Ψ′1 ∈ s(R(Ψ1)) and Ψ′2 ∈ s(R(Ψ2)) satisfying

Ψ′ = c1Ψ′1 + c2Ψ′2.

ProofIf we define R := R(Ψλ,µ) : |λ|2 + |µ|2 = 1, then R is precisely the set of all unit rays which areorthogonal to any ray which is orthogonal to both R(Ψ1) and R(Ψ2). By lemma 2.18, s(R) :=s(R) : R ∈ R is then precisely the set of all unit rays which are orthogonal to any ray which isorthogonal to both s(R(Ψ1)) and s(R(Ψ2)). But this implies that we can write s(R) as

s(R) = R(λ′Ψ′′1 + µ′Ψ′′2) : |λ′|2 + |µ′|2 = 1,

where Ψ′′1 and Ψ′′2 are some fixed vectors in s(R(Ψ1)) and s(R(Ψ2)). Thus for each R(Ψλ,µ) ∈ Rwe can write s(R(Ψλ,µ)) as s(R(Ψλ,µ)) = R(λ′Ψ′′1 +µ′Ψ′′2) for some λ′, µ′ ∈ C with |λ′|2 + |µ′|2 = 1.

Now let Ψ′ ∈ s(R(Ψc1,c2)) with c1c2 6= 0. Because s(R(Ψc1,c2)) = R(c′1Ψ′′1 + c′2Ψ′′2) for somec′1, c

′2 ∈ C with |c′1|2 + |c′2|2 = 1, there exists a θ ∈ [0, 2π) such that

Ψ′ = eiθ(c′1Ψ′′1 + c′2Ψ′′2) = c′′1Ψ′′1 + c′′2Ψ′′2,

where c′′j := eiθc′j . Because |c′′j |2 = |〈Ψ′,Ψ′′j 〉|2 = |〈Ψc1,c2 ,Ψj〉|2 = |cj |2, we can write c′′j = cjeiθj for

some θj ∈ [0, 2π). But then

Ψ′ = c1eiθ1Ψ′′1 + c2e

iθ2Ψ′′2 = c1Ψ′1 + c2Ψ′2,

26

Page 28: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

where Ψ′j := eiθjΨ′′j ∈ s(R(Ψj)).

We will now formulate and prove the main theorem concerning symmetries in quantum theory.This theorem, due to Wigner, states that we can represent any symmetry transformation of aquantum system by a unitary or antiunitary operator on the corresponding Hilbert space. The(constructive) proof of Wigner’s theorem that is given below is a mixture of the proofs in [35] and[1].

Theorem 2.22 (Wigner’s theorem) Let (H,A) be a quantum system and let s be a bijectionof the set of unit rays onto itself that conserves transition probabilities. Then there exists a mapU : H → H that is either linear and unitary or antilinear and antiunitary satisfying

UΨ ∈ s(R(Ψ))

for all unit vectors Ψ ∈ H. Furthermore, such a map U is uniquely determined up to a phasefactor.

ProofWe divide the proof into several steps.

•Step 1: Define U on an orthonormal basis Ψnn≥1 of H.Let Ψnn≥1 be an orthonormal basis of H (here we use the assumption that Hilbert spaces inquantum mechanics are separable) and write Rn := R(Ψn) for the corresponding unit rays. Forn 6= 1 we define the unit vectors

Φn =1√2

(Ψ1 + Ψn).

We now choose an arbitrary vector in s(R1) and call it UΨ1. For arbitrary Φ′n ∈ s(R(Φn))we have |〈Φ′n, UΨ1〉|2 = |〈Φn,Ψ1〉|2 = 1

2 , so there exists a unique UΦn ∈ s(R(Φn)) such that〈UΦn, UΨ1〉 = 1√

2. For n 6= 1 we then define

UΨn :=√

2UΦn − UΨ1.

To see that UΨn ∈ s(Rn), observe that, according to lemma 2.21, there exist Ψ′1 ∈ s(R1) andΨ′n ∈ s(Rn) satisfying UΦn = 1√

2(Ψ′1 + Ψ′n). Because Ψ′1, UΨ1 ∈ s(R1), there exists a θ ∈ [0, 2π)

such that Ψ′1 = eiθUΨ1. Then UΨn can be written as UΨn =√

2UΦn−UΨ1 = (eiθ−1)UΨ1 +Ψ′n.Because UΨ1 ⊥ Ψ′n, we find that eiθ − 1 = 〈UΨn, UΨ1〉 =

√2〈UΦn, UΨ1〉 − 〈UΨ1, UΨ1〉 = 0.

Hence we have UΨn = (eiθ − 1)UΨ1 + Ψ′n = Ψ′n, so indeed UΨn ∈ s(Rn). We have now definedU on the basis elements Ψn and on the elements Φn = 1√

2(Ψ1 + Ψn). According to the discussion

after lemma 2.18, UΨnn forms an orthonormal basis of H, and our definition of UΦn does notspoil the possibility to make U an R-linear operator, since

U

(1√2

(Ψ1 + Ψn)

)= UΦn =

1√2

(UΨ1 + UΨn).

•Step 2: If Ψ =∑

n cnΨn is a unit vector with c1 6= 0 and if Ψ′ =∑

n c′nUΨn ∈ s(R(Ψ)), then

either cn/c1 = c′n/c′1 for all n or cn/c1 = c′n/c

′1 for all n.

LetΨ =

∑n

cnΨn ∈ H

be a unit vector with c1 6= 0. If Ψ′ ∈ s(R(Ψ)), we may write it as

Ψ′ =∑n

c′nUΨn.

27

Page 29: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Note that for all n we have |cn|2 = |〈Ψ,Ψn〉|2 = |〈Ψ′, UΨn〉|2 = |c′n|2. Also, |c1 + cn|2 =√2|〈Ψ,Φn〉|2 =

√2|〈Ψ′, UΦn〉|2 = |c′1 + c′2|2. For arbitrary complex numbers a, b ∈ C with a 6= 0

we have |a + b|2 = |a|2 + |b|2 + 2|a|2Re(ba

), so the equality |c1 + cn|2 = |c′1 + c′2|2 implies that

|c1|2 + |c2|2 + 2|c1|2Re(cnc1

)= |c′1|2 + |c′2|2 + 2|c′1|2Re

(c′nc′1

). Together with |cn|2 = |c′n|2 this implies

that

Re

(cnc1

)= Re

(c′nc′1

).

Also,[Im(cnc1

)]2=∣∣∣ cnc1 ∣∣∣2 − [Re

(cnc1

)]2=∣∣∣ c′nc′1 ∣∣∣2 − [Re

(c′nc′1

)]2=[Im(c′nc′1

)]2. Hence,

Im

(cnc1

)= ±Im

(c′nc′1

).

Thus we conclude that we either have

cn/c1 = c′n/c′1 (2.14)

orcn/c1 = c′n/c

′1. (2.15)

We will now show that for each n, the same choice between (2.14) and (2.15) must be made.Suppose that for some k we have ck/c1 = c′k/c

′1 and that for some l we have cl/c1 = c′l/c

′1 and that

both ratios are not real. Note that this requires that k, l > 1 and k 6= l. Let

Υ :=1√3

(Ψ1 + Ψk + Ψl)

and let Υ′ be an arbitrary vector in s(R(Υ)). When we write Υ = d1Ψ1 + dkΨk + dlΨl and

Υ′ = d′1UΨ1 + d′kUΨk + d′lUΨl, we have dkd1, dld1∈ R, so we must have dk

d1=

d′kd′1

and dld1

=d′ld′1

, so we

can write Υ′ = α(d1UΨ1 + dkUΨk + dlΨl) = α√3(UΨ1 +UΨk +UΨl) for some α ∈ C with |α| = 1.

It then follows from conservation of transition probability that∣∣∣∣1 +c′kc′1

+c′lc′1

∣∣∣∣2 =3

|c′1|2|〈Ψ′,Υ′〉|2 =

3

|c1|2|〈Ψ,Υ〉|2 =

∣∣∣∣1 +ckc1

+clc1

∣∣∣∣2 .By assumption, we have ck/c1 = c′k/c

′1 and cl/c1 = c′l/c

′1, so∣∣∣1 + ck/c1 + cl/c1

∣∣∣2 = |1 + ck/c1 + cl/c1|2.

For arbitrary complex numbers a, b ∈ C the equality |a+b|2 = |a+b|2 implies that Re(ab) = Re(ab),which is equivalent to Re(a)Re(b) − Im(a)Im(b) = Re(a)Re(b) + Im(a)Im(b), or Im(a)Im(b) = 0.When we apply this to a = 1 + ck/c1 and b = cl/c1, we find that Im(1 + ck/c1)Im(cl/c1) = 0. ButIm(1 + ck/c1) = Im(ck/c1), so

Im(ck/c1)Im(cl/c1) = 0.

This implies that at least one of the two ratios ck/c1 and cl/c1 is real, which contradicts our as-sumption that both ratios are not real. We thus conclude that if Ψ =

∑n cnΨn is a unit vector with

c1 6= 0 and if Ψ′ =∑

n c′nUΨn ∈ s(R(Ψ)), then either cn/c1 = c′n/c

′1 for all n or else cn/c1 = c′n/c

′1

for all n.

•Step 3: Define U on unit vectors Ψ =∑

n cnΨn with c1 6= 0 and for which it is not true thatcn/c1 ∈ R for all n.If cn/c1 = c′n/c

′1 for all n and if it is not true that cn/c1 ∈ R for all n, then we define UΨ to be the

unique vector in s(R(Ψ)) for which c′1 = c1. In that case we get cn = (cn/c1)c1 = (c′n/c′1)c′1 = c′n

for all n.

28

Page 30: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

If cn/c1 = c′n/c′1 for all n and if it is not true that cn/c1 ∈ R for all n, then we define UΨ to be

the unique vector in s(R(Ψ)) for which c′1 = c1. In that case we get cn = (cn/c1)c1 = (c′n/c′1)c′1 = c′n

for all n.If cn/c1 ∈ R for all n, we will not define UΨ yet.

•Step 4: Define U on unit vectors Ψ =∑

n cnΨn with c1 = 0 and for which it is not true thatcn ∈ R for all n ≥ 2.Now suppose that

Ψ =∑n

cnΨn ∈ H

is a unit vector with c1 = 0. We then define

Ψ :=1√2

(Ψ1 + Ψ) =∑n

cnΨn,

where c1 = 1√2

and cn = cn√2

for n 6= 1. If it is not true that cn ∈ R for all n ≥ 2 then it is also not

true that cn/c1 ∈ R for all n, so the procedure in step 3 defines UΨ =∑

n c′nUΨn ∈ s(R(Ψ)) with

either c′n = cn for all n or else c′n = cn for all n. We then define

UΨ :=√

2UΨ− UΨ1 =∑n

c′nUΨn,

where c′1 = 0 and c′n =√

2c′n for n 6= 1. Thus, we either have c′n = cn for all n or else c′n = cn forall n.

We have now defined UΨ for all unit vectors Ψ =∑

n cnΨn for which not all coefficients havethe same phase (since we assumed that it is not true that cn/c1 ∈ R for all n), and for such unitvectors we either have

UΨ = U

(∑n

cnΨn

)=∑n

cnUΨn (2.16)

or

UΨ = U

(∑n

cnΨn

)=∑n

cnUΨn. (2.17)

Furthermore, for such unit vectors our choice between (2.16) and (2.17) is the only choice that ispossible, since for such unit vectors we either have

∑n cnUΨn ∈ s(R(Ψ)) or

∑n cnUΨn ∈ s(R(Ψ)),

but not both. If Ψ =∑

n cnΨn is a unit vector for which all coefficients have the same phase, thenalso all coefficients of an arbitrary vector Ψ′ =

∑n c′nUΨn in s(R(Ψ)) have the same phase. This

means that for such Ψ we are free to choose whether we want UΨ to satisfy (2.16) or (2.17), sinceboth

∑n cnUΨn and

∑n cnUΨn are in s(R(Ψ)). As stated before, we will not make this choice yet.

•Step 5: The choice between (2.16) and (2.17) must be the same for all unit vectors Ψ for whichwe have defined UΨ (i.e. the unit vectors where not all coefficients have the same phase).Let Υ1 =

∑n anΨn and Υ2 =

∑n bnΨn be two unit vectors for which UΥ1 and UΥ2 are already

defined by steps 3 and 4 above, and such that UΥ1 satisfies equation (2.16) and UΥ2 satisfies equa-tion (2.17); in particular, this implies that UΥ1 6= UΥ2. Conservation of transition probabilitygives ∣∣∣∣∣∑

n

bnan

∣∣∣∣∣2

=

∣∣∣∣∣⟨∑

k

akΨk,∑l

blΨl

⟩∣∣∣∣∣2

= |〈Υ1,Υ2〉|2

= |〈UΥ1, UΥ2〉|2 =

∣∣∣∣∣⟨∑

k

akUΨk,∑l

blUΨl

⟩∣∣∣∣∣2

=

∣∣∣∣∣∑n

bnan

∣∣∣∣∣2

.

29

Page 31: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Using this equality, we find that∑k,l

[Re(bkbl)Re(akal) + Im(bkbl)Im(akal)] =∑k,l

[Re(bkbl)Re(akal)− Im(bkbl)Im(akal)]

=∑k,l

Re[(bkbl)(akal)] = Re

[(∑k

bkak

)(∑l

blal

)]

= Re

∣∣∣∣∣∑n

bnan

∣∣∣∣∣2

=

∣∣∣∣∣∑n

bnan

∣∣∣∣∣2

=

∣∣∣∣∣∑n

bnan

∣∣∣∣∣2

= Re

∣∣∣∣∣∑n

bnan

∣∣∣∣∣2

= Re

[(∑k

bkak

)(∑l

blal

)]=

∑k,l

Re[(bkbl)(akal)]

=∑k,l

[Re(bkbl)Re(akal)− Im(bkbl)Im(akal)].

Thus, we find that UΥ1 and UΥ2 satisfy (2.16) and (2.17), respectively, if and only if∑k,l

Im(bkbl)Im(akal) = 0. (2.18)

We now separate two cases.Case 1. If Υ1 and Υ2 lie in the same unit ray, then there exists a θ ∈ (0, 2π) such that for all

n we have bn = aneiθ. But then bkbl = akeiθale

iθ = akal for all k, l, so equation (2.18) gives∑k,l[Im(bkbl)]

2 = 0 =∑

k,l[Im(akal)]2, which implies that bkbl, akal ∈ R for all k, l. This in turn

implies that all an’s have the same phase and all bn’s have the same phase. But this contradictsour assumption that UΥ1 and UΥ2 were already defined by our procedure. Hence, if Υ1 and Υ2

are in the same unit ray then the same choice between (2.16) and (2.17) must be made for UΥ1

and UΥ2.Case 2. Now suppose that Υ1 and Υ2 are not in the same unit ray. Because we have assumed thatUΥ1 and UΥ2 are defined, not all akal and not all bkbl are real (since then all an would have thesame phase, as well as all bn). We again separate two cases.Case 2a. If there exists a pair (i, j) such that both aiaj and bibj are not real, then define a unitvector Υ =

∑n cnΨn = 1√

2(eiλΨi + eiµΨj) with 0 < λ < µ < 2π. Then

∑k,l Im(ckcl)Im(akal) 6= 0

and∑

k,l Im(ckcl)Im(bkbl) 6= 0, so for UΥ and UΥ1 the same choice between (2.16) and (2.17)must be made, and for UΥ and UΥ2 the same choice between (2.16) and (2.17) must be made.Hence the same choice must be made for UΥ1 and UΥ2.Case 2b. Now suppose that there is no such pair (i, j). Then we choose a pair (i, j) for whichaiaj is not real and we choose a different pair (m,n) (possibly with m,n ∩ i, j 6= ∅) forwhich bmbn is not real. Now take a unit vector Υ =

∑n cnΨn =

∑k∈i,j,m,n ckΨk such that all

(three or four) coefficients ci, cj , cm, cn have different phases. Then∑

k,l Im(ckcl)Im(akal) 6= 0 and∑k,l Im(ckcl)Im(bkbl) 6= 0, so again we conclude that the same choice between (2.16) and (2.17)

must be made for UΥ1 and UΥ2.Thus for all unit vectors Ψ for which we have already defined UΨ, the same choice between

(2.16) and (2.17) must be made.

•Step 6: Define U for those unit vectors Ψ for which UΨ was not yet defined by the previoussteps.As stated before, for those unit vectors Ψ =

∑n cnΨn for which we have not yet defined UΨ, we

have that both∑

n cnUΨn and∑

n cnUΨn are in the unit ray s(R(Ψ)). We will now define UΨfor these vectors as follows. If U satisfies (2.16) for all unit vectors for which U is defined, then wedefine UΨ :=

∑n cnUΨn. If U satisfies (2.17) for all unit vectors for which U is defined, then we

30

Page 32: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

define UΨ :=∑

n cnUΨn.

•Step 7: Define U for Ψ ∈ H with ‖Ψ‖ 6= 1.We have now defined UΨ for all unit vectors Ψ in H and U either satisfies (2.16) for all unitvectors, or else U satisfies (2.17) for all unit vectors. In both cases we can extend U to a map

U : H → H by defining UΨ := ‖Ψ‖U(

Ψ‖Ψ‖

)for all Ψ ∈ H. In the first case, U becomes a linear

map on H and for arbitrary Ψ =∑

n anΨn and Φ =∑

n bnΨn in H we have

〈UΨ, UΦ〉 =∑k,l

akbl〈UΨk, UΨl〉 =∑n

anbn =∑k,l

akbl〈Ψk,Ψl〉 = 〈Ψ,Φ〉,

so U is unitary. In the second case, U becomes an antilinear map on H and for arbitrary Ψ =∑n anΨn and Φ =

∑n bnΨn in H we have

〈UΨ, UΦ〉 =∑k,l

akbl〈UΨk, UΨl〉 =∑n

anbn =∑k,l

akbl〈Ψk,Ψl〉 = 〈Φ,Ψ〉,

so U is antiunitary.

•Step 8: Uniqueness of U up to a phase factor.Now suppose that U ′ : H → H is another (anti)linear (anti)unitary map satisfying U ′Ψ ∈ s(R(Ψ))for all unit vectors Ψ ∈ H. Choose an arbitrary unit vector Ψ0 ∈ H. Because UΨ0, U

′Ψ0 ∈s(R(Ψ0)), there exists a λ0 ∈ [0, 2π) such that U ′Ψ0 = eiλ0UΨ0.

Let R be a unit ray in H that is not orthogonal to the unit ray R(Ψ0), and let Ψ ∈ R be theunique unit vector with 〈Ψ,Ψ0〉 ∈ R>0. Since UΨ and U ′Ψ lie in the same unit ray, we can writeU ′Ψ = eiλUΨ for some λ ∈ [0, 2π). But, because U and U ′ preserve real inner products,

〈UΨ, UΨ0〉 = 〈Ψ,Ψ0〉 = 〈U ′Ψ, U ′Ψ0〉 = ei(λ−λ0)〈UΨ, UΨ0〉,

so λ = λ0 and thus U ′Ψ = eiλ0UΨ. Using the fact that U and U ′ are both (anti)linear, we findthat this holds not only for Ψ, but for all vectors in CΨ. We have thus found that U ′Υ = eiλ0UΥfor all Υ ∈ H with 〈Υ,Ψ0〉 6= 0.

Now let Ψ ∈ H with 〈Ψ,Ψ0〉 = 0. Define Φ = Ψ + Ψ0. Because Φ is not orthogonal to Ψ0, itfollows from the discussion above that U ′Φ = eiλ0UΦ. Because U and U ′ are R-linear, we havethat UΦ = UΨ + UΨ0 and U ′Φ = U ′Ψ + U ′Ψ0. This gives us

U ′Ψ = (U ′Φ− U ′Ψ0) = eiλ0(UΦ− UΨ0)

= eiλ0UΨ.

We thus conclude that U ′ = eiλ0U .

In case U is unitary, we have for all Ψ ∈ H and any observable A

〈AΨ,Ψ〉 = ρΨ(A) = (s′ρΨ)(sA) = 〈(sA)UΨ, UΨ〉 = 〈U∗(sA)UΨ,Ψ〉,

which implies that A = U∗(sA)U , orsA = UAU−1, (2.19)

where we have used that unitarity of a linear map U : H → H is equivalent to U∗ = U−1. Notethat the expression for sA in (2.19) does not depend on the arbitrary phase factor in U , and thebijection s : A0 → A0 so defined is R-linear and satisfies sA2 = (sA)2. We may as well extend(2.19) to a bijection s : B(H) → B(H), and this bijection is in fact an automorphism of theC∗-algebra B(H) (we will discuss this in more detail in subsection 4.2.1). Now consider the case

31

Page 33: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

where U is anti-unitary. Because the definition of the adjoint U∗ of an anti-linear map is given bythe condition 〈UΨ1,Ψ2〉 = 〈Ψ1, U∗Ψ2〉, we now obtain the equality

〈AΨ,Ψ〉 = ρΨ(A) = (s′ρΨ)(sA) = 〈(sA)UΨ, UΨ〉 = 〈U∗(sA)UΨ,Ψ〉 = 〈Ψ, U∗(sA)UΨ〉

for all Ψ ∈ H and any observable A. This implies that A∗ = U∗(sA)U , or

sA = UA∗U−1, (2.20)

where we have used that anti-unitarity of an anti-linear map U : H → H is equivalent to U∗ =U−1. Again this expression does not depend on the arbitrary phase factor in U and the maps : A0 → A0 in (2.20) is R-linear and satisfies sA2 = (sA)2. When we extend s in (2.20) to a maps : B(H) → B(H), we obtain an anti-automorphism of the C∗-algebra B(H) (i.e. a ∗-preservingvector space isomorphism satisfying sAB = sBsA). We will come back to this in subsection 4.2.1.

It is clear that the set of all symmetries of a quantum system forms a group under the com-position of (bijective) maps; it is called the symmetry group of the system. Now suppose that(s1, s

′1) and (s2, s

′2) are two symmetries of a quantum system with corresponding unit ray trans-

formations s1 and s2. Then the composition (s2 s1, s′2 s′1) is also a symmetry of the system

with corresponding unit ray transformation s2 s1. By Wigner’s theorem there exist operatorsU1, U2, U21 : H → H each of which is either linear and unitary or antilinear and antiunitary,and such that U1Ψ ∈ s1(R(Ψ)), U2Ψ ∈ s2(R(Ψ)), U21Ψ ∈ s2 s1(R(Ψ)) for all unit vectorsΨ ∈ H. Hence we must have U2 U1 = λ(s1, s2)U21, where λ(s1, s2) is some complex number with|λ(s1, s2)| = 1. In other words, we may conclude that if G is (a subgroup of) the symmetry groupof the quantum system and if for each g ∈ G we have chosen an operator U(g) as in Wigner’stheorem, then for all g1, g2 ∈ G we have

U(g1)U(g2) = λ(g1, g2)U(g1g2),

where λ : G × G → C with |λ(g1, g2)| = 1 for all g1, g2 ∈ G. We say that U : G → B(H) is aray representation of G: it is a representation of G up to a phase factor. Because B(H) is anassociative algebra, we must have (U(g1)U(g2))U(g3) = U(g1)(U(g2)U(g3)) for all g1, g2, g3 ∈ G,which implies that λ must satisfy

λ(g1, g2)λ(g1g2, g3) = λ(g1, g2g3)λ(g2, g3);

a function λ : G × G → C with |λ(g, h)| = 1 for all g, h ∈ G satisfying this equation is calleda 2-cocycle of G. Because the operators U(g) are only determined up to a phase factor, we canredefine them by letting U ′(g) := µ(g)U(g), where µ : G → C with |µ(g)| = 1 for all g ∈ G. Byconsidering all such functions µ, we obtain all possible ray representations U : G→ C. A naturalquestion to ask is whether we can choose µ such that λ(g1, g2) = 1 for all g1, g2 ∈ G; in that casewe get U(g1)U(g2) = U(g1g2) for all g1, g2 ∈ G, so U becomes an ordinary representation of Ginstead of a ray representation. In the next subsection we will answer this question for the casewhere G is the restricted Poincare group P↑+.

2.2.4 Poincare invariance and one-particle states

In relativistic quantum theory it is believed that the laws of nature are the same for two observerswhose reference frames are related by a restricted Poincare transformation, so every restrictedPoincare transformation on spacetime must give rise to a symmetry of the quantum system. Fornotational simplicity we will identify a restricted Poincare transformation g ∈ P↑+ with its asso-ciated symmetry by writing g instead of s(g) for the corresponding ray transformation. We can

then say that in relativistic quantum theory P↑+ must be a subgroup of the symmetry group of

the theory. By the remarks in the previous subsection, this means that the action of P↑+ on a

quantum system is given by a ray representation U : P↑+ → B(H) of P↑+ on H. In a sufficiently

small neighborhood N ⊂ P↑+ of the identity 1 ∈ P↑+, each element g ∈ N can be written as g = h2

32

Page 34: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

with h ∈ P↑+. But then U(g) = λ(h, h)−1U(h)U(h), which is linear and unitary, regardless ofwhether U(h) is linear and unitary or antilinear and antiunitary. Hence for all g ∈ N the operator

U(g) is linear and unitary. Because each elements g ∈ P↑+ can be written as a finite product of

elements in N (since P↑+ is a connected Lie group), this in turn implies that U(g) is linear and

unitary for all g ∈ P↑+. In other words, the action of P↑+ on a quantum system is given by a unitary

ray representation U : P↑+ → B(H) of P↑+ on H.

Now suppose that Uray : P↑+ → B(H) is a unitary ray representation of P↑+. If Φ : P↑+ → P↑+ is

the covering map, then Uray := Uray Φ : P↑+ → B(H) is a unitary ray representation of P↑+. For

Uray the following theorem applies.

Theorem 2.23 Any unitary ray representation of P↑+ can, by a suitable choice of phase factors,

be made into a unitary representation of P↑+.13

Thus there exists a function µ : P↑+ → C with |µ(g)| = 1 for all g ∈ P↑+ such that g 7→ µ(g)Uray(g)

is a unitary representation of P↑+; we denote this unitary representation of P↑+ by U . The ray

representation Uray of P↑+ can thus be described by the unitary representation U : P↑+ → B(H) of

the universal covering group since R(Uray(g)Ψ) = R(U(g)Ψ), where g is one of the two elementsof the set Φ−1(g).

Conversely, each unitary representation of P↑+ also gives rise to a ray representation of P↑+.

To see this, suppose that U is a unitary representation of P↑+. Because U(1) = 1H and because

U(−1)U(−1) = U(1) = 1H, we must have U(−1) = ±1H. Then for each g ∈ P↑+ we have

U(−g) = U(−1)U(g) = ±U(g), so R(U(−g)Ψ) = R(U(g)Ψ). This shows that U gives rise to a

unitary ray representation Ur of P↑+ by choosing Uray(g) ∈ U(±g) for all g ∈ P↑+, where g is oneof the two elements of Φ−1(g). We can thus conclude that for each unitary ray representation of

P↑+ there is a unitary representation of P↑+ that gives rise to the same transformation of unit rays

of H, and that for each unitary representation of P↑+ there exists a unitary ray representation of

P↑+ that gives rise to the same transformation of unit rays of H.

Classification of the irreducible representations of P↑+We will now study the irreducible unitary representations of P↑+ in some detail. There are severaltexts where one can find a discussion about these representations, but each of these texts missessome of the information that can be found in one of the other texts. Here we have tried to includeas much (relevant) information as possible, based on [1], [3], [2], [8], [20], [29], [32] and [35]. Let

U be an irreducible unitary representation of P↑+ in the Hilbert space H. We will always assumethat such representations are continuous with respect to the weak operator topology on B(H).

So to each element (a, L) ∈ P↑+ there corresponds a unitary operator U(a, L) on H. Before weproceed we have to choose some convention concerning the physical interpretation of the elementsof SL(2,C). In our discussion we will assume that we have chosen a fixed basis eµ of M. We

then let Φ : SL(2,C) → L↑+ be precisely the (basis-dependent) covering map that was defined

in section 2.1.2. So for instance, the element et2iσj corresponds to a spatial rotation around the

xj-axis (over an angle t) in the chosen basis.We will now study the representations in several steps.

•Step 1: Decomposition according to the translation subgroupThe abelian subgroup (a, 1)a∈M ⊂ P↑+ of translations gives rise to a continuous 4-parameterunitary group U(a, 0) of commuting unitary operators on H. The four parameters correspond tothe decomposition of the vectors a ∈ M with respect to our chosen orthonormal basis eµ3µ=0

of M. According to a generalization of Stone’s theorem on strongly continuous 1-parameter uni-tary groups in a Hilbert space (see also section X.5 of [5]), called the SNAG theorem (for Stone,

13This theorem is not true when we replace P↑+ by P↑+.

33

Page 35: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Naimark, Ambrose and Godement), there exist four pairwise commuting self-adjoint operators Pµ

(µ = 0, 1, 2, 3) defined on a common dense domain DP ⊂ H such that

U(a, 1) = eia·P

for all a ∈ M, where we have used the notation P := (P 0, P 1, P 2, P 3). The operator Pµ iscalled the generator of translations in the xµ-direction. Under an SL(2,C) transformation theseoperators transform according to

U(0, A)PµU(0, A)−1 = Φ(A)νµP ν ,

which follows from

U(0, A)eiaµPµU(0, A)−1 = U(Φ(A)a, 1) = eia

µΦ(A)νµPν ,

which implies that U(0, A)PµU(0, A)−1 = Φ(A)νµPν . Let EP denote the joint spectral measureof the operators Pµ as defined in appendix A.2. Then the operators Pµ and U(a, 1) on H can bewritten as

Pµ =

∫MpµdEP (p) and U(a, 1) =

∫Meia·pdEP (p).

Here we write M (Minkowski space) instead of R4, because we need the Minkowski metric η onthis space. As stated in appendix A.2 we can represent H as a direct integral∫ ⊕

MH(p)dµ(p)

of Hilbert spaces H(p), corresponding to the operators Pµ. If we denote the decomposition ofan element Ψ ∈ H with respect to this direct integral decomposition as a function Ψ(p), whereΨ(p) ∈ H(p) for all p ∈M and ∫

M‖Ψ(p)‖2H(p)dµ(p) <∞,

then for each Ψ ∈ DP we have (PµΨ)(p) = pµΨ(p) and for each Ψ ∈ H we have (U(a, 0)Ψ)(p) =eia·pΨ(p). Of course one is always free to change the value of the functions Ψ(p) on a set ofµ-measure zero, but we will simply ignore this fact in the following. In other words, we alwayspretend that we have chosen some particular representative in the equivalence class of functionsΨ(p) corresponding to some Ψ ∈ H.

•Step 2: Lorentz generatorsAccording to Stone’s theorem, there exist self-adjoint operators M j3j=1 and N j3j=1 on H such

that the one-parameter unitary groups U(0, et2iσj )3j=1 and U(0, e

t2σj )3j=1 can be written as

eitMj3j=1 and eitNj3j=1, respectively. For obvious reasons we will call the operator M j the

generator of a rotation around the xj-axis and the operator N j the generator of a Lorentz boostin the xj direction. Now define a set of operators Mµν3µ,ν=0 by

Mµν = −Mνµ

M jk = M l

M0j = N j ,

where in the second line (j, k, l) is a cyclic permutation of (1, 2, 3). The operators Mµν satisfy

U(0, A)MµνU(0, A)−1 = Φ(A)µρΦ(A)νσMρσ.

It follows from the definition of Mµν above that the operators iMµν and iPµ satisfy the samecommutation relations as the Lie algebra basis elements Xµν and Yµ, so

[Mµν ,Mρσ] = −i(ηµρMσν + ηνρMµσ + ηνσMρµ + ηµσMνρ)

[Mµν , P ρ] = i(ηνρPµ − ηµρP ν)

[Pµ, P ν ] = 0.

34

Page 36: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

An immediate consequence of these relations is that the operator P 2 := PµPµ commutes with all

generators Pµ and Mµν . Because U is an irreducible representation of P↑+ on H, it follows fromSchur’s lemma that P 2 is a scalar multiple of the identity operator, i.e. P 2 = c11H. In particular,this implies that the measure µ is supported in a subset of M of which all elements p have thesame value of p · p, but we will come back to this, in more detail, in step 5. From the generatorsPµ and Mµν we can construct four new operators

Wµ := −1

2εµνρσMνρPσ,

called Pauli-Lubanski operators. Here εµνρσ is the completely antisymmetric tensor, normalizedsuch that ε0123 = −ε0123 = 1; in some textbooks (for example in [2]) the sign of this antisymmetrictensor is reversed, i.e. ε0123 = −ε0123 = −1, and in those texts there is no minus sign in thedefinition of Wµ. The Pauli-Lubanski operators satisfy

PµWµ = 0.

The Pauli-Lubanski operators all commute with the operators Pµ, i.e. [Wµ, P ν ] = 0, so the actionof Wµ on a vector Ψ ∈ H may be described by (WµΨ)(p) = Wµ(p)Ψ(p) for some set of operatorsWµ(p) : H(p) → H(p)p∈M. Also, the operator W 2 = WµW

µ commutes with all generators Pµ

and Mµν , so it must be a scalar multiple of the identity operator, i.e. W 2 = c21H.

•Step 3: The action of SL(2,C) on the spectral measuresBecause U(0, A)U(a, 1)U(0, A)∗ = U((0, A)(a, 1)(0, A−1)) = U(Φ(A)a, 1) for all A ∈ SL(2,C), thespectral measures (defined in step 1) satisfy

U(0, A)EP (∆)U(0, A)∗ = EP (Φ(A)∆) (2.21)

for all Borel sets ∆ ⊂M. Because U(0, A)∗ is unitary, we have U(0, A)∗H = H, so (2.21) impliesthat

U(0, A)EP (∆)H = EP (Φ(A)∆)H.

If we use the notation H∆ := EP (∆)H, then this can be rewritten as

U(0, A)H∆ = HΦ(A)∆. (2.22)

In particular, this impies that we are free to identify the spaces H(p1) and H(p2) with each otherif p1 and p2 are related by a restricted Lorentz transformation.

•Step 4: The support of the measure µ is an L↑+-invariant subset of MBecause U(0, A) is unitary, we have for each Borel set ∆ ⊂M and for all vectors Ψ1,Ψ2 ∈ H∆∫

∆〈Ψ1(p),Ψ2(p)〉H(p)dµ(p) = 〈Ψ1,Ψ2〉

= 〈U(0, A)Ψ1, U(0, A)Ψ2〉

=

∫Φ(A)∆

〈(U(0, A)Ψ1)(p), (U(0, A)Ψ2)(p)〉H(p)dµ(p),

where 〈., .〉H(p) denotes the inner product on H(p). Now let q ∈M be a point outside the supportof the measure µ, which means that there exists an open neighborhood Vq of q with µ(Vq) = 0.For all Ψ1,Ψ2 ∈ HVq we then have

0 =

∫Vq

〈Ψ1(p),Ψ2(p)〉H(p)dµ(p) =

∫Φ(A)Vq

〈(U(0, A)Ψ1)(p), (U(0, A)Ψ2)(p)〉H(p)dµ(p).

In particular, when we choose Ψ1 and Ψ2 such that the integrand on the right-hand side is strictlypositive on Φ(A)Vq, we find that µ(Φ(A)Vq) = 0. But Φ(A)Vq is an open neighborhood of the

35

Page 37: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

point Φ(A)q, so the point Φ(A)q also lies outside the support of µ. Because this is true for all

A ∈ SL(2,C) and because Φ(SL(2,C)) = L↑+, it follows that all elements of the form Lq with

L ∈ L↑+ are outside the support of µ. This shows that the support of µ must in fact be invariant

under L↑+.

•Step 5: Orbits and their relation to irreducible representationsFor p ∈M we call the set Lp

L∈L↑+the orbit of the point p. We can define an equivalence relation

on M by defining two points in M to be equivalent if and only if they have the same orbit; inparticular, this gives rise to a partition of M into disjoint orbits. It is clear that 0 is an orbit.We will now characterize all other orbits, so in the following discussion the orbits are assumed tobe different from 0. Note that the elements of one orbit all have the same value of p2 = p · p.When this value is nonnegative, we can write it as p2 = m2 with m ≥ 0. In case this value isnegative, we can write it as p2 = (im)2 with m ≥ 0. So to each orbit we assign a number m (orim) in this manner. In the nonnegative case, it follows from our results of section 2.1.2 that forall elements p in the same orbit the component p0 has the same sign (note that p0 6= 0 becausewe have excluded the orbit 0 from our discussion), so to orbits with p2 ≥ 0 we can also assigna sign ε ∈ +,−. The label (m, ε) completely characterizes the orbits with p2 ≥ 0. For an orbitfor which p2 < 0, the label im completely characterizes the orbit. Thus M is partitioned into thefollowing orbits (where m ≥ 0):

O+m = p ∈M : p2 = m2, p0 > 0;

O−m = p ∈M : p2 = m2, p0 < 0;Oim = p ∈M : p2 = −m2;0.

Because the support of the measure µ is L↑+-invariant, the support of µ is the union of completeorbits in M. If it contains two or more orbits, then we can always construct an open subset Wthat is invariant under L↑+ and is such that it contains at least one orbit in the support of µ andsuch that it excludes at least one orbit in the support of µ. Using the results that we derivedearlier, we then find that

U(0, A)HW = HΦ(A)W = HWand

U(a, 1)HW =

(∫Meia·pdEP (p)

)EP (W )H =

∫Weia·pdEP (p)H ⊂ HW ,

so HW is an invariant subspace of H, contradicting the irreducibility of U . We thus conclude thatthe support of µ consists of exactly one orbit and the operator P 2 = PµP

µ = c11H that we definedin step 2 is given by

P 2 = m21H.

Therefore each irreducible unitary representation of P↑+ is (partly) characterized by the labels ofthe corresponding orbit. In particular, H can be represented as direct integral∫ ⊕

OH(p)dµ(p),

where O denotes the corresponding orbit of the irreducible representation U . As we shall see later,the only orbits of physical relevance are the orbits O+

m (with m ≥ 0) and 0, so from now on wewill only consider these orbits.

•Step 6: Representations corresponding to the orbit 0In this case the irreducible representations are either one-dimensional (the trivial representation)or infinite-dimensional. In physics, only the trivial representation will be relevant:

U(a,A)Ψ = Ψ

36

Page 38: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

for all Ψ ∈ H, where H is one-dimensional.

•Step 7: Representations corresponding to the orbits O+m with m ≥ 0

We now fix some orbit O+m and derive some general properties of the corresponding irreducible

representations. The Hilbert space H can be decomposed according to

H =

∫ ⊕O+m

H(p)dµm(p).

It can be shown that the requirements that the support of the measure µm must equal O+m and

that µm must be L↑+-invariant, determine µm uniquely up to a positive factor. We will choose thisfactor in such a way that the measure is given (on O+

m) by

dµm(p) =d3p

2p0, (2.23)

where we write p = (p0,p) for all p ∈ O+m, which of course implies that p0 =

√m2 + p2. A nice

property of this normalization of µm is that for each function f :M→ R we have∫Mδ(p2 −m2)θ(p0)f(p)d4p =

∫O+m

f(√m2 + p2,p)dµm(p),

where θ : R → 0, 1 denotes the step function. Note that the elements in our Hilbert space cannow be represented as (∪p∈O+

mH(p))-valued functions Ψ on O+

m with Ψ(p) ∈ H(p) for all p ∈ O+m

and ∫O+m

‖Ψ(p)‖2H(p)dµm(p) <∞.

Because U(0, A)H∆ = HΦ(A)∆ for all Borel sets ∆ and because Φ(A) : M →M is bijective,we can now define for each p ∈ O+

m in the orbit the vector space isomorphisms (but not Hilbertspace isomorphisms) Up→Φ(A)p(0, A) : H(p)→ H(Φ(A)p) such that

Up→Φ(A)p(0, A)(Ψ(p)) := (U(0, A)Ψ)(Φ(A)p). (2.24)

for all p ∈ O+m. Sometimes we will simply write Up(0, A) instead of Up→Φ(A)p(0, A) to save some

space whenever the equations get too long. Because U(0, A) is unitary, these mappings are indeedisomorphisms of vector spaces for all p ∈ O+

m. They are not isomorphisms of Hilbert spaces (i.e.inner product preserving) because of the p0 in the denominator of equation (2.23); however, the

map√

p0

(Φ(A)p)0Up→Φ(A)p(0, A) : H(p) → H(Φ(A)p) is in fact a Hilbert space isomorphism. This

follows from the fact that

‖Ψ(p)‖2H(p)

d3p

2p0= ‖Up→Φ(A)p(0, A)Ψ(p)‖2H(Φ(A)p)

d3p

2(Φ(A)p)0

(which follows from the unitarity of U(0, A)), which in turn implies that

‖Ψ(p)‖2H(p) =p0

(Φ(A)p)0‖Up→Φ(A)p(0, A)Ψ(p)‖2H(Φ(A)p)

=

∥∥∥∥∥√

p0

(Φ(A)p)0Up→Φ(A)p(0, A)Ψ(p)

∥∥∥∥∥2

H(Φ(A)p)

.

We will use the Hilbert space isomorphisms√

p0

(Φ(A)p)0Up→Φ(A)p(0, A) later when we will define

orthonormal bases on the spaces H(p).Note that the (vector space) isomorphisms Up→Φ(A)p(0, A) satisfy

UΦ(AB)−1p(0, AB)(Ψ(Φ(AB)−1p)) = UΦ(A)−1p(0, A)UΦ(AB)−1p(0, B)(Ψ(Φ(AB)−1p)).

37

Page 39: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

In particular, if Φ(A),Φ(B) ∈ L↑+ are such that Φ(A)p = Φ(B)p = p, then this becomes

Up→p(0, AB)(Ψ(p)) = Up→p(0, A)Up→p(0, B)(Ψ(p)). (2.25)

Now fix an element k ∈ O+m in the orbit and choose14 an orthonormal basis eσ(k)σ of H(k).

We will now use the operators Up→Φ(A)p(0, A) to define an orthonormal basis for the other pointsof O+

m. First we fix for each p ∈ O+m an element Mp ∈ SL(2,C) such that p = Φ(Mp)k. Then for

each p ∈ O+m we define an orthonormal basis eσ(p)σ of H(p) by

eσ(p) :=

√k0

p0Uk→p(0,Mp)eσ(k).

Here we use that√

k0

p0Uk→p(0,Mp) is a Hilbert space isomorphism, as shown above. With this basis,

we can write each Ψ(p) ∈ H(p) as Ψ(p) =∑

σ Ψ(p, σ)eσ(p), and we can identify a vector Ψ ∈ Hwith the function Ψ(p, σ). We will see in a moment that the spaces H(p) are finite-dimensional inall physically relevant cases, so the index σ takes on a finite number of values. The Hilbert spaceH can thus be realized as a finite direct sum

⊕σ L

2(O+m, µm) of copies of L2(O+

m, µm). The innerproduct is thus given by

〈Ψ1(p, σ),Ψ2(p, σ)〉 =∑σ

∫O+m

Ψ1(p, σ)Ψ2(p, σ)dµm(p)

=∑σ

∫R3

Ψ1((p0,p), σ)Ψ2((p0,p), σ)d3p

2p0,

where in the last line p0 =√m2 + p2. Of course we could just as well have defined functions

Ψ(p, σ) with p in R3 (or R3\0 for massless particles) and realize H as⊕

σ L2(R3, d

3p2p0 ), and

we will in fact do this later. However, for the moment we will stick with Ψ(p, σ) for notationalconvenience.

Now that we have defined these bases of H(p) (given some basis of H(k)), we can express theaction of Up→Φ(A)p(0, A) for any A ∈ SL(2,C) as

Up→Φ(A)p(0, A)eσ(p) =

√k0

p0Up→Φ(A)p(0, A)Uk→p(0,Mp)eσ(k)

=

√k0

p0Uk→Φ(A)p(0, AMp)eσ(k)

=

√k0

p0Uk→Φ(A)p(0,MΦ(A)pM

−1Φ(A)pAMp)eσ(k)

=

√k0

p0Uk→Φ(A)p(0,MΦ(A)p)Uk→k(0,M

−1Φ(A)pAMp)eσ(k), (2.26)

where in the last step we used that

Φ(M−1Φ(A)pAMp)k = Φ(M−1

Φ(A)p)Φ(A)Φ(Mp)k = Φ(M−1Φ(A)p)Φ(A)p = k.

To understand (2.26) we introduce some terminology. For any point p ∈ O+m in the orbit we define

a subgroup Gp ⊂ SL(2,C) by

Gp := A ∈ SL(2,C) : Φ(A)p = p.14In steps 7a and 7b we will show how to choose such basis for the cases m > 0 and m = 0 separately.

38

Page 40: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Clearly, if p′ is another point in the orbit O+m and A ∈ SL(2,C) is such that Φ(A)p = p′, then the

groups Gp and Gp′ are isomorphic and the isomorphism from Gp to Gp′ is given by B 7→ ABA−1.The isomorphic subgroups Gpp∈O+

mare called the little group of the orbit O+

m.

In this terminology, the transformation M−1Φ(A)pAMp ∈ SL(2,C) in (2.26) is an element of the

little group Gk. In fact, it follows from (2.25) that U induces a unitary representation of Gk on theHilbert space H(k) by A 7→ Uk→k(0, A) for A ∈ Gk. For A ∈ Gk we write [Uk→k(0, A)]σ,σ′ for thematrix components of the unitary operator Uk→k(0, A) on H(k) with respect to the orthonormalbasis eσ(k). With this notation we can write (2.26) as

Up→Φ(A)p(0, A)eσ(p) =

√k0

p0Uk→Φ(A)p(0,MΦ(A)p)

(∑σ′

[Uk→k(0,M−1Φ(A)pAMp)]σ′,σeσ′(k)

)

=

√k0

p0

∑σ′

[Uk→k(0,M−1Φ(A)pAMp)]σ′,σ Uk→Φ(A)p(0,MΦ(A)p)eσ′(k)︸ ︷︷ ︸

=

√(Φ(A)p)0

k0 eσ′ (Φ(A)p)

=

√(Φ(A)p)0

p0

∑σ′

[Uk→k(0,M−1Φ(A)pAMp)]σ′,σeσ′(Φ(A)p) (2.27)

Using (2.24) with Φ(A)−1p instead of p, we then find that

(U(0, A)Ψ)(p) = UΦ(A)−1p→p(0, A)(Ψ(Φ(A)−1p))

=∑σ

Ψ(Φ(A)−1p, σ)UΦ(A)−1p→p(0, A)eσ(Φ(A)−1p)

=∑σ

Ψ(Φ(A)−1p, σ)

√p0

(Φ(A)−1p)0

∑σ′

[Uk→k(0,M−1p AMΦ(A)−1p)]σ′,σeσ′(p)

=∑σ′

√p0

(Φ(A)−1p)0

∑σ

[Uk→k(0,M−1p AMΦ(A)−1p)]σ′,σΨ(Φ(A)−1p, σ)

eσ′(p).

Because we also have (U(0, A)Ψ)(p) =∑

σ(U(0, A)Ψ)(p, σ)eσ(p), we thus conclude that

(U(0, A)Ψ)(p, σ) =

√p0

(Φ(A)−1p)0

∑σ′

[Uk→k(0,M−1p AMΦ(A)−1p)]σ,σ′Ψ(Φ(A)−1p, σ′).

This shows that the action of U(0, A) on H is completely determined once we know the action ofUk→k(0, A) on H(k) for all little group elements A ∈ Gk. In other words, we have reduced the

problem of finding irreducible unitary representations of P↑+ to the problem of finding irreducibleunitary representations of Gk on the Hilbert space H(k). We will now briefly discuss these repre-sentations of Gk. We separate two cases: m > 0 and m = 0.

•Step 7a: m > 0If m > 0, then the little group Gk is SU(2), the double cover of the rotation group SO(3). This canbe seen easily by choosing k = (m, 0, 0, 0). The only restricted Lorentz transformations that leave(m, 0, 0, 0) invariant are rotations, i.e. elements of SO(3) (Note that these are the only restrictedLorentz transformations that leave the zeroth component of a vector invariant, so there cannot beany other restricted Lorentz transformations that leave (m, 0, 0, 0) invariant). Therefore, the littlegroup in this case is Gk = Φ−1(SO(3)) = SU(2). This can also be seen directly. The image of kunder the map ψ :M→ H(2,C) is ψ(k) = m1C2 , so an element A ∈ SL(2,C) is in Gk if and onlyif m1C2 = A(m1C2)A∗ = mAA∗, or A−1 = A∗. This shows that A ∈ SL(2,C) must be unitary, i.e.A ∈ SU(2) and hence Gk = SU(2).

The irreducible representations of SU(2) are finite-dimensional and are labelled by the param-eter s ∈ 0, 1

2 , 1,32 , . . ., where 2s + 1 is the dimension of the representation. Because SU(2) is

39

Page 41: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

a simply-connected Lie group, all these representations can be characterized by the irreducible2s + 1-dimensional representations D(s) : su(2) → V2s+1 of its Lie algebra su(2). If we choose abasis of the 2s+ 1-dimensional vector space V2s+1 in which D(s)( 1

2iσ3) is diagonal, we can specify

D(s) by[D(s)

(1

2iσ3

)]σσ′

= −iσδσ,σ′[D(s)

(1

2iσ1

)]σσ′

= −i(δσ′,σ+1

√(s− σ)(s+ σ + 1) + δσ′,σ−1

√(s+ σ)(s− σ + 1)

)[D(s)

(1

2iσ2

)]σσ′

= −(δσ′,σ+1

√(s− σ)(s+ σ + 1)− δσ′,σ−1

√(s+ σ)(s− σ + 1)

)where the row and column indices σ and σ′ run from s to −s. We will denote the representation ofSU(2) corresponding to the representation D(s) of su(2) by D(s). We thus conclude that the Hilbertspace H(k) is equal to C2s+1 for some j ∈ 1

2Z≥0 and that Uk→k(0, A) = D(s)(A) for A ∈ SU(2).Note that we have implicitly chosen the orthonormal basis eσ(k)σ on H(k) = C2s+1 to be a set ofeigenvectors of D(s)( 1

2iσ3), and (as described above) this also defines orthonormal bases eσ(p)σ

for H(p) = C2s+1 at all other points p ∈ O+m. The representation U is now given by

Up→Φ(A)p(0, A)eσ(p) =

√(Φ(A)p)0

p0

∑σ′

[D(s)(M−1Φ(A)pAMp)]σ′σeσ′(Φ(A)p).

In terms of the functions Ψ(p, σ) this reads

(U(0, A)Ψ)(p, σ) =

√p0

(Φ(A)−1p)0

∑σ′

[D(s)(M−1p AMΦ(A)−1p)]σ,σ′Ψ(Φ(A)−1p, σ′).

On H(k) ' C2s+1, we define the hermitian operators

S(s),j(k) = iD(s)

(1

2iσj)

(2.28)

for j = 1, 2, 3, which we will call the spin operators at the point k. They satisfy

[S(s)(k)]2 =3∑j=1

[S(s),j(k)

]2= s(s+ 1)1H(k)

and[S(s),a(k), S(s),b(k)] = iεabcS

(s),c(k). (2.29)

The three generators M j commute with P 0 and the operators P j are zero operators on H(k) sothey commute trivially with the M j , so the M j leave the space H(k) invariant. We can thereforedefine the operators M j(k) : H(k)→ H(k) in the obvious way. We now find

eitS(s),j(k) = eiD

(s)( t2iσj) = Uk→k(0, e

t2iσj ) = eitM

j(k).

So S(s),j(k) = M j(k). We will now use this to show that S(s),j(k) is proportional to W j(k).Because PµW

µ = 0, it follows that W 0(k) = 0. For the other components of W (k) we find that

W i(k) =

(−1

2εiνρσMνρPσ

)(k) =

(−1

2εiνρσMνρkσ

)(k) = −1

2mεijl0Mjl(k)

=1

2mε0ijlMjl(k) =

m

2

∑j,l

ε0ijlM jl(k) = mM i

= mS(s),i(k).

40

Page 42: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

In other words,

S(s)(k) =1

mW(k). (2.30)

In particular, this implies that W (k)2 = −[W(k)]2 = −m2[S(s)(k)]2 = −m2s(s + 1)1H(k) andtherefore that

W 2 = −m2s(s+ 1)1H,

since we already knew that W 2 is a scalar multiple of 1H. We now define the spin operator S(s),j

on H by

S(s),j =1

m

(W j − W 0P j

m+ P 0

). (2.31)

It is clear that, since this operator is constructed from the Pµ and Wµ, it leaves all spaces H(p)invariant, and therefore we can define spin operators S(s),j(p) at any point p ∈ O+

m. The reasonfor the definition (2.31) is that the three operator components of S(s) form a (pseudo-)vector, i.e.[Ma, S(s),b] = iεabcS

(s),c. Also, at the point k ∈ O+m this definition reproduces (2.30) and at each

point p in the orbit the commutation relations (2.29) hold. It can in fact be shown that this is theunique operator that is a linear combination of the Wµ with coefficients that are functions of Pµ

and that satisfies all these properties, see section 7.2C of [2]. The action of S(s),3 on a functionΨ(p, σ) is given by

(S(s),3Ψ)(p, σ) = σΨ(p, σ),

but we will not prove this.

•Step 7b: m = 0For m = 0 we choose k to be the vector k = (1, 0, 0, 1). Under the map ψ : M → H(2,C) asdefined in section 2.1.2 the vector k corresponds to the hermitean matrix

ψ(k) =

(k0 + k3 k1 − ik2

k1 + ik2 k0 − k3

)=

(2 00 0

).

For an arbitrary 2× 2-matrix A with components Aij (i, j = 1, 2) the condition Aψ(k)A∗ = ψ(k)implies that |A11|2 = 1 and A21 = 0. If A is also in SL(2,C) then we must have 1 = det(A) =A11A22 − A12A21 = A11A22, which implies that A22 = A−1

11 = A11. Thus, A ∈ SL(2,C) is in thelittle group Gk of k if and only if it is of the form

Aα,z =

(eiα z0 e−iα

)with α ∈ R and z = z1 + iz2 ∈ C. If α = 0 we can obtain A0,z by

A0,z = ez1( 12σ1− 1

2iσ2)+z2(− 1

2iσ1− 1

2σ2) (2.32)

and if α 6= 0 and α 6= π, we can obtain Aα,z by

Aα,z = e−2α 12iσ3+ α

sinα [z1( 12σ1− 1

2iσ2)+z2(− 1

2iσ1− 1

2σ2)]. (2.33)

Here we used that for c = (c1, c2, c3) ∈ C3 we have that

ec·σ =

(cosh

√c2

1 + c22 + c2

3

)1C2 +

sinh(√

c21 + c2

2 + c23

)√c2

1 + c22 + c2

3

c · σ (2.34)

if c21 + c2

2 + c23 6= 0 and

ec·σ = 1C2 + c · σ (2.35)

41

Page 43: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

if c21 + c2

2 + c23 = 0. In the case α = 0 we chose c = ( z2 ,

iz2 , 0) and applied (2.35); in the case where

α 6= 0 and α 6= π we chose c = ( zα2 sinα ,

izα2 sinα , iα) and applied (2.34). Note that the elements Aα,z

of Gk satisfy the algebraic properties

Aα1,0Aα2,0 = Aα1+α2,0

A0,z1A0,z2 = A0,z1+z2

Aα,0A0,zA−α,0 = A0,ze2iα .

In order to understand this group, we need to recall the definition of the group E+(2), the properEuclidean group in two dimensions. The group E+(2) acts on the plane R2 and is generated bythe translations T (~v) over a vector ~v ∈ R2 and the rotations R(θ) around an angle θ ∈ [0, 2π).These generators satisfy

R(θ1)R(θ2) = R(θ1 + θ2)

T (~v1)T (~v2) = T (~v1 + ~v2)

R(θ)T (~v)R(−θ) = T (R(θ)~v).

Comparing these two groups, we observe that Gk is the double cover E+(2) of E+(2). The elementsA 1

2α,0 and A0,z satisfy the same algebraic properties as R(θ) and T (~v), respectively, only the range

of α runs from 0 to 4π, while the range of θ runs from 0 to 2π. The only finite-dimensionalirreducible unitary representations of Gk are one-dimensional, and are given by

D(σ)(Aα,z) = e2iσα1H(k)

for σ ∈ 12Z. Here H(k) ' C is one-dimensional. All other representations are infinite-dimensional,

but they turn out to be physically irrelevant. Thus, U is given by

Up→Φ(A)p(0, A)eσ(p) =

√(Φ(A)p)0

p0e

2iσα(M−1Φ(A)p

AMp)eσ(Φ(A)p), (2.36)

where the index σ can only take on one value, since the representation of Gk is one-dimensional,and α(M) denotes the angle α in M = Aα,z for M ∈ Gk. In terms of the functions Ψ(p, σ) thisreads

(U(0, A)Ψ)(p, σ) =

√p0

(Φ(A)−1p)0e

2iσα(M−1Φ(A)p

AMp)Ψ(Φ(A)−1p, σ).

It follows from (2.32) and (2.33) that the Lie algebra of Gk ' E+(2) is spanned by the elements

R :=1

2iσ3,

T1 := − 1

2iσ2 +

1

2σ1

T2 := − 1

2iσ1 − 1

2σ2.

The Lie algebra representation D(σ) induced by D(σ) maps the basis element R to −iσ1H(k),because

e−2αD(σ)( 12iσ3) = D(σ)(e−2α 1

2iσ3

) = D(σ)(Aα,0) = e2iσα.

A similar calculation shows that the other two basis elements of the Lie algebra are mapped to 0by D(σ). On the space H(k) we define the operator

λ(k) := iD(σ)

(1

2iσ3

)= σ1H(k),

which we will call the helicity operator at the point k. Because [M3, Pµ] = [M12, Pµ] = i(η2µP 1 −η1µP 2) and because P 1 and P 2 are the zero operators on H(k), we find that M3 leaves the space

42

Page 44: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

H(k) invariant, so we can define an operator M3(k) : H(k) → H(k) in the obvious way. By thesame reasoning as for m > 0 we then find that λ(k) = M3(k). The Pauli-Lubanski operatorat the point k is Wµ(k) = −1

2εµνρσMνρkσ = −1

2(εµνρ0 − εµνρ3)Mνρ, where we have used thatk3 = −k3 = −1. Writing out these expressions gives15

W 0(k) = M3(k) = λ(k)

W 1(k) = (M1 +N2)(k) = 0

W 2(k) = (M2 −N1)(k) = 0

W 3(k) = M3(k) = λ(k),

so Wµ(k) = kµλ(k) = σkµ1H(k). We have thus found that Wµ(k) is proportional to kµ and, inparticular, that [W (k)]2 = Wµ(k)Wµ(k) = 0. Because W 2 = WµW

µ is a scalar multiple of theidentity operator on H, this gives

W 2 = WµWµ = 0.

Because we always have PµWµ = 0 and because P 2 = m21H = 0, we conclude that Wµ must be

proportional to Pµ. Since Wµ(k) = σkµ1H(k), the proportionality constant is σ and we obtain

Wµ = σPµ.

On the Hilbert space H we now define the helicity operator by

λ =W 0

P 0=

M ·P|P|

.

We will now briefly discuss the image of Gk ⊂ SL(2,C) under the covering map Φ : SL(2,C)→L↑+, because this is not found in any of the literature that we have used. The image of an arbitraryelement Aα,z is easily obtained from ψ(Φ(Aα,z)x) = Aα,zψ(x)A∗α,z, or(

(Φ(Aα,z)x)0 + (Φ(Aα,z)x)3 (Φ(Aα,z)x)1 − i(Φ(Aα,z)x)2

(Φ(Aα,z)x)1 + i(Φ(Aα,z)x)2 (Φ(Aα,z)x)0 − (Φ(Aα,z)x)3

)=

((1 + |z|2)x0 + 2Re[zeiα(x1 − ix2)] + (1− |z|2)x3 e2iα(x1 − ix2) + zeiα(x0 − x3)

e−2iα(x1 + ix2) + ze−iα(x0 − x3) x0 − x3

).

After some straightforward computations this gives

Φ(Aα,z) =

1 +

z21+z2

22 z1 cosα+ z2 sinα z1 sinα− z2 cosα − z2

1+z22

2z1 cosα− z2 sinα cos 2α sin 2α −z1 cosα+ z2 sinα−z1 sinα− z2 cosα − sin 2α cos 2α z1 sinα+ z2 cosα

z21+z2

22 z1 cosα+ z2 sinα z1 sinα− z2 cosα 1− z2

1+z22

2

= R3(−α)

1 +

z21+z2

22 z1 −z2 − z2

1+z22

2z1 1 0 −z1

−z2 0 1 z2z21+z2

22 z1 −z2 1− b21+b22

2

R3(−α),

where R(α) denotes a rotation around the x3-axis over an angle α (counterclockwise, as seen froma point with positive x3-coordinate). In particular, this computation shows that Aα,0 is mappedonto the rotation R3(−2α). Note that it was to be expected that R3(α)α∈[0,2π) would be asubgroup of Φ(Gk) because k = (1, 0, 0, 1) is invariant under rotations around the x3-axis.

In the book [35] of Weinberg the group Φ(Gk) ⊂ L↑+ is computed directly as follows16. Let

W ∈ L↑+ be such that Wk = k. Then for the vector t = (1, 0, 0, 0) we have 1 = t·k = (Wt)·(Wk) =

15Here we use that eit(M1+N2)(k) = Uk→k(0, e−tT2) = e−D

(σ)(T2) = 1H(k) and eit(M2−N1)(k) = Uk→k(0, e−tT1) =

e−D(σ)(T1) = 1H(k).

16Weinberg does not use the universal covering group at all. As a consequence, he needs to consider double-valuedrepresentations, i.e. representations up to a sign, of the little groups SO(3) and E+(2).

43

Page 45: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

(Wt) · k = (Wt)0 − (Wt)3, so we can write Wt as Wt = (1 + ct, at, bt, ct) with at, bt, ct ∈ R. Also1 = t · t = (Wt) · (Wt) = (1 + ct)

2 − a2t − b2t − c2

t , from which it follows that ct can be expressed interms of at and bt as ct = ct(at, bt) = (a2

t +b2t )/2. Now define the restricted Lorentz transformationsSa,b by

Sa,b =

1 + c(a, b) a b −c(a, b)

a 1 0 −ab 0 1 −b

c(a, b) a b 1− c(a, b)

∈ L↑+,where c(a, b) := (a2 + b2)/2. Using that c(a + a′, b + b′) = c(a, b) + c(a′, b′) + aa′ + bb′, it follows

easily that Sa,bSa′,b′ = Sa+a′,b+b′ , so the Sa,b form an abelian subgroup of L↑+. Note that Sat,btt =(1 + ct, at, bt, ct) = Wt and thus that S−1

at,btWt = t, which shows that

S−1at,bt

W ∈ Φ(Gt) = SO(3).

Because Sa,bk = k for all a, b ∈ R, we also have S−1at,bt

Wk = k, so S−1at,bt

W ∈ SO(3) must be a

rotation that leaves the x3-component invariant, i.e. it must be a rotation R3(θ) around the 3-axis.

We thus conclude that W = Sat,btR3(θ). A general element in L↑+ that leaves k invariant is thusof the form Wa,b,θ = Sa,bR3(θ), and these elements do indeed form a group. As we have already

seen above, the elements Wa,b,0 form an abelian subgroup of L↑+ with multiplication given by

Wa,b,0Wa′,b′,0 = Wa+a′,b+b′,0.

The elements W0,0,θ also from an abelian subgroup with multiplication given by

W0,0,θW0,0,θ′ = W0,0,θ+θ′ .

Furthermore, we also have

W0,0,θWa,b,0W0,0,−θ = Wa cos θ+b sin θ,−a sin θ+b cos θ,0.

Thus the group formed by the elements Wa,b,θ is isomorphic to the two dimensional Euclideangroup E+(2), i.e. the group of translations and rotations in the plane; the isomorphism is givenby identifying Wa,b,0 with a translation in the plane over the vector (a, b) and identifying W0,0,θ

with a rotation in the plane around an angle θ.

Physical interpretation of the irreducible representationsWe have now fully classified those irreducible unitary representations of P↑+ that are relevant inphysics. Quantum systems in which the pure state vectors Ψ ∈ H transform as such representa-tions of P↑+ are interpreted as one-particle states. The label m is interpreted as the mass of theparticle and the operators Pµ are interpreted as the four-momentum operators corresponding tothe one-particle system (this last fact is made more rigorous in [1], section 3.6). We found that theone-particle states are ∪p∈O+

mH(p)-valued functions of the four-momentum p with Ψ(p) ∈ H(p),

but since p0 =√m2 + p2, we will from now on write the one-particle states as functions Ψ(p) of

the three-momentum p. The generators M j of rotations around the xj-axis are interpreted as thexj-component of the angular momentum of the particle.

If m > 0 and if the representation of SU(2) is 2s+ 1-dimensional, then the state vectors of theparticle are functions Ψ(p, σ), where p ∈ R3 and σ ∈ −s, . . . , s. The label s ∈ 1

2Z≥0 is calledthe spin of the particle and because

(S(s),3Ψ)(p, σ) = σΨ(p, σ),

the label σ ∈ −s, . . . , s denotes the spin component along the x3-direction. The operators S(s)

contribute to the total angular momentum M of the particle and for a particle at rest we have

44

Page 46: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

in fact M = S(s). For a massive particle with spin s the probability of finding the particle withthree-momentum p in some Borel set B ⊂ R3 and spin x3-component σ is∫

B|Ψ(p, σ)|2 d

3p

2ωp,

where we definedωp :=

√m2 + p2.

For this reason, we may interpret

ψ(p, σ) :=1√2ωp

Ψ(p, σ)

as the momentum-spin wave function of the particle. This momentum-spin wave function ψ is

square-integrable with respect to d3p, not with respect to d3p2ωp

. We will denote the Hilbert space ofsuch momentum-spin wave functions by H, instead of H. Thus, we can describe one-particle stateseither by elements in Ψ in H or by elements ψ in H, both descriptions being unitarily equivalent(and hence physically equivalent) to each other. The map J : Ψ 7→ 1√

2ωpΨ = ψ which relates two

physically equivalent elements with each other, provides the unitary map from H onto H. Thismap should be interpreted as some kind of change of variables on the space of functions. Therepresentation U of P↑+ on H can easily be made into a representation u of P↑+ on H by using themap J :

[u(a,A)ψ](p, σ) = [JU(a,A)J−1ψ](p, σ) =1√2ωp

[U(a,A) (√

2ωpψ)︸ ︷︷ ︸∈H

](p, σ).

From now on we will write both representations U and u as U ; it will always be clear from thecontext which one is meant. In the following chapters we will need to switch often between bothHilbert spaces H and H to describe one-particle states, but we will always make a very explicitdistinction between the two descriptions. The Fourier transform of the momentum-spin wavefunction can be interpreted as the position (and spin) wave function. The position operators Xj

act on the momentum space wave function as the operators Xj = i ∂∂pj

. Therefore, the action of

Xj on Ψ(p, σ) is given by the operator

Xj =√

2p0i∂

∂pj1√2p0

.

If m = 0 then the Hilbert spaces H(p) are all one-dimensional and therefore the state vectorsare functions Ψ(p) of the three-momentum. The action of the helicity operator λ on Ψ is just amultiplication by σ, i.e. λΨ = σΨ; here σ denotes the label that occurs in the classification ofthe one-dimensional representations of (the double cover of) E+(2). The absolute value |σ| of thelabel σ is called the spin of the particle and σ itself is called the helicity of the particle. Becausefor the helicity we have

σ1H =M ·P|P|

,

the helicity measures the angular momentum of the particle in the direction of its three-momentump. Unlike the spin components σ for massive particles, the quantity σ for massless particles is fixedfor a given particle type. For example, if neutrinos are massless (which might not be the case),then they would have σ = −1

2 and anti-neutrinos would have σ = 12 . However, as we will see

after equation (2.39) below, there are cases where the particles with helicities σ and −σ shouldbe identified. In that case H is a direct sum of two irreducible representations, corresponding tohelicity σ and −σ, and the state vectors are described by functions Ψ(p, σ) with σ taking on thetwo values ±σ. From now on we will always use the notation Ψ(p, σ) to denote the state vector ofa massless particle, even if σ can only take on one value. As for massive particles, we also define

45

Page 47: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

the Hilbert space H of momentum-spin wave functions ψ(p, σ) for massless particles. However, formassless particles it is not possible to give a satisfactory definition of position operators.

Space inversion and time reversalSome quantum systems are not only invariant under P↑+, but also under the action of a spaceinversion Is : (x0,x) 7→ (x0,−x) or time reversal It : (x0,x) 7→ (−x0,x) or the combinationIsIt : x 7→ −x. By Wigner’s theorem, these transformations can then be represented by unitary orantiunitary operators P, T and PT on the Hilbert space H corresponding to the quantum system.It can be shown (see section 2.6 of [35]) that in order to avoid the existence of negative-energystates, we must choose P to be linear and unitary and we must choose T to be antilinear andantiunitary. Without any derivation, we will now simply give the action of P and T on statestransforming irreducibly under P↑+, i.e. on states that represent one-particle states.

For massive particles the action of P is given by

Peσ(p) = ξeσ(Isp), (2.37)

where ξ is a phase factor that only depends on the species of particle; it is called the intrinsicparity of the particle. The action of T is given by

Teσ(p) = ζ · (−1)s−σe−σ(Isp), (2.38)

where ζ is a phase factor that only depends on the species of particle and s is the spin of the particle.In contrast to the intrinsic parity ξ, the phase factor ζ has no physical significance because whenwe redefine eσ(p) by ζ1/2eσ(p), the factor ζ cancels out in equation (2.38). This trick does notwork for the intrinsic parity ξ because P is linear, rather than antilinear.

For massless particles the action of P is given by

Peσ(p) = ξσ · eiπε(p)σe−σ(Isp), (2.39)

where ξσ is a phase factor and ε(p) ∈ −1, 1 is the sign of the x2-component of p. Thus, if atheory is invariant under space inversions then massless particles in this theory with some σ shouldbe identified with those particles obtained by substituting σ → −σ. This happens for instance inquantum electrodynamics, where the massless particles with σ = 1 and σ = −1 are identified andare both refered to as photons. The action of T is given by

Teσ(p) = ζσ · eiπε(p)σeσ(Isp), (2.40)

where ζσ is a phase factor and ε(p) is as above.

2.2.5 Many-particle states and Fock space

Most of the material covered in this subsection can be found in one of the texts [1], [2] or [8].Suppose that we have a system consisting of n non-interacting distinguishable particles; heredistinguishable means that all particles are of a different type. If the individual particles aredescribed by one-particle states in Hilbert spaces17 H1, . . . ,Hn, then the total system is describedby the Hilbert space H1 ⊗ . . . ⊗ Hn, and the algebra of observables of the total system is thetensor product of the algebras of observables on the one-particle states. If the system consists of nnon-interacting particles of the same type then the Hilbert space of the system is still H⊗n, with Hthe Hilbert space for a single particle of the given type, but not all unit rays in this Hilbert spacerepresent physically realizable states. This last fact comes from the fact that in quantum mechanicstwo particles of the same type cannot be distinguished when they form a single system; there isno way of keeping track of which particle is which. Mathematically, this means the following.

17Here and in the rest of this section all the one-particle Hilbert spaces H either represent the spaces of statefunctions Ψ(p, σ) or else they all represent the momentum-spin wave functions ψ(p, σ), but one should be consequentin this choice. Thus, we either have H = H in all definitions, or else H = H in all definitions.

46

Page 48: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

If Sn denotes the symmetric group on n objects, we define for each σ ∈ Sn a unitary operatorR(n)(σ) : H⊗n → H⊗n by

R(n)(σ)(h1 ⊗ . . .⊗ hn) = hσ(1) ⊗ . . .⊗ hσ(n).

Note that this defines a unitary representation of Sn in H⊗n. The statement that the particles aretruely indistinguishable is then equivalent to saying that for any physically realizable pure stateh ∈ H⊗n and for any σ ∈ Sn we must have18

R(n)(σ)h = λ(σ, n, h)h (2.41)

with λ(σ, n, h) a complex number of absolute value 1. Note that it follows from the linearityof R(n) that for any nonzero complex number c we have R(n)(σ)(ch) = λ(σ, n, h)ch, so that λsatisfies λ(σ, n, ch) = λ(σ, n, h); this holds in particular whenever |c| = 1 and therefore λ(σ, n, h)is independent of the choice of unit vector in the unit ray R(h). For σ1, σ2 ∈ Sn and for anyphysically realizable state h ∈ H⊗n we have that

λ(σ1σ2, n, h)h = R(n)(σ1σ2)h = R(n)(σ1)R(n)(σ2)h = λ(σ1, n, h)λ(σ2, n, h)h,

so for any physically realizable state h ∈ H⊗n the map λ(., n, h) : Sn → U(1) defines a 1-dimensional representation of Sn on the space Ch. But the only two 1-dimensional representationsof Sn are the completely symmetric one, λS(σ) = 1 for all σ ∈ Sn, and the completely antisymmetricone, λA(σ) = ε(σ) for all σ ∈ Sn, where ε : Sn → −1, 1 denotes the sign of the permutation. Thus,for any physically realizable state h ∈ H⊗n we either have λ(σ, n, h) = λS(σ) or λ(σ, n, h) = λA(σ).In the first case we say that h is a completely symmetric state and in the second case we say thath is a completely antisymmetric state. The set of symmetric state vectors forms a linear subspaceof H⊗n which we denote by Fn+(H) and the set of antisymmetric state vectors forms a linearsubspace of H⊗n which we denote by Fn−(H). The orthogonal projections P+

n : H⊗n → Fn+(H)and P−n : H⊗n → Fn−(H) onto Fn±(H) are given by

P+n =

1

n!

∑σ∈Sn

R(σ)

P−n =1

n!

∑σ∈Sn

ε(σ)R(σ).

It is obvious that a superposition of a state in Fn+(H) with a state in Fn−(H) does not satisfy(2.41) and is therefore not physically realizable. It turns out that for any given particle typeoccuring in nature, the state vectors of an n-particle system of particles of that type are eitheralways symmetric or else always antisymmetric. Furthermore, the choice between the symmetricand antisymmetric case is the same for each n, so for any given particle type we either have thatthe space of physically realizable states of n particles is Fn+(H) and that λ(σ, n, h) = λS(σ) for allh ∈ Fn+(H) or else we have that the space of physically realizable states of n particles is Fn−(H)and that λ(σ, n, h) = λA(σ) for all h ∈ Fn−(H). In the first case we say that the given particle is aboson and in the second case we say that it is a fermion.

18There are more general possibilities here than a phase factor λ, the more general condition being that R(n)(σ)his physically indistinguishable from h. Instead of by unit rays, physical states are then mathematically describedby the (higher dimensional) generalized unit rays, see also [24]. Below we will see that λ defines a 1-dimensionalrepresentation of the permutation group. In the more general case one then also considers higher dimensionalrepresentations of the permutation group Sn induced by R(n) on H⊗n. This more general theory is also referred toas parastatistics and is of an entirely different nature than the generalizations that one encounters in 3-dimensionalspacetime (i.e. braid statistics, anyons, etc.). Note that in the case of two particles (n = 2) there is actually nogeneralization because H⊗2 decomposes precisely into a direct sum of the two subspaces corresponding to the two1-dimensional representations of S2. This is related to the fact that we can write v⊗w as 1

2(v⊗w+w⊗ v) + 1

2(v⊗

w − w ⊗ v).

47

Page 49: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Because we often have to consider systems of identical particles in which the number of particlesmay change, we introduce the Hilbert spaces

F±(H) =∞⊕n=0

Fn±(H),

where F0±(H) ' C represents the vacuum state, i.e. the state with no particles. We will choose a

unit vector Ω ∈ F0±(H) and call it the Fock vacuum. For each one-particle state vector h ∈ H we

define a (densely defined) operator A∗±(h) : F±(H)→ F±(H) by

A∗±(h)Ω = h (2.42)

A∗±(h)P±n (h1 ⊗ . . .⊗ hn) =√n+ 1P±n+1(h⊗ h1 ⊗ . . .⊗ hn). (2.43)

This operator maps Fn±(H) into Fn+1± (H) by ’creating’ an extra particle with state vector h. For

this reason A∗±(h) is called a creation operator. Note that it is defined on the dense subspace

D± =∞⋃n=0

n⊕j=0

F j±(H)

of F±(H), and that it leaves this subspace invariant. Furthermore, it can be shown that theoperator A∗±(h) is closable. Finally, note that the mapping h 7→ A∗±(h) is linear. For vectorsh1, . . . , hn ∈ H it follows easily that

A∗±(h1) . . . A∗±(hn)Ω =√n!P±n (h1 ⊗ . . .⊗ hn).

The inner product on D± can be expressed as

〈P±n (h1 ⊗ . . .⊗ hn), P±m(g1 ⊗ . . .⊗ gm)〉D± =1

n!m!

∑σ∈Sn,σ′∈Sm

ε±(σ)ε±(σ′)

〈hσ(1) ⊗ . . .⊗ hσ(n), gσ′(1) ⊗ . . .⊗ gσ′(m)〉

=δnm(n!)2

∑σ,σ′∈Sn

ε±(σσ′)〈hσ(1), gσ′(1)〉 . . . 〈hσ(n), gσ′(n)〉

=δnm(n!)2

∑σ,σ′∈Sn

ε±(σσ′)︸ ︷︷ ︸=ε±(σ′σ−1)

〈h1, gσ′σ−1(1)〉 . . . 〈h1, gσ′σ−1(n)〉

=δmnn!

∑σ∈Sn

ε±(σ)〈h1, gσ(1)〉 . . . 〈hn, gσ(n)〉,

where ε+(σ) = 1 and ε−(σ) = ε(σ) for all σ ∈ Sn. The action of the adjoint operator A±(h) :F±(H)→ F±(H) of A∗±(h) on the subspace D± is given by

A±(h)Ω = 0 (2.44)

A±(h)P±n (h1 ⊗ . . .⊗ hn) =1√n

n∑j=1

(±1)j−1〈hj , h〉P±n−1(h1 ⊗ . . . , hj−1 ⊗ hj+1, . . . hn). (2.45)

Because this operator maps Fn±(H) into Fn−1± (H), it is called an annihilation operator. Because

A±(h) is the adjoint of a densely defined operator, it is a closed operator. Unlike h 7→ A∗±(h), themapping h 7→ A±(h) is antilinear. Note furthermore that A±(h) is the restriction to the spaceF±(H) of the operator B(h) :

⊕∞n=0H

⊗n →⊕∞

n=0H⊗n, defined by

B(h)(h1 ⊗ . . .⊗ hn) =√n〈h1, h〉h2 ⊗ . . .⊗ hn.

48

Page 50: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Indeed, when we apply B(h) to the vector P±n (h1 ⊗ . . .⊗ hn), we get precisely the right-hand sideof (2.45). The creation and annihilation operators satisfy the following relations:

[A∗±(h1), A∗±(h2)]∓ = [A±(h1), A±(h2)]∓ = 0

[A±(h1), A∗±(h2)]∓ = 〈h2, h1〉1F±(H),

where [X,Y ]± = XY ± Y X.Finally, we note that when we have an operator L on the one-particle Hilbert space H, we can

define an operator Γ±(L) on D± by

Γ±(L)P±n (h1 ⊗ . . .⊗ hn) = P±n (Lh1 ⊗ . . .⊗ Lhn).

Here, by definition, we set Γ±(L)Ω = Ω. The operator Γ±(L) thus leaves all Fn±(H) invariant.Note that if L is invertible, then

Γ±(L)A∗±(h)Γ±(L)−1P±n (h1 ⊗ . . .⊗ hn) = Γ±(L)A∗±(h)P±n (L−1h1 ⊗ . . .⊗ L−1hn)

=√n+ 1Γ±(L)P±n+1(h⊗ L−1h1 ⊗ . . .⊗ L−1hn)

=√n+ 1P±n+1(Lh⊗ h1 ⊗ . . .⊗ hn)

= A∗±(Lh)P±n (h1 ⊗ . . .⊗ hn),

so if L is invertible thenΓ±(L)A∗±(h)Γ±(L)−1 = A∗±(Lh) (2.46)

on D±. In case L is also unitary, so that L−1 = L∗, we have

Γ±(L)A±(h)Γ±(L)−1P±n (h1 ⊗ . . .⊗ hn) = Γ±(L)A±(h)P±n (L−1h1 ⊗ . . .⊗ L−1hn)

=1√n

Γ±(L)n∑j=1

(±1)j−1〈h, L−1hj〉

P±n−1(L−1h1 ⊗ . . .⊗ L−1hj−1 ⊗ L−1hj+1 ⊗ . . .⊗ L−1hn)

=1√n

n∑j=1

(±1)j−1〈Lh, hj〉

P±n−1(h1 ⊗ . . .⊗ hj−1 ⊗ hj+1 ⊗ . . .⊗ hn)

= A±(Lh)P±n (h1 ⊗ . . .⊗ hn),

so if L is unitary thenΓ±(L)A±(h)Γ±(L)−1 = A±(Lh) (2.47)

on D±. If L and K are operators on H then Γ±(LK) = Γ±(L)Γ±(K), and we also have Γ±(1H) =1F±(H). Thus, if φ : G → B(H) is a (unitary) representation of a group G on H, then Γ±(φ) :=Γ± φ defines a (unitary) representation of G on F±(H). In particular, this holds for the unitary

representation19 U of P↑+ on H, so we get a unitary representation Γ±(U) on F±(H). Thisrepresentation is clearly reducible and the only vector in F±(H) which is invariant under Γ±(U)

is the vacuum vector Ω. Using (2.46) and (2.47) we also find that for any (a,A) ∈ P↑+ we have

Γ±(U(a,A))A∗±(h)Γ±(U(a,A))−1 = A∗±(U(a,A)h)

Γ±(U(a,A))A±(h)Γ±(U(a,A))−1 = A±(U(a,A)h).

These are the transformation properties of the creation and annihilation operators under thegroup P↑+ and we will need them several times in the following chapters. Of course, if there arealso parity and time reversal operators P and T defined on the one-particle space H, then we alsoobtain operators Γ±(P) and Γ±(T) on F±(H).

19Recall from section 2.2.4 that we use the same notation (namely U) to denote the representations of P↑+ onH = H and H = H.

49

Page 51: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

In case we have a system that contains different types ττ∈T of particles that are not interact-ing, we proceed as follows. Let H [τ ] denote the Hilbert space of one-particle states for the particletype τ . So the vectors in this Hilbert space transform according to the irreducible unitary repre-sentation of P↑+ with mass mτ and spin sτ (or helicity στ ), as described in the previous section.We then partition the set T of all particle types into two disjoint subsets TB and TF , consisting ofall particle types that are bosons or fermions, respectively. Then the Hilbert spaces HB

1 and HF1

of boson, respectively fermion, one-particle state vectors are defined by

HB1 :=

⊕τ∈TB

H [τ ], HF1 :=

⊕τ∈TF

H [τ ].

If we write TB = τB,1, . . . , τB,kB and TF = τF,1, . . . , τF,kF , then an arbitrary vector h in HB/F1

can be written as a sumh(τB/F,1)⊕ · · · ⊕ h(τB/F,kB/F )

with h(τB/F,j) ∈ H [τB/F,j ]. Because each h(τB/F,j) is in turn a function of p and σ, we can write

h ∈ HB/F1 as a function20 h(τ,p, σ), where the set of possible values of σ depends on τ . The inner

product of two such functions h and g is then given by

〈h, g〉HB/F1

=∑

τ∈TB/F

〈h(τ), g(τ)〉H[τ ]

=∑

τ∈TB/F

∑σ∈Iτ

∫R3

h(τ,p, σ)g(τ,p, σ)dλ(p) (2.48)

where Iτ = −sτ , . . . , sτ if τ is a massive particle with spin sτ , Iτ = στ if τ is a massless particlewith helicity στ and Iτ = −στ , στ if τ is a massless particle with possible helicities ±στ (see also

the discussion above about parity); the volume element dλ(p) is either d3p or d3p2ωp

, depending on

whether h, g ∈ HB/F1 or h, gΨ ∈ HB/F1 , respectively. The n-fold tensor product

(HB/F1

)⊗ncan be

identified with the closed linear span of all product functions

h1 ⊗ . . .⊗ hn ' h1(τ1,p1, σ1)h2(τ2,p2, σ2) . . . hn(τn,pn, σn)

with all hj ∈ HB/F1 . We can then construct the n-fold symmetrized (respectively antisymmetrized)

tensor products Fn±(HB/F1 ) by using the projection operators P±n : the space Fn±(H

B/F1 ) is the closed

linear span of all functions of the form

Pn(h1 ⊗ . . .⊗ hn) =1

n!

∑ρ∈Sn

ε±(ρ)hρ(1)(τ1,p1, σ1)hρ(2)(τ2,p2, σ2) . . . hρ(n)(τn,pn, σn).

We then take the direct sum of all these spaces to obtain the Fock spaces F±(HB/F1 ). On

these spaces F±(HB/F1 ) we can define, as in (2.42)-(2.45), the creation and annihilation opera-

tors A∗±(h) and A±(h) for vectors h ∈ HB/F1 . Because we can write each such h ∈ H

B/F1 as a

direct sum of vectors h(τ) ∈ H [τ ] with all τ ∈ TB/F , it is useful to introduce a special notationfor creation and annihilation operators corresponding to a single particle species. We will writeA∗(τ, h) and A(τ, h) to denote the creation and annihilation operators corresponding to a vec-

tor h = h(p, σ) ∈ H[τ ]. In other words, A(∗)(τ, h) = A(∗)± (g), where g ∈ HB/F1 is the function

g(τ ′,p, σ) =⊕

τ∈TB/F δττ ′h(p, σ). Note that we suppress the subindex ± in A(∗)(τ, .), because the

choice between + and − follows from τ .The Fock space corresponding to the entire system of particles T = TB ∪ TF is the tensor

productHFock := F+(HB

1 )⊗F−(HF1 ) (2.49)

20These functions h(τ,p, σ) can be either state functions Ψ ∈ HB/F1 or else momentum-spin wave functions

ψ ∈ HB/F1 .

50

Page 52: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

of the boson Fock space F+(HB1 ) and the fermion Fock space F−(HF

1 ). If ΩB and ΩF denote thevacuum vectors of F+(HB

1 ) and F+(HF1 ), then the vector ΩFock := ΩB ⊗ΩF is called the vacuum

vector of HFock. When τ is a boson, respectively fermion, we can let the operators A(∗)(τ, h)act on the entire space HFock by taking the tensor product A(∗)(τ, h) ⊗ 1F−(HF

1 ), respectively

1F+(HB1 ) ⊗ A(∗)(τ, h). If τ and τ ′ are both bosons or both fermions, so either τ, τ ′ ∈ TB or

τ, τ ′ ∈ TF , then these operators satisfy the relations

[A∗(τ, h1), A∗(τ ′, h2)]∓ = [A(τ, h1), A(τ ′, h2)]∓ = 0

[A(τ, h1), A∗(τ ′, h2)]∓ = δτ,τ ′〈h2, h1〉1HFock,

where the upper sign corresponds to the boson case and the lower sign to the fermion case. Notethat interchanging two operators corresponding to different fermion types costs a minus sign.If one of the particles τ and τ ′ is a boson and the other is a fermion, then their creation andannihilation operators commute with each other. We finally note that the unitary representationsUττ∈T of P↑+ on the one-particle spaces H [τ ] define unitary representations UB :=

⊕τ∈TB Uτ and

UF :=⊕

τ∈TF Uτ of P↑+ on HB1 and HF

1 , respectively. These can then be used to define a unitary

representation UFock := Γ+(UB) ⊗ Γ−(UF ) on HFock. If all one-particle spaces H [τ ] also containparity and time reversal operators, then a similar construction gives us parity and time reversaloperators on HFock.

As emphasized above, all definitions in this section hold for both the Hilbert space H of one-particle state functions Ψ(p, σ), as well as for the Hilbert space H of one-particle momentum-spinwave functions ψ(p, σ). In the first case (i.e. in the case H = H, H [τ ] = H[τ ], etc.) we obtainthe Fock spaces F±(H) and HFock and in this case we will write the creation and annihilationoperators with a capital letter A,

A(∗)± (Ψ), A(∗)(τ,Ψ), etc.,

whereas in the second case (i.e. in the case H = H, H [τ ] = H[τ ], etc.) we obtain Fock spaces F±(H)and HFock and in this case we will always write the creation and annihilation operators with asmall letter a

a(∗)± (ψ), a(∗)(τ, ψ), etc.

In the previous section we defined the unitary map J : H → H that relates the two Hilbert spaces.This map naturally extends to a family of maps Γn(J) : H⊗n → H⊗n by defining Γn(J)(Ψ1⊗ . . .⊗Ψn) = (JΨ1)⊗ . . .⊗ (JΨn). For each n, the restriction of Γn(J) to Fn±(H) then defines a unitarymap Γn±(J) : Fn±(H)→ Fn±(H). Hence, there is a unitary map

Γ±(J) : F±(H)→ F±(H)

that relates the two Fock spaces. For Υ ∈ F±(H), the element Γ±(J)Υ is physically equivalent toΥ in a similar sense that, for Ψ ∈ H, the element JΨ is physically equivalent to Ψ. Using Γ±(J),we can express the relation between A(∗) and a(∗) as

A(∗)± (Ψ) = Γ±(J)−1a

(∗)± (JΨ)Γ±(J)

a(∗)± (ψ) = Γ±(J)a

(∗)± (J−1ψ)Γ±(J)−1

Thus A(∗)± (Ψ) and a

(∗)± (JΨ) represent the same physics in the sense that if Υ ∈ F±(H) and if

υ = Γ±(J)Υ ∈ F±(H) (which is physically equivalent to Υ), then A(∗)± (Ψ)Υ is physically equivalent

to a(∗)± (ψ)υ.Especially in the next chapter, on the physics of quantum fields, the description in terms of H

will be most convenient. This coincides with the notation in most of the physics literature, wheresmall letters a are used to denote creation and annihilation operators.

51

Page 53: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

3 The physics of quantum fields

In this chapter we will give a brief overview of the use of quantum fields in physics in the so-calledcanonical formalism (as opposed to the path-integral formalism). We will do this by following themain arguments that are stated in chapters 3, 4 and 5 of Weinberg’s book [35]. In contrast tothe previous chapters (and the following ones) this chapter will not be mathematically rigorous.Also, we will not provide any derivations of the results in this chapter since they can all befound in [35]. The purpose of this chapter is merely to give some physical background on themathematical constructions in the next chapter and to motivate the content of the axioms of thetwo mathematical frameworks that will be discussed in the next chapter.

3.1 The interaction picture and scattering theory

For any quantum system with Hamiltonian H the time-evolution operator is given by UH(t) =e−itH , see also subsection 2.2.2. Now assume that this Hamiltonian can be written as H = H0 +V ,where H0 is a Hamiltonian which corresponds to an easy and well-understood quantum system andV is some extra term which makes the system more complicated. We thus assume that we alreadyknow the evolution operator UH0(t) = e−itH0 , and we want to express UH(t) in terms of UH0(t)and V . For this purpose we introduce the so-called interaction picture, which lies in between theHeisenberg picture and the Schrodinger picture.

The interaction pictureIn the interaction picture an observable A evolves according to

AI(t) = UH0(−t)AUH0(t)

and a state vector Ψ evolves according to

ΨI(t) = Ω(t)∗Ψ,

where Ω(t) := UH(−t)UH0(t) and hence Ω(t)∗ = UH0(−t)UH(t). It is easy to see that the in-teraction picture is physically equivalent to the Heisenberg picture (which is in turn physicallyequivalent to the Schrodinger picture), since

〈AI(t)ΨI(t),ΨI(t)〉 = 〈UH0(−t)AUH0(t)UH0(−t)UH(t)Ψ, UH0(−t)UH(t)Ψ〉= 〈UH0(−t)AUH(t)Ψ, UH0(−t)UH(t)Ψ〉 = 〈AUH(t)Ψ, UH(t)Ψ〉= 〈UH(−t)AUH(t)Ψ,Ψ〉.

Since we are interested in UH(t) and because UH(t) = UH0(t)Ω(t)∗, we must thus find a way tocalculate Ω(t)∗. It follows directly from the expression Ω(t)∗ = UH0(−t)UH(t) that Ω(t)∗ satisfiesdΩ(t)∗

dt = 1iVI(t)Ω(t)∗, where VI(t) denotes the interaction picture evolution of V . This allows us

to write

Ω(t)∗ = Ω(0)∗ +

∫ t

0

dΩ(τ1)∗

dτ1dτ1 = 1 +

1

i

∫ t

0VI(τ1)Ω(τ1)∗dτ1.

We can then insert this expression for Ω(t)∗ into the right-hand side of this same equation andrepeat this procedure over and over again, so that finally we obtain

Ω(t)∗ ∼ 1 +

∞∑n=1

1

in

∫ t

0

∫ τn

0. . .

∫ τ2

0VI(τn)VI(τn−1) . . . VI(τ1)dτ1 . . . dτn−1dτn

= 1 +

∞∑n=1

1

in

∫t>τn>τn−1>...>τ1>0

VI(τn)VI(τn−1) . . . VI(τ1)dτ1 . . . dτn−1dτn

= 1 +∞∑n=1

1

inn!

∫ t

0. . .

∫ t

0TVI(τn)VI(τn−1) . . . VI(τ1)dτ1 . . . dτn−1dτn, (3.1)

52

Page 54: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

where T denotes the time-ordered product (i.e. the operators are ordered in such a way thatthe time variables of the operators are in anti-chronological order from left to right). This is theexpression for Ω(t)∗ that we needed. We will now apply these results to the case of a scatteringexperiment.

Scattering experimentsIn a typical scattering experiment particles approach each other from very large mutual distances.They will then interact with each other in some small region in space, and finally a collection ofparticles (not necessarily the same ones) comes out of this region and moves apart to large mutualdistances. At the beginning and at the end of such experiment the particles are so far apart thatthey do not interact with each other. Therefore, if H denotes the Hilbert space correspondingto the scattering experiment and Ψ ∈ H is a pure state vector (in the Heisenberg picture), thetransformed state vectors e−iHtΨ must in some sense ’look like’ state vectors in a free particle Fockspace HFock when t → ±∞; here either HFock = HFock or HFock = HFock, see also section 2.2.5.Mathematically, this means that we have two linear isometric embeddings Ωin,Ωout : HFock → Hfrom the free particle Fock space into H. We then define Hin,Hout ⊂ H by Hin = ΩinHFock

and Hout = ΩoutHFock; these are called the spaces of asymptotic states of incoming and outgoingparticles, respectively.

Physically, the maps Ωin and Ωout should be interpreted as follows. If h ∈ HFock is a statecorresponding to a collection of non-interacting particles with some specified momenta and spin-x3 components, then Ωinh ∈ H represents the state vector in a scattering experiment where thestate of the incoming particles (when they are not yet interacting) is given by h. The physicalinterpretation of Ωout is analogous.

The linear operator S = (Ωout)∗Ωin : HFock → HFock is called the scattering operator. In physicsit is often assumed that Hin = Hout (so-called asymptotic completeness), which is equivalent tothe requirement that S is unitary. The scattering matrix, or S-matrix, is defined by

Sβα := 〈Shα, hβ〉HFock= 〈Ωinhα,Ω

outhβ〉H,

where hα, hβ ∈ HFock are free-particle states. Note that it represents the transition amplitude for

the transition Ωinhα → Ωouthβ. Now recall the definition of the unitary representation UFock of P↑+on HFock as given subsection 2.2.5. The representation UFock induces two unitary representationsU in := ΩinUFock(Ωin)∗ and Uout := ΩoutUFock(Ωout)∗ of P↑+ on Hin = Hout. The theory will bePoincare invariant if U in = Uout, which is equivalent to SUFock = UFockS.

In physics textbooks, it is often assumed that the free particle states and the asymptotic statesboth live in the same Hilbert space HFock = HFock of many-particle momentum-spin wave functions,so in the rest of this chapter we will take HFock = HFock (rather than HFock = HFock). Also, theHamiltonian H of the system is assumed to be a sum H = H0 + V of a free-particle HamiltonianH0 and an interaction term V . Then the operator

Ω(t) = UH(−t)UH0(t) = eiHte−iH0t

is defined, and Ω(∓∞) = limt→∓∞Ω(t) correspond to Ωin and Ωout, respectively, so that

S = limt→∞

limt0→−∞

Ω(t)∗Ω(t0).

A similar calculation as the one which led to equation (3.1) gives that the operator Ω(t)∗Ω(t0) canbe written as

Ω(t)∗Ω(t0) = 1 +∞∑n=1

1

inn!

∫ t

t0

. . .

∫ t

t0

TVI(τn)VI(τn−1) . . . VI(τ1)dτ1 . . . dτn−1dτn.

The S-operator can then be written as

S = 1 +∞∑n=1

(−i)n

n!

∫ ∞−∞

. . .

∫ ∞−∞

dt1 . . . dtnTVI(t1) . . . VI(tn).

53

Page 55: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

As a first step to guarantee that the S-matrix will be Lorentz-invariant, it is assumed thatVI(t) is of the form

VI(t) =

∫R3

H (t,x)d3x, (3.2)

with H (x) a scalar in the sense that

UFock(a,A)H (x)UFock(a,A)−1 = H (Φ(A)x+ a), (3.3)

where Φ : P↑+ → P↑+ denotes the covering map, as usual21. The S-operator may then be written

as

S = 1 +∞∑n=1

(−i)n

n!

∫M. . .

∫Md4x1 . . . d

4xnTH (x1) . . .H (xn). (3.4)

The time ordering of two points in spacetime is only Lorentz-invariant when the two points arenot spacelike separated, so to obtain Lorentz invariance we must have that H (x1)H (x2) =H (x2)H (x1) whenever x1 and x2 are spacelike separated. Thus, H (x) satisfies

[H (x1),H (x2)] = 0 (3.5)

when (x− y)2 < 0. Actually, to avoid singularities when x = y, it is assumed that this holds evenwhen (x− y)2 ≤ 0.

In physics it is assumed that different experiments that are carried out at large spatial distancesfrom each other cannot influence one another. This is called the cluster decomposition principle.In terms of the S-matrix, this principle implies that for multiparticle scattering processes α1 →β1, . . . , αN → βN that are carried out at N different laboratories that are far apart, the S-matrixelement of the composite experiment will factorize:

Sβ1+...+βN ,α1+...+αN → Sβ1α1 · . . . · SβNαN .

To see what kind of Hamiltonians will give rise to an S-matrix that satisfies this property, weneed to consider the creation and annihilation operators a(∗)(τ, ψ) on the space HFock definedin subsection 2.2.5. In the notation where we write a distribution F as a function F (x) in thesense that F (f) =

∫F (x)f(x)dx, we can write the creation and annihilation operators a(∗)(τ, .) as

(operator-valued) functions a(∗)(τ,p, σ) in the sense that22

a∗(τ, ψ) =∑σ∈Iτ

∫R3

d3pψ(τ,p, σ)a∗(τ,p, σ)

a(τ, ψ) =∑σ∈Iτ

∫R3

d3pψ(τ,p, σ)a(τ,p, σ),

where we used that a(τ, ψ) depends conjugate-linearly on ψ. It is useful to introduce the so-calleddefinite momentum-spin wave functions ψτ ′,p′,σ′ . They are defined by

ψτ ′,p′,σ′(τ,p, σ) = δττ ′δσσ′δ(p− p′).

The corresponding definite momentum-spin state functions Ψτ ′,p′,σ′(τ,p, σ) are then given by

Ψτ ′,p′,σ′(τ,p, σ) =√

2ωp′δττ ′δσσ′δ(p− p′),

21In some (gauge) theories both (3.2) and (3.3) are only approximately true, but this will not affect the Lorentzinvariance. In such theories the interaction is obtained from a Lorentz invariant Lagrangian, which will guaranteeLorentz invariance in a more general way than equations (3.2) and (3.3). We will come back to Lagrangians later.

22Here and in the rest of this chapter the index set Iτ is defined as in (2.48).

54

Page 56: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

but they will not be very useful. The “inner product”of two definite-momentum wave functions isgiven by ∑

τ

∑σ∈Iτ

∫R3

ψτ ′,p′,σ′(τ,p, σ)ψτ ′′,p′′,σ′′(τ,p, σ)d3p = δτ ′τ ′′δσ′σ′′

∫R3

d3p

δ(p− p′)δ(p− p′′)

= δτ ′τ ′′δσ′σ′′δ(p′ − p′′).

With the use of these definite momentum-spin wave functions, we can define the operators a(∗)(τ,p, σ)formally as a(∗)(ψτ,p,σ), since

a(∗)(τ ′,p′, σ′) =∑τ

∑σ∈Iτ

∫R3

d3pδττ ′δσσ′δ(p− p′)a(∗)(τ ′,p′, σ′)

=∑τ

∑σ∈Iτ

∫R3

d3pψτ ′,p′,σ′(τ,p, σ)a(∗)(τ ′,p′, σ′)

= a(∗)(ψτ ′,p′,σ′)

Since we are not being mathematically rigorous in this chapter, we can thus describe the action ofthese operators as

a∗(τ,p, σ)P±n (ψ1 ⊗ . . .⊗ ψn) =√n+ 1P±n+1(ψτ,p,σ ⊗ ψ1 ⊗ . . .⊗ ψn)

a(τ,p, σ)Pn(ψ1 ⊗ . . .⊗ ψn) =1√n

n∑j=1

(±1)j−1〈ψj , ψτ,p,σ〉

P±n−1(ψ1 ⊗ . . .⊗ ψj−1 ⊗ ψj+1 ⊗ . . .⊗ ψn)

=1√n

n∑j=1

(±1)j−1ψj(τ,p, σ)

P±n−1(ψ1 ⊗ . . .⊗ ψj−1 ⊗ ψj+1 ⊗ . . .⊗ ψn).

The operator a(τ,p, σ) is the restriction to the subspace F±(H) of the operator b(τ,p, σ), definedon⊕∞

n=0H⊗n by

b(τ,p, σ)(ψ1 ⊗ . . .⊗ ψn) =√n〈ψ1, ψτ,p,σ〉ψ2 ⊗ . . .⊗ ψn =

√nψ1(τ,p, σ)(ψ2 ⊗ . . .⊗ ψn).

For an arbitrary function ψ(n) ∈ H⊗n this means that

[b(τ,p, σ)ψ(n)](τ1,p1, σ1, . . . , τn−1,pn−1, σn−1) =√nψ(n)(τ,p, σ, τ1,p1, σ1, . . . , τn−1,pn−1, σn−1).

If τ is a massive particle of spin sτ , then the operators a(∗)(τ,p, σ) transform under a transformation

(b, A) ∈ P↑+ as23

UFock(b, A)aτ (p, σ)UFock(b, A)−1 = e−ib·Φ(A)p

√(Φ(A)p)0

p0

sτ∑σ′=−sτ

[D(sτ )(M−1Φ(A)pAMp)

−1]σσ′aτ (Φ(A)p, σ′), (3.6)

UFock(b, A)a∗τ (p, σ)UFock(b, A)−1 = eib·Φ(A)p

√(Φ(A)p)0

p0

sτ∑σ′=−sτ

[D(sτ )(M−1Φ(A)pAMp)−1]σσ′a

∗τ (Φ(A)p, σ′), (3.7)

23In order to save space, we will from now on always write a(∗)τ (p, σ) instead of a(∗)(τ,p, σ).

55

Page 57: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

where Φ(A)p denotes the spatial part of Φ(A)p. If τ is a massless particle with (possible) helicityστ , these transformations become

UFock(b, A)aτ (p, στ )UFock(b, A)−1 = e−ib·Φ(A)p

√(Φ(A)p)0

p0

e−iστα(M−1

Φ(A)pAMp)

aτ (Φ(A)p, στ ), (3.8)

UFock(b, A)a∗τ (p, στ )UFock(b, A)−1 = eib·Φ(A)p

√(Φ(A)p)0

p0

eiστα(M−1

Φ(A)pAMp)

a∗τ (Φ(A)p, στ ), (3.9)

where α(M) is defined in the same manner as in (2.36). The annihilation and creation operators,for both massive and massless particles, satisfy the (anti)commutation relations

[aτ (p′, σ′), a∗τ ′(p, σ)]± = δττ ′δσ′σδ(p′ − p), (3.10)

[a∗τ (p′, σ′), a∗τ ′(p, σ)]± = 0, (3.11)

[aτ (p′, σ′), aτ ′(p, σ)]± = 0, (3.12)

where the minus sign holds whenever τ and τ ′ are both bosons and the plus sign holds whenever τand τ ′ are both fermions. When one of the particles τ or τ ′ is a boson and the other is a fermion,then all creation and annihilation operators commute. Instead of aτ (p, σ) we will often simplywrite a(q). Each operator A can then be represented in the form

A ∼ A :=∞∑N=0

∞∑M=0

∫dq′1 . . . dq

′Ndq1 . . . dqMANM (q′, q)a∗(q′1) . . . a∗(q′N )a(qM ) . . . a(q1)

in the sense that 〈AΨ1,Ψ2〉 = 〈AΨ1,Ψ2〉 for all Ψ1,Ψ2 ∈ H. Here the ANM (q′, q)N,M arecomplex-valued functions in the variables q′1, . . . , q

′N , q1, . . . , qM . In particular, we can write the

Hamiltonian as

H =∞∑N=0

∞∑M=0

∫dq′1 . . . dq

′Ndq1 . . . dqMHNM (q′, q)a∗(q′1) . . . a∗(q′N )a(qM ) . . . a(q1). (3.13)

It can be shown that the cluster decomposition principle will be satisfied if the coefficientsHNM (q′, q)are of the form

HNM (q′, q) = δ

N∑i=1

p′i −M∑j=1

pj

HNM (q′, q),

where HNM (q′, q) contains no further delta functions. Because the free particle Hamiltonian isalways of the form

H0 =

∫dqE(q)a∗(q)a(q),

with E(q) = E(τ,p, σ) =√mτ

2 + p2, it follows that the interaction V = H −H0 must be of theform

V =∞∑N=0

∞∑M=0

∫dq′1 . . . dq

′Ndq1 . . . dqMVNM (q′, q)a∗(q′1) . . . a∗(q′N )a(qM ) . . . a(q1) (3.14)

with VNM of the form δ(∑N

i=1 p′i −∑M

j=1 pj)VNM (q′, q), where VNM (q′, q) contains no furtherdelta functions. Note furthermore that V will be hermitian if and only if the coefficients satisfyVNM (q′1, . . . , q

′N , q1, . . . , qM ) = VMN (q′1, . . . , q

′M , q1, . . . , qN ).

56

Page 58: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

3.2 The use of free quantum fields in scattering theory

To summarize the results of the previous section, for Lorentz invariance the operator V must beof the form (3.2) with H (x) satisfying (3.3) and (3.5), and for the S-matrix to satisfy the clusterdecomposition principle, V must be of the form (3.14) with coefficients VNM (q′, q) as describedabove. To satisfy both of these conditions, we will (at the end of this section) construct a scalardensity24

H (x) =∑N,M

∑j1,...,jN

∑k1,...,kM

gj1...jN ,k1...kM (φτ1)−j1(x) . . . (φτN )−jN (x)(φτ′1)+k1

(x) . . . (φτ′M )+

kM(x)

out of annihilation fields (φτ )+j (x)j,τ and creation fields (φτ )−j (x)j,τ :

(φτ )+j (x) =

∑σ∈Iτ

∫d3puj(x; p, σ, τ)aτ (p, σ),

(φτ )−j (x) =∑σ∈Iτ

∫d3pvj(x; p, σ, τ)a∗τ (p, σ).

Here the coefficients uj and vj are chosen so that under a transformation (b,M) ∈ P↑+ the fieldstransform as

UFock(b,M)(φτ )±j (x)UFock(b,M)−1 =∑k

D(M−1)jk(φτ )±k (Φ(M)x+ b). (3.15)

with D(M−1)jkj,k some collection of numbers depending on M−1. When we take M = 1 andcompare this expression with (3.6), (3.7), (3.8) and (3.9), we find that the coefficients are of theform

uj(x; p, σ, τ) = (2π)−3/2uj(p, σ, τ)e−ip·x

vj(x; p, σ, τ) = (2π)−3/2vj(p, σ, τ)eip·x,

where p0 =√m2τ + p2. Furthermore, by applying two respective transformations as in (3.15), it

can easily be seen that the matrices D(M) form a representation of SL(2,C). Therefore, beforewe can construct such fields we first need to consider the representations of SL(2,C).

Representations of SL(2,C)Every representation of SL(2,C) can be written as a direct sum of irreducible representationsand because SL(2,C) is simply-connected, any representation D of SL(2,C) is completely de-termined by the corresponding representation ϕD of its Lie algebra sl(2,C). Recall that thesix matrices 1

2σj ,12iσjj=1,2,3 form a basis (over R) for the Lie algebra sl(2,C) of SL(2,C). If

ϕ : sl(2,C) → gl(V ) is any representation of sl(2,C) in some complex vector space V , then wedefine on V the six linear maps Jk := iϕ( 1

2iσk) and Kk = iϕ(12σk), k = 1, 2, 3. These linear maps

satisfy the commutation relations

[Ji,Jj ] = iεijkJk,

[Ji,Kj ] = iεijkKk,

[Ki,Kj ] = −iεijkJk.

We now define another six linear maps

Ak =1

2(Jk + iKk),

Bk =1

2(Jk − iKk),

24Here τ denotes the particle species, as usual.

57

Page 59: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

which satisfy the commutation relations

[Ai,Aj ] = iεijkAk,

[Bi,Bj ] = iεijkBk,

[Ai,Bj ] = 0.

From these relations it follows that V can be written as a tensor product

V = VA ⊗ VB

and that the action of Ak and Bk on V is given by

Ak.(vA ⊗ vB) = (S(A)k vA )⊗ vB

Bk.(vA ⊗ vB) = vA ⊗ (S(B)k vB),

where vA ∈ VA , vB ∈ VB, A,B ∈ 12Z≥0 and S

(A)k , S

(B)k are the spin operators corresponding to

spin A and B, respectively; see also equation (2.28) for the definition of spin operators. Note that,in particular, the dimensions of the spaces VA and VB are 2A + 1 and 2B + 1, respectively, andthe dimension of V is (2A+ 1)(2B + 1). We will label these representations as ϕ(A,B) and we will

write V (A,B), V(A)A and V

(B)B instead of V , VA and VB. These representations of sl(2,C) give rise

to irreducible representations D (A,B) of SL(2,C), which we will call the (A,B)-representation ofSL(2,C). This representation is given by

D (A,B)(M)(vA ⊗ vB) =(D(A)(M)vA

)⊗(D(B)(M)vB

),

where the D(.) denote the SU(2) representations discussed in subsection 2.2.4. Note that the(A,B)-representations are not unitary, due to the fact that SL(2,C) is non-compact. However,the compact subgroup SU(2) ⊂ SL(2,C) is represented unitarily, with generators

Jk = Ak + Bk.

It follows that under an SU(2) transformation a vector v ∈ V (A,B) = V(A)A ⊗ V (B)

B transforms asthe direct sum of spin j objects, with j = A+B,A+B− 1, . . . , |A−B|. Finally, we mention thatin case any (not necessarily irreducible) representation can be extended to a representation thatincludes space inversion, there is an operator β that satisfies βJkβ

−1 = Jk and βKkβ−1 = −Kk,

and hence βAkβ−1 = Bk and βBkβ

−1 = −Ak. Clearly, such an operator β only makes sense inthe (A,A)-representation and the (A,B)⊕ (B,A)-representation.

Construction of general free fieldsNow that we have found the representations of SL(2,C), we can write the annihilation and creationfields corresponding to the (A,B)-representation as25

(φτA,B)+ab(x) = (2π)−3/2

∑σ∈Iτ

∫d3p(uA,B)ab(p, σ, τ)e−ip·xaτ (p, σ), (3.16)

(φτA,B)−ab(x) = (2π)−3/2∑σ∈Iτ

∫d3p(vA,B)ab(p, σ, τ)eip·xa∗τ (p, σ), (3.17)

where a = −A,−A+ 1, . . . , A and b = −B,−B+ 1, . . . , B. We will sometimes suppress the capitalletters A and B in the coefficients u and v to prevent that the equations become too wide. Theannihilation and creation fields transform according to

UFock(b,M)(φτA,B)±ab(x)UFock(b,M)−1 =∑a′,b′

[D (A,B)(M−1)

]ab,a′b′

(φτA,B)±a′b′(Φ(M)x+ b). (3.18)

25At this point we do not make any statements about the existence of such fields, i.e. about the question whether(for any particle species τ) components (uA,B)ab(p, σ, τ) and (vA,B)ab(p, σ, τ) can be found that transform properly.Later we will answer this question for massive particles and massless particles separately.

58

Page 60: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

The (anti-) commutation relations (3.10), (3.11) and (3.12) imply that the fields satisfy the(anti)commutation relations

[(φτA,B)+ab(x), (φτ

′A′,B′)

+a′b′(y)]± = 0,

[(φτA,B)−ab(x), (φτ′A′,B′)

−a′b′(y)]± = 0,

[(φτA,B)+ab(x), (φτ

′A′,B′)

−a′b′(y)]± =

δττ ′

(2π)3

∑σ∈Iτ

∫d3p(uA,B)ab(p, σ, τ)(vA,B)a′b′(p, σ, τ

′)e−ip·(x−y),

where the plus sign corresponds to the case where τ and τ ′ are both fermions and the minus signto the case where at least one of the particles τ and τ ′ is a boson. In order to obtain quantitiesthat either commute or anticommute at space-like distances (for reasons that will become clearwhen we will come back to equation (3.5)), we will construct linear combinations

(φτA,B)ab(x) := κ(φτA,B)+ab(x) + λ(φτA,B)−ab(x) (3.19)

= (2π)−3/2∑σ∈Iτ

∫d3p

[κe−ip·xuab(p, σ)aτ (p, σ) + λeip·xvab(p, σ)a∗τ (p, σ)

],

where κ and λ (depending on τ and on the pair (A,B), but not on the components a and b) arechosen so that if x− y is spacelike, we have

[(φτA,B)ab(x), (φτA′,B′)a′b′(y)]± = 0 (3.20)

[(φτA,B)ab(x), (φτA′,B′)∗a′b′(y)]± = 0 (3.21)

for all pairs (A,B) and (A′, B′) for which these fields exist (again, we will come back to the existencelater). Note that instead of using the annihilation and creation fields (φτA,B)±(x)A,B for eachparticle τ occuring in the scattering experiment, we will from now on use the fields (φτA,B)(x)A,Band their adjoints (φτA,B)∗(x)A,B. The correct values of κ and λ wil be given later, but first wehave to consider another problem. It might happen that particles that are created and annihilatedby these fields carry conserved quantum numbers, such as electric charge, in which case we mustbe sure that H (x) commutes with the operator that corresponds to the conserved quantity. IfQ is an operator corresponding to some conserved quantum number (for example electric charge)and if q(τ) is the value of the conserved quantum number for particles of type τ , then we have thecommutation relations

[Q, aτ (p, σ)] = −q(τ)aτ (p, σ)

[Q, a∗τ (p, σ)] = q(τ)a∗τ (p, σ),

which show that the commutation relations between Q and the field (3.19) are not so pretty inthe sense that they do not allow an easy way to construct an interaction H (x) out of fields of theform (3.19) that also commutes with Q. To solve this problem, we postulate that for each particletype τ there exists another particle type τC , called the antiparticle of τ , which has the same massbut carries opposite values for all conserved quantum numbers. It can also happen that τC = τ ,in which case the antiparticle and the particle are identical and in this case there is no problem inusing the field (3.19). In case τC 6= τ , we define the annihilation and creation fields (φτ

C

A,B)+ab(x)

and (φτC

A,B)−ab(x) to be the same as those in (3.16) and (3.17) but with every τ replaced with τC .In this case we replace equation (3.19) with

(φτA,B)ab(x) := κ(φτA,B)+ab(x) + λ(φτ

C

A,B)−ab(x) (3.22)

= (2π)−3/2∑σ∈Iτ

∫d3p

[κe−ip·xuab(p, σ)aτ (p, σ) + λeip·xvab(p, σ)a∗τC (p, σ)

],

with κ and λ still such that (3.20) and (3.21) are satisfied. If Q and q(τ) are as above, then wenow have the simple commutation relations

[Q, (φτA,B)ab(x)] = −q(τ)(φτA,B)ab(x), (3.23)

[Q, (φτA,B)∗ab(x)] = q(τ)(φτA,B)∗ab(x). (3.24)

59

Page 61: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Note that these fields thus have the nice property that

[Q, (φτA,B)(x)(φτA,B)∗(x)] = [Q, (φτA,B)(x)](φτA,B)∗(x) + (φτA,B)(x)[Q, (φτA,B)∗(x)]

= −q(τ)(φτA,B)(x)(φτA,B)∗(x) + q(τ)(φτA,B)(x)(φτA,B)∗(x)

= 0,

where we have suppressed the indices a and b. Later we will use a generalization of this propertyto ensure that the interaction commutes with Q.

To summarize our results so far, for each particle type τ occuring in the scattering experimentwe construct (for certain pairs (A,B) depending on the particle species τ) fields as in (3.19) if

τC = τ and fields as in (3.22) if τC 6= τ . However, we will not construct fields (φτC

A,B)(x), so foreach particle-antiparticle pair we must agree which of the two is the particle and which is theantiparticle. Note that the partial derivatives of the fields (3.19) and (3.22) are given by

∂µ(φτA,B)ab(x) = −ipµκ(φτA,B)+ab(x) + ipµλ(φ

τ/τC

A,B )−ab(x),

which we can regard as a different field on its own since it is also of the form (3.19) or (3.22). Bytaking partial derivatives again, it follows immediately that each component (φτA,B)ab(x) of thefields satisfies the Klein-Gordon equation:

( +m2τ )(φτA,B)ab(x) = 0.

For notational convenience we will combine all the irreducible fields (φτA,B)(x)τ,(A,B) that occurin the description of a scattering experiment into a single field φ with components φj . The so-obtained field φ of course no longer transforms irreducibly. Along with the field φ, we will alsoconsider its adjoint φ∗. We can then construct very general Hamiltonian densities

H (x) =∑N,M

∑j1,...,jN

∑k1,...,kM

gj1...jN ,k1...kM : φj1(x) . . . φjN (x)φ∗k1(x) . . . φ∗kM (x) : (3.25)

where the colons : : indicate normal ordering, i.e. the expression obtained by moving all creationoperators to the left of all annihilation operators while including a minus sign whenever two fermionoperators are interchanged. When we choose the coefficients gj1...jN ,k1...kM to transform properly,H (x) will automatically satisfy the scalar condition (3.3). If each term in this interaction containsan even number of fermion fields, then this interaction will also commute with itself at spacelike-distances. In equations (3.23) and (3.24) we have seen that for an operator Q that corresponds tosome conserved quantum number (such as electric charge) our fields satisfy commutation relationsof the form [Q,φj ] = −q(j)φj and [Q,φ∗j ] = q(j)φ∗j , so if each term in H (x) consists of a productφj1(x) . . . φjN (x)φ∗k1

(x) . . . φ∗kM (x) of fields and adjoint fields for which q(j1) + . . .+ q(jN )− q(k1)−. . .− q(kM ) = 0, then H (x) will satisfy [Q,H (x)] = 0.

Before closing this section, we will give the explicit construction of the fields described above.In order to find the coefficients uab and vab, as well as the constants κ and λ, we have to considerthe fields of massive particles and the fields of massless particles separately.

Massive particle fieldsIf τ is a massive particle, then the index σ takes values in −sτ , . . . , sτ and the coefficients aregiven by

uab(p, σ) =1√2ωp

∑a′,b′

[eθ·S

(A)]aa′

[e−θ·S

(B)]bb′CAB(sτ , σ; a′, b′) (3.26)

vab(p, σ) = (−1)jτ+σuab(p,−σ),

where θ is the boost parameter corresponding to the boost that maps (m, 0, 0, 0) to (ωp,p) andCAB(j, σ; a, b) are Clebsch-Gordan coefficients defined by

v(s1,s2)s,m =

∑m1,m2

Cs1s2(s,m;m1,m2)vs1,m1 ⊗ vs2,m2 ,

60

Page 62: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

where on the right-hand side the vectors vsi,mi ∈ V (si) denote the joint eigenvectors of [S(si)]2 and

S(si)3 (in physics these vectors are written as |simi〉), and on the left-hand side the vectors v

(s1,s2)s,m ∈

V (s1) ⊗ V (s2) denote the joint eigenvectors of [S(s)]2 and S(s)3 with S

(s)k := S

(s1)k ⊗ 1 + 1 ⊗ S(s2)

k .Before we can go on, we first make some remarks concerning the relationship between particles andthe fields that describe them. It is not true that any (A,B)-field can describe some given massiveparticle species τ . By considering the transformation properties of the (A,B)-field under rotations,it can be shown that it can only describe particles with spin A+B,A+B−1, . . . , |A−B|. Thus, fora given particle species τ we can only construct (A,B)-fields for which |A−B| ≤ sτ ≤ A+B andA+ B − sτ ∈ Z≥0. These different fields for τ are not physically distinct, however. For example,when we take the derivative ∂µφ0,0(x) of the (0, 0)-field (or scalar field) φ0,0(x), we obtain a fieldthat transforms as a (1

2 ,12)-field (or vector field). In general, any (A,B)-field for a given particle

type τ can be expressed either as a rank 2B differential operator acting on a (sτ , 0)-field or as arank 2A differential operator acting on a (0, sτ )-field. For reasons that will become clear later, wewant that all (A,B)-fields that describe a single type of particle τ will commute with each otherwhen τ is a boson and anticommute with each other when τ is a fermion. To achieve this, we mustchoose the constants κ and λ in (3.22) such that26 λ = (−1)2Bκ. By adjusting the overall scale ofthe field, we can then write it as

(φτA,B)ab(x) = (2π)−3/2sτ∑

σ=−sτ

∫d3puab(p, σ, τ)

[e−ip·xaτ (p, σ) + (−1)2B+sτ+σeip·xa∗τC (p,−σ)

].

As shown in section 5.7 of [35] (in particular, equation (5.7.53)), the adjoints of the components ofan (A,B)-field for a particle with τC = τ are proportional to the components of the (B,A)-fieldfor the particle τ , so if τC = τ then the adjoint fields do not give rise to new kinds of objects.

Finally, we will mention the transformation properties of these fields under space inversions.In the (A,A)-representation the field transforms according to

PφτA,A(x)P−1 = (−1)2A−jξτφτA,A(Isx),

with ξτ the intrinsic parity defined in the previous chapter, while in the (A,B)⊕(B,A)-representationit transforms according to

P

(φτA,B(x)

φτB,A(x)

)P−1 = (−1)A+B−sτ ξτ

(φτB,A(Isx)

φτA,B(Isx)

). (3.27)

So under space inversion, the (A,B)⊕ (B,A)-representation becomes the (B,A)⊕ (A,B)- repre-sentation. In appendix B we give some examples of free massive fields that can be obtained byexplicit calculation of the coefficients uab(p, σ, τ) for some representation (A,B). As is shown inthat appendix, the (1

2 , 0)⊕ (0, 12)-field automatically satisfies the Dirac equation.

Massless particle fieldsAs we have already mentioned when we discussed parity, it is sometimes necessary to identify twomassless particles that only differ from each other in the fact that they have opposite helicities.We will first consider the case where the massless particle τ can have only one helicity στ . In thiscase the field has the form

(φτA,B)ab(x) = (2π)−3/2

∫d3p

[κe−ip·xuab(p, στ )aτ (p, στ ) + λeip·xvab(p, στC )a∗τC (p, στC )

].

Just as the (A,B)-field for massive particles could only describe particles with spin sτ for which|A−B| ≤ sτ ≤ A+B, the (A,B)-field for massless particles can only describe particles for which

26In deriving this result (see section 5.7 of [35]) it becomes clear that fields which describe particles with integerspin commute, while fields that describe particles with half-odd integer spin anticommute. In other words, particleswith integer spin are bosons and particles with half-odd integer spin are fermions. This is the content of the famousspin-statistics theorem. We will come back to this theorem in the next chapter.

61

Page 63: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

στ = A−B and στC = −στ . Thus the simplest field that one can construct for a massless particleτ is the (στ , 0)-field if στ ≥ 0 and the (0, |στ |)-field if στ < 0. The (B + στ , B)-fields are 2Bthorder derivatives of the (στ , 0)-field and the (A,A + |στ |)-fields are 2Ath order derivatives of the(0, |στ |)-field. As stated before, the only irreducible representations (A,B) that can be extendedto a representation that also contains space inversion are the representations for which A = B.In the present case of massless particles this representation can only be chosen if στ = 0 and inthat case we have A = B = 0, i.e. a scalar field. For στ 6= 0 we again have to consider fieldsof type (A,B) ⊕ (B,A). However, in order to obtain a transformation law for massless particlesanalogous to (3.27), we must identify the particle type τ with the particle type which is obtainedby substituting στ → −στ . Otherwise the two τ ’s on the right-hand side of (3.27) cannot be thesame as the ones on the left-hand side, because the τs necessarily have opposite helicity (becauseA and B are switched).

3.3 Calculation of the S-matrix using perturbation theory

In this section we will give a brief overview of how physicists use perturbation theory to calculatethe S-matrix elements for some scattering process. If ψα, ψβ ∈ HFock are two Fock space vectorsgiven by ψα = ΠN

j=1a∗(qin

j )ΩFock and ψβ = ΠMj=1a

∗(qoutj )ΩFock with the qj specifying particle species,

momentum and spin, then it follows from equation (3.4) that the S-matrix element 〈Sψα, ψβ〉 isequal to

∞∑n=0

1

inn!

∫R4n

⟨ΠMj=1a(qout

j )TH (x1) . . .H (xn)ΠNj=1a

∗(qinj )ΩFock,ΩFock

⟩dx1 . . . dxn.

We can write the interaction density (3.25) as

H (x) =∑i

giHi(x),

where i denotes a multi-index and each Hi(x) is a normal-ordered product of fields and fieldadjoints. The task is therefore to calculate expressions of the form⟨

ΠMj=1a(qout

j )THi1(x1) . . .Hin(xn)ΠNj=1a

∗(qinj )ΩFock,ΩFock

⟩or, in terms of the field φ⟨

ΠMj=1a(qout

j )T: φ(∗)i1,1

(x1) . . . φ(∗)i1,k(1)(x1) : . . . : φ

(∗)in,1

(xn) . . . φ(∗)in,k(n)(xn) :ΠN

j=1a∗(qin

j )ΩFock,ΩFock

⟩,

where the double indices of the fields are defined by

Him(x) =: φ(∗)im,1

(x) . . . φ(∗)im,k(im)(x) : .

For notational convenience, we will write such expressions simply as

〈TA1 . . . AnΩFock,ΩFock〉 ,

where the Aj are either fields/field adjoints or creation/annihilation operators and, by definition,the time ordering has no effect on the a(qout

j ) and a∗(qinj ). For any pair (Aj , Ak) we define the

contraction by ︷ ︸︸ ︷AjAk = TAjAk− : AjAk :

(time-ordering minus normal-ordering), where TAjAk = AjAk whenever at least one of the twooperators does not depend on time. Contractions happen to be always scalar multiples of the

62

Page 64: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

identity operator, and we will identify the contraction with this scalar (thus forgetting about theidentity operator). Then according to Wick’s theorem we have for n even

〈TA1 . . . AnΩFock,ΩFock〉 =∑P

ε(P )︷ ︸︸ ︷Aj1Aj2 . . .

︷ ︸︸ ︷Ajn−1Ajn , (3.28)

where the sum runs over all groupings P of the Aj ’s into n/2 pairs with the ordering in each pairthe same as on the left-hand side, so j1 < j2, j3 < j4, . . . , jn−1 < jn. The sign ε(P ) depends onthe number of fermion interchanges that were needed to bring the two elements in each pair nextto each other. For n odd, the expression on the left-hand side is simply zero. Recalling that theAj are actually fields or creation/annihilation operators, we see that only a few contractions arenonzero. Their contributions are27:︷ ︸︸ ︷

aτ (C)(poutj , σoutj )a∗τ (C)(p

ink , σ

ink ) = δσoutj σink

δ(poutj − pink )︷ ︸︸ ︷aτ (poutj , σoutj )(φτ )∗` (x) = (2π)−3/2eip

outj ·xu`(p

outj , σoutj )︷ ︸︸ ︷

aτC (poutj , σoutj )(φτ )`(x) = (2π)−3/2eipoutj ·xv`(p

outj , σoutj )︷ ︸︸ ︷

(φτ )`(x)a∗τ (pinj , σinj ) = (2π)−3/2e−ip

inj ·xu`(p

inj , σ

inj )︷ ︸︸ ︷

(φτ )∗` (x)a∗τC (pinj , σinj ) = (2π)−3/2e−ip

inj ·xv`(p

inj , σ

inj )︷ ︸︸ ︷

φj(xm)φ∗k(xm+l) = θ(x0m − x0

m+l)[φ+j (xm), (φ+

k )∗(xm+l)]∓

±θ(x0m+l − x0

m)[(φ−k )∗(xm+l), φ−j (xm)]∓

=: −i∆jk(xm, xm+l)︷ ︸︸ ︷φ∗j (xm)φk(xm+l) = θ(x0

m − x0m+l)[(φ

−j )∗(xm), φ−k (xm+l)]∓

±θ(x0m+l − x0

m)[φ+k (xm+l), (φ

+j )∗(xm)]∓

= ∓i∆kj(xm+l, xm)

where in the last two equalities m, l ≥ 1 and θ : R → 0, 1 denotes the step function and thelower signs correspond to the case where both components φj and φk are fermionic. Furthermore,the φ±j (x) refer to the decomposition φj(x) = φ+

j (x) + φ−j (x) of the field into an annihilation fieldand a creation field. The quantities ∆ij(x, y) are called propagators.

The nonzero terms in (3.28) are products of pairings of the forms above and such productsof pairings can be graphically represented by Feynman diagrams. In these diagrams the initialparticles are represented by vertices at the bottom of the diagram and the final particles arerepresented by vertices at the top of the diagram, and these external vertices are labeled by theircorresponding particle species, momenta and spin states. Each Hj(xk) is represented by a vertexthat is drawn between the initial and final vertices, and each such internal vertex is labeled byj and xk. Each pairing is then represented by a line connecting the two vertices that representthe two paired objects. Here we mean, in particular, that if a field or adjoint field in Hj(xk) ispaired with some other object then we draw a line between that object and the vertex labeled byj, xk. Of course this implies that if Hj(xk) is a product of K fields and adjoint fields, then thereare K lines that connect the vertex j, xk with other vertices. Each line that connects the externalvertex of a particle carries an upward arrow, while each line that connects the external vertex ofan antiparticle carries a downward arrow. Internal lines, representing a pairing of a field with anadjoint field, carry an arrow that points from the vertex of the adjoint field to the vertex of the

27Here τC denotes the antiparticle of τ , as usual. In each line where there are two τs, these two τs are thesame. In the first line τ (C) is either τ or τC , but the same choice must be made for both factors in the first line.Furthermore, the field φ can be decomposed into irreducible components each of which corresponds to a singleparticle (not antiparticle) species. If we write (φτ )` instead of φ` we mean that the `th component of φ belongs toan irreducible representation corresponding to particle species τ .

63

Page 65: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

field. All lines in the diagram are labeled by the value of the corresponding contraction. The termin (3.28) that corresponds to the diagram can now be recovered from the diagram as follows. Foreach internal vertex j, xk we write a factor −igj and for each line we write a factor that is equal toits label (i.e. to the corresponding contraction). The result is then integrated over all coordinatesxk, yielding the desired term in (3.28). In practice, one uses the Feynman diagrams to calculate allterms (up to some order) in the expansion of the S-matrix elements. However, we will not discussthe details of this procedure here, nor will we discuss the renormalization procedure that is neededwhenever the obtained expressions are infinite.

3.4 Obtaining V from a Lagrangian

In the previous section we mentioned that perturbation theory can be used to compute S-matrixelements when we are given an expression for the interaction V (in the interaction picture) in termsof the free fields. In this section we will show how this expression for V can be obtained froma Lagrangian. In practice, this is very useful because Lagrangians are often easier to guess thanHamiltonians. Furthermore, if there is Poincare invariance in the Lagrangian formalism, then theS-matrix will automatically be Poincare invariant, even if the requirements (3.2) and (3.3) are notexactly satisfied; see section 7.4 of [35]. These Lagrangians are given in the context of a classicalLagrangian field theory, so we will first discuss classical Lagrangian field theory. Next we will makethe transition to Hamiltonian classical field theory, and then we will quantize the theory to obtainHeisenberg picture quantum fields. After quantization we can then make the transition from theHeisenberg picture to the interaction picture, and finally derive the expression for V . Althoughthe main structure of this chapter is based on [35], some parts of the present section are basedmore on [6].

Classical relativistic field theoryIn classical relativistic field theory, the field components φ1(x), . . . , φn(x) are complex-valued28

functions on spacetime that transform under a transformation (a,A) ∈ P↑+ as29

φj(x)→n∑k=1

[D(A)]jkφk(Φ(A)x+ a),

where D is a representation of SL(2,C) and Φ : SL(2,C) → L↑+ is the covering map. Note thatthe partial derivatives ∂µφj(x) also transform in such a way, namely as the tensor product of therepresentation D with the representation A 7→ Φ(A) (which transforms the index µ in ∂µ). Forfixed time x0 = t, the fields φj(t,x) and their time derivatives φ(t,x) are functions of the spacecoordinates x. We will call the space of all such possible functions the configuration space of thesystem. This configuration space is thus a function space, the elements of which are ordered 2n-tuples (q1(x), . . . , qn(x), q1(x), . . . , qn(x)) of functions qj and qj with certain smoothness conditions(and perhaps some boundary conditions as well); the notation qj has nothing to do with the time-derivative of qj (which does not even depend on time). The Lagrangian L is a functional L[qj , qj ]of these 2n functions. We will always assume that there is a smooth function L(aj , bk, cl) onCn+3n+n = C5n such that L[qj , qj ] can be written as

L[qj , qj ] =

∫R3

L(qj(x),∇qj(x), qj(x))d3x.

In particular, when these functions qj(x) and qj(x) are taken to be the values of the field compo-nents φj(x) and their time-derivatives ∂0φ(x) at a fixed moment x0 = t of time, we can define a

28Here we consider real-valued functions as a special subclass of complex-valued functions.29This transformation property is only approximately true for components that represent gauge fields, where the

transformation rule may contain gauge transformations. Such extra terms have no effect on the Poincare invarianceof the theory because the action S (we will define actions below) is always considered to be invariant under suchgauge transformations.

64

Page 66: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

functional L(t) by:

L(t) = L[φ(t, .), ∂0φ(t, .)] =

∫R3

L(φ(t,x),∇φ(t,x), ∂0φ(t,x))d3x

=

∫R3

L(φ(t,x), ∂µφ(t,x))d3x,

where φ(t, .) and ∂0φ(t, .) denote the functions x 7→ φ(t,x), and x 7→ ∂0φ(t,x). In the last stepwe simply used a different convention for writing the dependence of L on its arguments. For givenφj(x), the action S is now defined as the time integral of L(t) and it is a functional of the fieldsφj(x), but not of their time-derivatives, since these can be calculated once we know the fields (atall times),

S = S[φ(.)] =

∫ML(φ(x), ∂µφ(x))d4x.

The action is manifestly invariant under spacetime translations φj(x)→ φj(x+a) and it is assumedthat it is also a real-valued Lorentz-scalar. The equations of motion (or field equations) are obtainedby demanding that the action is stationary under variations; these equations are

∂L∂φj− ∂µ

(∂L

∂(∂µφj)

)= 0,

for j = 1, . . . , n, and are called the Euler-Lagrange equations.Suppose that the action S is invariant (by which we mean that δS = 0) under an infinitesimal

variationφj(x)→ φj(x) + εFj(x),

where ε is a constant and the Fj(x) are functions of the field components and their derivatives atthe point x. We also call this an infinitesimal symmetry of the action. Then there exists a currentJµ(x) that is conserved, i.e. ∂µJ

µ(x) = 0. In particular, this implies that we can define a quantity

Q(t) :=

∫R3

J0(t,x)d3x

that is conserved in the sense that dQdt ≡ 0; in other words, an infinitesimal symmetry of the

action S implies the existence of a conserved quantity. This statement is also known as Noether’stheorem. In cases where not only the action S is invariant under a certain infinitesimal variationof the fields, but also the Lagrangian (this is the case for spatial translations and rotations), onecan write down an explicit expression for the conserved quantity Q in terms of the Lagrangian Land the functions Fj(x). If the Lagrangian density L is also invariant under the variation, thenone can even find an explicit expression for the current Jµ(x) in terms of the Lagrangian densityand the functions Fj(x). These expressions can be found in section 7.3 of [35]. As already statedabove, the action is invariant under spacetime translations, which can be described in infinitesimalform as φj(x)→ φj(x) + εµ∂µφj(x). These are in fact four independent symmetries (one for eachspacetime dimension) and hence there are four conserved currents Tµ0(x), . . . , Tµ3(x). Becausethe Lagrangian density is not invariant under translations, one cannot use the explicit expressionfor the currents that we mentioned above. However, it is possible to derive an expression for thesecurrents by more direct means, see section 7.3 of [35]; the result is

Tµν :=n∑j=1

∂L∂(∂µφj)

∂νφj − ηµνL,

which is also called the energy-momentum tensor (the index ν is a Lorentz index, so this is indeeda tensor), and the corresponding conserved quantities are

Pµ =

∫T 0µd3x =

∫ n∑j=1

∂L∂(∂0φj)

∂µφj − η0µL

d3x.

65

Page 67: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

The conserved quantities Pµ are interpreted as the four-momentum. The conserved currentsMµνρν,ρ corresponding to Lorentz invariance are given by

Mµρσ =1

2

n∑j,k=1

∂L∂(∂µφj)

[Dρσ]jkφk −1

2(Tµρxσ − Tµσxρ),

where D is the Lie algebra representation of l ' sl(2,C) induced by D , and Dρσ = D(Xρσ) withXµν ∈ l as defined before. The corresponding conserved quantities Jρσ are then

Jρσ =

∫R3

M0ρσd3x.

If (i, j, k) is a cyclic permutation of (1, 2, 3), then J ij is interpreted as the xk-component of theangular momentum.

Although the Lagrangian formulation of classical fields is very useful in order to describe sym-metries, it will also be necessary to construct the Hamiltonian formalism from a given Lagrangian.The reason for this is that the Hamiltonian formalism involves Poisson (and Dirac) brackets, whichcan be used to postulate the commutation relations of the field component in the correspondingquantum theory. Also, after quantizing in the Hamiltonian formalism, the time evolution of the(quantized) fields can be stated in a simple form. As described above, the Lagrangian L is afunctional on the configuration space and this configuration space consists of 2n-tuples of func-tions (qj(x), qj(x)). We now define a second space, called the phase space, which also consists of2n-tuples (qj(x), pj(x)), where the first n functions are allowed to be functions of the same class asthe first n functions in the elements of configuration space, but the last n objects pj(x) are definedas linear functionals

q 7→∫R3

pj(x)q(x)d3x

on the space of all allowed functions q(x) (note that this is very similar to the case of finitely manyparticles, where the configuration space is the tangent bundle of some smooth manifold and thephase space is the cotangent bundle of the same manifold). In order to define the Poisson bracketswe need to define the notion of a functional derivative of a functional. Let A be a functionalof k functions f1(x), . . . , fk(x), i.e. A is a map (f1, . . . , fk) 7→ A[f1, . . . , fk]. We then define thefunctional derivative of A with respect to fj at the point (g1, . . . , gk) as the linear map

h 7→[d

dεA[g1, . . . , gj−1, gj + εh, gj+1, . . . , gk]

]ε=0

,

where h(x) is a function in the same class as the fj . The integral kernel of this map is denoted byδA/δfj : [

d

dεA[g1, . . . , gj−1, gj + εh, gj+1, . . . , gk]

]ε=0

=

∫R3

δA[f1, . . . , fk]

δfj(x)

∣∣∣∣(g1,...,gk)

h(x)d3x

and this integral kernel will also be called the functional derivative of A with respect to fj . Wemay interpret this functional derivative as a functional

(g1, . . . , gk, h) 7→∫R3

δA[f1, . . . , fk]

δfj(x)

∣∣∣∣(g1,...,gk)

h(x)d3x,

which is linear in h, as already stated above. Recall that in writing partial derivatives of ordinaryfunctions, we often denote the partial derivative of f(x1, . . . , xn) with respect to xj at the point

(y1, . . . , yn) as ∂f(y1,...,yn)∂yj

instead of ∂f(x1,...,xn)∂xj

∣∣∣(y1,...,yn)

. Similarly, we will often write the functional

derivative of the functional A[f1, . . . , fk] above with respect to fj at the point (g1, . . . , gk) as

δA[g1, . . . , gk]

δgj(x).

66

Page 68: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

When the functional A depends in a well-behaved manner on the functions fj , then it mighthappen that these functional derivatives depend on x in a well-behaved manner, and in that casewe can consider them as a map

(x, g1, . . . , gk) 7→δA[g1, . . . , gk]

δgj(x),

i.e. as a functional of the gj , depending in a well-behaved manner on the variable x. We nowapply these definitions as follows. Let F [q, p] and G[q, p] be two functionals on phase space. Wethen define the Poisson bracket F,GP of F and G by

F,GP :=

n∑j=1

∫R3

[δF [q, p]

δqj(x)

δG[q, p]

δpj(x)− δG[q, p]

δqj(x)

δF [q, p]

δpj(x)

]d3x.

In particular, for the functionals Fj,x[q, p] = qj(x) and Gk,y[q, p] = pk(y) we find that their Poissonbracket is δjkδ(x−y). We will now use the Lagrangian L[q, q] to define a function from configurationspace to phase space by

(q1(x), . . . , qn(x), q1(x), . . . , qn(x)) 7→ (q1(x), . . . , qn(x), π1[q, q](x), . . . , πn[q, q](x)), (3.29)

where the πj [q, q] are given by

πj [q, q](x) =δL[q, q]

δqj(x).

A priori the x-dependence should be interpreted in the sense of linear functionals as describedabove, but because the Lagrangian L is a space integral of a Lagrangian density L(aj , bk, cl) (withj, l ∈ 1, . . . , n and k ∈ 1, . . . , 3n), the πj [q, q](x) are well-behaved functions given by

πj [q, q](x) =∂L(q(x), (∇q)(x), c)

∂cj

∣∣∣∣c=q(x)

,

It might happen that the map (3.29) gives rise to certain constraints, e.g. if L(aj , bk, cl) does notdepend on cm for some m ∈ 1, . . . , n then the image of configuration space under the map (3.29)only contains 2n-tuples of the form (q1, . . . , qn, p1, . . . , pm−1, 0, pm+1, . . . , pn). In this case we canwrite the constraint as pm ≡ 0. In general, we will consider constraints of the form

αm,x[q, p] = 0,

where for each pair (m,x) ∈ 1, . . . ,M ×R3, the functional αm,x[q, p] is a function of the qs andps and their derivatives at the point x. These constraints are called primary constraints becausethey already follow from the definition of the functions πj , without using equations of motion. Wewill assume that these primary constraints are all independent of one another. We now define afunctional H ′[q, q] on configuration space by

H ′[q, q] = H ′[q, q, π[q, q]] :=

∫R3

qj(x)πj [q, q](x)d3x− L[q, q].

It can be shown that H ′[q, q] is actually a functional H ′[q, π[q, q]], i.e. that the dependence on q isonly via π. If there are primary constraints, we cannot extend H ′ uniquely to a functional H[q, p]on the entire phase space. Indeed, if H[q, p] is such that H[q, π[q, q]] = H ′[q, q], then

HT [q, p] = H[q, p] +

M∑m=1

∫R3

um,x[q, p]αm,x[q, p]d3x

also satisfies this property for any set of functionals um,x[q, p], since αj,x[q, p] vanishes when wesubstitute π[q, q] for p. However, as we will explain below, not all of the HT of the form abovewill give rise to Hamiltonian equations of motion in the phase space that are consistent with

67

Page 69: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

the primary constraints. To see this, we first have to introduce the concept of time evolution ofthe qs and ps. Analogues to the theory of Hamiltonian particle dynamics, the time evolution inHamiltonian field theory is described by maps

t 7→ (q(t), p(t)),

where for each t ∈ R, (q(t), p(t)) is an element in phase space. This time evolution is given by theHamiltonian equations of motion

∂q(t)j (x)

∂t=

∂tFj,x[q(t), p(t)]

≈ Fj,x[q(t), p(t)], HT [q(t), p(t)]P∂p

(t)j (x)

∂t=

∂tGj,x[q(t), p(t)]

≈ Gj,x[q(t), p(t)], HT [q(t), p(t)]P ,

where Fj,x[q, p] = qj(x) and Gj,x[q, p] = pj(x), and ≈ means that the equality only holds for thoseqs and ps that satisfy the constraints. This time evolution of the qs and ps determines the timeevolution for any functional g[q, p] by t 7→ g[q(t), p(t)] =: g(t)[q, p], which is equivalent to

d

dtg(t)[q, p] ≈ g(t)[q, p], H

(t)T [q, p]P .

In particular, for g = HT we find that

d

dtH

(t)T [q, p] ≈ 0.

Therefore, we often suppress the time dependence of HT . We can also take g to be a constraintfunctional αj,x. Because the constraints must be satisfied for all time, we have the followingequations

αm,x[q, p], HT [q, p]P ≈ 0 (3.30)

for m = 1, . . . ,M . There are now three options for any one of these equations:(1) The equation is trivially true because it follows from the primary constraints;(2) The equation reduces to an equation not involving the um,x[q, p]. In this case we obtain a newconstraint of the form βx[q, p] ≈ 0, which is called a secondary constraint ;(3) The equation does not reduce in any of the two manners described above. In this case weobtain an equation that describes restrictions on the um,x, namely

αm,x[q, p], H[q, p]P +

M∑m′=1

∫R3

d3xum′,x[q, p]αm,x[q, p], αm′,x[q, p]P ≈ 0. (3.31)

The procedure is now as follows. For each value of m we check which of the three options issatisfied. If it is (1), then we obtain nothing new and we move on to the next m. If it is (2),then we have obtained a new constraint βx[q, p] ≈ 0 and we must demand consistency of this newconstraint by substituting it into (3.30) instead of αm,x[q, p]:

βm,x[q, p], HT [q, p]P ≈ 0.

We then have to check again which of the three options is satisfied for this new equation. Finally,if option (3) is satisfied then we have obtained a constraint on the um,x[q, p] and we move on tothe next m. The final result of this procedure is that we are left with M primary constraintsαm,x[q, p] ≈ 0, K secondary constraints βk,x[q, p] ≈ 0 and L constraints on the um,x of the form(3.31). Because the distinction between primary and secondary constraints is not really necessary,we will use the letter χ for both primary and secondary constraints from now on, χm,x := αm,x

68

Page 70: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

for m = 1, . . . ,M and χM+k,x = βk,x for k = 1, . . . ,K, so we can write the set of primary andsecondary constraints as

χn,x[q, p] ≈ 0,

with n = 1, . . . ,M +K. The constraints on the um,x[q, p] are now of the form

χn,x[q, p], H[q, p]P +M∑m=1

∫R3

d3yum,y[q, p]χn,x[q, p], χm,y[q, p]P ≈ 0

for some of the n ∈ 1, . . . ,M +K. We can interpret this as a system of non-homogeneous linearequations in the unknown variables um,x[q, p]. If Um,x[q, p] is any particular solution of this system,then the general solution can be written as

um,x[q, p] = Um,x[q, p] +

A∑a=1

∫R3

d3zva,z[q, p] · (V (a,z))m,x[q, p],

where the va,x[q, p] are arbitrary functionals and the (V (a,z))m,x[q, p] are all independent solutionsof the corresponding homogeneous system:

M∑m=1

∫R3

d3y(V a,z)m,x[q, p]χn,x[q, p], χm,y[q, p]P ≈ 0.

The index (a, z), which labels the solutions to the homogeneous system, in general takes on lessvalues than the index (m,x), so the constraint equations on um,x[q, p] have reduced the arbitrarinessof the Hamiltonian HT somewhat. The Hamiltonian HT can now be written as

HT [q, p] = H[q, p]

+

M∑m=1

∫R3

d3x

(Um,x[q, p] +

A∑a=1

∫R3

d3zva,z[q, p](V (a,z))m,x[q, p]

)χm,x[q, p]

= H[q, p] +M∑m=1

∫R3

d3xUm,x[q, p]χm,x[q, p]

+A∑a=1

∫R3

d3zva,z[q, p]

(M∑m=1

∫R3

d3x(V (a,z))m,x[q, p]χm,x[q, p]

)

=: H[q, p] +A∑a=1

∫R3

d3zva,z[q, p]χa,z[q, p].

We have thus obtained an expression for the Hamiltonian in which the arbitrariness is made veryexplicit. The equations of motion for any functional g(t)[q, p] can be written in terms of thisHamiltonian as

d

dtg(t)[q, p] ≈ g(t)[q, p], H[q, p]P +

A∑a=1

∫R3

d3zv(t)a,z[q, p]g(t)[q, p], χa,z[q, p]P ,

where the time dependence of the v(t)a,z[q, p] is arbitrary. Because of this arbitrariness in the equa-

tions of motion, the time evolution of g(t)[q, p] is not uniquely defined. The physical interpretation

of this is that different choices for v(t)a,z[q, p] correspond to the same physical situation; the system

has some gauge freedom. The infinitesimal gauge transformations are of the form

g[q, p]→ g[q, p] +

A∑a=1

∫R3

d3zεa,zg[q, p], χa,z[q, p]P . (3.32)

69

Page 71: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Transformations of this kind do not change the physical state. The χa,y[q, p] are called generatingfunctionals for the infinitesimal gauge transformations. It can be shown that the Poisson bracketχa,z[q, p], χa′,z′ [q, p]P of two generating functionals is again a generating functional. In general,this gives rise to new gauge transformation, other than (3.32). For a better understanding oftheses new gauge transformations, we introduce some useful terminology. Any functional A[q, p]for which

A[q, p], χn,x[q, p]P ≈ 0

for any pair (n,x) is called first class, and it is easy to see that the Poisson bracket of two firstclass functionals is again first class. Any functional that is not first class is called second class. Wecan apply this terminology to the constraint functionals χn,x[q, p] themselves. In this manner theconstraints can be divided into first class constraints and second class constraints. The constraintfunctionals χa,z[q, p], which are generating functionals for gauge transformations of the form (3.32),are all first class (and primary). Because the Poisson bracket of two first class functionals is againfirst class, we find that the χa,y[q, p], χa′,y′ [q, p]P are first class constraints. However, it mighthappen that some of these are not primary (and hence secondary) and in this case we obtaingauge transformations other than (3.32). The corresponding first class secondary constraints canbe added to the Hamiltonian in a similar manner as the χa,y[q, p] without changing the physics;this new (and more general) Hamiltonian is called the extended Hamiltonian. In general it is nottrue that every first class secondary constraint generates a gauge transformation (so we should notadd them all to the Hamiltonian), but in all physically interesting models this turns out to be thecase. In these models all first class constraints can be eliminated by choosing a gauge, so we fromnow on we will only need to focus on second class constraints. In particular, we will only discussthe quantization of a system with second class constraints.

QuantizationFor a system with second class constraints χn,x[q, p] ≈ 0 we define the “matrix”

C(n1,x1),(n2,x2)[q, p] = χn1,x1 [q, p], χn2,x2 [q, p]P .

Because the constraints are second class, this matrix has an inverse (C−1)(n1,x1),(n2,x2)[q, p] (recallthat we are not trying to be mathematically rigorous in this chapter). We then define the Diracbracket of two functionals by

A[q, p], B[q, p]D := A[q, p], B[q, p]P

−∑n1,n2

∫d3x1d

3x2A[q, p], χn1,x1 [q, p]P (C−1)(n1,x1),(n2,x2)[q, p]

χn2,x2 [q, p], B[q, p]P .

In quantizing this system, the functionals become operators and we impose on them the commu-tation relations

[A,B] = iA,BDand the time evolution in the Heisenberg picture of the corresponding quantum system is givenby the Heisenberg equations of motions, as usual. The reason for choosing these commutationrelations can be explained as follows for the case of finitely many30 degrees of freedom. Recallthat for unconstraint systems the commutation relations are defined to be i times the Poissonbracket. If we have a classical system with only second class constraints, then it can be shownthat there exists a canonical transformation31 that transforms the coordinates q1, . . . , qn and theircanonical conjugates p1, . . . , pn to a set of coordinates Q1, . . . , Qn−r and Q1, . . . , Qr with canon-ical conjugates P1, . . . , Pn−r and P1, . . . , Pr, respectively, such that the constraints take the formQj = Pj = 0 with j = 1, . . . , r. The Qs and the P s then form an unconstraint system which

30Taking finitely many degrees of freedom makes the argument a little easier.31This is a coordinate transformation for which the new coordinates have the same Poisson brackets as the old

ones. Here the Poisson brackets of the new coordinates are computed with respect to the old coordinates.

70

Page 72: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

we know how to quantize, namely by taking the commutators to be i times the Poisson brackets(in terms of the unconstraint P s and Qs of course). But it can be shown that the Dirac bracket,calculated in the original coordinates qj and pj , coincides with the Poisson bracket calculated interms Qs and P s, so using the Dirac bracket indeed gives the desired result. More details aboutthis quantization procedure can be found in section 7.6 of [35].

Transition to the interaction pictureMany (or perhaps even all) of the free fields that we constructed at the beginning of this chap-ter can be obtained by quantizing a classical Lagrangian field theory according to the proceduredescribed above, and in these cases the Lagrangian is known explicitly. However, the expansionof the free fields in terms of creation and annihilation operators does not arise in any way fromthe quantization procedure described above. Instead, this expansion can be obtained by solvingthe classical equations of motion (which are linear differential equations for these free field Lagra-gians) and writing the general solution as a Fourier expansion. The Fourier coefficients are thenreplaced with the creation and annihilation operators in the corresponding quantized field. Sincethe commutation relations of the quantized free fields are dictated by the quantization procedureabove, the commutation relations of the creation and annihilation operators are also determinedby the quantization procedure. However, these commutation relations turn out to be precisely theones that we had before (so there is some consistency here). Furthermore, the free Hamiltonianthat is derived from the Lagrangian is also of the correct form when we write it in terms of thecreation and annihilation operators.

Suppose now that we are given a Lagrangian for an interacting field theory. Once we havederived the Hamiltonian from the Lagrangian by the procedure described above, we will split theHamiltonian into a sum H = H0 + V of a free part H0 and an interaction part V . Finding thecorrect free part H0 of H is not a very difficult task, because for all physically relevant free fieldswe know the explicit form of the Lagrangian (and hence also of the Hamiltonian). Once we havethe decomposition H = H0 +V , we make the transition to the interaction picture at time t = 0. Soat t = 0 the interaction picture fields and their canonical conjugates coincide with the Heisenbergfields and their canonical conjugates, but they evolve from t = 0 in a different manner. The nexttask is to express the canonical conjugates of the interaction picture fields in terms of the fieldsand their time derivatives. This is done by taking the functional derivative of H0 (not H) withrespect to the canonically conjugate fields. The result is an expression for the interaction term Vin the interaction picture in terms of the free fields, as desired.

3.5 Some remarks on the physics of quantum fields

It is clear that the entire framework described in the preceding sections was based on the fact thatwe want to calculate the S-matrix elements for a given scattering experiment. These S-matrixelements describe the scattering experiment in terms of the incoming particles and the resultingoutgoing particles, but they give no insight into the processes during the period that the particlesare interacting. The free field expansion of the S-operator is not very useful to examine this. Also,the methods above give us no information about how to describe arbitrary relativistic quantumsystems (beyond scattering experiments). One could argue that for many practical purposes thisis unnecessary, but in the end any satisfactory fundamental theory of nature should in principle beable to describe any system. Therefore, we must extend our discussion of the preceding sections.The obvious extension would be to somehow give meaning to Lagrangian theories in terms ofinteracting Heisenberg fields, and to assume that these fields describe the exact evolution of thesystem. For scattering experiments, these fields then interpolate between the fields of incoming andoutgoing free particles. It is difficult to imagine what these fields would be like from a mathematicalpoint of view, since in general we cannot even solve the (non-linear) classical equations of motioncorresponding to these fields. However, there are some properties that we must expect these fieldsto have. For example, they should transform under Poincare transformations in such a way thatthe theory is Poincare invariant, and any physical quantity that can be measured in some bounded

71

Page 73: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

spacetime region must be compatible with the physical quantities that can be measured in someother spacelike-separated region. The latter demand can most easily be implemented by assumingthat the fields either commute or anticommute at spacelike separated points, just as the freefields described earlier. These considerations will all be used to motivate the two mathematicalframeworks for quantum field theory that we will discuss in the next chapter.

72

Page 74: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

4 The mathematics of quantum fields

In this chapter we will discuss the mathematical structure of quantum field theory. In the firstsection we will introduce the Wightman axioms and we will motivate these axioms by referingback to several physical aspects that we discussed in the previous chapters. After introducingthese axioms we will formulate some of the important theorems that can be proven within theWightman formalism. In the second section of this chapter we will briefly discuss an alternativeapproach to the problem of finding a mathematical framework for quantum field theory, namelythe theory of local observables, also called algebraic quantum field theory. The axioms of thetheory of local observables are often refered to as the Haag-Kastler axioms.

4.1 The Wightman formulation of quantum field theory

However, before we can understand the Wightman formalism, we first need to develop some mathe-matical background. This mathematical background consists mainly of the theory of distributionsand operator-valued distributions and will be summarized in the following subsection, which isbased on chapter 2 of [32] and section 2.7 of [2]. The rest of this section on the Wightmanformulation is also mainly based on these two books.

4.1.1 Mathematical preliminaries: Distributions and operator-valued distributions

Let C∞(RN ) be the space of infinitely differentiable complex-valued functions f(x1, . . . , xN ) onRN . For any sequence k = (k1, . . . , kN ) with kj ∈ Z≥0 we define a function x = (x1, . . . , xN ) 7→ xk

in C∞(RN ), where xk is given byxk := x1

k1 . . . xNkN .

Also, for any such sequence k we define the differential operator Dk on C∞(RN ) by

Dk :=∂|k|

(∂x1)k1 . . . (∂xN )kN,

where |k| = k1 + . . .+ kN . For each f ∈ C∞(RN ) we then define for any r, s ∈ Z≥0

‖f‖r,s =∑

k:|k|≤r

∑l:|l|≤s

supx|xkDlf(x)|,

which is either a non-negative real number or else +∞. For any fixed r, s the restriction of‖ ‖r,s : C∞(RN ) → R to the linear subspace C∞(RN )r,s of C∞(RN ) consisting of all functions ffor which ‖f‖r,s <∞ is a norm.

Definition 4.1 The function space S(RN ) is defined to be the set of all f ∈ C∞(RN ) for which‖f‖r,s <∞ for all r, s ∈ Z≥0. The space S(RN ) is also called the Schwartz space.

In particular, since the ‖ ‖r,s are norms (and hence semi-norms) on the Schwartz space, we cangive S(RN ) the structure of a locally convex space by taking as a subbasis for the topology thesets of the form

Br,s(f0, ε) = f ∈ S(RN ) : ‖f − f0‖r,s < ε,

where f0 ∈ S(RN ), r, s ∈ Z≥0 and ε > 0. Thus a set U ⊂ S(RN ) is open if and only if for eachf0 ∈ U there are r1, . . . , rn, s1, . . . , sn and ε1, . . . , εn > 0 such that

⋂nj=1Brj ,sj (f0, εj) ⊂ U . As in

any topological space, we say that a sequence (fn) of elements in S(RN ) converges to f ∈ S(RN )if for each open neighborhood U of f there exists a positive integer NU such that fn ∈ U for alln ≥ NU . In particular, this condition for U must hold for any open neighborhood of the formBr,s(f, ε), so if (fn) converges to f then for every r, s ∈ Z≥0 and for every ε > 0 there existsa positive integer Nr,s,ε such that ‖fn − f‖r,s < ε for all n ≥ Nr,s,ε. Conversely, if (fn) is asequence such that for every r, s ∈ Z≥0 and for every ε > 0 there exists a positive integer Nr,s,ε

such that ‖fn − f‖r,s < ε for all n ≥ Nr,s,ε, then (fn) must converge to f . This is because each

73

Page 75: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

open neighborhood U of f contains an open subset of the form⋂nj=1Brj ,sj (f, εj). Therefore, we

conclude that a sequence (fn) in the Schwartz space converges to some f in the Schwartz space ifand only if for all r, s ∈ Z≥0 we have limn→∞ ‖fn − f‖r,s = 0.

Now that we have defined the Schwartz space we will introduce the notion of a distribution onthis space.

Definition 4.2 A continuous linear functional T : S(RN ) → C on the Schwartz space is called atempered distribution. We will denote the space of tempered distributions by S ′(RN ).

Because the Schwartz space S(RN ) is metrizable32, the continuity of a linear functional T :S(RN ) → C can be expressed in terms of sequences (see for instance [25], theorem 21.3): thelinear functional T is continuous if and only if for each sequence (fn) converging to f we have thatT (fn) converges to T (f), which is equivalent to the statement that limn→∞ ‖fn − f‖r,s = 0 for allr, s ∈ Z≥0 implies that limn→∞ |T (fn)− T (f)| = 0.

The most natural topology to define on S ′(RN ) is the weak*-topology, which is the topologythat is defined by the seminorms pf : T 7→ |T (f)|. With respect to this topology, a sequence (Tn)of tempered distributions converges to a tempered distribution T if and only if limn→∞ |Tn(f) −T (f)| = 0 for all f ∈ S(RN ).

We now introduce some terminology. We say that a distribution T ∈ S ′(RN ) vanishes in anopen set U ⊂ RN if T (f) = 0 for all f ∈ S(RN ) for which supp(f) ⊂ U . Here supp(f) denotes thesupport of f , i.e. the complement of the largest open set contained in x ∈ RN : f(x) = 0. Wethen define the support supp(T ) of the distribution T to be the complement of the largest openset on which T vanishes.

An important example of a tempered distribution is

T (f) =∑

k:|k|≤s

∫RN

tk(x1, . . . , xN )Dkf(x1, . . . , xN )dx1 . . . dxN , (4.1)

where k = (k1, . . . , kN ) (with kj ∈ Z≥0) and the functions tk are continuous and satisfy |tk(x)| ≤Ck(1 + ‖x‖jk) for some Ck ≥ 0 and jk ∈ Z≥0. A particularly nice case of (4.1) is when T is of theform T (f) =

∫RN t(x)f(x)dNx. In that case we say that the tempered distribution T is a function.

However, it is convenient to write any distribution T as T (x), even though it is not a function.For any non-singular linear transformation L : RN → RN and any vector a ∈ RN we can define

the diffeomorphism φ(a,L)(x) = Lx+ a on RN . For any function f ∈ S(RN ) we then define a newfunction f(a,L) by

f(a,L)(x) := f(φ−1(a,L)(x)) = f(L−1(x− a)).

For a distribution T ∈ S ′(RN ) we define a new distribution T(a,L) by

T(a,L)(f) := |det(L)|−1T (f(a,L))

for all f ∈ S(RN ). If we define variables (y1, . . . , yN ) = (φ−11 (x), . . . , φ−1

N (x)), then we find thatthe volume element in RN satisfies dNx = | det(L)|dNy. Now suppose that T is a distribution thatis given by T (f) =

∫RN t(x)f(x)dNx, as a special case of (4.1). Then

T(a,L)(f) = |det(L)|−1T (f(a,L)) = | det(L)|−1

∫RN

t(x)f(φ−1(x))dNx

= |det(L)|−1

∫RN

(t φ)(φ−1(x))f(φ−1(x))dNx

= |det(L)|−1

∫RN

(t φ)(y)f(y)|det(L)|dNy

=

∫RN

t(Lx+ a)f(x)dNx, (4.2)

32This is because the topology of S(RN ) is determined by countably many seminorms, see also proposition IV.2.1of [5] for this argument. It can be shown that S(RN ) is complete as a metric space, so that it is in fact a Frechetspace, but we will not need this fact.

74

Page 76: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

where in the last step we changed dummy variables. As we have stated before, distributions areoften written as T (x), even though they are not functions in general. In view of equation (4.2)above, the distribution T(a,L) is then denoted by T (Lx+ a). With the definition of T(a,L) at hand,we can now define the partial derivative of a tempered distribution T as

∂T

∂xj:= lim

h→0h−1(T(hej ,0) − T ).

where ejNj=1 is the standard basis for RN . By definition of convergence in S ′(RN ) this means

that for each f ∈ S(RN ) we have

∂T

∂xj(f) = lim

h→0h−1(T(hej ,0) − T )(f) = lim

h→0h−1[T (f(hej ,0))− T (f)]

= limh→0

T [h−1(f(hej ,0) − f)] = T [ limh→0

h−1(f(hej ,0) − f)]

= −T(∂f

∂xj

),

where in the second last step we used that h−1(f(hej ,0) − f) converges in S(RN ) and that T iscontinuous. For higher-order derivatives we have

(DkT )(f) = (−1)|k|T (Dkf).

Suppose that for some n ≥ 2 we have a distribution T ∈ S ′(Rn·N ) and let f1, . . . , fn ∈ S(RN ).Then the product function f1 · . . . · fn is an element of S(Rn·N ), so we can consider T (f1 · . . . · fn).This expression is linear in each of the fj in the sense that

T (f1 · . . . · (λf ′j + µf ′′j ) · . . . · fn) = λT (f1 · . . . · f ′j · . . . · fn) + µT (f1 · . . . · f ′′j · . . . · fn),

and it depends continuously on each of the fj in the sense that

liml→∞

T (f1 · . . . · fj,l · . . . · fn) = T (f1 · . . . · fj · . . . · fn)

if liml→∞ fj,l = fj . Thus, T ∈ S ′(Rn·N ) defines a multilinear functional on S(RN )×n that isseparately continuous in each of its arguments. Conversely, it is also true that each multilinearfunctional on S(RN )×n which is continuous in each of its arguments can be derived from a (unique)tempered distribution on S(Rn·N ) as above. This is the content of the nuclear theorem, which canbe found in section 1.3 of [9].

Theorem 4.3 (Nuclear theorem) Let T : S(RN )×n → C be a multilinear functional which isseparately continuous in each of its arguments. Then there exists a unique tempered distributionT ∈ S ′(Rn·N ) such that for all f1, . . . , fn ∈ S(RN ) we have

T (f1, f2, . . . , fn) = T (f1 · f2 · . . . · fn).

We will now discuss the Fourier transform on distributions. Recall that the Fourier transformand the inverse Fourier transform of a Schwartz function f are defined by

(FBf)(p) =

(1√2π

)N ∫RN

e−iB(p,x)f(x)dNx

and

(FBf)(p) =

(1√2π

)N ∫RN

eiB(p,x)f(x)dNx,

respectively, where B(., .) denotes a non-degenerate symmetric bilinear form (for example theEuclidean or the Minkowskian form). Now let f, g ∈ S(RN ). Because Schwartz functions behavevery nice, we can use Fubini’s theorem to conclude that the Fourier transform satisfies∫

RN(FBf)(p)g(p)dNp =

∫RN

f(x)(FBg)(x)dNx.

75

Page 77: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

We will use this in the following case. Suppose that T ∈ S ′(RN ) is a tempered distribution of theform T (f) =

∫RN h(x)f(x)dNx with h ∈ S(RN ) a Schwartz function. We will write Th instead of

T to denote the dependence of T on h. Then the equality above implies that we have the identity

TFBh(g) =

∫RN

(FBh)(p)g(p)dNp =

∫RN

h(x)(FBg)(x)dNx

= Th(FBg).

Similarly, we also have the identity

TFBh(g) = Th(FBg)

for the inverse Fourier transform. The left-hand sides of these two equations can be used as adefinition of the Fourier transform and its inverse, respectively, of the tempered distribution Th.This motivates the following definition for the Fourier transform and the inverse Fourier transformfor tempered distributions.

Definition 4.4 Let T ∈ S ′(RN ) be a tempered distribution. Then the Fourier transform FBT ofT is defined by

(FBT )(f) = T (FBf).

The inverse Fourier transform FBT of T is defined by

(FBT )(f) = T (FBf).

In order to also define the Laplace transform on distributions, it is convenient to start with alarger class of distributions than S ′(RN ). Let D(RN ) ⊂ C∞(RN ) denote the set of all C∞-functionswith compact support. By definition, a sequence (fn) in D(RN ) converges to f ∈ D(RN ) if thesupports of all fn lie in a single compact set K, if fn → f uniformly in K and if all derivatives offn converge uniformly in K to the derivatives of f . It is clear that D(RN ) ⊂ S(RN ) as sets, soevery tempered distribution defines a linear functional on D(RN ), and because convergence of asequence in D(RN ) implies convergence of the same sequence in S(RN ), we see that any tempereddistribution is in fact continuous with respect to the topology on D(RN ). So if D′(RN ) denotes thespace of all continuous linear functionals on D(RN ), then we have the inclusion S ′(RN ) ⊂ D′(RN ).In general this inclusion will be strict and we have thus obtained a class of distributions D′(RN )that is larger than S ′(RN ).

Now if T ∈ D′(RN ) then for each g ∈ C∞(RN ) we can define a distribution gT by (gT )(f) =T (fg) for f ∈ D(RN ). It is easy to see that gT ∈ D′(RN ). Sometimes it can happen that gT iseven a tempered distribution.

Definition 4.5 For each T ∈ D′(RN ) we define a set Γ(T ) ⊂ RN by

Γ(T ) = γ ∈ RN : e−B( . ,γ)T ∈ S ′(RN ),

where B denotes a non-degenerate symmetric bilinear form and e−B( . ,γ) denotes the C∞-functionx 7→ e−B(x,γ) on RN .

It can be shown (see section 2.3 of [32]) that Γ(T ) is convex, i.e. if γ1, γ2 ∈ Γ(T ) then alsotγ1 +(1− t)γ2 ∈ Γ(T ) for all 0 < t < 1. Note that this does not exclude the case Γ(T ) = ∅, so Γ(T )might still be empty. However, if T ∈ S ′(RN ) then 0 ∈ Γ(T ), so then Γ(T ) is certainly non-empty.In general, whenever T ∈ D′(RN ) is such that Γ(T ) is non-empty and such that the support of Tlies in some half-space of the form

H(B)α,r := x ∈ RN : B(x, α) > r

with α ∈ RN and r ∈ R, the following theorem gives some information about Γ(T ).

76

Page 78: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Theorem 4.6 If T ∈ D′(RN ) with supp(T ) ⊂ H(B)α,r for some α ∈ RN and r ∈ R, then Γ(T )

contains all points of the form γ + tα with γ ∈ Γ(T ) and t ≥ 0.

This is theorem 2.7 of [32]. Note that the actual value of r ∈ R is not important here. We nowdefine the Laplace transform of a distribution.

Definition 4.7 Let T ∈ D′(RN ). For each γ ∈ Γ(T ) we define the Laplace transform LB(T )γ ∈S ′(RN ) by

LB(T )γ = FB(e−B( . ,γ)T ).

If T is given by T (f) =∫RN t(x)f(x)dNx, then its Laplace transform is given by LB(T )γ(f) =∫

RN LB(T )γ(p)f(p)dNp, with the function LB(T )γ(p) given by

LB(T )γ(p) =

(1√2π

)N ∫RN

e−iB(p,x)e−B(x,γ)t(x)dNx

=

(1√2π

)N ∫RN

e−iB(x,p−iγ)t(x)dNx,

where we have extended B to complex vectors by making B a C-bilinear form (not a sesquilinearform). We will often identify the Laplace transform LB(T )γ with the function LB(T )γ(p). Theexpression for LB(T )γ(p) gives the impression that the Laplace transform of a distribution T ofthe form above depends on the complex variables p− iγ = (p1 − iγ1, . . . , pN − iγN ) in a nice way.In fact, as stated in the following theorem, which is theorem 2.6 in [32], this is even true for generaldistributions in D′(RN ).

Theorem 4.8 Let Γ ⊂ RN be a convex open set and let T ∈ D′(RN ) be such that Γ ⊂ Γ(T ). Thenthe Laplace transform LB(T )γ is a holomorphic function LB(T )(p− iγ) on the tube RN− iΓ ⊂ CN .

We will now apply the theorems above to distributions on Minkowski space. For the inside ofthe future light cone we will use the notation

V+ = x ∈M : η(x, x) > 0, x0 > 0,

where we have assumed that we have already chosen an orthonormal basis in M. The closure ofthis set is V+ = x ∈ M : η(x, x) ≥ 0, x0 ≥ 0 and is just the union of V+ with the future lightcone C+. For each a ∈ V+ we have that η(x, a) ≥ 0 for all x ∈ V+, so for each a ∈ V+ we have theinclusion

V+ ⊂ H(η)a,−ε

for any ε > 0. Now consider the n-fold product Mn ' R4n of M. On Mn we define a non-degenerate symmetric bilinear form η(n) by

η(n)((x1, . . . , xn), (y1, . . . , yn)) =n∑j=1

η(xj , yj) =n∑j=1

ηµνxµj y

µj .

From now on, we will denote the bilinear form η(n) simply by η. Then, analogous to the inclusionabove, we have for all a ∈ (V+)n

(V+)n ⊂ H(η)a,−ε

for any ε > 0. Thus, if T ∈ S ′(Mn) (so that 0 ∈ Γ(T )) has support in V+ then ta ∈ Γ(T ) for alla ∈ (V+)n and t ≥ 0, i.e. (V+)n ⊂ Γ(T ). Because (V+)n is open and convex in Mn, the Laplacetransform of T is holomorphic on the tube Mn − i(V+)n =: Tn. This is summarized in the firstpart of the following theorem. The second part is theorem 2.9 in [32].

Theorem 4.9 If T ∈ S ′(Mn) with supp(T ) ⊂ (V+)n then the Laplace transform Lη(T )(p− iγ) isa holomorphic function on the tube Tn =Mn − i(V+)n. Also, for each f ∈ S(Mn) we have

limγ→0

∫Mn

Lη(p− iγ)f(p)d4np = (Fη(T ))(f).

77

Page 79: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

We will now introduce the notion of a vector-valued distribution. Our purpose for using vector-valued distributions will be to define operator-valued distributions. Vector-valued distributions canbe defined to take on values in any locally convex space, but for us it will be enough to restrictourselves to Hilbert spaces.

Definition 4.10 Let H be a Hilbert space. Then a linear map T : S(RN )→ H is called a vector-valued distribution if for all Ψ ∈ D, with D ⊂ H some dense linear subspace, the linear functionalf 7→ 〈T (f),Ψ〉 is continuous. Thus, a vector-valued distribution is a linear map T : S(RN ) → Hsuch that f 7→ 〈T (f),Ψ〉 is a tempered distribution for all Ψ in some dense linear subspace D ⊂ H.

With this definition, the definition of an operator-valued distribution can be given as follows.

Definition 4.11 Let T be a linear map from the Schwartz space S(RN ) to the set of closable33

operators on a Hilbert space H which are all defined on the same dense linear subspace D ⊂ H.Then the map T is called an operator-valued distribution in H if for all Ψ ∈ D the correspondencef 7→ T (f)Ψ is a vector-valued distribution.

4.1.2 The Wightman axioms

In section 2.2.4 we showed that in any quantum theory that is Poincare invariant we should have aunitary representation of the double cover P↑+ of the restricted Poincare group P↑+ on the Hilbertspace H of pure states. Therefore, before we can even start discussing quantum fields the followingaxiom must be satisfied:

Axiom 0: Relativistic quantum theoryIf H denotes the Hilbert space of pure states for some quantum system in Minkowski spacetime,then there is a unitary representation U : P↑+ → B(H) of the double cover of the restricted Poincaregroup on H describing the transformation of states and operators under a Poincare transforma-tion. In particular, any spacetime translation (a, 1) ∈ P↑+ is represented by a unitary operator ofthe form

U(a, 1) = eia·P ,

where the self-adjoint operators P = (P 0, P 1, P 2, P 3) are interpreted as the energy-momentumoperators of the system. The points in the joint spectrum of these operators lie on or inside thepositive light cone in momentum space (positive energy condition), i.e. the operators P 0 andM2 = P · P are both positive operators.

Our description of quantum fields in the previous chapter seems to suggest that quantum fields areobjects which assign an operator to each point in spacetime. However, the fields at a spacetimepoint are too singular to be a well-defined operator. Therefore, we assume that quantum fields onlydefine well-defined operators after they are smeared out with some rapidly decreasing test functionover spacetime. The quantum fields are thus operator-valued distributions. This motivates thefollowing axiom.

Axiom 1: Quantum fieldThere is an object φ = (φ1, . . . , φN ), called a quantum field, whose components are operator-valueddistributions mapping each function f in the Schwartz space S(M) of functions on Minkowskispacetime to operators φ1(f), . . . , φN (f) on H whose domains all contain the same dense subspaceD ⊂ H and which satisfy φj(f)D ⊂ D. The adjoints φj(f)∗ are also operators whose domainscontain D and which satisfy φj(f)∗D ⊂ D; the adjoint field φ∗ = (φ∗1, . . . , φ

∗N ) is then defined by

φ∗j (f) = φj(f)∗. Furthermore, the dense subset D is left invariant by U , i.e. U(a,A)D ⊂ D for

any (a,A) ∈ P↑+.

33An operator A : H → H is called closable if it has a closed extension, i.e. if it has an extension whose graph isclosed in H⊕H.

78

Page 80: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Here S(M) is of course the same as S(R4). Note that the fact that D is invariant under thefields and their adjoints implies that for any Ψ ∈ D we can let arbitrary products of smeared fieldsand their adjoints act on Ψ.

Equation (3.15) in the previous chapter shows that the quantum field components should trans-

form according to a representation of P↑+. This is stated in the following axiom.

Axiom 2: Transformation law of the fieldFor each f ∈ S(M) we have the operator identity on D

U(a,A)φj(f)U(a,A)−1 =

N∑k=1

S(A−1)jkφk(f(a,A)),

where f(a,A)(x) := f(Φ(A)−1(x− a)) (here Φ : P↑+ → P↑+ is the covering map) and S : SL(2,C)→

GL(CN ) is a representation of SL(2,C) on CN .

Note that this implies that for the adjoints of the smeared fields we have the transformationlaw

U(a,A)φj(f)∗U(a,A)−1 =N∑k=1

S(A−1)jkφk(f(a,A))∗,

which follows easily by taking the adjoint of the transformation law of the fields. The representationS in axiom 2 is not assumed to be irreducible; in general it will be a direct sum S = S(κ1)⊕. . .⊕S(κ`)

of irreducible representations S(κj) of SL(2,C). Correspondingly, the field φ can be decomposedinto irreducible fields as φ = (φ(κ1), . . . , φ(κ`)). Each of these irreducible fields has componentsφ(κj)

ab with a = −Aj ,−Aj + 1, . . . , Aj and b = −Bj ,−Bj + 1, . . . , Bj , where Aj and Bj are thelabels characterizing the irreducible representation S(κj) of SL(2,C) as in the previous chapter.Of course, the Aj , Bj satisfy

∑`j=1(2Aj + 1)(2Bj + 1) = N . Although the parameters Aj , Bj`j=1

give us information about how the different components of the field φ are related to each other,we cannot say anything about the complete form of any single component φj as operator-valueddistribution. For instance, they do not have to satisfy the Klein-Gordon equation, which wassatisfied for our (free) field components in the previous chapter. To say more about the field

components, we also need to know about the representation U(a,A) of P↑+.To obtain a Lorentz invariant S-matrix it was also necessary that the Hamiltonian density

commutes with itself at spacelike distances, see equation (3.5). This was then translated to therequirement that the fields and their adjoints should in fact commute or anticommute with eachother as in equation (3.20), and in section 3.5 we argued that this property should probably remainvalid beyond scattering theory. In terms of operator-valued distributions this can be formulatedby using Schwartz functions whose supports are spacelike separated. We say that the supports oftwo Schwartz functions f and g are spacelike separated if f(x)g(y) = 0 whenever (x− y)2 ≤ 0.

Axiom 3: Local commutativity or microscopic causalityIf f and g are Schwartz functions on Minkowski spacetime whose supports are spacelike separated,then for any j, k ∈ 1, . . . , N the corresponding smeared operators either commute or anticom-mute, i.e.

[φj(f), φk(g)]± = 0

as operator identity on D. Similarly, we also have

[φj(f), φk(g)∗]± = 0.

From the Poincare covariance of the fields we expect that the components of an irreducible fieldφ(κj) either all commute with each other at spacelike distances, or else they all anticommute witheach other at spacelike distances.

79

Page 81: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Finally, we will also assume that each quantum field theory has a unique vacuum state andthat the entire Hilbert space H of pure states can be constructed by acting on the vacuum statewith polynomials in the smeared fields.

Axiom 4: Vacuum stateThere exists a unique34 vector Ω ∈ D, called the vacuum state vector, which is invariant under theaction of P↑+

U(a,A)Ω = Ω

and which is cyclic for the smeared fields, i.e. the set P (φ1, . . . , φN )Ω of polynomials in the smearedfields acting on the vacuum vector forms a dense subspace in H.

A quantum theory which satisfies the axioms 0-4 is called a quantum field theory. It is char-acterized by the objects (H, D, U, φ,Ω). The free fields discussed in the previous chapter provideexamples of quantum field theories. We will prove this for the free hermitean scalar field in sub-section 4.1.5. The existence of these examples implies that the axioms above must be compatiblewith each other. It can also be shown that these axioms are independent of each other, i.e. thatone can find theories that satisfy only a proper subset of these axioms, but we will not discuss thishere; see also section 3.2 of [32].

Finally, we want to make a remark related to the cyclicity of the vacuum state. We say thatthe smeared fields form an irreducible set of operators in the Hilbert space if for A ∈ B(H) thecondition

〈Aφj(f)Ψ1,Ψ2〉 = 〈AΨ1, φj(f)∗Ψ2〉

for all Ψ1,Ψ2 ∈ D, all f ∈ S(M) and all j, implies that A is a constant multiple of the identity. Wemention without proof that the cyclicity of the vacuum implies that in any quantum field theorythe fields form an irreducible set of operators.

4.1.3 Wightman functions

Given any quantum field theory with field components φjNj=1 and their corresponding adjoints,we can define for each n ≥ 0 maps of the form

wi(∗)1 ...i

(∗)n

: (f1, . . . , fn) 7→⟨φ

(∗)i1

(f1)φ(∗)i2

(f2) . . . φ(∗)in

(fn)Ω,Ω⟩

=:⟨φi(∗)1

(f1)φi(∗)2

(f2) . . . φi(∗)n

(fn)Ω,Ω⟩

from S(M)×n to C. Here φ(∗)j refers to either taking the adjoint of φj or not. Note that in the

second line we also introduce the notation φj∗ to denote the adjoint field φ∗j . The benefit of thisnotation is that we can refer to adjoint fields in expressions where only the field indices occur, suchas in w

i(∗)1 ...i

(∗)n

. However, from now on we will often suppress the (∗) unless it is really necessary.

The maps wi1...in are separately continuous in each of their n arguments, so according to thenuclear theorem in the previous subsection, there exist unique tempered distributions Wi1...in onS(M×n) that satisfy

Wi1...in(f1 · f2 · . . . · fn) = wi1...in(f1, . . . , fn)

for all fj ∈ S(M). Here the arguments of each of the functions f1, . . . , fn in the product f1·f2·. . .·fnlie in different copies ofM, so the product indeed defines a function in S(M×n). The distributionsWi1...in are called (n-point) vacuum expectation values or Wightman functions. As stated in theprevious subsection, we often write distributions as if they were functions. Thus we will oftenwrite the Wightman functions as Wi1...in(x1, . . . , xn), where each of the variables xj denotes afour-vector with components xµj . The Wightman functions satisfy some nice properties, as statedin the following theorem. A detailed proof can be found in section 3.3 of [32].

34Here we mean uniqueness up to a phase factor, of course.

80

Page 82: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Theorem 4.12 In any quantum field theory the Wightman functions are tempered distributionswhich satisfy the following properties.(a) (Relativistic transformation law). Under Poincare transformations the Wightman func-tions transform as∑j(∗)1 ,...,j

(∗)n

S(A−1)i(∗)1 j

(∗)1

. . . S(A−1)i(∗)n j

(∗)nWj(∗)1 ...j

(∗)n

(Φ(A)x1+a, . . . ,Φ(A)xn+a) = Wi(∗)1 ...i

(∗)n

(x1, . . . , xn),

where S(A−1)i∗kj∗k

:= S(A−1)ikjk . So they are translation invariant and Lorentz covariant. Foreach n, let ξj = xj −xj+1 for j = 1, . . . , n− 1. Then translation invariance implies that there existtempered distributions Vj1...jn(ξ1, . . . , ξn−1) such that

Wj1...jn(x1, . . . , xn) = Vj1...jn(ξ1, . . . , ξn−1).

(b) (Spectral conditions). The (inverse) Fourier transforms Wj1...jn = Fη(Wj1...jn) and Vj1...jn =Fη(Vj1...jn) of Wj1...jn and Vj1...jn are tempered distributions and are related by

Wj1...jn(p1, . . . , pn) = (2π)4δ

n∑j=1

pj

Vj1...jn(p1, p1 + p2, . . . , p1 + p2 + . . .+ pn−1).

Also, Vj1...jn(q1, . . . , qn−1) = 0 if any qj is not in the joint spectrum of the operators Pµ.(c) (Hermiticity conditions). The Wightman functions satisfy

Wi1...in(x1, . . . , xn) = Wi∗n...i∗1(x1, . . . , xn),

where i∗k refers to the field obtained by taking the adjoint of the (adjoint) field that is referred to

by the index ik ≡ i(∗)k .

(d) (Local commutativity conditions). If (xj − xj+1)2 < 0, then for j = 1, . . . , n− 1 we have

Wj1...jn(x1, . . . , xj+1, xj , . . . , xn) = ∓Wj1...jn(x1, . . . , xj , xj+1, . . . , xn),

where the signs ∓ correspond to the two cases [φj , φj+1]±, respectively.(e) (Positive definiteness conditions). For any sequence fi1,...,in∞n=0 with fi1,...,in(x1, . . . , xn)in S(Mn) and with fn ≡ 0 for all but finitely many values of the multi-indices (i1, . . . , in), we havethe inequality

∞∑m,n=0

∑(i∗1,...,i

∗m)

∑(j1,...,jn)

∫Mm+n

Wi∗m,...,i∗1,j1,...,jn

(xm, . . . , x1, x′1, . . . , x

′n)fi1,...,im(x1, . . . , xm)

fj1,...,jn(x′1, . . . , x′n)d4x1 . . . d

4xmd4x′1 . . . d

4x′n ≥ 0.

(f) (Cluster decomposition property). For any spacelike vector a ∈ M and for any m ∈1, . . . , n we have

limλ→∞

Wj1...jn(x1, . . . , xm, xm+1 + λa, xm+2 + λa, . . . , xn + λa)

= Wj1...jm(x1, . . . , xm)Wjm+1...jn(xm+1, . . . , xn),

where the limit is taken in the topology of S ′(Mn).

Conversely, if we have a set of tempered distributions satisfying all the properties in the theorem,then there exists a unique quantum field theory for which these distributions are the Wightmanfunctions. This is also called the reconstruction theorem, see for example section 3.4 of [32].

81

Page 83: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

As stated in part (b) of the theorem above, the support of the distribution Vj1,...,jn(q1, . . . , qn−1) ∈S ′(Mn−1) lies in (V+)n−1. Therefore, according to theorem 4.9, the Laplace transform Lη(Vj1,...,jn)is holomorphic on the tube Tn−1 =Mn−1 − i(V+)n−1 and for each f ∈ S(Mn−1) we have

limγ→0

∫Mn−1

Lη(Vj1,...,jn)(x1 − iγ1, . . . , xn−1 − iγn−1) f(x1, . . . , xn−1)d4(n−1)x

= (Fη(Vj1,...,jn))(f)

= (Fη(F η(Vj1,...,jn)))(f)

= Vj1,...,jn(f),

so in this sense Vj1,...,jn is the boundary value of a holomorphic function defined on the tube Tn−1,

namely Lη(Vj1,...,jn). We will also denote this holomorphic function as V holj1,...,jn

from now on. Thus,

Vj1,...,jn(x1, . . . , xn−1) = limγ→0

V holj1,...,jn(x1 − iγ1, . . . , xn−1 − iγn−1),

where the convergence is in S ′(Mn−1) and γj ∈ V+. On the set Tn := (x1 − iγ1, . . . , xn − iγn) ∈Mn + iMn : γj − γj+1 ∈ V+ we can define another holomorphic function by

W holj1,...,jn(x1 − iγ1, . . . , xn − iγn) := V hol

j1,...,jn(x1 − x2 − i(γ1 − γ2), . . . , xn−1 − xn − i(γn−1 − γn)),

where γj − γj+1 ∈ V+, and the Wightman functions W holj1,...,jn

(x1, . . . , xn) are boundary values ofthese functions.

From part (a) of the theorem above it follows that under an SL(2,C) transformation thedistributions V transform according to∑

j1,...,jn

S(A−1)i1j1 . . . S(A−1)injnVj1...jn(Φ(A)x1, . . . ,Φ(A)xn−1) = Vi1...in(x1, . . . , xn−1).

Now consider the holomorphic function

V holj1,...,jn(ξ1, . . . , ξn−1)−

∑j1,...,jn

S(A−1)i1j1 . . . S(A−1)injnVholj1...jn(Φ(A)ξ1, . . . ,Φ(A)ξn−1)

on the tube Tn−1. From the transformation properties of Vi1,...,in it follows that the boundaryvalue bi1...in(x1, . . . , xn−1) ∈ S ′(Mn−1) of this holomorphic function is zero. According to thegeneralized uniqueness theorem for holomorphic functions with several complex variables (see forexample theorem B.10 of [2]) this holomorphic function must then be identically zero on the tubeTn−1. This shows that V hol

j1,...,jnhas the same transformation properties on the tube Tn−1 as Vj1,...,jn

on Mn−1.In order to understand the following important theorem, we have to introduce the notion of a

complex Lorentz transformation. LetMC :=M+iM' C4 be complex Minkowski spacetime withMinkowski metric ηC(z, w) = z0w0 −

∑3j=1 z

jwj for w, z ∈ C4. Then we define a complex Lorentztransformation to be a linear map L :MC →MC that preserves the metric ηC. The set L(C) ofcomplex Lorentz transformations forms a group, called the complex Lorentz group. As for ordinaryLorentz transformations, we have det(L) = ±1 for complex Lorentz transformations, and we definethe proper complex Lorentz group L+(C) to be the set of those complex Lorentz transformationsL with det(L) = +1. This group L+(C) is connected (unlike L+) and its universal covering groupis SL(2,C) × SL(2,C). The covering map ΦC : SL(2,C) × SL(2,C) → L+(C) is defined in the

following manner, which is very similar to the definition of the map Φ : SL(2,C) → L↑+. Webegin with a bijection ψC : MC → M2(C) that maps each element z ∈ MC to a matrix ψC(z)with det(ψC(z)) = ηC(z, z); the matrix ψC(z) is defined by precisely the same formula as ψ(x) insubsection 2.1.2. Then for (A,B) ∈ SL(2,C)×SL(2,C) we define the determinant preserving mapΨA,B : M2(C)→M2(C) by

ΨA,B(Z) = AZBT .

82

Page 84: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Under the bijection ψC this determinant preserving map corresponds to a metric preserving mapΦC(A,B) : MC → MC, so ΦC(A,B) ∈ L(C). Similar arguments as for the real case show thatΦC : SL(2,C) × SL(2,C) → L(C) is actually a surjective Lie group homomorphism onto L+(C).The elements of the form (A,A) ∈ SL(2,C) × SL(2,C) form a subgroup isomorphic to SL(2,C)and for such elements we have

ΦC(A,A) = Φ(A),

which simply follows from AT

= A∗. So if we have a representation D of SL(2,C), then we caninterpret this as a representation of the subgroup (A,A) : A ∈ SL(2,C) of SL(2,C)× SL(2,C).According to the discussion in section 9.1A of [2], this representation can be uniquely extendedby analyticity to a representation DC of SL(2,C) × SL(2,C). We now apply these ideas to theWightman functions. According to our discussion above, the holomorphic functions V hol satisfy atransformation law of the form∑

k

D(A−1)jkVholk (Φ(A)z1, . . . ,Φ(A)zn) = V hol

j (z1, . . . , zn),

where we write only a single index. This index corresponds to some basis for the vector spaceobtained by taking the n-fold tensor product of the N -dimensional vector space on which therepresentation S acts (here N is the number of field components in the theory and S is the repre-sentation as in the Wightman axioms). We can decompose the representation D into irreduciblerepresentations, which are of the form D (A,B) as we already noticed when we constructed generalfree fields in the previous chapter. If there are any representations with A+B a half-odd integer,then the corresponding components must be zero as follows from substituting A = −1 in thetransformation law for V hol

j and using that D (A,B)(−1) = (−1)A+B. The non-trivial irreducible

components of V holj thus transform according to single-valued representations of the restricted

Lorentz group L↑+. Also, the analytic continuation of D to a representation of SL(2,C)×SL(2,C)will define a single-valued representation of the proper complex Lorentz group L+(C). We are nowready to state the following theorem of Bargmann, Hall and Wightman, the proof of which can befound in section 9.1B of [2] (theorem 9.1) or section 2.4 of [32] (theorem 2.11).

Theorem 4.13 (Bargmann-Hall-Wightman) Let Fj(z1, . . . , zn) with j = 1, . . . , N be a set ofholomorphic functions defined on the tube Tn that satisfies∑

k

D(A−1)jkFk(Φ(A)z1, . . . ,Φ(A)zn) = Fj(z1, . . . , zn)

for A ∈ SL(2,C) and with D a representation of SL(2,C) the irreducible components of whichare of the form D (A,B) with A + B ∈ Z≥0. Then the Fj can be uniquely extended by analyticcontinuation to a holomorphic function on the so-called extended tube

T ′n :=⋃

L∈L+(C)

LTn

and this extension satisfies∑k

DC(A−1, B−1)jkFk(ΦC(A,B)z1, . . . ,ΦC(A,B)zn) = Fj(z1, . . . , zn)

for (A,B) ∈ SL(2,C)× SL(2,C).

In view of our discussion preceding the theorem, the theorem states that every L↑+-covariant holo-morphic function on the tube has a unique analytic continuation on the extended tube whichis L+(C)-covariant. We can apply the theorem to the functions V hol

j (z1, . . . , zn) to conclude thatthey are actually L+(C)-covariant holomorphic functions on the extended tube. Similarly, the holo-morphic Wightman functions W hol(z1, . . . , zn) can be extended to L+(C)-covariant holomorphicfunctions on the extended tube

T ′n :=⋃

L∈L+(C)

LTn.

83

Page 85: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Although the tube Tn did not contain any real points, the extended tune T ′n does. These pointsare called Jost points. There is a simple characterization of Jost points, which can be found insection 2.4 of [32] (theorem 2.12) or in section 9.1C of [2] (proposition 9.5), but we will not needit.

4.1.4 Important theorems

We will now discuss some of the famous theorems that can be proved for any quantum field theory,such as the spin-statistics theorem and the PCT theorem. To understand the main argumentsin the proofs of these theorems, it is useful to know something about polynomial algebras ofoperators corresponding to open sets in Minkowski spacetime. In a quantum field theory with fieldφ = (φ1, . . . , φN ) we define for each open set O ⊂ M the set P(O) consisting of all operators ofthe form

c1H +M∑k=0

φj1(fk,1) . . . φjk(fk,k)

with c ∈ C, M ∈ Z≥0 and the fk,l (with 1 ≤ l ≤ k and 0 ≤ k ≤ M) functions in S(M) withsupport contained in O. It is clear that P(O) is a ∗-algebra; it is called the polynomial algebraof O. According to the following theorem, the vacuum vector Ω ∈ H is cyclic for any P(O) withO ⊂M open. We will only give a sketch of the proof, since some of the details of the proof requiresome more knowledge of holomorphic functions which will not be very relevant for our purposes;the full proof can be found in section 4.2 of [32].

Theorem 4.14 (Reeh-Schlieder) Given some quantum field theory (H, D, U, φ,Ω), let O ⊂Mbe a non-empty open set and let Ψc ∈ H be cyclic for P(M). Then Ψc is also cyclic for P(O).

Proof sketchLet Ψ ∈ H be a vector which is orthogonal to the set AΨcA∈P(O). The first step in the proofconsists of defining tempered distributions F

i(∗)1 ...i

(∗)n

by

Fi(∗)1 ...i

(∗)n

(−x1, x1 − x2, . . . , xn−1 − xn) = 〈φi(∗)1

(x1) . . . φi(∗)n

(xn)Ψc,Ψ〉

and by argueing that the inverse Fourier transforms of these distributions vanish unless all of thevariables lie in the joint spectrum of the operators Pµ (which is a subset of V+). Then theorem 4.9is used to define a function F holi1...in

which is holomorphic in the tube Tn in the complex variables(−x1)− iγ1, (x1−x2)− iγ2, . . . , (xn−1−xn)− iγn and which converges to Fi1...in as γ1, . . . , γn → 0in V+. By definition of Ψ, the supports of the distributions Fi1...in lie in the complement of(−x1, x1 − x2, . . . , xn−1 − xn) ∈ Rn : x1, . . . , xn ∈ O, which in turn implies that F holi1...in

vanisheson the whole tube Tn (this is a non-trivial argument from the theory of holomorphic functionsin several complex variables). But then the distributions Fi1...in vanish on the whole space Mn.From the definition of these distributions it then follows that Ψ is in fact orthogonal to the setAΨcA∈P(M). Because Ψc was cyclic for P(M), we must have Ψ = 0. This proves that Ψc isalso cyclic for P(O).

For any open set O ⊂M we define an open set O∨ ⊂M by

O∨ = x ∈M : (x− y)2 < 0 for all y ∈ O,

where we use the notation A to denote the interior of a set A. Note that if O is also bounded,then O∨ will be non-empty. In that case the following theorem applies.

Theorem 4.15 Given some quantum field theory (H, D, U, φ,Ω), let O ⊂M be a non-empty openset with O∨ 6= ∅ and let A ∈P(O) be a monomial with AΩ = 0. Then A = 0.

84

Page 86: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

ProofLet Ψ ∈ D and let T∨ ∈P(O∨). Then,

〈A∗Ψ, T∨Ω〉 = 〈Ψ, AT∨Ω〉 = 0,

where in the last step we used that A either commutes or anti-commutes with each term in the poly-nomial T∨, since O and O∨ are spacelike separated. By the previous theorem, T∨ΩT∨∈P(O∨)

is dense in H, so we conclude that A∗Ψ = 0 for all Ψ ∈ D. For Ψ1,Ψ2 ∈ D we then have〈AΨ1,Ψ2〉 = 〈Ψ1, A

∗Ψ2〉 = 0. Because D is dense in H, this implies that AΨ = 0 for all Ψ ∈ D.

The (anti)commutator satisfies [A∗, B∗]± = (BA ± AB)∗ = ±[A,B]∗±, which implies that if thefield components φj and φk (anti)commute at spacelike distances then the adjoint components φ∗jand φ∗k also (anti)commute at spacelike distances. Using the theorem above, we can also show thatif the field components φj and φk (anti)commute at spacelike distances then the field componentsφj and φ∗k also (anti)commute at spacelike distances.

Theorem 4.16 (Dell’Antonio). Let (H, D, U, (φi)Ni=1,Ω) be a quantum field theory and let j, k ∈1, . . . , N. If we have at spacelike distances that

[φj , φk]± = 0,

while[φj , φ

∗k]∓ = 0,

then either φj or φk vanishes.

ProofFor any non-zero f, g ∈ S(M) with spacelike separated supports we have for Ψ ∈ D

φj(f)∗φk(g)∗φk(g)φj(f)Ψ = ±φk(g)∗φj(f)∗φk(g)φj(f)Ψ

= ∓± φk(g)∗φk(g)φj(f)∗φj(f)Ψ

= −φk(g)∗φk(g)φj(f)∗φj(f)Ψ.

Applying this to the vacuum vector Ω ∈ D we find the inequality

0 ≥ −‖φk(g)φj(f)Ω‖2 = −〈φj(f)∗φk(g)∗φk(g)φj(f)Ω,Ω〉= 〈φk(g)∗φk(g)φj(f)∗φj(f)Ω,Ω〉.

Suppose now that the supports K(f),K(g) ⊂M of f and g are compact and non-empty (and stillspacelike separated). Let a ∈M be a spacelike vector such that the compact setKλ(g) := K(g)+λaremains spacelike separated from K(f) for all λ > 0, and let gλ be the function gλ(x) = g(x−λa).Then the support of gλ is clearly Kλ(g), and for each λ ≥ 0 the inequality above gives

〈φk(gλ)∗φk(gλ)φj(f)∗φj(f)Ω,Ω〉 ≤ 0.

By the cluster decomposition property of Wightman functions, we have

limλ→∞

〈φk(gλ)∗φk(gλ)φj(f)∗φj(f)Ω,Ω〉 = 〈φk(g)∗φk(g)Ω,Ω〉〈φj(f)∗φj(f)Ω,Ω〉

= ‖φk(g)Ω‖2‖φj(f)Ω‖2

≥ 0.

Together, these inequalities imply that ‖φk(g)Ω‖2‖φj(f)Ω‖2 = 0, so either φj(f)Ω = 0 or φk(g)Ω =0. According to the previous theorem, this in turn implies that either φj(f) = 0 or φk(g) = 0.We thus conclude that for all f, g ∈ S(M) with spacelike separated non-empty compact supportswe have either φj(f) = 0 or φk(g) = 0. Suppose that φj does not vanish. Then there exists a

85

Page 87: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

function h ∈ S(M) with non-empty compact support K(h) such that φj(h) 6= 0. Then for anyfunction p ∈ S(M) with compact support K(p) which is spacelike separated from K(h), we haveφk(p) = 0. By considering different functions hi ∈ S(M) with compact supports K(hi) ⊂ K(h)and repeating the same argument, we find that φk(p) = 0 for all p ∈ S(M) with compact support.Because the set of all such functions p is dense in S(M), this implies that φk vanishes. Similarly,assuming that φk does not vanish will imply that φj vanishes.

As discussed before, in any quantum field theory we can decompose the field into fields whichtransform as an irreducible representation of P↑+. In the previous chapter we found that irre-ducible fields which transform according to the (A,B)-representation can only describe particleswith spin j ∈ |A− B|, |A− B|+ 1, . . . , A+ B − 1, A+ B. Therefore, an irreducible field whichtransforms according to the (A,B)-representation will be called a field of integer spin if A+B isan integer and a field of half-odd integer spin if A+B is a half-odd integer.

The following theorem shows that the components of an irreducible field of integer spin mustcommute with each other at spacelike separated distances and that the components of an irreduciblefield of half-odd integer spin must anticommute with each other at spacelike distances.

Theorem 4.17 (Spin-statistics theorem). Let (H, D, U, φ,Ω) be a quantum field theory andlet φ(κ) be an irreducible field in the decomposition of φ into irreducible fields. Suppose that φj isa component of φ which belongs to φ(κ). Then, if φ(κ) is of integer spin and φj satisfies

[φj(x), φ∗j (y)]+ = 0

for (x− y)2 < 0, or if φ(κ) is of half-odd integer spin and φj satisfies

[φj(x), φ∗j (y)]− = 0

for (x− y)2 < 0, then φj and φ∗j vanish.

Proof sketchSuppose that φ satisfies one of the two alternatives stated in the theorem. Then

Vjj∗(x− y) + (−1)εVj∗j(−(x− y)) = 〈φj(x)φ∗j (y)Ω,Ω〉+ (−1)ε〈φ∗j (y)φj(x)Ω,Ω〉= 〈(φj(x)φ∗j (y) + (−1)εφ∗j (y)φj(x))Ω,Ω〉= 0,

where ε = 0 for integer spin and ε = 1 for half-odd integer spin. This implies that the correspondingholomorphic functions satisfy

V holjj∗ (ξ) + (−1)εV hol

j∗j (−ξ) = 0 (4.3)

on the tube T1. It can be shown (see theorem 2.11 of [32]) that there exists a single-valued analyticcontinuation of the holomorphic functions V hol

jj∗ (ξ) and V holj∗j (−ξ) to the extended tube35 T ′1 , and

thatV holj∗j (ξ) = (−1)εV hol

j∗j (−ξ),

where ε is as before. Combining this with (4.3) gives

0 = V holjj∗ (ξ) + (−1)ε(−1)εV hol

j∗j (ξ) = V holjj∗ (ξ) + V hol

j∗j (ξ).

Passing to the boundary, we obtain

0 = Vjj∗(x− y) + Vj∗j(x− y) = Vjj∗(x− y) + Vj∗j(−y − (−x))

= 〈φj(x)φ∗j (y)Ω,Ω〉+ 〈φ∗j (−y)φj(−x)Ω,Ω〉. (4.4)

35The extended tube T ′1 is the set of all points ξ ∈ C4 of the form ξ = Λζ with ζ ∈ T1 and Λ a complex Lorentztransformation. A complex Lorentz transformation Λ is a 4× 4 complex matrix which satisfies ΛT ηΛ = η.

86

Page 88: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Now let f ∈ S(M) and define f−(x) := f(−x), then

‖φj(f)∗Ω‖2 + ‖φj(f−)Ω‖2 = 〈φj(f)φj(f)∗Ω,Ω〉+ 〈φj(f−)∗φj(f−)Ω,Ω〉

=

∫M2

f(x)f(y)(〈φj(x)φ∗j (y)Ω,Ω〉+ 〈φ∗j (−y)φj(−x)Ω,Ω〉)dxdy

= 0.

This implies both φj(f)∗Ω = 0 and φj(f−)Ω = 0. Thus, because f ∈ S(M) was arbitrary, we

have φ∗j (x)Ω = 0 and φj(x)Ω = 0. Theorem 4.15 then implies that for all f ∈ S(M) with compactsupport we have φ∗j (f) = 0 and φj(f) = 0, which in turn implies that φ∗j and φj vanish.

Together with our assumption that the components of an irreducible field either all commuteor else all anticommute with each other at spacelike distances, this theorem implies that at space-like distances the components of an irreducible field all commute with each other if the irreduciblefield is of integer spin and anticommute with each other if the irreducible field is of half-odd integerspin. This is the famous connection between spin and statistics: if we identify commuting fieldswith bosons and anticommuting fields with fermions, then bosons are described by fields of integerspin and fermions by fields of half-odd integer spin.

The theorem gives no information about whether we should choose a commutator or anticom-mutator when we are dealing with components belonging to two different irreducible fields, sothere is some freedom here. We say that in a quantum field theory we have a normal connectionbetween spin and statistics if every component of a boson field commutes with any other fieldcomponent in the theory and if two components of (different) fermion fields always anticommutewith each other. It can be shown that in any quantum field theory in which there is no normalconnection between spin and statistics, there is always a transformation of the fields, called a Kleintransformation, which transforms the fields into new fields with a normal connection between spinand statistics. Therefore, we may as well assume from now on that all quantum field theories havea normal connection between spin and statistics. Then the following theorem applies.

Theorem 4.18 (PCT-theorem). Let (H, D, U, φ,Ω) be a quantum field theory with normalconnection between spin and statistics and let φ = (φ(κ1)

j1 , . . . , φ(κ`)

j`) be a decomposition of thefield into irreducible fields φ(κj) transforming according to the (Aj , Bj)-representation of SL(2,C).Then there exists a unique anti-unitary operator Θ on H which leaves the vacuum vector Ω invariantand satisfies

Θφ(κj)(x)Θ−1 = (−1)2Aj iεjφ(κj)∗(−x),

where εj = 0 if Aj +Bj is an integer and εj = 1 if Aj +Bj is a half-odd integer.

Proof sketchIn the first part of the proof it is shown that in any quantum field theory with normal connectionbetween spin and statistics we have

〈φj1(x1) . . . φjk(xk)Ω,Ω〉 = i∑kl=1 εjl (−1)2

∑kl=1 Ajl 〈φ∗j1(−x1) . . . φ∗jk(−xk)Ω,Ω〉, (4.5)

where εjl = 0 (or εjl = 1) if φjl is a component of a boson (or fermion) field transforming accordingto the (Ajl , Bjl)-representation of SL(2,C). For functions fl ∈ S(M) this means that

〈φj1(f1) . . . φjk(fk)Ω,Ω〉 = i∑kl=1 εjl (−1)2

∑kl=1 Ajl 〈φj1(f1)∗ . . . φjk(fk)∗Ω,Ω〉,

where f(x) = f(−x). From this it easily follows that

〈φj1(f1)∗ . . . φjk(fk)∗Ω,Ω〉 = 〈Ω, φjk(fk) . . . φj1(f1)Ω〉 = 〈φjk(fk) . . . φj1(f1)Ω,Ω〉

= i∑kl=1 εjl (−1)2

∑kl=1 Ajl 〈φjk(fk)∗ . . . φj1(f1)∗Ω,Ω〉

= (−i)∑kl=1 εjl (−1)2

∑kl=1 Ajl 〈φjk(fk)

∗ . . . φj1(f1)∗Ω,Ω〉

= (−i)∑kl=1 εjl (−1)2

∑kl=1 Ajl 〈φj1(f1) . . . φjk(fk)Ω,Ω〉.

87

Page 89: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

So just like (4.5) we also get

〈φ∗j1(x1) . . . φ∗jk(xk)Ω,Ω〉 = (−i)∑kl=1 εjl (−1)2

∑kl=1 Ajl 〈φj1(−x1) . . . φjk(−xk)Ω,Ω〉. (4.6)

Of course we can also make combinations of (4.5) and (4.6). The difference is that replacing afield component by its adjoint gives a factor iεj (−1)2Aj , while replacing an adjoint field componentby the corresponding field component gives a factor (−i)εj (−1)2Aj . The next step in the proofconsists of showing that the antilinear extension of

Θφj1(f1) . . . φjk(fk)Ω := (−i)∑kl=1 εjl (−1)2

∑kl=1 Ajlφ∗j1(f1) . . . φ∗jk(fk)Ω

defines the anti-unitary operator with the desired properties. In showing this, the identities aboveare used to derive the anti-unitarity.

In physics it is often convenient to consider quantum fields at a given time, for example whenone wants to study equal-time commutation relations for the fields. According to the Wightmanaxioms, however, the fields are operator-valued distributions on Minkowski spacetime and there-fore only the smeared fields φj(f) for f ∈ S(M) define operators on the Hilbert space. In otherwords, the Wightman fields must be smeared out both in time and space. Suppose now that inaddition to the Wightman axioms, we also assume that the fields φj define for each t and eachf ∈ S(R3) a well-defined operator φj(t, f) on the dense set D ∈ H such that for all u ∈ S(R) wehave

φj(fu) =

∫Rφj(t, f)u(t)dt,

where fu ∈ S(M) is the function defined by fu(t,x) = f(x)u(t). Then the fields φj can also beconsidered as operator-valued distributions on S(R3) depending on a parameter t. To prevent badt-dependence, we assume that for each f ∈ S(R3) and each Ψ ∈ D the norm of the vector φj(t, f)Ψis a bounded function of |t|. When a quantum field theory satisfies these properties we will simplysay that it satisfies the sharp-time axiom.

For the following theorem we need the definition of the Euclidean group in three dimensions.The group E+(3) of proper Euclidean motions in R3 is generated by translations and rotations inR3. Its universal covering group E+(3) is therefore a semi-direct product of R3 and SU(2) (compare

this with P↑+, which was a semi-direct product of R4 and SL(2,C)) and the multiplication law isgiven by (a1, R1)(a2, R2) = (a1 +R1a2, R1R2).

Theorem 4.19 Let (φ1)j(t, .)nj=1 and (φ2)j(t, .)nj=1 be two sets of operator-valued distributions

on S(R3), depending on a parameter t, that act on Hilbert spaces H1 and H2, respectively, andassume that the operator-valued distributions at any time t form an irreducible set of operators36.Suppose further for i = 1, 2 that on Hi there are defined unitary representations Ui of E+(3) suchthat

Ui(a, R)(φi)j(t, f)Ui(a, R)−1 =n∑k=1

S(R−1)jk(φi)k(t, f(a,R))

for all f ∈ S(R3), where S is a matrix representation of SU(2). Finally suppose that for some t′

there exists a unitary operator V : H1 → H2 such that

(φ2)j(t′, .) = V (φ1)j(t

′, .)V −1.

Then(a) the representations U1 and U2 are unitarily equivalent:

U2(a, R) = V U1(a, R)V −1;

(b) if there exists in H1 a unique (up to a phase) normalized vector Ω1 that is invariant under U1,then there also exists in H2 a unique (up to a phase) normalized vector, namely Ω2 = V Ω1, thatis invariant under U2.

36See the end of subsection 4.1.2 for the definition of an irreducible set of operators.

88

Page 90: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

ProofFor t = t′ we have for f ∈ S(R3)

U2(a, R)V (φ1)j(t′, f)V −1U2(a, R)−1 = U2(a, R)(φ2)j(t

′, f)U2(a, R)−1

=n∑k=1

S(R−1)jk(φ2)k(t′, f(a,R))

= V

(n∑k=1

S(R−1)jk(φ1)k(t′, f(a,R))

)V −1

= V U1(a, R)(φ1)j(t′, f)U1(a, R)−1V −1,

which is equivalent to

(φ1)j(t′, f)V −1U2(a, R)−1 = V −1U2(a, R)−1V U1(a, R)(φ1)j(t

′, f)U1(a, R)−1V −1,

which in turn is equivalent to

(φ1)j(t′, f)V −1U2(a, R)−1V U1(a, R) = V −1U2(a, R)−1V U1(a, R)(φ1)j(t

′, f).

Thus, the operator V −1U2(a, R)−1V U1(a, R) on H1 commutes with all the (φ1)j(t, f) and is there-fore a (nonzero) multiple of the identity operator:

V −1U2(a, R)−1V U1(a, R) = ω(a, R)−11H1

orU2(a, R) = ω(a, R)V U1(a, R)V −1,

where ω(a, R) is a complex number depending on (a, R). Now for any T1 = (a1, R1) and T2 =(a2, R2) we have

ω(T1T2)V U1(T1T2)V −1 = U2(T1T2) = U2(T1)U2(T2)

= ω(T1)ω(T2)V U1(T1)V −1V U1(T2)V −1

= ω(T1)ω(T2)V U1(T1T2)V −1,

so ω is in fact a one-dimensional representation of E+(3) and hence ω(a, R) ≡ 1. This proves part(a). Part (b) follows directly from the unitary equivalence of U1 and U2.

We will now state (a generalization of) Haag’s theorem. For simplicity, we will only state itfor scalar fields.

Theorem 4.20 (Generalized Haag’s theorem) Let (H1, D1, U1, φ1,Ω1) and (H2, D2, U2, φ2,Ω2)be two scalar quantum field theories which satisfy the sharp-time axiom and the fields of which havewell-defined time-derivatives at each time t. For i = 1, 2, suppose that for each t the fields φi(t, .)and ∂tφi(t, .) together form an irreducible set of fields on Hi. Suppose also that for some instantt′ there exists a unitary operator V : H1 → H2 such that

φ2(t, .) = V φ1(t, .)V −1, ∂tφ2(t, .) = V ∂tφ1(t, .)V −1.

Then(a) the first four Wightman functions are the same in both quantum field theories;(b) if φ1 is a free field of mass m ≥ 0, then φ2 is also a free field of mass m and both theories areunitarily equivalent.

Part (b) of the theorem is the original theorem of Haag, and its truth follows from the first partbecause the two-point Wightman functions in a free scalar field theory completely determine theother Wightman functions. A proof of this theorem can be found in [2], theorem 9.28.

89

Page 91: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

4.1.5 Example: The free hermitean scalar field

Let H = L2(R3, d3p

2p0 ) be the one-particle state space for a spinless particle with mass m ≥ 0 that is

equal to its own antiparticle. We denote the corresponding Fock space by F+(H) and on this spacewe define the creation and annihilation operators A∗(Ψ) := A∗+(Ψ) and A(Ψ) := A+(Ψ) for each

Ψ ∈ H. On the Fock space we also have a unitary representation UFock of P↑+ with correspondingenergy-momentum operators Pµ that satisfy the positive energy condition in axiom 0 and we alsohave a vacuum vector Ω that is the unique unit vector (up to a phase) in F+(H) that is invariantunder UFock. We will now construct a hermitean scalar field in F+(H) that satisfies the remainingWightman axioms.

For each Schwartz function f ∈ S(M) let f denote its Fourier transform f(p) = 1(2π)2

∫M f(x)eip·xd4x.

Because f is again a Schwartz function, its restriction f |O+m

to the orbit O+m is an element of

L2(R3, d3p

2p0 ) = H. We can thus define a map R : S(M)→ H by

R(f) = f |O+m.

Explicitly,

(Rf)(p) =1

(2π)2

∫Mf(x)ei(ωpx0−p·x)d4x. (4.7)

Because R(f) ∈ H, the operators A∗(Rf) and A(Rf) are well-defined and we can use them todefine for each real-valued f ∈ S(M) the operators

φ(f) =√

2π(A(Rf) +A∗(Rf)).

For complex-valued f = f1 + if2 ∈ S(M), with f1 and f2 the real and imaginary parts of f ,we define φ(f) := φ(f1) + iφ(f2). The reason for not defining φ(f) by the same formula as forreal-valued functions is that fields should depend linearly on the Schwartz function f (recall thatannihilation operators A(Ψ) depend anti-linearly on Ψ). Because for each Ψ ∈ H the operatorsA(∗)(Ψ) are defined on the dense subspace D+ ⊂ F+(H) (which was defined in subsection 2.2.5),the operators φ(f) are defined on the dense subspace D+ for any f ∈ S(M). Also, because theA(∗)(Ψ) all leave D+ invariant, the operators φ(f) also leave D+ invariant. Furthermore, for anyΨ1,Ψ2 ∈ D+ the map S(M)→ C, given by

f 7→ 〈φ(f)Ψ1,Ψ2〉,

is a tempered distribution. Thus, f 7→ φ(f) is an operator-valued distribution and each such φ(f)is defined on the dense subspace D+ ⊂ F+(H) and leaves this subspace invariant. For each f theadjoint φ(f)∗ is defined on D+, so axiom 1 is satisfied. From the transformation properties of the

creation and annihilation operators under P↑+ (as derived in subsection 2.2.5), it follows that φtransforms as

U(a,A)φ(f)U(a,A)−1 = φ(f(a,A)),

so axiom 2 is also satisfied. For real-valued f, g ∈ S(M) we have

[A(Rf), A∗(Rg)] = 〈Rg,Rf〉 =

∫R3

(Rg)(p)(Rf)(p)d3p

2ωp

=1

(2π)4

∫R3

[∫Mei(ωpy0−p·y)g(y)d4y

] [∫Mei(ωpx0−p·x)f(x)d4x

]d3p

2ωp

=1

(2π)4

∫M

∫M

[∫R3

e−i(ωp(x−y)0−p·(x−y)) d3p

2ωp

]f(x)g(y)d4xd4y,

so for real-valued f, g ∈ S(M) we have

[φ(f), φ(g)] = 2π ([A(Rf), A∗(Rg)] + [A∗(Rf), A(Rg)])

= 2π ([A(Rf), A∗(Rg)]− [A(Rf), A∗(Rg)]∗)

= − 2i

(2π)3

∫M

∫M

[∫R3

sin(ωp(x− y)0 − p · (x− y))d3p

2ωp

]f(x)g(y)d4xd4y.

90

Page 92: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

For complex-valued f = f1 + if2 and g = g1 + ig2 we then find

[φ(f), φ(g)] = [φ(f1) + iφ(f2), φ(g1) + iφ(g2)]

= − 2i

(2π)3

∫M

∫M

[∫R3

sin(ωp(x− y)0 − p · (x− y))d3p

2ωp

][f1(x)g1(y)− f2(x)g2(y) + if1(x)g2(y) + if2(x)g1(y)]d4xd4y

= − 2i

(2π)3

∫M

∫M

[∫R3

sin(ωp(x− y)0 − p · (x− y))d3p

2ωp

]f(x)g(y)d4xd4y.

The integral between the square brackets is a distribution in the variable x − y and vanishes atpoints where x − y is spacelike. Therefore, if the supports of f and g are mutually spacelikeseparated then [φ(f), φ(g)] = 0. So axiom 3 is also satisfied. It can also be shown that theFock vacuum vector Ω is cyclic for the field operators φ(f), so axiom 4 is also satisfied. Thus allWightman axioms are satisfied; see section 8.4 of [2] for more details. Note that for the 2-pointWightman function we have

〈φ(f)φ(g)Ω,Ω〉 = 2π〈A(Rf)A∗(Rg)Ω,Ω〉 = 2π〈[A(Rf), A∗(Rg)]Ω,Ω〉

=1

(2π)3

∫M

∫M

[∫R3

e−i(ωp(x−y)0−p·(x−y)) d3p

2ωp

]f(x)g(y)d4xd4y,

or

W (x, y) =1

(2π)3

∫R3

e−i(ωp(x−y)0−p·(x−y)) d3p

2ωp. (4.8)

For odd n the n-point Wightman functions are zero and for even n one can express the n-pointfunction in terms of the n−2 point function and the 2-point function, and hence the 2-point functiondetermines the other n-point functions. This was also mentioned briefly when we discussed Haag’stheorem, but we will not prove it here; these statements about the n-point functions can (forexample) be found in section 8.4 of [2] or in section 3.3 of [32].

For any Schwartz function f ∈ S(M) the Fourier transform [(∂2 + m2)f ]∧ of (∂2 + m2)f isgiven by

[(∂2 +m2)f ]∧ =1

(2π)2

∫Meip·x(∂2 +m2)f(x)d4x

= − 1

(2π)2

∫M

[∂2eip·x]︸ ︷︷ ︸=p2eip·x

f(x)d4x+m2f(p)

= (m2 − p2)f(p).

So the restriction of this Fourier transform to O+m = p ∈ M : p2 = m2, p0 > 0 is identically

zero; in other words, R((∂2 +m2)f) ≡ 0. This implies that the field φ satisfies the Klein-Gordonequation:

[(∂2 +m2)φ](f) = φ((∂2 +m2)f) = 0

for any f ∈ S(M).We will now write the field φ in terms of the creation and annihilation operators a(∗) defined

on the Fock space F+(H). As indicated in subsection 2.2.5, the physical equivalent of A(∗)(Ψ) isa(∗)(JΨ), with J : H 3 Ψ 7→ 1√

2ωpΨ ∈ H. In the present case this means that we should define

the map r : S(M)→ H by rf = JRf , so

(rf)(p) =1

(2π)2√

2ωp

∫Mf(x)ei(ωpx0−p·x)d4x.

91

Page 93: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

On F+(H) the field φ(f) for real-valued f ∈ S(M) is now given by

φ(f) =√

2π(a∗(JRf) + a(JRf)) =√

2π(a∗(rf) + a(rf))

=√

∫R3

d3p[(rf)(p)a∗(p) + (rf)(p)a(p)

]= (2π)−3/2

∫R3

d3p√2ωp

[(∫Mf(x)ei(ωpx0−p·x)d4x

)a∗(p) +

(∫Mf(x)e−i(ωpx0−p·x)d4x

)a(p)

]=

∫M

(2π)−3/2

∫R3

d3p√2ωp

[e−i(ωpx0−p·x)a(p) + ei(ωpx0−p·x)a∗(p)

]f(x)d4x.

For this reason we also write

φ(x) = (2π)−3/2

∫R3

d3p√2ωp

[e−i(ωpx0−p·x)a(p) + ei(ωpx0−p·x)a∗(p)

].

As stated in the previous chapter, the a∗(p) and a(p) are not well-defined operators on the Fockspace, but since we are smearing them out this is no problem. However, as we will show when wewill discuss the (λφ4)2-model in the next chapter, it is not true that a∗(p) and a(p) cannot begiven any mathematical meaning without smearing them out.

We will now define the notion of the field φ at a fixed moment in time, on the Fock spaceF+(H). Analogous to (4.7) we define for each t ∈ R and for each Schwartz function f ∈ S(R3) onR3 a map Rt : S(R3)→ H by

(Rtf)(p) = (2π)−3/2

∫R3

f(x)ei(ωpt−p·x)d3x = (2π)−3/2eiωpt

∫R3

f(x)e−ip·xd3x.

Then, for each t ∈ R and each real-valued f ∈ S(R3) we can define an operator

φt(f) = A∗(Rtf) +A(Rtf)

on the Fock space. We then extend φt to complex-valued functions f = f1 + if2 by definingφt(f) = φt(f1) + iφ(f2). The operators φt(f) are defined on D+ and it can be shown that themap t 7→ 〈φt(f)Ψ1,Ψ2〉 is smooth for any f ∈ S(R3) and Ψ1,Ψ2 ∈ H. We will now investigate therelationship between φt and φ. For each Schwartz function u ∈ S(R) on R we find that for anyf ∈ S(R3) ∫

R(Rtf)(p)u(t)dt = (2π)−3/2

∫R

[∫R3

f(x)eip0te−ip·xd3x

]u(t)dt

= (2π)−3/2

∫Mf(x)u(x0)eip·xd4x

=√

2π[R(f · u)](p).

For real-valued f ∈ S(R3) and real-valued u ∈ S(R), this implies that∫Rdtu(t)A(∗)(Rtf) =

√2πA(∗)(R(f · u)),

and therefore for real-valued f ∈ S(R3) and real-valued u ∈ S(R) we find that∫Rdtu(t)φt(f) = φ(f · u).

This can then be extended by linearity to complex-valued f and u. This establishes the relationshipbetween φt and φ: the operator-valued distribution φt on S(R3) is nothing else than the operator-valued distribution φ on S(M) at fixed time t. For the time-derivative of the field at time t wefind

∂tφt(f) = A(∂tRtf)∗ +A(∂tRtf) = A(iωpRtf)∗ +A(iωpRtf).

92

Page 94: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

So in order to obtain the commutators of the φt and their derivatives, we must compute commu-tators of the form [A(h1 · Rtf), A(h2 · Rtg)∗], where the hj = hj(p) are either identically one, orelse iωp. For these commutators we find

[A(h1 ·Rtf), A(h2 ·Rtg)∗] = 〈h2 ·Rtg, h1 ·Rtf〉

=1

(2π)3

∫R3

d3p

2ωp

[h2(p)eiωpt

∫R3

g(y)e−ip·yd3y

][h1(p)eiωpt

∫R3

f(x)e−ip·xd3x

]=

1

(2π)3

∫R3

∫R3

[∫R3

h1(p)h2(p)

2ωpeip·(x−y)d3p

]f(x)g(y)d3xd3y.

The commutator of φt with itself now follows from choosing h1 = h2 ≡ 1,

[φt(f), φt(g)] = [A(Rtf), A∗(Rtg)]− [A(Rtf), A∗(Rtg)]∗

=2i

(2π)3

∫R3

∫R3

[∫R3

sin[p · (x− y)]

2ωpd3p

]f(x)g(y)d3xd3y

= 0.

The commutator of ∂tφt with itself follows from choosing h1 = h2 = iωp,

[∂tφt(f), ∂tφt(g)] = [A(iωpRtf), A∗(iωpRtg)]− [A(iωpRtf), A∗(iωpRtg)]∗

=2i

(2π)3

∫R3

∫R3

[∫R3

ωp sin[p · (x− y)]

2d3p

]f(x)g(y)d3xd3y

= 0.

Finally, the commutator of φt with ∂tφt is obtained by taking h1 ≡ 1 and h2 = iωp,

[φt(f), ∂tφt(g)] = [A(Rtf), A∗(iωpRtg)]− [A(Rtf), A∗(iωpRtg)]∗

=i

(2π)3

∫R3

∫R3

[∫R3

cos[p · (x− y)]d3p

]f(x)g(y)d3xd3y

= i

∫R3

∫R3

δ(x− y)f(x)g(y)d3xd3y.

We have thus found the commutation relations

[φt(x), φt(y)] = 0 = [∂tφt(x), ∂tφt(y)]

[φt(x), ∂tφt(y)] = iδ(x− y).

As we did for the field φ, we can also define the field φt on F+(H). This is done by using the maprt : S(R3)→ H which is given by

(rtf)(p) =1

(2π)3/2√

2ωpeiωpt

∫R3

f(x)e−ip·xd3x.

The field φt on F+(H) is then defined by

φt(f) = a∗(rtf) + a(rtf)

for real-valued f ∈ S(R3). The result is

φt(f) =

∫R3

(2π)−3/2

∫R3

d3p√2ωp

[e−i(ωpt−p·x)a(p) + ei(ωpt−p·x)a∗(p)

]f(x)d3p.

This suggests that we should write

φt(x) = (2π)−3/2

∫R3

d3p√2ωp

[e−i(ωpt−p·x)a(p) + ei(ωpt−p·x)a∗(p)

].

The right-hand side is the same as φ(x) with x0 = t, which reflects the fact that φt is precisely thefield φ at time t.

93

Page 95: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

4.1.6 Haag-Ruelle scattering theory

In the previous chapter we showed that quantum fields arise quite naturally in the (perturbative)calculations in scattering theory. Before we introduced the perturbation theory, we mentioned thatthere must be some embeddings Ωin and Ωout from the Fock space HFock (describing free particles)into the physical Hilbert space H corresponding to the scattering experiment. We will now showthat under some additional conditions, a quantum field theory (i.e. a theory satisfying the Wight-man axioms) gives rise to such embeddings37 and can therefore describe scattering experiments.The additional conditions are the following.

Haag-Ruelle axiom 1

The joint spectrum of the operators Pµ lies in the set 0∪V+µ , where V+

µ = p ∈M : p · p > µ and p0 ≥ 0.

Haag-Ruelle axiom 2The Hilbert space H contains countably many mutually orthogonal subspaces H[τ ]τ∈T (so-calledone-particle subspaces for particles of type τ) which transform according to irreducible unitary rep-

resentations (mτ , sτ ) of P↑+ and are taken into H[τC ] under PCT-transformations. Furthermore,for each particle type τ ∈ T in the theory there exists an operator Aτ ∈P(M) in the polynomial

algebra such that AτΩ is the zero vector in H[τ ] and A∗τΩ ∈ H[τC ].

Given the subspaces H[τ ]τ∈T , the operators Aτ are called the solutions of the quantum fieldproblem of one-particle states. Sometimes the one-particle problem has a simple solution; for in-stance, this is the case if the mass m > 0 of the particle is an isolated point of the spectrum of themass operator.

In a theory satisfying the Wightman axioms and the Haag-Ruelle axioms we define for eachparticle type τ ∈ T the linear span B[τ ] of all operators of the form U(a, L)AτU(a, L)−1 with

(a, L) ∈ P↑+. We then define the space A[τ ] := B[τ ] +(B[τC ]

)∗. Then A[τ ] is a linear subspace of

P(M) that is taken to A[τC ] under hermitean conjugation, is invariant with respect to restrictedPoincare transformations and is such that D[τ ] := A[τ ]Ω is dense in H[τ ]. Thus, for each particletype τ ∈ T essentially all one-particle states Ψ ∈ H[τ ] can be constructed by letting an operator inthe polynomial algebra act on the vacuum vector, and when the adjoint of this operator acts onthe vacuum vector then this gives a one-particle state of the corresponding antiparticle τC .

For each operator A ∈ A[τ ] we define a family of operators Att∈R by

At =

∫x0=t

A(x)

∂x0Dmτ (x)−Dmτ (x)

∂x0A(x)

d3x,

where A(x) := U(x, 1)AU(x, 1)−1 and Dm(x) := 2πi∫M ε(p0)δ(p2 −m2)e−ip·x d4p

(2π)4 . An important

property of the family Att∈R is that each element At acts in the same way on the vacuum as A,

AtΩ = AΩ. (4.9)

The first part of the main result of Haag-Ruelle theory is that for Aj ∈ H[τj ] with j = 1, . . . , n thelimits limt→∓∞A

t1 . . . A

tnΩ exist in H. These limits are denoted by Ψin and Ψout:

Ψin(A1, . . . , An) = limt→−∞

At1 . . . AtnΩ

Ψout(A1, . . . , An) = limt→∞

At1 . . . AtnΩ.

To understand the second part of the result, let HFock be the Fock space describing a free systemof particles of types τ ∈ T . We will identify the vacuum vector ΩFock ∈ HFock with the vacuumvector Ω ∈ H and the one-particle states in HFock with the one-particle states in H. Then for each

37We will not provide the proofs of the relevant theorems here, since very detailed versions can be found in chapter12 of [2]. Another good source for Haag-Ruelle theory is [20], sections II.3 and II.4.

94

Page 96: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

A ∈ A[τ ] we can interpret AΩ and A∗Ω as one-particle states in the Fock space HFock, describing afree particle of type τ and of type τC , respectively. Using the creation and annihilation operatorsdefined in subsection 2.2.5, we define the operator

A := A∗τ (AΩ) +A∗τC (A∗Ω)

on the Fock space. For each n ∈ N and each n-tuple (A1, . . . , An) with Aj ∈ A[τj ] we then definea Fock space vector ΨFock(A1, . . . , An) by

ΨFock(A1, . . . , An) = A1 . . . AnΩFock,

and the closed linear span of all such vectors in the entire Fock space. A nice property of ΨFock isthat it satisfies

UFock(a, L)ΨFock(A1, . . . , An) = ΨFock(U(a, L)A1U(a, L)−1, . . . , U(a, L)AnU(a, L)−1),

where (a, L) ∈ P↑+ and UFock is the representation of P↑+ on HFock. The second part of the mainresult of Haag-Ruelle theory now states that there exist two linear isometries Ωin,Ωout : HFock → Hsatisfying

Ωin/out(ΨFock(A1, . . . , An)) = Ψin/out(A1, . . . , An),

with Aj ∈ A[τj ]. Furthermore, this property determines Ωin and Ωout uniquely and these maps arePoincare invariant, i.e.

U(a, L)Ωin/out = Ωin/outUFock(a, L)

for all (a, L) ∈ P↑+, which in turn implies that the S-operator is Poincare invariant:

UFock(a, L)SUFock(a, L)−1 = UFock(a, L)(Ωout)∗ΩinUFock(a, L)−1

= UFock(a, L)(UFock(a, L)−1(Ωout)∗U(a, L)

)(U(a, L)−1ΩinUFock(a, L)

)UFock(a, L)−1

= S. (4.10)

From (4.9) it follows that Ψin(A) = Ψout(A) for any A ∈⋃τ∈T A[τ ], and thus that ΩinΨFock(A) =

ΩoutΨFock(A) for any such A. This implies that the S-operator satisfies

SΨFock(A) = (Ωout)∗ΩinΨFock(A) = (Ωout)∗ΩoutΨFock(A) = ΨFock(A),

where in the last step we used that Ωout is an isometry. Thus, the S-operator leaves one-particlestates invariant.

4.2 The Haag-Kastler formulation of quantum field theory

In this section we discuss the Haag-Kastler axioms as an alternative to the Wightman axioms.The Haag-Kastler framework is often called algebraic quantum field theory, because it makes useof abstract C∗-algebras, rather than concrete operators on a Hilbert space.

4.2.1 The algebraic approach to quantum theory

To discuss the Haag-Kastler formulation of quantum field theory, we first need to reformulate thequantum theory that we discussed in section 2.2. In that section we assumed that the states andobservables of a quantum system are given in terms of a concrete Hilbert space. In particular,the algebra of observables was B(H) and the states were given in terms of density matrices onH. In the algebraic approach to quantum theory, which we will introduce now38, the algebra ofobservables corresponding to a quantum system is given as an abstract unital C∗-algebra U, thehermitian elements of which are called bounded observables. The set of states of this C∗-algebra,

38Good sources for the algebraic approach are chapter 6 of [2] and chapter 2 of [7].

95

Page 97: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

i.e the normalized positive linear functionals on U, is denoted by S(U), but this set will be toolarge for physical purposes and we will therefore define the smaller set of physical states below. Inthe meantime, it will be convenient to introduce some terminology concerning the set S(U). Thetransition probability ω1 · ω2 between two pure states ω1, ω2 ∈ PS(U) is defined as

ω1 · ω2 = 1− 1

4‖ω1 − ω2‖2,

where ‖.‖ denotes the operator norm on S(U). Because 0 ≤ ‖ω1 − ω2‖ ≤ ‖ω1‖ + ‖ω2‖ = 2, it isclear that ω1 · ω2 ∈ [0, 1] and it follows from the positive-definiteness of ‖.‖ that ω1 · ω2 = 1 ifand only if ω1 = ω2. When ω1 · ω2 = 0, we say that the states ω1 and ω2 are orthogonal, andtwo subsets S1, S2 ⊂ PS(U) of pure states are called mutually orthogonal if ω1 · ω2 = 0 for allω1 ∈ S1 and ω2 ∈ S2. A non-empty subset S ⊂ PS(U) is called indecomposable if it cannot bewritten as the disjoint union of two non-empty mutually orthogonal subsets. Using this definition,we define a relation ∼ on PS(U) as follows: ω1 ∼ ω2 if and only if there exists an indecomposableset S ⊂ PS(U) with ω1, ω2 ∈ S.

Proposition 4.21 The relation ∼ is an equivalence relation on PS(U).

ProofBy considering the indecomposable set ω, it is clear that ω ∼ ω (reflexivity) for all ω ∈ PS(U).Because the definition of ω1 ∼ ω2 is manifestly symmetric in ω1 and ω2, it is also clear thatω1 ∼ ω2 ⇒ ω2 ∼ ω1 (symmetry) for all ω1, ω2 ∈ PS(U). To prove transitivity, assume thatω1 ∼ ω2 and ω2 ∼ ω3. Then there exist indecomposable sets S1, S2 ⊂ PS(U) with ω1, ω2 ∈ S1 andω2, ω3 ∈ S2. If the union S := S1∪S2 would not be indecomposable, there would exist two disjointnon-empty mutually orthogonal subsets S′, S′′ ⊂ PS(U) with S = S′ ∪ S′′, and hence we couldwrite Sj = (Sj ∩ S′) ∪ (Sj ∩ S′′) for j = 1, 2. Note that either ω2 ∈ S′ or ω2 ∈ S′′. Assuming thatω2 ∈ S′ would immediately lead to Sj ∩S′′ = ∅ for j = 1, 2 (since the Sj are indecomposable), andthus also to S′′ = ∅. Similarly, assuming that ω2 ∈ S′′ would lead to S′ = ∅. This contradictionshows that S must in fact be indecomposable. Because ω1, ω2, ω3 ∈ S, this implies that ω1 ∼ ω3,and thus that ∼ is indeed an equivalence relation.

Now consider an equivalence class C ⊂ PS(U) under ∼. We will show that C is indecompos-able. If this would not be true then there would be disjoint mutually orthogonal non-empty setsC1, C2 ⊂ PS(U) with C = C1 ∪ C2. Now if ω1, ω2 ∈ C with ωj ∈ Cj , then (since ω1 ∼ ω2) thereexists an indecomposable set S ⊂ PS(U) with ω1, ω2 ∈ S. Because all elements of S are equiv-alent under ∼, we must have S ⊂ C. But then S decomposes into disjoint mutually orthogonalnon-empty subsets S ∩ C1 and S ∩ C2, contradicting the indecomposability of S. Thus we con-clude that the equivalence classes are indecomposable subsets of PS(U). Now suppose that forsome equivalence class C we have an indecomposable set C′ with C ⊂ C′. Then all elements in C′are equivalent under ∼ and hence we also have C′ ⊂ C, which implies that C′ = C. This showsthat the equivalence classes are maximal indecomposable subsets of PS(U); we will call these setssectors. Furthermore, note that if ω1, ω2 ∈ PS(U) with ω1 · ω2 6= 0, then the set ω1, ω2 is inde-composable and hence ω1 ∼ ω2. This shows that the different sectors must be mutually orthogonal.

General facts about representationsIn order to physically describe a system with abstract algebra of observables U, we choose an ap-propriate representation π : U→ B(H) of the algebra of observables in some Hilbert space H. Inthis context we call π the physical representation and H the physical Hilbert space. If the systemhas finitely many coordinates and momenta that must satisfy the canonical commutation relations,the choice of representation is uniquely determined (up to unitary equivalence) by the Stone-VonNeumann theorem. However, if the system has infinitely many degrees of freedom, as in quantumfield theory, the Stone-Von Neumann theorem is no longer applicable and in such cases there aremany unitarily inequivalent representations of the canonical commutation relations. Therefore, for

96

Page 98: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

such systems the physical representation π should be chosen carefully, depending on the particulardynamics of the system at hand39; for instance, the Fock representation cannot be used for inter-acting fields. Without loss of generality, we may always assume that the physical representationπ is faithful, i.e. that π is injective. The reason for this is as follows. Suppose that it would havebeen possible to physically describe a quantum system by using a non-faithful representation π ofthe algebra of observables U. Then the representation π defines a representation π of the quotientC∗-algebra U/ ker(π), and we could just as well have started with this quotient algebra (as thealgebra of observables) from the beginning.

Given the physical representation π : U→ B(H) we define for each unit vector Ψ ∈ H a stateon the C∗-algebra U by

U 3 A 7→ 〈π(A)Ψ,Ψ〉 =: ρΨ(A). (4.11)

We call this state the vector state associated with π corresponding to the vector Ψ ∈ H. If π isirreducible then this always defines a pure state in S(U) and in that case the set ρΨΨ∈H of allvector states associated with π coincides precisely with a sector C in PS(U) and if π′ is anotherirreducible representation of U whose vector states correspond to some sector C′ then C′ = C if andonly if π′ is unitarily equivalent to π. Also, for each sector C ⊂ PS(U) there exists an irreduciblerepresentation π : U→ B(H) such that C = ρΨΨ∈H, so we conclude that the sectors of PS(U) arein one-to-one correspondence with the irreducible representations of U modulo unitary equivalence.The proof of these facts can be found in section 6.1 (proposition 6.2) of [2].

As stated at the beginning of this subsection, the space S(U) is unnecessarily large for physicalpurposes. For a physical representation π : U → B(H), we define the set of physical states to bethe set of all states in S(U) of the form

ρ(A) = Tr(ρπ(A)), A ∈ U (4.12)

with ρ a density operator on H. To emphasize that this set of physical states depends on therepresentation π, we will denote it by Sπ. In general, Sπ is a proper subset of the set of all statesS(U). Note that we can now characterize a quantum system by the pair (U, π), instead of by thepair (H,A) as we did in subsection 2.2.2. By the same reasoning as in subsection 2.2.2, we findthat the vector state ρΨ defined in (4.11) is obtained as a special case of (4.12) by taking ρ to bethe one-dimensional projection onto CΨ. Also, as in subsection 2.2.2, any ρ ∈ Sπ can be writtenas a countable convex combination of vector states. Again this shows that any pure state in Sπmust be a vector state. However, because in general π(U) is not equal to B(H), the converse isnot necessarily true. To illustrate this, suppose that the physical Hilbert space is a direct sumH = H1 ⊕ H2 of Hilbert spaces and that π(U) = B(H1) ⊕ B(H2). Let Ψi ∈ Hi for i = 1, 2be two unit vectors that define vector states ρΨi which are different from each other, and definethe unit vector Ψ := (Ψ1 + Ψ2)/

√2 ∈ H. Because π(A)Ψi ∈ Hi, and hence π(A)Ψi ⊥ Ψj for

(i, j) ∈ (1, 2), (2, 1), for every A ∈ U, the vector state defined by Ψ satisfies

ρΨ(A) =1

2〈π(A)(Ψ1 + Ψ2),Ψ1 + Ψ2〉 =

1

2〈π(A)Ψ1,Ψ1〉+

1

2〈π(A)Ψ2,Ψ2〉

=1

2ρΨ1(A) +

1

2ρΨ2(A),

which shows that the vector state ρΨ is a convex combination of two different states and is thereforenot pure. We note furthermore that although each state in Sπ is a countable convex combinationof vector states, it is not necessarily true that each state is a countable convex combination of purestates.

We now introduce some terminology concerning representations. Two representations π1 :U → B(H1) and π2 : U → B(H2) are called phenomenologically equivalent if Sπ1 = Sπ2 . Notethat two unitarily equivalent representations are in particular phenomenologically equivalent. A

39In contrast to the case of finitely many degrees of freedom, where the chosen representation only depends on thenumber of degrees of freedom (i.e. H ' L2(RN ) for N degrees of freedom) and not on the specific dynamics of thesystem.

97

Page 99: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

representation π : U → B(H) is called factorial of type I if π is a direct sum of a (possiblyinfinite) number of copies of some irreducible representation π : U → H, so H = H⊕K and π =π⊕K . Two factorial representations of type I are called disjoint if they are multiples of irreduciblerepresentations that are unitarily inequivalent. Without proof we mention that a representation πis phenomenologically equivalent to some irreducible representation π if and only if π is a directsum of copies of π (and is thus factorial of type I).

Once we have chosen the physical representation π, we can consider the closure π(U) of thealgebra π(U) ⊂ B(H) in the σ-weak topology. By Von Neumann’s bicommutant theorem, thisis a Von Neumann algebra and we have π(U) = π(U)′′; it is called the Von Neumann algebra ofobservables of the quantum system. To be able to consider observables which are represented byunbounded observables, we proceed as follows. We say that a (possibly unbounded) self-adjointoperator A on H is affiliated to the Von Neumann algebra of observables π(U)′′ if all spectralprojection operators EA(∆), with ∆ a Borel set in R, belong to π(U)′′. The set of observables ofthe system is then defined to be the set of all self-adjoint operators on H which are affiliated toπ(U)′′.

Superselection rulesFor a quantum system (U, π) the elements in the commutant π(U)′ are called superselection oper-ators and a set of operators in B(H) that generates π(U)′ is refered to as the superselection rulesof the system. Of course, we always have C1H ⊂ π(U)′ and in case the inclusion is strict we saythat the system has non-trivial superselection rules.

If H is a Hilbert space and V ⊂ H is a subset of non-zero vectors in H, then V is called a linkedsystem of vectors if V cannot be written as a disjoint union of two non-empty mutually orthogonalsubsets. In particular, for any linear subspace V ⊂ H the set of unit vectors in V forms a linkedsystem. Now let W ⊂ H be a set of nonzero vectors that is total in H, i.e. the closed linear span ofW is equal to H. We then define a relation ∼ on W as follows: Ψ1 ∼ Ψ2 if and only if there existsa linked system L ⊂W with Ψ1,Ψ2 ∈ L. By using similar arguments as for indecomposable sets ofPS(U) (see above), we find that ∼ defines an equivalence relation on W . The equivalence classesgive rise to a partitioning of W into mutually orthogonal maximal linked systems Wνν∈N , wherethe index set N may be uncountable. For each ν ∈ N we define a subspace Hν ⊂ H as the closedlinear span of Wν . Because W is total in H, we then have

H =⊕ν∈NHν .

So we conclude that if we have a total subset W in a Hilbert space H, then H decomposes into adirect sum of non-zero subspaces Hν such that Wν = W ∩ Hν . Note that although N might beuncountable, the direct sum is still discrete in the sense that the measure on the index set N isdiscrete, in contrast to the general case of a direct integral when the set N is equiped with a moregeneral measure and in which case we would really have to write a direct integral

∫ ⊕instead of⊕

.We will now apply this in the following way. Let π : U → B(H) be a representation of the

C∗-algebra U and suppose that the set P ⊂ H of all vectors in H that define pure states on Uforms a total subset of H. Then according to the discussion above we can decompose H into adirect sum H =

⊕ν∈N Hν of non-zero subspaces Hν with Pν = Hν ∩ P, where Pνν∈N are the

maximal linked systems in P as above. Now fix some ν0 ∈ N and choose a Ψ0 ∈ Pν0 . If A ∈ U withπ(A)Ψ0 6= 0, then it can be shown40 that the state on U defined by the unit vector AΨ0/‖AΨ0‖ ispure, so the unit vectors of π(U)Ψ0 form a subset of P. Because π(U)Ψ0 is a linear subspace, theunit vectors in π(U)Ψ0 form a linked system. Thus the set of unit vectors of π(U)Ψ0 is a subsetof Pν1 for some ν1 ∈ N . But Ψ0 ∈ Pν0 ∩ Pν1 , so we must in fact have ν1 = ν0 and thus the unitvectors of π(U)Ψ0 lie in Pν0 . Since Ψ0 ∈ Pν0 was arbitrary, this implies that π(U)Pν0 ⊂ Hν0 ; sinceν0 ∈ N was arbitrary, we then have π(U)Pν ⊂ Hν for all ν ∈ N . Because for each ν the set Pν is

40See exercise 6.10 of [2].

98

Page 100: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

total in Hν , this in turn implies that π(U) leaves all the subspaces Hν invariant. Without proof41

we mention furthermore that the subrepresentation of π(U) on the subspaces Hν are all factorialof type I and are pairwise disjoint. We can thus write H as a double direct sum

H =⊕ν∈NHν =

⊕ν∈N

(H⊕Mνν

)(4.13)

and we can write π asπ =

⊕ν∈N

πν =⊕ν∈N

(π⊕Mνν

), (4.14)

where πν : U→ B(Hν) are irreducible representations. This decomposition into a (discrete) directsum of irreducible representations was possible because of the assumption that P is total in H;this assumption about P is therefore called the hypothesis of discrete superselection rules. Forrepresentations satisfying this hypothesis the following proposition holds, which can be found insection 6.2 (proposition 6.6) of [2].

Proposition 4.22 Let π : U → B(H) be a representation of a C∗-algebra U, let P ⊂ H be theset of all vectors that define pure states and suppose that π satisfies the hypothesis of discretesuperselection rules. Then the following statements are equivalent:(1) The elements of PSπ are in one-to-one correspondence with the elements of P.(2) The representations πν : U → B(Hν) in the decompositions (4.13) and (4.14) are irreducible(i.e. Mν = 1 for all ν ∈ N).(3) P = Ψ ∈

⋃ν∈N Hν : ‖Ψ‖ = 1.

(4) The commutant π(U)′ of π(U) is abelian.

Note that if (2) is satisfied, the representation is still phenomenologically equivalent to a repre-sentation where some (or all) of the Mν are larger than 1 (including the case where some Mν

is infinite). So demanding that the physical representation π satisfies (2) does not restrict thepossibilities for the state space Sπ, but it has the benefit that it simplifies the representation π.For this reason it is often assumed that a system (U, π) that satisfies the hypothesis of discretesuperselection rules, also satisfies the equivalent statements in the proposition above. Because of(4), this assumption is called the hypothesis of commutative (discrete) superselection rules. So fora system (U, π) that satisfies the hypothesis of commutative discrete superselection rules the rep-resentation decomposes into a direct sum of unitarily inequivalent representations of U. As statedearlier, all unit vectors in an irreducible representation define pure states on U, so in each of thespaces Hν in the direct sum we have the unrestricted superposition principle, i.e. the superpositionof two pure states again defines a pure state. On the entire space H we then have the followingrestricted version of the superposition principle: a normalized linear combination of two vectorsdefining pure states again defines a pure state if the two vectors belong to the same space Hν .For this reason the subspaces Hν are called coherent subspaces of H. For a system (U, π) withcommutative discrete superselection rules, the commutant is given by

π(U)′ =⊕ν∈N

πν(U)′ =⊕ν∈N

C1ν ,

where the first equality already holds if only the hypothesis of discrete superselection rules issatisfied. The second equality follows from the irreducibility of the πν . The Von Neumann algebraof observables is now clearly

π(U)′′ =⊕ν∈N

B(Hν).

Symmetries in the algebraic approachWe will now discuss symmetries in the algebraic approach. The definition of a symmetry of the

41See proposition 6.5 of [2] for a proof.

99

Page 101: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

quantum system (H,A) that was given in 2.2.3 can be restated for a quantum system (U, π) inthe algebraic approach. A symmetry of a quantum system (U, π) is defined to be a pair (s, s′) ofbijections s : Us-a → Us-a and s′ : Sπ → Sπ on the set of self-adjoint (s-a) elements of U and on theset of physical states, respectively, satisfying

(s′ρ)(sA) = ρ(A) (4.15)

for all ρ ∈ Sπ and A ∈ Us-a.The map s : Us-a → Us-a is continuous in the norm topology (it even preserves the norm) and

consequently the map s′ : Sπ → Sπ is weak*-continuous. The proofs of these facts can be foundin [2], proposition 6.7 and the paragraph preceding that proposition. Because π is assumed tobe faithful, Sπ can be shown42 to be weak*-dense in S(U). Thus we can extend s′ uniquely byweak*-continuity to a map s′ : S(U) → S(U); therefore we may assume that s′ is a map fromS(U) into itself; it is in fact a bijection, with inverse given by the extension of (s′)−1 : Sπ → Sπ.By using the same reasoning as in section 2.2.3, the map s′ : S(U) → S(U) preserves the convexstructure of S(U) and therefore maps pure states onto pure states. Thus, s′ : S(U) → S(U) is aweak*-continuous affine bijection which maps Sπ onto itself. Furthermore, s′ preserves transitionprobabilities and as a consequence the image of a sector of S(U) under the map s′ is again a sector.Conversely, given a weak*-continuous affine map s′ that maps Sπ onto itself, we can define a uniquesymmetry (s, s′). Before we close our discussion of the map s′ and go over to a discussion of themap s, we define the notion of an invariant state. A state ρ ∈ Sπ is called an invariant state under(s, s′) if

(s′ρ)(A) = ρ(A)

for all A ∈ Us−a, or equivalently (s′ρ)(sA) = (s′ρ)(A) for all A ∈ Us−a.Now that we have discussed s′ in some detail, we will discuss some properties of s. To see that

s : Us−a → Us−a is R-linear, we note that for λ, µ ∈ R and A,B ∈ Us-a we have

(s′ρ)(s(λA+ µB)) = ρ(λA+ µB) = λρ(A) + µρ(B) = λ(s′ρ)(sA) + µ(s′ρ)(sB)

= (s′ρ)(λsA+ µsB) (4.16)

for all ρ ∈ S(U), which implies that s(λA+µB) = λsA+µsB since S(U) separates the points of Us-a.Furthermore, it can also be shown that s satisfies s(A2) = s(A)2 for all A ∈ Us-a (see for instanceproposition 6.10 of [2]), which is equivalent to the property that s(AB+BA) = s(A)s(B)+s(B)s(A)for all A,B ∈ Us-a. We will now extend s to a map s : U→ U by demanding that condition (4.15)also holds for all A ∈ U. Then the first step in (4.16) also makes sense for λ, µ ∈ C and it followsthat s : U → U is C-linear, and is hence a vector space automorphism. Using the fact that eachA ∈ U can be written as linear combination A = 1

2(A + A∗) + i 12i(A − A

∗) =: Re(A) + iIm(A)of self-adjoint elements, it is also easy to see that s(A2) = s(A)2 for all A ∈ U. Finally, for eachA ∈ U we also have

s(A∗) = s(Re(A)− iIm(A)) = s(Re(A))− is(Im(A)) = [s(Re(A)) + is(Im(A))]∗

= s(A)∗,

where in the last step we used that s is C-linear. If U1 and U2 are C∗-algebras, a linear maps : U1 → U2 satisfying s(A2) = s(A)2 and s(A∗) = s(A)∗ for all A ∈ U is called a Jordan*-homomorphism. Thus we have found that the symmetries of a quantum system (U, π) must beJordan*-automorphisms of U. Conversely, if s : U → U is a Jordan*-automorphism then we get apair (s, s′) of bijections where (s′ρ)(A) = ρ(s−1A) for A ∈ U; however, this pair (s, s′) will onlydefine a symmetry if s′ maps Sπ onto itself. The set J (U) of all Jordan*-automorphisms inher-its a topology from the weak*-topology of U∗. Together with the composition law of Jordan*-automorphisms this gives J (U) the structure of a topological group. Note that a Jordan*-automorphism is more general than a C∗-isomorphism because a Jordan*-isomorphism is not

42See section 6.1 and 6.3 of [2]. Here it is shown that if π is faithful, Sπ distinguishes the positive elements of U(i.e. ρ(A) ≥ 0 for all ρ ∈ Sπ implies A ≥ 0), which in turn (by a result of Kadison which also uses that Sπ is convex)implies that Sπ is weak*-dense in S(U).

100

Page 102: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

necessarily multiplicative. For instance, a C∗-anti-automorphism of U (i.e. a vector space au-tomorphism s : U → U which preserves the ∗-operation and satisfies s(AB) = s(B)s(A) for allA,B ∈ U) is also a Jordan*-automorphism. The following theorem of Kadison, which can befound in section 2.2 of [7] (theorem II.2.1), gives us more insight in the nature of a given Jordan*-automorphism.

Theorem 4.23 Let U1 be an abstract C∗-algebra and let U2 be a C∗-algebra of operators on aHilbert space H. Then a linear ∗-preserving surjection α : U1 → U2 is a Jordan*-homomorphism ifand only if there exists a projection operator E ∈ (U2)′′ ∩ (U2)′ such that α(AB)E = α(A)α(B)Eand α(AB)(1− E) = α(B)α(A)(1− E) for all A,B ∈ U1.

We can apply the theorem as follows. If s : U → U is a Jordan*-automorphism that defines asymmetry of the system (U, π), then the map πs : U→ π(U) defined by A 7→ π(sA) is a surjectiveJordan*-homomorphism. According to the theorem, there exists a projection E ∈ π(U)′′ ∩ π(U)′

with πs(AB)E = πs(A)πs(B)E and πs(AB)(1−E) = πs(B)πs(A)(1−E) for all A,B ∈ U. In otherwords, if H denotes the representation space corresponding to π then H decomposes into a directsum H = H1⊕H2 of subspaces H1 = EH and H2 = (1−E)H which are invariant under π(U) (thisfollows from the fact that E ∈ π(U)′) and such that the Jordan*-automorphism43 π(A) 7→ πs(A)on π(U) is a direct sum of a C∗-automorphism of the algebra π(U)E ' π(U)|H1 and a C∗-anti-automorphism of the algebra π(U)(1 − E) ' π(U)|H2 . As a special case, if π(U)′′ ∩ π(U)′ = C1then the Jordan automorphism π(A) 7→ πs(A) is either a C∗-automorphism or else a C∗-anti-automorphism and we have thus obtained the following corollary to the theorem above.

Corollary 4.24 Let U be a C∗-algebra and let π : U → B(H) be a representation. If s : U → Uis a Jordan*-automorphism such that s(ker(π)) ⊂ ker(π), then π(A) 7→ π(sA) =: πs(A) is aJordan*-automorphism of π(U). Furthermore, there exists a projection operator E ∈ π(U)′′ suchthat π(U) leaves the subspaces H1 = EH and H2 = (1−E)H invariant and such that π(A) 7→ πs(A)decomposes into a direct sum of a C∗-automorphism of π(U)|H1 and a C∗-anti-automorphism ofπ(U)|H2.

The relationship between this corollary and Wigner’s theorem in section 2.2.3 is as follows.Assume that (U, π) satisfies the hypotheses of commutative and discrete superselection rules. Thenthe physical Hilbert space decomposes into a direct sum of Hilbert spaces Hν

H =⊕ν∈NHν ,

and the physical representation π decomposes accordingly into a direct sum π =⊕

ν∈N πν ofirreducible representations πν : U → B(Hν). Because the Hilbert spaces H1 = EH and H2 =(1−E)H are invariant under π(U), the index set N can be written as a disjoint union N = N1∪N2

such thatH1 =

⊕ν∈N1

Hν and H2 =⊕ν∈N2

Note that it is allowed that one of the Ni is empty. It can now be shown that the C∗-automorphismof π(U)|H1 can be represented by a unitary operator U1 : H1 → H1 as

πs(A)|H1 7→ U1π(A)|H1U1−1 (4.17)

and that there exists a bijection b1 : N1 → N1 such that this operator U1 maps the coherentsubspace Hν with ν ∈ N1 unitarily onto the coherent subspace Hb1(ν). Similarly, the C∗-anti-automorphism of π(U)|H2 can be represented by an anti-unitary operator U2 : H2 → H2 as

πs(A)|H2 7→ U2π(A)∗|H2U2−1 (4.18)

43Here we assume that s(ker(π)) ⊂ ker(π), which is certainly the case when π is faithful.

101

Page 103: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

and there exists a bijection b2 : N2 → N2 such that U2 maps the coherent subspace Hν withν ∈ N2 anti-unitarily onto the coherent subspace Hb2(ν). By comparing equations (4.17) and(4.18) with equations (2.19) and (2.20), the relationship with Wigner’s theorem is clear. In fact,we have obtained a generalization of Wigner’s theorem: a symmetry in the algebraic approachcan be represented in the physical Hilbert space as a direct sum of a unitary operator and ananti-unitary operator, each of which maps coherent subspaces unitarily, resp. anti-unitarily, ontocoherent subspaces. The proof of all these facts can be found in section 6.3 (proposition 6.11)of [2]. In the absence of commutative and discrete superselection rules, we cannot always makethe step from corollary 4.24 to unitary and anti-unitary operators on H. In particular, whena Jordan*-automorphism s : U → U defines a C∗-automorphism of π(U) (rather than a directsum of an automorphism and an anti-automorphism), it is not always true that there exists aunitary operator U : H → H such that π(sA) = Uπ(A)U−1. When such an operator U doesexist, we say that the symmetry s is implementable. An important example, which we will needin the following subsection when we discuss vacuum states, is obtained in the case of a GNS-representation corresponding to a state which is invariant under a symmetry (s, s′) for which s isa C∗-automorphism:

Theorem 4.25 Let U be a C∗-algebra and let (s, s′) be a symmetry for which the Jordan*-automorphism s : U → U is a C∗-automorphism and suppose that the state ρ ∈ S(U) is invariantunder (s, s′). Let πρ : U → B(Hρ) be the GNS-representation associated to the state ρ and sup-pose that s(ker(π)) ⊂ ker(π). Then there exists a unique unitary operator Us : Hρ → Hρ on therepresentation space Hρ that satisfies

Usπρ(A)U−1s = πρ(sA), UsΩρ = Ωρ (4.19)

for each A ∈ U, where Ωρ ∈ Hρ denotes the cyclic vector corresponding to πρ.

ProofBecause s(ker(π)) ⊂ ker(π), we see that if πρ(A)Ωρ = πρ(B)Ωρ for some A,B ∈ U then alsoπρ(sA)Ωρ = πρ(sB)Ωρ. Thus, on the dense subset πρ(U)Ωρ of Hρ we can define a linear operatorUs by

Usπρ(A)Ωρ := πρ(sA)Ωρ.

This operator satisfies

〈Usπρ(A)Ωρ, Usπρ(B)Ωρ〉 = 〈πρ(sA)Ωρ, πρ(sB)Ωρ〉 = 〈πρ(sB∗A)Ωρ,Ωρ〉 = ρ(sB∗A) = ρ(B∗A)

= 〈πρ(A)Ωρ, πρ(B)Ωρ〉,

where we have used that ρ is invariant under the symmetry. Because s is bijective, we haveUsπρ(U)Ωρ = πρ(U)Ωρ, so Us is indeed unitary. If U has a unit, then the second equation in (4.19)follows from the fact that s1 = 1, since this implies πρ(s1)Ωρ = Ωρ. If U has no unit, then theidentity follows by taking an approximate unit eν of U. The first equation in (4.19) follows from

Usπρ(A)U−1s [πρ(sB)Ωρ] = Usπρ(A)U−1

s [Usπρ(B)Ωρ] = Usπρ(AB)Ωρ = πρ(sAB)Ωρ

= πρ(sA)πρ(sB)Ωρ

and from the fact that the set πρ(sB)ΩρB∈U is dense in Hρ. To show uniqueness, suppose thatU ′s is a linear operator satisfying the two equations in (4.19). Then for all A ∈ U we have

U ′sπρ(A)Ωρ = U ′sπρ(A)(U ′s)−1U ′sΩρ = πρ(sA)Ω = Usπρ(A)Ωρ,

so U ′s coincides with Us on πρ(U)Ωρ. Hence U ′s = Us.

Now that we have discussed individual symmetries, we will consider symmetry groups. Recallthat after proving Wigner’s theorem we argued that the elements of a connected Lie group of

102

Page 104: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

symmetries must be represented by unitary operators rather than anti-unitary ones. In view ofthis, we expect that in the algebraic approach connected symmetry groups should be representedby C∗-automorphisms. In the algebraic approach we say that a topological group G is a symmetrygroup of the system if there is a morphism α : G→ J (U) of topological groups. As demonstratedin section 2.2 of [7] (theorem II.2.4), it is indeed the case that if a topological group G is a connectedsymmetry group then the elements αg := α(g) ∈ J (U) are all C∗-automorphisms. In particular,

in relativistic quantum systems the elements (a, L) of the restricted Poincare group P↑+ give riseto C∗-automorphisms α(a,L) : U→ U.

4.2.2 The Haag-Kastler axioms

We will now state and motivate the axioms of the Haag-Kastler framework. In section 4.1.6 weshowed that under some extra assumptions (the Haag-Ruelle axioms) the Wightman frameworkis capable of describing particles in a scattering experiment. These extra assumptions concernedcertain properties of the spectrum of the energy-momentum operator and also the existence of one-particle subspaces in the Hilbert space H and the existence of certain operators in the polynomialalgebra P(M) which generate one-particle states from the vacuum state. Using these operatorswhich generate one-particle states, it was possible to construct the in- and out-states that oneneeds in scattering theory, as well as the Poincare-invariant isometries Ωin/out : HFock → H. Itseems that in the Haag-Ruelle theory the quantum fields only play a role in the background: theywere needed to obtain the correspondence O → P(O) from spacetime domains to ∗-algebras ofoperators, and they were needed in the proofs of the mathematical statements (although we did notconsider these proofs in section 4.1.6; see for instance section 12.2 of [2] for detailed proofs). It turnsout that when in a quantum theory a correspondence between spacetime domains and operatorsis chosen properly (i.e. as in the Haag-Kastler framework that we will introduce now), the resultsof the Haag-Ruelle theory can be derived without using quantum fields, see also chapter 5 of [1].This fact should be considered as one of the reasons for discussing the Haag-Kastler theory. Apartfrom the fact that this framework is capable of incorporating the Haag-Ruelle scattering theory,Haag and Kastler give (in their paper [21]) as a motivation for introducing their theory that thetrue essence of quantum field theory is that it gives rise to the notion of observables which can bemeasured in some spacetime region and that the observables corresponding to spacelike separatedregions are compatible. In this sense it should be expected that the Haag-Kastler theory, whichfocusses on the assignment of observables to spacetime domains, is more general than quantumfield theory. Our discussion in this and the following subsection is inspired by the books [1] and[20] and, of course, by the article [21].

As stated in the previous subsection, the Haag-Kastler theory is formulated in the setting ofalgebraic quantum theory, so we should begin with the following axiom.

Axiom 0: Algebra of observablesThere is a C∗-algebra U, called the algebra of observables.

Notice that there is no mention of the choice of the (faithful) physical representation here. Thereason for this, as explained in the article [21], is as follows. Suppose that we have a number ofphysical systems that are prepared in identical ways and suppose that for each of these systemswe make a measurement of a set of simultaneously measurable observables A1, . . . , An, in order toobtain some knowledge about the unknown state α ∈ S(U) of the identical systems. More specifi-

cally, these measurements provide us with estimates p(Aj , B) ∈ [0, 1] of the probabilities PAjα (B)

that a measurement of observable Aj will result in a value in the Borel set B for the system instate α. However, each measurement always involves some error and we can only prepare a finitenumber of identical systems, so after our measurements we can only conclude that the system isin some state α ∈ S(U) for which the following inequalities hold:

|PAjα (B)− p(Aj , B)| < εj .

103

Page 105: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

These inequalities do not specify the point α ∈ S(U) exactly, but rather define some neighborhoodin S(U) with respect to the weak topology on S(U). A particular physical representation π shouldthus be considered as adequate for describing the system if all open neighborhoods in S(U) thatcan be obtained from an experiment (in the way described above) contains an element of Sπ. Inview of these considerations, it seems natural to introduce the following definition. Two physicalrepresentations π1 and π2 are called physically equivalent if every weakly open neighborhood in Sπ1

contains an element of Sπ2 , and vice versa. Note that Sπ1 and Sπ2 are both subsets of the largerspace S(U), so that the definition makes sense. An important result44 in the theory of C∗-algebrasnow states that any two representations π1 and π2 with ker(π1) = ker(π2) are physically equivalentin the sense defined above. In particular, any two faithful representations are physically equivalentand for this reason it is not necessary to specify the particular choice of the (faithful) physicalrepresentation in the axioms.

The correspondence between spacetime domains and operators is now established by assumingthat the algebra of observables has some a substructure which is given in terms of spacetime do-mains.

Axiom 1: Local algebrasFor each bounded45 open set O ⊂M in Minkowski spacetime there is a C∗-subalgebra U(O) ⊂ U.

The self-adjoint elements of U(O) are interpreted as the observables that can be measured inthe spacetime region O, also called local observables. A local observable which is measurable insome subset of Minkowski spacetime, should also be measurable in some larger subset of Minkowskispacetime. This is expressed in the following axiom.

Axiom 2: MonotonicityIf O1, O2 ⊂M are bounded open sets satisfying O1 ⊂ O2, then U(O1) ⊂ U(O2).

When we have some bounded open set O ⊂ M and some observable A ∈ U(O) which can be

measured in O, then a restricted Poincare transformation g ∈ P↑+ should have the effect of map-

ping A to some observable which can be measured in gO. So an element g ∈ P↑+ defines for each

bounded open set O ⊂ M a map α(O)g : U(O) → U(gO). Because restricted Poincare transforma-

tions are assumed to be symmetries of any relativistic quantum system, this map must in fact bean isomorphism of C∗-algebras.

Axiom 3: CovarianceFor each restricted Poincare transformation g ∈ P↑+ we have an automorphism αg : U → U suchthat for each bounded open subset O ⊂ M the restriction of αg to U(O) is a C∗-isomorphismαg : U(O)→ U(gO). For a fixed observable A ∈ U, the map g 7→ αg(A) is continuous in g.

If two regions of spacetime are spacelike separated, then no physical process in one of the tworegions can affect a physical process in the other region. In particular, this means that we can per-form simultaneous measurements in both regions and therefore the local observables correspondingto one of the two regions must in fact commute with all local observables corresponding to theother region.

Axiom 4: LocalityIf the bounded open subsets O1, O2 ⊂M are spacelike separated, then the algebras U(O1) and U(O2)commute.

44Haag and Kastler call this result Fell’s equivalence theorem. In their article [21] there is a reference to therelevant article of J.M.G. Fell.

45By a bounded set in M we mean a set with compact closure.

104

Page 106: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Finally, we assume that the algebra U of observables is the smallest C∗-algebra containing allthe local observables. This emphasizes the importance of local observables.

Axiom 5: Generating propertyThe algebra

⋃O U(O) is dense in U. Here the union is taken over all bounded open subsets of M.

4.2.3 Vacuum states in the Haag-Kastler framework

A large difference with the Wightman theory is that there is no mention of a vacuum state inthe Haag-Kastler axioms. As we will show now, there is in fact a notion of vacuum states in theHaag-Kastler framework. To this end, we first consider a physical representation π : U → B(H)of the algebra of observables and we assume that there is some unitary representation T (x) of thetranslation group on the physical Hilbert spaceH with corresponding energy-momentum generatorsPµ. Let EP denote the joint spectral measure of the operators Pµ. Then for any Ψ ∈ H we candefine a measure µΨ on Minkowski space M by µΨ(B) = 〈EP (B)Ψ,Ψ〉 for any Borel set B ⊂M.The support of the measure µΨ is of course precisely the support of the wave function of Ψ inenergy-momentum space in case Ψ is a one-particle state. In general, it is given the followingname.

Definition 4.26 Let H be the Hilbert space of a physical system and let T (x) be a unitaryrepresentation of the translation group on H with corresponding energy-momentum generators Pµ

which have the joint spectral measure EP . Then for a vector Ψ ∈ H the support of the measureµΨ : B 7→ 〈EP (B)Ψ,Ψ〉 on M is called the energy-momentum spectrum of Ψ.

We now have the following lemma, which states that (the representation of) some operators in thealgebra U can shift the energy-momentum spectrum of vectors in the physical Hilbert space. Thislemma is lemma 4.1 in [1].

Lemma 4.27 Let f ∈ C∞(M) be such that its Fourier transform f(p) =∫M f(x)eip·xd4x has

bounded support ∆ ⊂M, and for Q ∈ U define Q(f) =∫α(x,1)(Q)f(x)d4x as a Bochner integral46.

Let π : U→ B(H) be a representation and suppose that there is a unitary representation T (x) of thetranslation group on H with energy-momentum generators Pµ. If the energy-momentum spectrumFΨ ⊂M of a vector Ψ ∈ H is a closed set, then π(Q(f))Ψ has energy-momentum spectrum FΨ+∆.

For this reason, we say that the operator Q(f) ∈ U, with f as in the lemma, increases the energy-momentum by ∆. Now define for a future-directed timelike vector e ∈ M the set M−(e) = p :p · e < 0. Note that e lies along the time-axis in some particular inertial frame, so the set M−(e)contains all energy-momentum vectors which have negative energy in this particular inertial frame.If ∆ ⊂M−(e), then the lemma implies that according to this inertial observer the operator π(Q(f))decreases the energy of any vector Ψ ∈ H. Thus, if we want some vector Ψ0 ∈ H to representa vacuum vector (which has the lowest possible energy in any inertial frame) then we must haveπ(Q(f))Ψ0 = 0 for any Q ∈ U and any smooth f whose Fourier transform has support ∆ ⊂M−(e)for some future-directed timelike vector e ∈M. For such functions f we thus have

ρΨ0(Q(f)∗Q(f)) = 〈π(Q(f)∗Q(f))Ψ0,Ψ0〉 = 0.

When we translate this back into the language of the abstract algebra, we obtain the followingdefinition.

Definition 4.28 A state ω ∈ S(U) on a C∗-algebra U is called a vacuum state if ω(Q(f)∗Q(f)) = 0for all Q ∈ U and for any smooth function f whose Fourier transform has bounded support∆ ⊂M−(e) for some future-directed timelike vector e ∈M.

46A Bochner integral is an integral of a function on a measure space with values in a Banach space. Its definitionis very similar to that of a Lebesgue integral of a complex-valued function on a measure space.

105

Page 107: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Let V+ = p : p · p ≥ 0, p0 ≥ 0 denote the closed forward light cone in momentum space. Thenclearly M−(e) ⊂ M\V+ for any future-directed timelike vector e ∈ M, so if ω ∈ S(U) is a statewhich satisfies ω(Q(f)∗Q(f)) = 0 for all Q ∈ U and for any smooth function f whose Fouriertransform has bounded support ∆ ⊂M\V+, then ω is a vacuum state.

Conversely, suppose that ω is a vacuum state and suppose that f is a smooth function whoseFourier transform f has bounded support ∆ ⊂ M\V+. If V+ ⊂ M denotes the set of all future-directed timelike vectors, then M\V+ =

⋃e∈V+

M−(e); because each M−(e) is open, we have

obtained an open cover M−(e)e∈V+ of M\V+ and hence also of ∆. But ∆ is compact, so thereexists a finite subcover M−(ej)nj=1 (with e1, . . . , en ∈ V+) of ∆. Now let gjnj=1 be a partitionof unity subordinate to the cover M−(ej)nj=1, i.e. each gj is a smooth function with support in

M−(ej) and∑n

j=1 gj(p) = 1 for all p ∈⋃nj=1M−(ej). Now define the smooth functions fjnj=1 by

fj = gj f . Then the support of each fj lies in M−(ej) and∑n

j=1 fj = f . If we denote the inverse

Fourier transform of fj by fj , then we find that

0 ≤ ω(Q(f)∗Q(f)) =n∑j=1

n∑k=1

ω(Q(fj)∗Q(fk)) ≤

n∑j=1

n∑k=1

|ω(Q(fj)∗Q(fk))| ≤ 0.

where in the last step we used the property |ω(A∗B)| ≤√ω(A∗A)

√ω(B∗B) of states. Thus

ω(Q(f)∗Q(f)) = 0 and we have proved the following proposition.

Proposition 4.29 A state ω ∈ S(U) on a C∗-algebra U is a vacuum state if and only if ω(Q(f)∗Q(f)) =0 for all Q ∈ U and any smooth function f whose Fourier transform has bounded support ∆ ⊂M\V+.

106

Page 108: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

5 Constructive quantum field theory

After our investigation of the Wightman and Haag-Kastler axiom schemes we might wonderwhether there are any concrete models that satisfy all axioms of one or both of these axiomschemes. Of course, we have already seen that the free field theories are examples of such concretemodels. There are also some other models that were constructed at a very early stage in the de-velopment of rigorous quantum field theory, such as the Schwinger and Thirring models, but thesemodels turned out to be trivial in the sense that the corresponding fields could be expressed asfunctions of free fields. The goal of the constructive quantum field theory programm that emergedin the 1960s was to prove that concrete non-trivial models exist within the Wightman and/orHaag-Kastler axiom scheme.

In this chapter we will discuss some of the earliest results that were obtained in constructivequantum field theory. We have used the historical notes [23] and [33] as a guide through the litera-ture, especially concerning the chronology of the results. The two main strategies for constructivequantum field theory were the Hamiltonian and Euclidean strategy. We will discuss both of themin separate sections, with a special focus on the scalar boson models with a self-interaction. Be-cause the proofs of almost all theorems that we will be needing are very long and technical, we havedecided not to include them here. Instead, we will focus on the main arguments and we will spec-ify how the different mathematical objects are constructed, without proving that the constructionmakes sense mathematically.

5.1 The Hamiltonian approach

In the Hamiltonian approach one begins with a free field theory on Fock space and uses cutoffs inorder to make sense of the interaction term in the Hamiltonian of some interacting field theory.The methods that are used in this approach are of a functional-analytic nature.

5.1.1 The (λφ4)2-model as a Haag-Kastler model

The scalar quantum field theory in 2-dimensional spacetime with a quartic self-interaction was oneof the first non-trivial models that people tried to construct in the 1960s, because it is probablythe simplest of all non-trivial models. The Hamiltonian for this model is given formally by47

H = H0 + λ

∫R

: φ04(x) : dx, (5.1)

where H0 is the free field Hamiltonian (see also equation (5.2) below), λ is the coupling constantand φ0(x) denotes the free field at time t = 0. This model is called the (λφ4)2-model, wherethe subindex 2 refers to the number of spacetime dimensions. Since the interaction term is notwell-defined, we will introduce a cutoff version of this interaction. However, we will begin with adiscussion of the free field system in two spacetime dimensions.

Description of the free fieldLet H ' L2(R, dp) be the Hilbert space of one-particle momentum-spin wave functions in 2-dimensional Minkowski spacetime M2 for a particle with mass m and spin s = 0 that is equal toits own antiparticle. Since the spacetime is 2-dimensional, these wave functions ψ(p) depend onlyon a single real variable48 p, and analogous to the 4-dimensional case we define ωp =

√m2 + p2.

Let F ≡ F+(H) be the boson Fock space corresponding to H, and let φ be the free scalar field,defined on real-valued f ∈ S(M2) by

φ(f) =√

2π(a∗(rf) + a(rf))

47Here we use a boldface letter x to denote a single real variable, because the notation x is already used to denotespacetime vectors, which have two components in this 2-dimensional model.

48We will write p instead of p for the same reason that we write x instead of x.

107

Page 109: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

as defined before, where for f ∈ S(M2) we define (rf)(p) = 1

2π√

2ωp

∫R2 f(t,x)ei(ωpt−px)dtdx. In

terms of the ill-defined operators a(∗)(p) we can express the field as

φ(t,x) =1√2π

∫R

dp√2ωp

[ei(ωpt−px)a∗(p) + e−i(ωpt−px)a(p)].

We also define the sharp-time field φt and its derivative πt := ∂tφt in the same way as in the4-dimensional case. Of special importance to us will be the t = 0 field

φ0(x) =1√2π

∫R

dp√2ωp

e−ipx[a∗(p) + a(−p)]

and its canonical conjugate π0. For real-valued f ∈ S(R), both φ0(f) and π0(f) are essentiallyself-adjoint operators that are defined on the subspace D ≡ D+ of F consisting of all finite particlestates, as defined in subsection 2.2.5:

D =∞⋃n=0

n⊕j=1

Fn = ψ = (ψ0, ψ1, ψ2, . . .) ∈ F : ∃N with ψn = 0 for all n ≥ N,

where Fn ≡ Fn+(H) is the symmetric n-particle Hilbert space, consisting of all square-integrablefunctions ψn(p1, . . . ,pn) with ψn(pσ(1), . . . ,pσ(n)) = ψn(p1, . . . ,pn) for all σ ∈ Sn; also, we haveused the notation where we write an element ψ ∈ F as a sequence ψ = (ψ0, ψ1, . . .) with ψn ∈ Fnfor all n. Because φ0(f) and π0(f) are essentially self-adjoint for real-valued f ∈ S(R), theirclosures φ0(f)− and π0(f)− are self-adjoint, and by the spectral theorem they define spectralmeasures Eφ0(f) and Eπ0(f). If O ⊂ R is a bounded open set49, let

DR(O) = f ∈ S(R) : f real-valued, supp(f) ⊂ O.

If BR denotes the Borel σ-algebra on R, then we define the set

A(O) =

⋃∆∈BR,f∈DR(O)

Eφ0(f)(∆)

∪ ⋃

∆∈BR,f∈DR(O)

Eπ0(f)(∆)

.

The Von Neumann algebra A(O) (notice the difference between the letters A and A) is then definedto be the Von Neumann algebra generated by the set A(O) ⊂ B(F). Equivalently, A(O) is theVon Neumann algebra generated by the unitary elements eiφ0(f) and eiπ0(f) with f ∈ DR(O).

We will now show how the operators a(∗)(p) can be defined rigorously. We define the subsetD ⊂ F as the set of all elements ψ = (ψ0, ψ1, . . .) in D for which ψn is a Schwartz function for alln:

D = ψ = (ψ0, ψ1, . . .) ∈ D : ψn ∈ S(Rn) for all n.

The annihilation operator a(p) can now be defined as a map a(p) : D → D . The action of a(p)on an element ψ = (ψ0, ψ1, . . .) ∈ D is

(a(p)ψ)n−1(p1, . . . ,pn−1) =√nψn(p,p1,p2, . . . ,pn−1).

Because a(p) maps D into itself, we can let an arbitrary product a(p1) . . . a(pn) act on D andhence such products are well-defined operators on D . Furthermore, for any ψ, υ ∈ D such aproduct gives rise to a Schwartz function

(p1, . . . ,pn) 7→ 〈a(p1) . . . a(pn)ψ, υ〉.49We will write open spatial sets (which are subsets of R) by boldface letters to distinguish them from open sets

in two-dimensional spacetime.

108

Page 110: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Unfortunately, the creation operators a∗(p) are not so well-behaved as the annihilation operatorsa(p). Formally, their action on an element ψ = (ψ0, ψ1, . . .) ∈ D is

(a∗(p)ψ)n+1(p1, . . . ,pn+1) =1√n+ 1

n+1∑j=1

δ(p− pj)ψn(p1, . . . ,pj−1,pj+1, . . . ,pn+1).

The delta function makes it impossible to define a∗(p) as an operator on a non-trivial subspace ofF , but the fact that for any ψ ∈ H the operator a∗(ψ) is the adjoint of a(ψ) suggests that we canmake sense of a∗(p) as a bilinear form on D ×D ,

D ×D 3 (ψ, υ) 7→ 〈a∗(p)ψ, υ〉 := 〈ψ, a(p)υ〉.

More generally, for any product a∗(p1) . . . a∗(pn)a(p1) . . . a(pm) we can define a bilinear form onD ×D by

(ψ, υ) 7→ 〈a∗(p1) . . . a∗(pn)a(p′1) . . . a(p′m)ψ, υ〉:= 〈a(p′1) . . . a(p′m)ψ, a(pn) . . . a(p1)υ〉.

For fixed ψ, υ ∈ D , the right-hand side is a Schwartz function in the variables pi,p′j , i.e. fψ,υ ∈

S(Rn+m), where

fψ,υ(p1, . . . ,pn,p′1, . . . ,p

′m) = 〈a∗(p1) . . . a∗(pn)a(p′1) . . . a(p′m)ψ, υ〉.

If F ∈ S ′(Rn+m) is a tempered distribution and if we write this distribution as a functionF (p1, . . . ,pn,p

′1, . . . ,p

′m), then the action of F on fψ,υ can be written as

F (fψ,υ) =

∫Rn+m

F (p1, . . . ,pn,p′1, . . . ,p

′m)fψ,υ(p1, . . . ,pn,p

′1, . . . ,p

′m)dnpdmp′

=

∫Rn+m

F (p1, . . . ,pn,p′1, . . . ,p

′m)〈a∗(p1) . . . a∗(pn)a(p′1) . . . a(p′m)ψ, υ〉dnpdmp′.

In this sense, we may say that for each distribution F (p1, . . . ,pn,p′1, . . . ,p

′m) ∈ S ′(Rn+m) we can

define the integral∫Rn+m

F (p1, . . . ,pn,p′1, . . . ,p

′m)a∗(p1) . . . a∗(pn)a(p′1) . . . a(p′m)dnpdmp′.

Since ωpδ(p− p′) is a Schwartz distribution in the variables p and p′, i.e. ωpδ(p− p′) ∈ S ′(R2),we can use this to define for each n the operator

Nn :=

∫R2

ωpnδ(p− p′)a∗(p)a(p′)dpdp′ =

∫Rωp

na∗(p)a(p)dp.

For n = 0 this gives the number operator N0 = N and for n = 1 this gives the free HamiltonianN1 = H0,

H0 =

∫Rω(p)a∗(p)a(p)dp. (5.2)

This bilinear form gives rise to a self-adjoint operator, the domain of which we denote by D(H0).

The interaction termFor g ∈ S(R), let g ∈ S(R) denote its Fourier transform. Then, as a bilinear form, we may define

V (g) :=

4∑n=0

(4n

)∫R4

dp1 . . . dp4g(p1 + . . .+ p4)

(2π)3/2√ωp1 . . .√ωp4

a∗(p1) . . . a∗(pn)a(−pn+1) . . . a(−p4).

109

Page 111: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

This bilinear form defines a self-adjoint operator (with domain D(V (g))), which we will also denoteby V (g), that is essentially self-adjoint on the subspace

D0 =

∞⋂n=0

D(Hn0 ).

The right-hand side in the definition of V (g) can be further rewritten, resulting in

V (g) =

∫R4

dp1 . . . dp4

1√2π

∫R g(x)e−i(p1+...+p4)xdx

(2π)3/2√ω(p1) . . .

√ω(p4)

[4∑

n=0

(4n

)a∗(p1) . . . a∗(pn)a(−pn+1) . . . a(−p4)

]

=

∫R

∫R4

e−ip1xdp1√2π√ωp1

. . .e−ip4xdp4√

2π√ωp4

: [a∗(p1) + a(−p1)] . . . [a∗(p4) + a(−p4)] :

g(x)dx

=

∫R

:

[∫R

e−ipxdp√2π√ωp

[a∗(p) + a(−p)]

]4

: g(x)dx

=

∫R

: φ04(x) : g(x)dx.

So V (g) is in fact a smeared out version of the interaction term in the total Hamiltonian (5.1). Wedefine the cut-off Hamiltonian H(g) to be

H(g) = H0 + V (g).

This cut-off Hamiltonian is self-adjoint with domain D(H0) ∩D(V (g)).For any bounded open set O ⊂ R we define the set

Ot = x ∈ R : dist(x,O) < |t|.

With this notation, Glimm and Jaffe show in the article [11] that the free Hamiltonian satisfies

eitH0A(O)e−itH0 ⊂ A(Ot), (5.3)

where A(O) is the Von Neumann algebra that was defined above. Now let Eφ0(f) be the spectralmeasure of the closure of φ0(f), which was already used in the definitions of A(O) and A(O). Wethen define M to be the Von Neumann algebra generated by the set of projections⋃

∆∈BR,f∈S(R)

Eφ0(f)(∆).

Note that the functions f are not assumed to have a bounded support this time. Perhaps themost important result in the article [11] is the following theorem, which will allow us to removethe cut-off in the time-evolution of a local observable.

Theorem 5.1 Let O ⊂ R be an open interval. If g ∈ S(R) is real-valued with supp(g) ⊂ O, then

eitV (g) ∈ A(O) ∩M. (5.4)

In what follows we choose O to be of the form O = x ∈ R : |x| < M. Now let A ∈ A(O), andfix some n ∈ N and some g ∈ S(R). For 1 ≤ k ≤ n and t ∈ R we then define

An,k(t) :=[eitnH0e

itnV (g)

]kA[e−

itnV (g)e−

itnH0

]k.

Let ε > 0. We can write g as a sum g = g1 + g2, where g1 and g2 are smooth and satisfysupp(g1) ⊂ Oε and supp(g2) ∩ Oε/2 = ∅. Then V (g) = V (g1) + V (g2) and it follows from thedefinition of V (g) that V (g1) and V (g2) commute. Thus, for any t ∈ R we have

eitnV (g) = e

itnV (g1)e

itnV (g2).

110

Page 112: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Because supp(g2)∩Oε/2 = ∅, eitnV (g2) commutes with50 A(Oε/4) and hence, in particular, with our

operator A ∈ A(O) ⊂ A(Oε/4). From this it follows that for An,1(t) we have

An,1(t) = eitnH0e

itnV (g)Ae−

itnV (g)e−

itnH0

= eitnH0e

itnV (g1)e

itnV (g2)Ae−

itnV (g2)e−

itnV (g1)e−

itnH0

= eitnH0e

itnV (g1)Ae−

itnV (g1)e−

itnH0 .

Thus, An,1(t) only depends on g1; in other words, An,1(t) depends only on the value of g in the

region Oε. Theorem 5.1 implies that eitnV (g1)Ae−

itnV (g1) ∈ A(Oε) and equation (5.3) then implies

thatAn,1(t) = e

itnH0e

itnV (g1)Ae−

itnV (g1)e−

itnH0 ∈ A((Oε)t/n) = A(Oε+t/n).

Because An,k(t) = eitnH0e

itnV (g)An,k−1(t)e−

itnV (g)e−

itnH0 , we can repeat the procedure above (with

An,k−1(t) instead of A) by choosing in each step an appropriate decomposition g = g1 + g2. Theresult is that for each 1 ≤ k ≤ n we have An,k(t) ∈ A(Okt/n+kε) and that An,k(t) depends onlyon the value of g in Okt/n+kε. In particular, for k = n we find that An,n(t) ∈ A(Ot+nε) and thatAn,n(t) depends only on the value of g in Ot+nε. Because ε was arbitrary, An,n(t) depends only onthe value of g in Ot (the closure of Ot), and An,n(t) ∈

⋂ε>0A(Ot+ε). Thus, if O′ ⊂ R is an open

region with O′ ∩Ot = ∅ then An,n(t) commutes with every observable B ∈ A(O′). Because n wasarbitrary, the statements above hold for all n and thus also for

σt(A) := eitH(g)Ae−itH(g) = strong limn→∞

An,n(t),

where we have used the Trotter product formula, which states that if S and T are self-adjoint andS + T is essentially self-adjoint on D(S) ∩D(T ), then for each ψ ∈ H we have

ei(S+T )ψ = limn→∞

(einSe

inT)nψ.

In the present case this means that

eitH(g) = strong limn→∞

(eitnH0e

itnV (g)

)n.

In particular, σt(A) depends only on the value of g in the region Ot. The idea is now to takeg ∈ S(R) to be a nonnegative function such that it equals the coupling constant λ in the regionOt. The time evolution σt(A) of A ∈ A(O) is then determined by the value of g in the regionwhere it equals λ, and hence the cut-off has been removed. This was the main result of the article[11]. We will now turn to the second article, namely [12], of Glimm and Jaffe on the (λφ4)2-model.

The ground state of H(g) and the field operatorsThe Hamiltonian H(g) defined above is bounded from below, i.e. its spectrum has an infimumEg ∈ R. This was shown by Nelson in [27], and later in a more general context by Glimm in[10]. There exists a vector Ωg ∈ F that satisfies H(g)Ωg = EgΩg and ‖Ωg‖ = 1 and this vectoris uniquely determined up to a phase factor. This phase factor is fixed by the requirement that〈Ωg,ΩFock〉 > 0. The existence and uniqueness of the ground state Ωg are the content of theorems2.2.1 and 2.3.1 of the article [12].

Using the t = 0 field φ0(x), we define the field φg(t,x) by

φg(t,x) = eitH(g)φ0(x)e−itH(g)

as a bilinear form on some subset of F , and if ψ ∈ F lies in this subset, then the function(t,x) 7→ 〈φg(t,x)ψ,ψ〉 is continuous. For each f ∈ S(R2) we then define another bilinear formAg,f (t) by

Ag,f (t) =

∫Rφg(t,x)f(t,x)dx.

50This is argued by Glimm and Jaffe in the proof of the theorem 5.1 above.

111

Page 113: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

This bilinear form gives rise to a self-adjoint operator which we will also denote by Ag,f (t). In asimilar fashion, we obtain a self-adjoint operator from the bilinear form

Bg,f (t) =

∫Rπg(t,x)f(t,x)dx,

where πg(t,x) = eitH(g)π0(x)e−itH(g). Using these self-adjoint operators, we then define the inte-grals

φg(f)ψ :=

∫RAg,f (t)ψdt

πg(f)ψ :=

∫RBg,f (t)ψdt,

which in turn define closed symmetric operators φg(f) and πg(f). Under certain conditions on f ,which we will not discuss here (see section 3.2 of the article [12] for the details), these operatorssatisfy

(∂tφg)(f) = πg(f) = [iH(g), φg(f)]

on a certain subset of F . A similar reasoning then also gives that

(∂2t φg)(f) = [iH(g), πg(f)].

The commutator on the right-hand side is a bilinear form equal to

(∂2xφg)(f)−m2φg(f)− 4

∫R2

: φ3g(t,x) : f(t,x)g(x)dxdt,

where : φ3g(t,x) : is a shorthand for eitH(g) : φ3

0(x) : e−itH(g), so φg satisfies the differential equation

( +m2)φg(f) = −4λ : φ3g : (f),

where the equality is in the sense of bilinear forms. If f has compact support, the operator φg(f)is self-adjoint and the differential equation above can be interpreted as an equality of self-adjointoperator-valued distributions. Also, if f has compact support, say supp(f) ⊂ O for some boundedopen region in R2, then we can remove the cutoff g in a similar manner as we did for operatorsA ∈ A(O) above, i.e. by choosing a function g(O) ∈ S(R) such that g(O)(x) = λ on an interval Iof R whose causal shadow contains O. Thus for each bounded open subset O we can define a fieldφ(O)(t,x) without any cutoff, where φ(O)(f) = φg(O)(f). We now want to patch such φ(O)(t,x)

together to form a field φ′(t, x) without cutoffs. To accomplish this, we divide R2 into overlappingsquares Sj and we define a partition of unity subordinate to this open cover of overlapping squares,see also section 3.4 of [12]. Thus, we define a set of functions ζj : R2 → [0, 1] with supp(ζj) ⊂ Sjand with

∑j ζj = 1R2 . A function f ∈ S(R2) can then be written as f =

∑j fζj =:

∑j fj , where

supp(fj) ⊂ Sj . The idea is now to define a bilinear form

A′g,f (t) =∑j

∫Rφ(Sj)(t,x)fj(t,x)dx,

which gives rise to a self-adjoint operator which we will also denote by A′g,f (t). In a similar

manner we also obtain a self-adjoint operator B′g,f (t) by replacing φ(Sj) by π(Sj). We then definethe integrals

φ′(f)ψ :=

∫RA′f,g(t)ψdt (5.5)

π′(f)ψ :=

∫RB′f,g(t)ψdt, (5.6)

112

Page 114: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

which give rise to closed symmetric operators φ′(f) and π′(f) for f ∈ S(R2). We write φ′ and π′

(instead of φ and π) to distinguish these objects from the free field φ and its time-derivative π.The field φ′ is local in the sense that φ′(f1) and φ′(f2) commute whenever the supports of f1 andf2 are mutually spacelike separated.

The algebra of local observablesFor each bounded open subset O ⊂ R2 of spacetime, we define U(O) to be the Von Neumannalgebra generated by the set

eiφ′(f) : supp(f) ⊂ O, f = fof (bounded) operators on F . We will show that these algebras satisfy the Haag-Kastler axioms.It is clear that for bounded open sets O1 ⊂ O2 ⊂ R2 we have U(O1) ⊂ U(O2), so monotonicity issatisfied. By construction of the field φ′, we can find for any bounded open O ⊂ R2 a function gwhich equals λ on an interval of R and is such that for all f ∈ S(R2) with supp(f) ⊂ O we have

ei∆tH(g)φ′(f)e−i∆tH(g) = φ′(f∆t,0), (5.7)

where f∆t,∆x(t,x) = f(t−∆t,x−∆x). Because supp(f∆t,0) ⊂ O+(∆t, 0), we see that the spectralmeasures of φ′(f∆t,0) are in U(O+(∆t, 0)). So (5.7) induces a map U(O)→ U(O+(∆t, 0)) for eachbounded open region O ⊂ R2 and this map is in fact a C∗-isomorphism. Thus, this map gives usa transformation of the algebras U(O) under time translations that is of the form required by theHaag-Kastler axioms. For space translations, we first consider the free field generator P of spacetranslations. Because φ′(0,x) = φ0(x), we have

e−i∆xPφ′(0,x)ei∆xP = φ′(0,x + ∆x).

If one now chooses a cutoff function g such that the interval where g = λ is large enough, we get

e−i∆xPφ′(t,x)ei∆xP = e−i∆xPeitH(g)φ′(0,x)e−itH(g)ei∆xP

= eitH(g)e−i∆xPφ′(0,x)ei∆xPe−itH(g)

= eitH(g)φ′(0,x + ∆x)e−itH(g)

= φ′(t,x + ∆x),

see section 3.6 of [12]. In particular, this shows that

e−i∆xPφ′(f)ei∆xP = φ′(f0,∆x)

for any f ∈ S(R2) with bounded support. Applying the same reasoning as for time translations,we find a map that transforms U(O) to U(O + (0,∆x)) as required by the Haag-Kastler axioms.Showing that there also exists a transformation of the algebras U(O) under Lorentz boosts withall the desired properties is much harder. In fact, Cannon and Jaffe have written an article of 61pages to prove this covariance under Lorentz boosts, see [4]. The starting point of their solution isto consider the expression for the Lorentz boost generator as used in physics, i.e. the expressionin terms of the energy-momentum tensor of the field. Analogues to the interaction term in theHamiltonian, they introduce cutoff functions to obtain a well-defined local version of the boostgenerator. This local boost generator then defines a transformation of the algebras U(O) underLorentz boosts, and this transformation can be shown to have the desired properties. This, togetherwith the results about spacetime translations above, completes the verification of the covarianceaxiom. So for each Poincare transformation (a, L) in two-dimensional spacetime, we have a C∗-automorphism σ(a,L) : U→ U such that for each bounded region O of spacetime the restriction ofσ(a,L) to U(O) defines a C∗-isomorphism

σ(a,L) : U(O)→ U((a, L)O). (5.8)

Finally, the locality axiom is also satisfied because we already noticed that φ′(f1) and φ′(f2)commute whenever the supports of the two compactly supported functions f1 and f2 are spacelikeseparated, and the algebras U(O) are constructed from the smeared fields φ′(f) with f compactlysupported. The algebra of observables U is now obtained by taking the norm completion of⋃O U(O).

113

Page 115: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

5.1.2 The physical vacuum for the (λφ4)2-model

As stated above, the cutoff Hamiltonian H(g) has an infimum Eg ∈ R and there exists a uniqueunit vector Ωg ∈ F , up to a phase factor, that satisfies H(g)Ωg = EgΩg. This vector was calledthe ground state of H(g). In order to obtain an operator such that the ground state has eigenvalue0, we define

Hg := H(g)− Eg.

For this operator the vector Ωg is the unique vector, up to a phase factor, with the propertyHgΩg = 0. Corresponding to the vector Ωg we can define a linear functional ωg : U → C, with Uthe algebra of observables defined above, by

ωg(A) = 〈AΩg,Ωg〉.

This linear functional is a state in the sense of C∗-algebras, as is any vector state. In the article[13], Glimm and Jaffe use this state ωg to construct a physical vacuum state. They begin witha cutoff function g that equals the coupling constant λ in some interval of the form [−M,M ]and then they define the sequence (gn) of functions gn(x) = g(x/n). If σx : U → U denotes thetransformation of U corresponding to a space translation over the vector x, then they define asequence of states (ωn) by

ωn(A) =

∫R

h(x/n)

n〈σx(A)Ωgn ,Ωgn〉dx, (5.9)

where h is a smooth nonnegative function with support in [−1, 1] and∫Rh(x)dx = 1,

which also implies that x 7→ h(x/n)/n integrates to 1. This sequence (ωn) of states can be shownto have a weakly-convergent subsequence (ωnk), the limit of which is denoted by ω, i.e. for allA ∈ U we have

limk→∞

ωnk(A) = ω(A). (5.10)

The obtained state ω can now be used to define a physical vacuum state in a physical Hilbertspace, via the Gelfand-Naimark-Segal construction. Thus we consider the GNS-representationπω : U→ B(Hω) of the C∗-algebra U in the Hilbert space Hω, in which the state ω is given by

ω(A) = 〈πω(A)Ωω,Ωω〉,

where the unit vector Ωω ∈ Hω is uniquely determined up to a phase factor. As shown in theorem2.1 of [13], on the Hilbert space Hω there exists a unitary representation U(a) of the translationgroup and this representation satisfies

U(a)πω(A)U(a)∗ = πω(σ(a,1)(A))

U(a)Ωω = Ωω,

where A ∈ U and σ(a,L) is as in equation (5.8). The existence of U(a) follows from the fact thatω is a translation-invariant state, i.e. ω(σ(a,1)(A)) = ω(A). The representation U(a) is strongly-continuous, so according to the SNAG-theorem (which we also used in step 1 of the classification

of the irreps of P↑+) there exist two commuting self-adjoint operators H and P such that

U(a) = eia·P ,

where P = (H,P). The operator H is positive, which is a consequence of (5.10) and of the factthat each Hgn is positive; see also the end of the proof of theorem 2.1 of [13].

114

Page 116: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

In the proof of theorem 2.2 of [13], it is shown that Hω is a separable Hilbert space and thatfor each bounded region O of spacetime, there exists a unitary operator UO : F → Hω such thatfor all A ∈ U(O)

πω(A) = UOAU∗O.

This is also true if O is replaced by a bounded region O of space at time t = 0. So locally therepresentation is unitarily equivalent to the local algebra of Fock space operators. For this reason,the representation (Hω, πω) is called locally Fock. This property can be used to construct fields onthe Hilbert space Hω as follows. Let O be a bounded open region of spacetime and let f ∈ S(R2)be a real-valued function with support in O. Then φ′(f) is self-adjoint on the Fock space F , andhence s 7→ eisφ

′(f) defines a strongly-continuous one parameter unitary group on F . Using theunitary map UO described above, we then obtain a strongly-continuous one parameter unitarygroup

s 7→ πω(eisφ′(f)) = UOe

isφ′(f)U∗O

on the Hilbert spaceHω. According to Stone’s theorem there exists a self-adjoint operator φω(f) onHω that generates this unitary group. In Stone’s theorem the self-adjoint operator is constructedexplicitly in terms of the derivative of the unitary group with respect to the parameter, and fromthis construction it easily follows that the generators of the two unitary groups on F and Hω arerelated by

φω(f) = UOφ′(f)U∗O.

In the previous subsection we showed how a partition of unity can be used to define the smearedfields φ′(f) for f ∈ S(R2). This same technique can also be used to define φω(f) for arbitrarySchwartz functions, see also the last pages of [15].

5.1.3 The P(φ)2-model and verification of some of the Wightman axioms

At the beginning of the 1970s, all results in the previous two sections on the (λφ4)2-model wererederived for the more general P(φ)2-model, characterized (formally) by the Hamiltonian

H = H0 + λ

∫R

: P(φ0(x)) : dx,

where P is a polynomial that is bounded from below. So for the P(φ)2-model the Haag-Kastleraxioms were established, as well as the existence of a vacuum state ω that gives rise to a locally Fockrepresentation (Hω, πω) of the algebra of observables and on Hω there is a unitary representationof the translation group with corresponding energy-momentum operators. Also, the locally Fockrepresentation allows the construction of fields φω as in the (λφ4)2-model. The problem of verifyingthe Wightman axioms for the (λφ4)2-model could thus be investigated in the more general contextof the P(φ)2-model.

Some of the first (new) results in this more general context were derived in the articles [14]and [15]. In these articles, Glimm and Jaffe show that the energy-momentum spectrum lies in theforward light cone for the P(φ)2-model, as required by the Wightman axioms (spectral condition),and that HΩω = 0 = PΩω. They also show that, under the assumption that the model hasa mass gap, the vacuum vector Ωω is unique and the vacuum expectation values exist; we willcome back to the vacuum expectation values later. What is not established in these articles isthe Lorentz invariance of the vacuum, i.e. ω(σ(0,L)(A)) = ω(A) for all A ∈ U. The state ofaffairs for the P(φ)2-model at this point of history (i.e. 1970/1971) is also summarized in part Iof the lecture notes [16], which can also be found in volume 1 of the two-volume book ’collectedpapers’ of Glimm and Jaffe. However, it soon became clear that there was a gap in the proof of thespectral condition, as was pointed out by Frolich and Faris. So the spectral condition was no longerestablished for the P(φ)2-model. In the meantime, Streater proved in the article [31] that if onecould prove the spectral condition, then the Lorentz covariance of the Wightman functions wouldfollow automatically (given the results that were already established at that point). The existenceof these Wightman functions (as tempered distributions) was established in the article [17] of

115

Page 117: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Glimm and Jaffe which was published later than the article [31] of Streater, but Streater explainsthat Glimm and Jaffe communicated some of their results before their article was published. In[17] Glimm and Jaffe begin with the quantities of the form

〈φ′(f1) . . . φ′(fm)Ωg,Ωg〉 (5.11)

and they show that these quantities can be bounded in absolute value by a product of Schwartzspace norms ‖f1‖1 . . . ‖fm‖m,

|〈φ′(f1) . . . φ′(fm)Ωg,Ωg〉| ≤ ‖f1‖1 . . . ‖fm‖m,

independently of the cutoff g, and such that each of the norms is translation invariant (i.e. atranslation of a function f does not change the norm of f). In a similar manner as in equation(5.9) (with n = 1), they then average the quantities in (5.11) over space translations:∫

Rh(x)〈φ′((f1)((0,x),1)) . . . φ

′((fm)((0,x),1))Ωg,Ωg〉dx,

where h is a similar function as in (5.9) and f(a,L)(x) = f(L−1(x−a)) as usual. Due to translationinvariance of the norms described above, this averaged quantity is also bounded by the same prod-uct of norms. By considering sequences (gn) as in (5.9) and by taking a convergent subsequence,we then obtain quantities that we denote by

〈φω(f1) . . . φω(fm)Ωω,Ωω〉.

It is quite nice that this procedure gives us the opportunity to somehow define the vacuum expecta-tion values for the field φω, even though we are not sure whether the expressions φω(f1) . . . φω(fm)Ωω

are well-defined. The bounds above still hold for these vacuum expectation values, which showsthat they are separately continuous and therefore, by the nuclear theorem, define tempered dis-tributions on the Schwartz space S(R2m). Although it is not yet clear whether these vacuumexpectation values satisfy all the properties of Wightman functions, we can still use the recon-struction theorem to construct a Hilbert space with quantum fields, but this theory might notsatisfy all the Wightman axioms. For instance, it is not clear whether the spectrum conditionis satisfied or whether the vacuum is unique. Also Lorentz covariance is not established, but asexplained above this follows once the spectrum condition holds. A summary of all results for theP(φ)2-model up to this moment in history (i.e. 1972) is given in the notes [18], which can also befound in volume 1 of the ’collected papers’.

5.1.4 Similar methods for other models

Without any further details we mention that, up to the beginning of the 1970s, similar tech-niques that were used for the P(φ)2-model, were also used to establish some results for othermodels. Among these models were the two-dimensional Yukawa-model, or Y2-model, and the two-dimensional model with exponential bosonic self-interaction. However, the results for these modelsdid not go as far as those for the P(φ)2-model. For a summary of the results for the Y2-model,one can consult part II of the notes [16].

5.2 The Euclidean approach

Despite the hard work of constructive field theorists that we described above, the results at thebeginning of the 1970s were still very small. For that reason there was a large need for a newapproach to the constructive field programm, other than the brute-force methods above. This newapproach was Euclidean quantum field theory.

To understand Euclidean quantum field theory, recall that the Wightman functions are bound-ary values of holomorphic functions W hol

i1...in(z1, . . . , zn) defined on the extended tube T ′n. Here we

consider the general case of a quantum field theory in d-dimensional Minkowski spacetimeMd. It

116

Page 118: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

can be shown that in a quantum field theory (in the sense of Wightman) with a normal connec-tion between spin and statistics, these holomorphic functions can be analytically continued to thesymmetrized tube

(T ′n)S :=⋃σ∈Sn

σT ′n,

where σ ∈ Sn permutes the n variables of (z1, . . . , zn) ∈ T ′n in the obvious way. This is (part of)the content of theorem 9.6 in [2]. To each x ∈Md we now assign an element x′ ∈ Cd given by

x′ = (ix0,x). (5.12)

A point (z1, . . . , zn) ∈ Cdn where each of the zj is of the form (5.12) is called a Euclidean point.If such a Euclidean point satisfies the property that zj 6= zk for j 6= k, then we speak of a non-exceptional (or non-coincident) Euclidean point. The important step toward Euclidean quantumfield theory is the statement that (T ′n)S contains all non-exceptional Euclidean points, which isproposition 9.10 in [2]. As a consequence, the W hol

i1...in(z1, . . . , zn) are holomorphic on the set of

non-exceptional Euclidean points. We can use this property as follows. Define the set of points

(Rd)n6= := (x1, . . . , xn) ∈ (Rd)n : xj 6= xk for all j 6= k

and let x 7→ x′ be as in (5.12). Then we can define the Schwinger functions

Si1...in(x1, . . . , xn) := W holi1...in(x′1, . . . , x

′n),

which are holomorphic functions on (Rd)n6=. The most important property of the Schwinger func-tions is that they are E+(d)-covariant, i.e. covariant in the Euclidean sense.

5.2.1 Euclidean fields and probability theory

At the beginning of the 1970s Edward Nelson developed a framework for Euclidean quantum fieldsin terms of certain stochastic processes. For a good comprehension of these ideas we first recallsome terminology from probability theory. The entire content of this subsection can be found in[30] and [28], but the order in which we present the material is quite different from these references.

Probability spacesA probability space is a measure space (X,A, µ) with µ(X) = 1. The σ-algebra A has the structureof a ring when we define addition by A∆B = (A\B) ∪ (B\A) and multiplication by A ∩B. If Nµdenotes the collection of all sets in A with µ-measure zero, then Nµ is an ideal in A and we candefine the quotient ring A/Nµ; we denote the equivalence class of A ∈ A by [A]. The measure µthen defines a function [µ] on this quotient in the obvious way. Two probability spaces (X,A, µ)and (X ′,A′, µ′) are called isomorphic if there exists a ring isomorphism ψ : A/Nµ → A′/Nµ′ suchthat for all [A] ∈ A/Nµ we have [µ′](ψ([A])) = [µ]([A]).

Let (X,A, µ) and (X ′,A′, µ′) be two probability spaces and let T : X → X ′ be (A,A′)-measurable, i.e. T−1(A′) ∈ A for all A′ ∈ A′. Then T is called a measure-preserving transfor-mation if µ(T−1(A′)) = µ′(A′) for all A′ ∈ A′. If T is bijective and if its inverse T−1 : X ′ → Xis also measure-preserving, then T is called an invertible measure-preserving transformation. Inparticular, we can apply this terminology to the case where the two probability spaces coincide. Inthis case the invertible measure-preserving transformations form a group under composition, whichwill be denoted by T (X,A, µ), or simply T when there is no confusion about the probability space.

Random variablesA function f : X → R on a probability space (X,A, µ) is called a random variable if it is (A,BR)-measurable, where BR denotes the Borel σ-algebra on R. We denote the set of all random variableson (X,A, µ) by LR(X,A). A random variable f ∈ LR(X,A) defines a probability measure µf ,called the probability distribution of f , on the measurable space (R,BR) by

µf (B) := µ(f−1(B)).

117

Page 119: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

The Fourier transform cf of µf ,

cf (t) :=

∫Reitxdµf (x) =

∫Xeitfdµ,

is called the characteristic function of f . The expectation value of an integrable random variablef ∈ LR(X,A) is defined by

Eµ(f) :=

∫Xfdµ =

∫Rxdµf (x).

Often, we will also write 〈f〉µ to denote the expectation value of f . If f ∈ LR(X,A) is a randomvariable with fn integrable, then the n-th moment of f is defined as∫

Xfndµ =

∫Rxndµf (x).

If the characteristic function cf is C∞, then f has moments of all orders and we can obtain thesemoments by differentiation of cf :∫

Xfndµ = (−i)n

(d

dt

)ncf

∣∣∣∣t=0

.

If we have two isomorphic probability spaces (X,A, µ) and (X ′,A′, µ′) with isomorphism ψ :A/Nµ → A′/Nµ′ , then we say that two random variables f ∈ LR(X,A) and f ′ ∈ LR(X ′,A′)correspond under the isomorphism ψ if for all B ∈ BR we have ψ([f−1(B)]) = [(f ′)−1(B)]. Whenwe come to define Markov fields, we need the following theorem, a proof of which can be found insection III.3 of [30] (theorem III.7).

Theorem 5.2 Let (X,A, µ) be a probability space and let A0 ⊂ A be a σ-subalgebra. We write µ0

for the restriction of µ to A0. If f ∈ LR(X,A) is an integrable random variable, then there existsa unique function (f |A0) ∈ LR(X,A0) such that for all g ∈ L∞(X,A0, µ0) we have∫

Xg · (f |A0)dµ0 =

∫Xgfdµ,

i.e. Eµ0(g · (f |A0)) = Eµ(gf).

Finally, we want to define the notion of a representation on a probability space. Let (X,A, µ)be a probability space and let T be the group of invertible measure-preserving transformations on(X,A, µ), as defined above. Note that any transformation T ∈ T defines a map from LR(X,A) toitself, which we will also denote by T , given by

(Tf)(x) := f(T−1(x)).

A representation of a group G on the probability space (X,A, µ) is a homomorphism T : G→ T .We often write Tg rather than T (g) in this context. In case G is a topological group, we alsoassume that a representation T is ’continuous’ in the following sense. If gn → g with respect tothe topology in G and if f ∈ LR(X,A), then Tgnf → Tgf in measure, which means that for allε > 0 we have limn→∞ µ(x ∈ X : |Tgnf(x)− Tgf(x)| ≥ ε) = 0.

Sets of random variablesIf f1, . . . , fn ∈ LR(X,A) are random variables on a probability space (X,A, µ), then we definetheir joint probability distribution µf1,...,fn on (Rn,BRn) by

µf1,...,fn(B) = µ((f1, . . . , fn)−1(B))

for B ∈ BRn , where on the right-hand side (f1, . . . , fn) : X → Rn is given by (f1, . . . , fn)(x) =(f1(x), . . . , fn(x)). Their joint characteristic function cf1,...,fn is defined by

cf1,...,fn(t1, . . . , tn) =

∫Rnei(t1x1+...+tnxn)dµf1,...,fn(x)

=

∫Xei(t1f1+...+tnfn)dµ.

118

Page 120: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

We will now define the notion of a σ-algebra generated by a set of random variables. Let S ⊂LR(X,A) be a set of random variables on some probability space (X,A, µ). Then the σ-subalgebraof A generated by the collection f−1(B) : B ∈ BR, f ∈ S ⊂ A is called the σ-algebra generatedby the collection S of random variables and we will denote it by AS (note that it is the smallestσ-subalgebra with respect to which all f ∈ S are measurable). The restriction µS of the measureµ to this σ-subalgebra defines a probability space (X,AS , µS). On this probability space we canagain define the set of µS-measure zero sets in AS , which we will denote by NµS . In case thecorresponding quotient ring AS/NµS happens to coincide with A/Nµ, we say that the set S is full.

Now fix some set of random variables f1, . . . , fk ∈ LR(X,A) on a probability space (X,A, µ).Then we can consider formal power series∑

(n1,...,nk)

an1...nkf1n1 . . . fk

nk

in the random variables fj , with addition and multiplication of two such series defined in theobvious way. By ’formal’ we mean that we do not bother about the convergence and we do notsubstitute actual relations that are satisfied for the fj (for instance, if f1 ≡ 1 then we still considerf1 and f1

2 as two different formal power series). We define partial derivatives of these formalpower series by

∂fj

∑(n1,...,nk)

an1...nkf1n1 . . . fk

nk :=∑

(n1,...,nk)

njan1...nkf1n1 . . . fj

nj−1 . . . fknk ,

where we use the convention that fj0−1 = 0. With these formal power series we can define

Wick products of random variables as follows. For each (n1, . . . , nk) ∈ (Z≥0)k the Wick product: f1

n1 . . . fknk : is the unique formal power series in the fj that is defined recursively in n =

n1 + . . .+ nk by the following relations

: f10 . . . fk

0 : = 1

∂fj: f1

n1 . . . fknk : = nj : f1

n1 . . . fjnj−1 . . . fk

nk :

〈: f1n1 . . . fk

nk :〉µ = 0,

where 〈.〉µ denotes the expectation value as usual. It follows from the first two relations that: f1

n1 . . . fknk : is a power series of degree n1 in f1, of degree n2 in f2, and so on. If we have com-

puted all Wick products with n = n1 + . . .+ nk ≤ m for some m ∈ Z≥0, then the second relationtells us that the Wick products with n = m + 1 can be obtained by computing anti-derivativesof the Wick products with n = m. The third relation then fixes the constant term in the powerseries expansion of the Wick product with n = m+ 1 (’the constant of integration’).

Random processes, random fields, Markov fields and Euclidean fieldsIf T is a set and (X,A, µ) is a probability space, then a map ρ : T → LR(X,A) is called a randomprocess indexed by T . If V is a vector space and (X,A, µ) is a probability space, then a linear mapλ : V → LR(X,A) is called a linear random process indexed by V . In case V is also a topologicalvector space, we also assume that vn → v implies that the sequence (λ(vn)) in LR(X,A) convergesin measure to λ(v). In case V = D(Rd) (see subsection 4.1.1 for the definition), we call λ a randomfield. On D(Rd) we define for m > 0 and q ∈ R≥0 an inner product 〈., .〉−q,m by

〈f, g〉−q,m := 〈(−∆ +m2)−qf, g〉L2 =

∫Rd

[(−∆ +m2)−qf ](x)g(x)ddx

=

∫Rd

f(k)g(k)

[G(k, k) +m2]qddk

where ∆ =∑d

j=1∂2

∂xj2 is the Laplacian and G is the Euclidean inner product on Rd. Let the Hilbert

space H −qm (Rd) be the completion of this space. It can be shown that the embedding of D(Rd)

119

Page 121: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

in H −1m (Rd) is continuous, so every linear random process λ : H −1

m (Rd) → LR(X,A) defines arandom field when we restrict λ to D(Rd). For this reason, we will call a linear random processλ : H −1

m (Rd) → LR(X,A) a random field, from now on. These random fields do not exhaust theset of all random fields, but they will suffice for our purposes, so we will restrict our attention tothese random fields.

Let (X,A, µ) be a probability space and let λ : H −1m (Rd) → LR(X,A) be a random field. If

K ⊂ Rd, let Aλ,K ⊂ A be the σ-subalgebra generated by the set of random variables

λ(h) ∈ LR(X,A) : h ∈H −1m (Rd), supp(h) ⊂ K.

Then the random field λ is called a Markov field over H −1m (Rd) if for all open sets U ⊂ Rd and

for every positive random variable f ∈ LR(X,Aλ,U ) ⊂ LR(X,A) we have the Markov property :

Eµ(f |Aλ,Uc) = Eµ(f |Aλ,∂U ),

where U c = Rd\U and ∂U is the boundary of U .Now let (X,A, µ) be a probability space. A Euclidean field over H −1

m (Rd) is a Markov fieldλ : H −1

m (Rd) → LR(X,A) together with a representation T of the Euclidean group E(d) on(X,A, µ) such that for all h ∈H −1

m (Rd) and all g ∈ E(d) we have

Tg(λ(h)) = λ(h g−1).

This property is called Euclidean covariance. It is convenient to assume that for any Euclideanfield the set λ(h)h is a full set of random variables. This situation can always be obtained bymaking the σ-algebra A smaller. For s ∈ R we will use the notation Ys to denote the subset(x1, . . . , xd) ∈ Rd : x1 = s.

Theorem 5.3 Let λ : H −1m (Rd) → LR(X,A) be a Euclidean field with corresponding representa-

tion T of E(d) on the probability space (X,A, µ) and let E0 : L2(X,A, µ)→ L2(X,Aλ,Y0 , µ|Aλ,Y0) be

defined by E0(f) = (f |Aλ,Y0). If Tt ∈ E(d) denotes the translation (x1, . . . , xd) 7→ (x1 + t, . . . , xd),which in turn defines a transformation on L(X,A), then we can define the operator E0TtE0 onL2(X,A, µ). If the restriction of this operator to L2(X,Aλ,Y0 , µ|Aλ,Y0

) is written as P t, then there

exists a positive self-adjoint operator H on L2(X,Aλ,Y0 , µ|Aλ,Y0) such that

P t = e−|t|H .

The operator H plays an important role in Nelson’s axiom scheme for Euclidean field theory, aswe will see later.

Gaussian random variables and Gaussian random processesIf (X,A, µ) is a probability space, then a random variable f ∈ LR(X,A) is called a Gaussianrandom variable (G.r.v.) if its characteristic function has the form

cf (t) = e−12at2

with a ≥ 0. If a = 0 then µf is a Dirac distribution at the origin, while µf is a Gaussian distributionwhenever a > 0. A finite set f1, . . . , fn ∈ LR(X,A) of random variables is called jointly Gaussianif their joint characteristic function has the form

cf1,...,fn(t1, . . . , tn) = e−12

∑i,j aijtitj

with (aij) a symmetric real positive-definite n × n-matrix. We now have the following importantresult, which can be found in section I.1 of [30] (proposition I.2).

120

Page 122: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Proposition 5.4 (Wick’s theorem) Let (X,A, µ) be a probability space and let f1, . . . , f2n ∈LR(X,A) be (not necessarily distinct) jointly Gaussian random variables. Then

〈f1 . . . f2n〉µ =∑

pairings

〈fi1fj1〉µ . . . 〈finfjn〉µ,

where 〈.〉µ := Eµ(.) and the sum is over all distinct ways of partitioning the set of indices1, . . . , 2n into n two-element subsets i1, j1, . . . , in, jn.

If (X,A, µ) is a probability space and V is a real vector space, then a linear random processγ : V → LR(X,A) is called a Gaussian random process indexed by V if each γ(v) is a G.r.v. andif the set γ(v) ∈ LR(X,A) : v ∈ V is a full set of random variables. Note that γ defines asemi-inner product 〈., .〉γ : V × V → R by

〈v, w〉γ := 〈γ(v)γ(w)〉µ.

If we define the linear subspace Nγ := v ∈ V : 〈v, v〉γ = 0, we can form the quotient vector spaceV/Nγ to obtain an inner product space, and then complete it to obtain a real Hilbert space HV .Note that v ∈ Nγ if and only if the probability distribution µγ(v) of the Gaussian random variableγ(v) : X → R is a Dirac distribution, since 〈v, v〉γ = 〈γ(v)2〉µ is the variance of γ(v) (which can onlybe zero for Dirac distributions). In terms of the random variable itself, the condition 〈v, v〉γ = 0is equivalent to γ(v) ≡ 0. We now give the definition of a Gaussian random process over a realHilbert space, which is more restrictive than the definition of a Gaussian random process over areal vector space.

Definition 5.5 Let (X,A, µ) be a probability space and let H be a real Hilbert space. A linearrandom process γ : H → LR(X,A) is called a Gaussian random process indexed by H if(1) each γ(v) is a G.r.v.,(2) the set γ(v) ∈ LR(X,A) : v ∈ V is a full set of random variables,(3) 〈v, w〉γ = CG〈v, w〉H with CG ∈ R>0 fixed (by some convention) for all v, w ∈ H.

In [30] the convention is that CG = 12 , but we will keep things general here. It follows from the

positive-definiteness of 〈., .〉H that 〈γ(v)2〉µ = 0 if and only if v = 0, so for the Gaussian randomvariable γ(v) we have γ(v) ≡ 0 if and only if v = 0. The following important theorem, which istheorem I.6 in [30], states that Gaussian random processes over a real Hilbert space are unique ina sense.

Theorem 5.6 Let (X,A, µ) and (X ′,A′, µ′) be two probability spaces and let H be a real Hilbertspace. If γ : H → LR(X,A) and γ′ : H → LR(X ′,A′) are Gaussian random processes indexedby H, then there exists an isomorphism ψ : A/Nµ → A′/Nµ′ of the probability spaces such thatfor every h ∈ H the random variables γ(h) and γ′(h) correspond under the isomorphism ψ, i.e.ψ([γ(h)−1(B)]) = [γ′(h)−1(B)] for all B ∈ BR.

This result allows us to speak of ’the’ Gaussian random process γH indexed by the real Hilbertspace H. The existence of this Gaussian random process is shown in section I.2 of [30], where thereare given several explicit constructions (which are of course equivalent to each other in the senseof the theorem).

For a real Hilbert space H, we will write the underlying probability space for the Gaussianrandom process as (QH,AH, µH), which is often called Q-space in the literature. For fixed n,the Wick products : γH(v1) . . . γH(vn) : are square-integrable and we denote their linear span byWnH. For n = 0 we define W0

H to be the 1-dimensional space spanned by 1QH . The algebraic (i.e.

uncompleted) direct sum WH :=⊕(alg)

n∈Z≥0WnH is a dense subspace in L2(QH,AH, µH), so

L2(QH,AH, µH) =⊕n∈Z≥0

WnH.

121

Page 123: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Most of the time we will interpret the random variables γH(h) as multiplication operators on theHilbert space L2(QH,AH, µH):

f 7→ γH(h)f.

In this way, the Gaussian random variables γH(h) become the same kind of mathematical objectas quantum fields, namely operators on a Hilbert space.

If A ∈ B(H) is a bounded operator, then we can define a linear operator Γn(A) on WnH by

Γn(A) : γH(v1) . . . γH(vn) : = : γH(Av1) . . . γH(Avn) : .

It can be shown that this definition is algebraically consistent and that the norm of this operator

onWnH is ≤ ‖A‖n. So if ‖A‖ ≤ 1, then we can extend this to a linear operator Γ(A) on

⊕(alg)n∈Z≥0

WnH

and for this operator we have ‖Γ(A)‖ ≤ 1. So Γ(A) is continuous, and we can extend it to allof L2(QH,AH, µH). We thus conclude that any A ∈ B(H) with ‖A‖ ≤ 1 defines an operatorΓ(A) ∈ B(L2(QH,AH, µH)) with ‖Γ(A)‖ ≤ 1.

For each operator A on H with domain D(A) ⊂ H, we can also define an operator dΓn(A) on: γH(D(A)) . . . γH(D(A)) :⊂ Wn

H by

dΓn(A) : γH(v1) . . . γH(vn) :=n∑j=1

: γH(v1) . . . γH(Avj) . . . γH(vn) :,

for v1, . . . , vn ∈ D(A). The extension of this operator to⊕(alg)

n∈Z≥0: γH(D(A)) . . . γH(D(A)) : is

denoted by dΓ(A) and is sometimes called the second quantization of A.

Fock space and Gaussian random processes over Hilbert spacesLet H be a real Hilbert space and let HC be its complexification. Let F(HC) be the symmetricFock space with the creation and annihilation operators A(∗)(h) defined as before for h ∈ H (notHC). On F(HC) we define for h ∈ H the Segal field

φ(h) :=√CG(A∗(h) +A(h)),

defined as an operator on the dense subspace D of finite-particle states, which was defined severaltimes before. The operators φ(h)h∈H all commute with each other in the sense that their spectralmeasures E

φ(h)commute. We denote the abelian Von Neumann algebra in B(F(HC)) generated

by these spectral measures by M. According to the Gelfand-Naimark theorem we can represent(the unital abelian C∗-algebra) M as C(Σ(M)), where (the compact Hausdorff space) Σ(M) isthe Gelfand spectrum of M. We will write α : M → C(Σ(M)) to denote the corresponding C∗-isomorphism. The state A 7→ 〈AΩ,Ω〉, with Ω the vacuum vector in F(HC), defines a probabilitymeasure µΩ on Σ(M) (with the Borel σ-algebra BΣ(M) on the topological space Σ(M)) such thatfor all A ∈M we have

〈AΩ,Ω〉 =

∫Σ(M)

α(A)dµΩ.

Although the φ(h) are not in M, there is a natural way in which they can be represented asBorel-measurable functions (and hence random variables) on Σ(M) by extending the continuousfunctional calculus to the Borel functional calculus. With abuse of notation we will write theserandom variables as α(φ(h)). Because on the Fock space we have the equality 〈eiφ(h)Ω,Ω〉 =

e−14‖h‖2H , the random variables α(φ(h)) are in fact Gaussian:

cα(φ(h))

(t) =

∫Σ(M)

eitα(φ(h))dµΩ = 〈eitφ(h)Ω,Ω〉 = e−14t2‖h‖2H .

The set α(φ(h))h∈H is a full set of random variables on the measurable space (Σ(M),BΣ(M))

122

Page 124: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

and, finally, we also have

〈h1, h2〉αφ = 〈α(φ(h1))α(φ(h2))〉µΩ =

∫Σ(M)

α(φ(h1))α(φ(h2))dµΩ

= 〈φ(h1)φ(h2)Ω,Ω〉F(HC) = 〈φ(h2)Ω, φ(h1)Ω〉H= CG〈A∗(h2)Ω, A∗(h1)Ω〉H = CG〈h2, h1〉H= CG〈h1, h2〉H,

so by uniqueness of the Gaussian random process indexed byH, γH := αφ : H → LR(Σ(M),BΣ(M))must be the Gaussian random process indexed by H.

For h ∈ HC we can define φ(h) by linearity, as we did when we defined the free field.Now define U :

⊕n∈Z≥0

H⊗nC → L2(Σ(M),BΣ(M), µΩ) by U(h1 ⊗ . . . ⊗ hn) = (n!)−1/2(√

2)n :

γH(h1) . . . γH(hn) : and UΩ = 1ΣM. The restriction of U to Fock space is a unitary operator

U : F(HC)→ L2(Σ(M),BΣ(M), µΩ) that satisfies

UΩ = 1Σ(M) Uφ(h)U−1 = γH(h) UFn(HC) =WnH.

This shows that there is a very intimate relation between the Gaussian random process indexedby a real Hilbert space H and the Segal field on the Fock space F(HC). As operators on a Hilbertspace, the Gaussian random variables γH(h) are just Segal fields on the Fock space F(HC).

The free Euclidean field in two dimensionsLet N be the Hilbert space obtained by completing the set of real-valued elements in D(R) withrespect to the inner product 〈., .〉N := 1

CG〈., .〉−1,m and let γN : N → LR(QN ,AN ) be the Gaus-

sian random process indexed by N . An element g ∈ E(2) in the Euclidean group acts naturallyas a linear operator on an element h ∈ N by

(u(g)h)(x) := h(g−1x). (5.13)

The operator u(g) preserves the inner product 〈., .〉N on N and therefore ‖u(g)‖ = 1. We canthus define an operator U(g) := Γ(u(g)) on L2(QN ,AN , µN ), and this operator is unitary. NowU(g), in turn, defines an invertible measure-preserving map Tg : QN → QN , but we will notshow this. We merely mention that the correspondence g 7→ Tg defines a representation of E(2) on(QN ,AN , µN ), see also section III.1 of [30]. Finally, it also turns out that γN satisfies the Markovproperty, so the Gaussian random process γN is in fact automatically a Euclidean field. We will callit the free Euclidean field in two dimensions. This free Euclidean field has the additional propertythat the translation subgroup of E(d) acts ergodically, which means that the only random variablesthat are left invariant by all translations are constant random variables. Note that proposition 5.4above implies that γN satisfies

〈γN (h1) . . . γN (h2n)〉µN =∑

pairings

〈γN (hi1)γN (hj1)〉µN . . . 〈γN (hin)γN (hjn)〉µN (5.14)

for all h1, . . . , h2n ∈ N .

The connection with the free hermitean scalar quantum fieldWe will now argue that there is a very natural connection between the free hermitean scalar quan-tum field in 2-dimensional spacetime and the free Euclidean field γN defined above. Consider theSchwinger functions S(x1, . . . , xn) for the free hermitean scalar field in 2-dimensional spacetime. Ifn is odd, these functions are identically zero because the same is true for the Wightman functions.The even Schwinger functions S(x1, . . . , x2n) can all be obtained from S(x1, x2) by the recurrencerelation

S(x1, . . . , x2n) =∑

pairings

S(xi1 , xj1) . . . S(xin , xjn),

123

Page 125: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

where the sum is defined as in proposition 5.4 above. In particular, S(x1, x2) is symmetric in itstwo arguments, and so all S are symmetric under permutations of the arguments. This symmetryof the Schwinger functions is in fact present in any boson quantum field theory. The explicit formof S(x1, x2) is given by

S(x1, x2) =1

(2π)2

∫R2

eiG(k,x1−x2)

G(k, k) +m2d2k,

where G denotes the Euclidean inner product in R2 and m is the mass of the free field. This formulacan be formally derived as follows. The two-point Wightman function for the free scalar quantumfield in 2-dimensional spacetime follows easily from the 4-dimensional case (4.8) by removing twospace variables and by removing two factors 1

2π . With some additional formal manipulations wethen find51

W (x, y) =1

∫R

1

2e−i[ωp(x−y)0−p(x−y)] 1√

p2 +m2dp

=1

∫R

1

2e−i[ωp(x−y)0−p(x−y)]

[1

π

∫R

dp0

(p0)2 + p2 +m2

]dp

=1

(2π)2

∫R2

e−i[ωp(x−y)0−p(x−y)]

(p0)2 + p2 +m2d2p.

Replacing the spacetime vectors xj = ((xj)0,xj) by x′j = (i(xj)

0,xj) gives the desired result for

S(x1, x2). Although S(x1, x2) is a function on (R2)26=, it is sufficiently well-behaved to define a

distribution on S(R4) by

S(f) :=

∫R4

S(x1, x2)f(x1, x2)d4x,

where x1, x2 ∈ R2. From this and the recurrence relation above it also follows that S(x1, . . . , xn)defines a distribution on S(R2n). In terms of these distributions we can formulate the recurrencerelation above as

S(f1 · . . . · f2n) =∑

pairings

S(fi1 · fj1) . . . S(fin · fjn), (5.15)

where f1, . . . , f2n ∈ S(R2) and (f1 · . . . ·fj)(x1, . . . , xj) := f1(x1) . . . fj(xj) for any 2 ≤ j ≤ 2n. Nownotice the resemblence between equations (5.14) and (5.15). In fact it goes much further than amere resemblence, as we will now show. First observe that for any real-valued f1, f2 ∈ S(R2) ⊂ Nwe have

〈γN (f1)γN (f2)〉µN = 〈f1, f2〉γN = CG〈f1, f2〉N

= CG1

CG

∫R2

[(∆ +m2)−1f1](x)f2(x)d2x

=

∫R2

f1(k)f2(k)

G(k, k) +m2d2k

=1

(2π)2

∫R2

∫R4

eiG(k,x1−x2)

G(k, k) +m2f1(x1)f2(x2)d4xd2k

= S(f1 · f2),

where we have used that f2 is not necessarily real-valued when f2 is real-valued. When we combine

51The reason for including this non-rigorous derivation is that we can check that all factors of 2π are correct.

124

Page 126: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

this equality with equations (5.14) and (5.15) we find that∫QN

γN (f1) . . . γN (f2n)dµN = 〈γN (f1) . . . γN (f2n)〉µN

=∑

pairings

〈γN (fi1)γN (fj1)〉µN . . . 〈γN (fin)γN (fjn)〉µN

=∑

pairings

S(fi1 · fj1) . . . S(fin · fjn)

= S(f1 · . . . · f2n).

If we write δ · f for the function (δ · f)(t,x) = δ(t)f(x), then the formula above implies that

〈γN (δ · f1) . . . γN (δ · fn)ΩN ,ΩN 〉 = W ((δ · f1) · . . . · (δ · fn)). (5.16)

Note that for the free quantum field it is allowed to smear out over a product of a function anda delta-function (for general quantum fields this might not be the case). Despite these beautifulrelations between the free Euclidean field in two dimensions and the free hermitean scalar quantumfield in 2-dimensional spacetime, the two are not the same. Of course this was to be expected,since the smeared quantum fields φ(f) do not commute with each other in the general case wheref is complex-valued, whereas random variables always commute with each other.

As we will demonstrate now, it is possible to represent the time-zero quantum field φ0 asa Gaussian random process, but this Gaussian random process is not the two dimensional freeEuclidean field. On the space DR(R) of real-vaued functions in D(R) we define for m > 0 theinner product 〈., .〉F := CF 〈., .〉− 1

2,m, where CF is fixed by some convention (in [30] the convention

is CF = 1), and let the Hilbert space F be the completion of this inner product space. On this

Hilbert space we define the operator D = (−∆ + m2)12 . Let γF : F → LR(QF ,AF ) be the

Gaussian random process indexed by F and write H0 for the second quantization dΓ(D) of D asdefined above. For each f ∈ S(R2) we then define

γ′F (f) :=

∫ReitH0γF (ft)e

−itH0dt,

where ft is the function on R defined by ft(x) = f(t,x). By defining ΩF ∈ L2(QF ,AF , µF ) tobe the random variable that is 1 everywhere, it can be shown that the quantities

〈γ′F (f1) . . . γ′F (fn)ΩF ,ΩF 〉 :=

∫QF

γ′F (f1) . . . γ′F (fn)dµF (5.17)

are equal to the smeared Wightman functions W (f1 · . . . · fn), see also theorem II.17 in [30]. Thus,for the smeared Wightman functions we have

W (f1 · . . . · fn) =

∫Rndt1 . . . dtn〈eit1H0γF ((f1)t1)e−it1H0 . . . eitnH0γF ((fn)tn)e−itnH0ΩF ,ΩF 〉

=

∫Rndt1 . . . dtn〈γF ((f1)t1)ei(t2−t1)H0 . . . ei(tn−tn−1)H0γF ((fn)tn)ΩF ,ΩF 〉,

where in the last step we used that eitH0ΩF = ΩF . In this sense we may interpret L2(QF ,AF , µF )as the Fock space for a scalar particle of mass m, and γ′F as the free quantum field φ. In particularthe Gaussian process γF itself can be interpreted as the time-zero quantum field.

Now that we know that the free Euclidean field and the time-zero free quantum field areGaussian processes γN and γF , respectively, it is time to compare them. For each r ∈ R we definea linear map jr : F → N by

(jrf)(s,x) :=√

2CGCF δr(s)f(x),

125

Page 127: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

where δr is the delta function concentrated at r, i.e. δr(s) = δ(s − r). The Fourier transform ofjrf is

jrf(u,k) =

√2CGCF

∫R2

f(x)δr(s)e−i(us+kx)dsdx =

√2CGCF√

2πf(k)e−iru,

so for the norm of jrf we find that

‖jrf‖2N =1

CG

∫R2

jrf(u,k)jrf(u,k)

k2 + u2 +m2dudk

=1

CG

2CGCF2π

∫R2

|f(k)|2

k2 + u2 +m2dudk

=CFπ

∫R

π√k2 +m2

|f(k)|2dk

= CF

∫R

|f(k)|2√k2 +m2

dk

= ‖f‖2F ,

so jr is an isometry. If gt ∈ E(2) is the time translation over t, then we will write ut to denoteu(gt) as defined in (5.13). Then

(ut(jrf))(s,x) = (jrf)(s−t,x) =√

2CGCF δr(s−t)f(x) =√

2CGCF δ(s−t−r)f(x) = (jr+tf)(s,x),

which shows that utjr = jr+t. Another property of jr is that j∗r1jr2 = e−|r2−r1|D, where D is thedifferential operator on F defined above. In the same way as we did before, we can define a mapJr := Γ(jr) : L2(QF ,AF , µF )→ L2(QN ,AN , µN ). This map is also an isometry and it satisfiesUtJr = Jr+t and J∗r Jt = e−|r−t|H0 . One can now prove the Feynman-Kac-Nelson formula, seetheorem III.6 of [30].

Theorem 5.7 (The free field FKN formula) Let f1, . . . , fn ∈ F , let F0, . . . , Fk : Rn → C bebounded functions and let t1, . . . , tk ≥ 0 be fixed. Let s0 ∈ R be arbitrary and let sj = s0 +

∑ji=1 ti

for 1 ≤ j ≤ k. Then

〈F0(γF (f1), . . . , γF (fn))e−t1H0F1(γF (f1), . . . , γF (fn)) . . . e−tkH0Fk(γF (f1), . . . , γF (f1))ΩF ,ΩF 〉

=

∫QN

k∏l=0

Fl(γN (jslf1), . . . , γN (jslfn))dµN .

As Simon shows in theorem III.19 of [30], the FKN formula remains valid if the Fj are polynomiallybounded. Now consider the following special case of the FKN formula. Let f1, . . . , fn ∈ F and letFm(x1, . . . , xn) = xm+1 for 0 ≤ m ≤ n− 1 (so k = n− 1), which are polynomially bounded. Thenthe FKN formula reads

〈γF (f1)e−t1H0 . . . e−tnH0γF (fn)ΩF ,ΩF 〉 =

∫QN

γN (js0f1) . . . γN (jsn−1fn)dµN

= 〈γN (js0f1) . . . γN (jsn−1fn)ΩN ,ΩN 〉,

where we have chosen s0 = 0 and ΩN ∈ L2(QN ,AN , µN ) is the function that is identically 1.

Nelson’s axioms for Euclidean field theoryAs stated in equation (5.16), the free Euclidean field can be used to obtain the Wightman functionsfor the free quantum field. This immediately leads one to ask under what conditions a (non-free)Euclidean field will define a Wightman quantum field theory. In his article [28], Nelson defines aset of conditions on a Euclidean field that will guarantee that the Euclidean field gives rise to aWightman quantum field theory. These conditions are known as Nelson’s axioms. The idea behindthe axioms is that we try to carry out the same construction for a (non-free) Euclidean field as for

126

Page 128: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

the free Euclidean field, and then look what extra conditions are needed to satisfy all characteristicproperties of the Wightman functions. Thus, given a Euclidean field λ : H −1

m (Rd) → LR(X,A)with associated representation T of E(d) on the probability space (X,A, µ), we begin by defininga map jr : S(Rd−1) → H −1

m (Rd) by (jrf)(s,x) = δr(s)f(x). Then we define a time-zero fieldλ0 : S(Rd−1)→ LR(X,A) by

λ0(f) := λ(jrf)

and for t ∈ R we defineλt(f) = eitHλ0(f)e−itH ,

where H is the positive self-adjoint operator that satisfies P t = e−|t|H . Finally, for f ∈ S(Rd) wedefine

θ(f) :=

∫Rλt(ft)dt,

where ft(x) = f(t,x). The candidate Wightman functions are now

〈θ(f1) . . . θ(fn)Ω,Ω〉

where Ω is the function that is identically 1 on X. The problem is now reduced to formulatingconditions that guarantee that these distributions are indeed the Wightman functions of somequantum field theory. One of these conditions is that the translation subgroup of E(2) actsergodically, a property that was also present in the free Euclidean field. Besides this ergodicity,there is also some regularity condition, but we will not discuss it here. To summarize, the Nelsonaxioms require the existence of some Euclidean field, together with some regularity condition andthe ergodicity property. This will guarantee that the construction above will yield a Wightmanquantum field theory.

5.2.2 An alternative method: The Osterwalder-Schrader theory

When the Schwinger functions are computed for some given Wightman quantum field theory, theWightman functions can be recovered by analytic continuation of the Schwinger functions andhence also the entire quantum field theory can be recovered, by the reconstruction theorem. Theseideas led in the early 1970s to the question whether it is possible to somehow begin with a set offunctions which can be shown to be the Schwinger functions for some quantum field theory, andthen recover the Wightman functions from these Schwinger functions. Of course, one should beable to recognize whether some given set of functions is indeed the set of Schwinger functions forsome quantum field theory, i.e. one should have a set of conditions that should be satisfied forthese functions in order to be the Schwinger functions of some quantum field theory.

Soon after Nelson developed his axiom system for Euclidean field theory, Osterwalder andSchrader developed another axiom system, called the Osterwalder-Schrader axioms, in which theaxioms describe properties of Schwinger functions that guarantee the existence of a correspondingWightman quantum field theory. We will not further discuss the Osterwalder-Schrader axiomshere.

5.2.3 The P(φ)2-model as a Wightman model

Now that we have seen two general axiomatic frameworks for a Euclidean approach, we brieflymention how the Euclidean framework can be used in the construction of the P(φ)2-model. Thefirst task is of course to make sense of the formal expression

V = λ

∫R

: P(φ(x)) : dx

for the interaction of the P(φ)2-model within the Euclidean framework. In the Hamiltonian ap-proach this was achieved by considering the interaction as a perturbation of the free field theoryand by introducing a cutoff version of the interaction. The first of these two points was manifest

127

Page 129: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

in the fact that we began with the free particle Fock space as our Hilbert space, on which wewould later define the local observables for the interacting theory. The direct translation of thisto Nelson’s framework would be to begin the Euclidean construction of the P(φ)2-model with theGaussian random process γN of which we know that it gives rise to the Schwinger functions of thefree quantum field:

S(x1 · . . . · xn) =

∫QN

γN (x1) . . . γN (xn)dµN .

Path-integral arguments from physics then suggest that the Schwinger functions for the interactingtheory are formally given by

Sint(x1 · . . . · xn) =

∫QN

γN (x1) . . . γN (xn)e−λ∫R2 :P(γN (x)):dxdµN∫

QNe−λ

∫R2 :P(γN (x)):dxdµN

.

Recall that on the free particle Fock space we defined the normal ordering : φm(x) : of an unsmearedfield by writing all creation operators to the left of all annihilation operators. For random variableswe also defined the Wick ordering : f1

n1 . . . fknk : of a product of powers of random variables, which

can in particular be applied to define Wick products : γN (f1) . . . γN (fm) : of the free Euclidean fieldγN (f). The unsmeared version of these products can be written formally as : γN (x1) . . . γN (xm) :.However, these objects are not what is meant when we write : γmN (x) : as in : P(γN (x)) : above,which should be clear from the fact that we write only one variable x rather than m variablesx1, . . . , xm. To understand what is meant here, first consider the expression γ(x) for a Gaussianrandom process γ. Formally, by this expression we mean something like

γ(x) = γ(δx) =

∫γ(y)δx(y)dy,

but of course smearing out over a δ-function is not allowed. Mathematically we should thus replacethis with γh(x) =

∫γ(y)h(x− y)dy, where h is a smooth function that looks like a δ-function that

is concentrated at the origin. Then γh(x) is a random variable for any x and we can even definerandom fields by g 7→

∫γh(x)mg(x)dx. Since γh(x) is a well-defined random variable, we can also

take the Wick product : γh(x)m : and define a random field by g 7→∫

: γh(x)m : g(x)dx =: γmh (g) :.By taking some limit in which h → δ, we then obtain the expression that is meant by : γm(g) :.The unsmeared version is then denoted by : γm(x) :. This defines : P(γN (x)) :. For details aboutthe precise conditions for γ under which this definition is possible, see section V.1 of [30].

The cutoff for the interaction term in the Hamiltonian approach can easily be translated to acutoff in the Euclidean theory:

U(g) = λ

∫R2

: P(γN (x)) : g(x)dx,

where g is a cutoff function that equals 1 on a large region, only this time the cutoff function is afunction on R2 instead of on R. The corresponding cutoff Schwinger functions are

Sg(x1, . . . , xn) =

∫QN

γN (x1) . . . γN (xn)e−U(g)dµN∫QN

e−U(g)dµN.

When we define the measure dνg by

dνg =e−U(g)dµN∫

QNe−U(g)dµN

,

we can write the cutoff Schwinger functions more simply as

Sg(x1, . . . , xn) =

∫QN

γN (x1) . . . γN (xn)dνg.

128

Page 130: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

The idea is now to show that in the limit where g → 1, the cutoff Schwinger functions become aset of functions Sint(x1, . . . , xn) that satisfy all the Osterwalder-Schrader axioms and that thesefunctions can be identified as the Schwinger functions of the P(φ)2-model. This is one of themain results of Glimm, Jaffe and Spencer in their article [19]. They prove the result under theassumption that λ/m2 is sufficiently small. They also prove that the theory has a mass gap andthat the infimum of the set that is obtained by removing the point 0 from the spectrum of themass operator (H2 −P2)1/2 is an isolated point mr in the spectrum of the mass operator, and

that the restriction of the unitary representation of P↑+ to the subspace corresponding to mris irreducible (and thus describes a one-particle state). In particular, the Haag-Ruelle theory canbe applied to the P(φ)2-model and this model thus has a particle structure. The techniques thatwere used to derive these properties are directly inspired by techniques of statistical mechanics.This can be understood by realizing that the expression for the Schwinger functions looks verymuch like the correlation functions that one encounters in statistical mechanics. This analogywith statistical mechanics also gave rise to the study of phase transitions and other quantitiesfrom statistical mechanics for the P(φ)2-model. Unfortunately, there is not enough time to discussall these interesting developments in this thesis.

129

Page 131: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

A Hilbert space theory

In this appendix we discuss some facts of Hilbert space theory that are usually not covered in anelementary course in functional analysis, such as direct integrals and unbounded operators. Wewill not discuss the more basic topics such as bounded operators on Hilbert space.

A.1 Direct sums and integrals of Hilbert spaces

The direct sum of a finite number of Hilbert spaces Hnkn=1 is defined to be the direct sum⊕kn=1Hn (of vector spaces) with inner product of two elements (hn)kn=1 and (gn)kn=1 defined by

〈(hn)kn=1, (gn)kn=1〉 =∑k

n=1〈hn, gn〉Hn . It follows easily that⊕k

n=1Hn with this inner productbecomes a Hilbert space. If we have a countably infinite collection of Hilbert spaces Hn∞n=1,then their direct sum is defined to be

∞⊕n=1

Hn := (hn)∞n=1 : hn ∈ Hn for all n and∞∑n=1

‖hn‖2Hn <∞,

with similar addition and scalar multiplication as in the case of a finite number of Hilbert spacesand inner product defined by 〈(hn)kn=1, (gn)kn=1〉 =

∑∞n=1〈hn, gn〉Hn . It follows easily that

⊕∞n=1Hn

is a Hilbert space with this inner product.We will now generalize this notion of a direct sum of Hilbert spaces to direct integrals of

Hilbert spaces. Let (X,A, µ) be a measure space with positive measure µ, and suppose that foreach x ∈ X we are given a Hilbert space H(x) of dimension n(x) ∈ N∪∞ in such a way that thefunction n : X → N ∪ ∞ is A-measurable. We will then define the direct integral of the Hilbertspaces H(x) as follows. First we partition the set X into (disjoint) subsets Xnn∈N∪∞, whereXn = x ∈ X : dim(H(x)) = n. For any n we may then identify all Hilbert spaces H(x)x∈Xnwith some particular Hilbert space H(n) with dim(H(n)) = n. Now let Hn be the set of functions52

h : Xn → H(n) for which the function x 7→ 〈h(x), g〉H(n) from Xn to C is A-measurable for allg ∈ H(n) and such that

∫Xn‖h(x)‖2H(n)dµ(x) < ∞. We then define a vector space structure on

Hn by (h + g)(x) = h(x) + g(x) and (λh)(x) = λh(x) for h, g ∈ Hn and λ ∈ C, and we define aninner product on Hn by 〈h, g〉Hn =

∫Xn〈h(x), g(x)〉H(n)dµ(x). It can be shown that Hn is in fact a

Hilbert space, which we will denote by∫ ⊕XnH(x)dµ(x). Finally, we then define the direct integral

of the Hilbert spaces H(x)x∈X by∫ ⊕XH(x)dµ(x) =

⊕n

∫ ⊕Xn

H(x)dµ(x).

When we are given some Hilbert space H, we say that H can be represented as a direct integral∫ ⊕X H(x)dµ(x) of the Hilbert spaces H(x), with (X,µ) some measure space, if there exists an

isomorphism α : H →∫ ⊕X H(x)dµ(x) of Hilbert spaces.

A.2 Self-adjoint operators and the spectral theorem

If H is a Hilbert space and D ⊂ H is a dense linear subspace of H, then we call an operatorA : D → H (which is not necessarily bounded) a densely defined operator. If A : D → H is adensely defined operator, we define a linear subspace D∗ ⊂ H by

D∗ = k ∈ H : h 7→ 〈Ah, k〉 is a bounded linear functional on D.

If k ∈ D, then h 7→ 〈Ah, k〉 is a bounded linear functional on the dense subspace D ⊂ H andcan therefore be extended to a bounded linear functional F on H. By the Riesz representationtheorem, there exists a unique vector k∗ ∈ H such that F (h) = 〈h, k∗〉 for all h ∈ H. In particular,we have 〈Ah, k〉 = 〈h, k∗〉 for all h ∈ D. We then define a map A∗ : D∗ → H by A∗k = k∗.

52Actually equivalence classes of functions, where the equivalence relation is given by h ∼ g ⇔ h = g µ-almosteverywhere. Here we pretend as if we have already chosen a particular representative of each class, so we can workwith functions instead of equivalence classes of functions.

130

Page 132: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Definition A.1 Let A : D → H be a densely defined operator in the Hilbert space H.(i) We call A symmetric if 〈Ah, g〉 = 〈h,Ag〉 for all g, h ∈ D.(ii) We call A self-adjoint if A = A∗.

It is clear that every self-adjoint operator A is also symmetric, but the converse is false since ingeneral we do not have D = D∗. When A is bounded, the converse is in fact true.

In order to formulate the spectral theorem for self-adjoint operators, we first need to recall thedefinition of a spectral measure and some facts about integration with respect to spectral measures.

Definition A.2 If (X,Ω) is a measurable space and H is a Hilbert space, then a spectral measurefor (X,Ω,H) is a function E : Ω→ B(H) such that:(1) for each ∆ ∈ Ω the operator E(∆) ∈ B(H) is an orthogonal projection;(2) E(∅) = 0 and E(X) = 1H;(3) E(∆1 ∩∆2) = E(∆1)E(∆2) for ∆1,∆2 ∈ Ω;(4) if ∆n∞n=1 are pairwise disjoint sets in Ω, then

E

( ∞⋃n=1

∆n

)=∞∑n=1

E(∆n).

From (2) and (3) it follows that if ∆1,∆2 ∈ Ω with ∆1∩∆2 = ∅ then E(∆1)E(∆2) = E(∆2)E(∆1) =0. Also, for each h ∈ H the map µ : Ω → R given by ∆ 7→ 〈E(∆)h, h〉 is a positive measure on(X,Ω). We will denote this measure by µh.

Let E be a spectral measure for (X,Ω,H) and let t : X → C be a simple function on themeasurable space (X,A). We can write the simple function t as

t = α11∆1 + . . .+ αn1∆n

with αj ∈ C and all ∆j ∈ Ω disjoint. We then define the integral∫Xt(x)dE(x) :=

n∑j=1

αjE(∆j).

Since for j 6= k the sets ∆j and ∆k are disjoint, E(∆j)h and E(∆k)h are orthogonal for all h ∈ H.We may thus use the Pythagoras theorem:∥∥∥∥∫

Xt(x)dE(x)h

∥∥∥∥2

=

n∑j=1

|αj |2〈E(∆j)h,E(∆j)h〉 =

n∑j=1

|αj |2〈E(∆j)h, h〉

=n∑j=1

|αj |2µh(∆j) =

∫X|t(x)|2dµh(x). (A.1)

Since∫X dµh = µh(X) = 〈E(X)h, h〉 = ‖h‖2 it follows that∥∥∥∥∫

Xt(x)dE(x)h

∥∥∥∥2

≤ ‖h‖2 supx∈X|t(x)|2 = ‖h‖2 max

x∈X|t(x)|2.

Hence∫X t(x)dE(x) is a bounded operator with norm ≤ maxx∈X |t(x)|.

An arbitrary bounded measurable function f : X → C can be approximated uniformly by asequence tn of simple functions. According to the estimate above it follows that∥∥∥∥∫

Xtn(x)dE(x)h−

∫Xtm(x)dE(x)h

∥∥∥∥2

=

∥∥∥∥∫X

[tn(x)− tm(x)]dE(x)h

∥∥∥∥2

≤ ‖h‖2 supx∈X|tn(x)−tm(x)|2.

Because this goes to zero uniformly on the unit ball inH form,n→∞, the sequence ∫X tn(x)dE(x)n

is a Cauchy sequence in the Banach space B(H) and thus converges to an element in B(H). Wethen define ∫

Xf(x)dE(x) := lim

n→∞

∫Xtn(x)dE(x) ∈ B(H),

131

Page 133: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

where the limit is taken in the norm-topology of B(H). The limit does not depend on the chosensequence of simple functions. Because for each converging sequence Ann in B(H) with limitA ∈ B(H) it is in particular true that limn→∞(Anh) = Ah for all h ∈ H, we have in the presentcase that

∫X f(x)dE(x)h = limn→∞(

∫X tn(x)dE(x)h). Using (A.1), we then find that∥∥∥∥∫

Xf(x)dE(x)h

∥∥∥∥2

=

∥∥∥∥limn

(∫Xtn(x)dE(x)h

)∥∥∥∥2

= limn

∥∥∥∥∫Xtn(x)dE(x)h

∥∥∥∥2

= limn

∫X|tn(x)|2dµh(x) =

∫X|f(x)|2dµh(x). (A.2)

Now that we have defined the integral∫X f(x)dE(x) for a spectral measure E for (X,Ω,H)

and a bounded measurable function f : X → C, we will do the same for unbounded measurablefunctions. Let E be a spectral measure for (X,Ω,H), let g : X → C be a measurable function andlet h ∈ H be such that g ∈ L2(X,Ω, µh). We can approximate g ∈ L2(X,A, µh) (in the L2-norm)by a sequence of bounded measurable functions. For instance, we can take the sequence gnn,defined by

gn(x) =

g(x) if |g(x)| ≤ n0 if |g(x)| > n.

Because the functions gn are bounded, we can use the identity (A.2):∥∥∥∥∫Xgn(x)dE(x)h−

∫Xgm(x)dE(x)h

∥∥∥∥2

=

∥∥∥∥∫X

[gn(x)− gm(x)]dE(x)h

∥∥∥∥2

=

∫X|gn(x)− gm(x)|2dµh(x).

Because this goes to zero for n,m → ∞ (this follows from the fact that the sequence gnnconverges to g in the L2-norm and is hence a Cauchy sequence), the sequence

∫X gn(x)dE(x)hn

is a Cauchy sequence in H and therefore converges in H. We now define∫Xg(x)dE(x)h := lim

n→∞

∫Xgn(x)dE(x)h.

This definition does not depend on the choice of the sequence gn and analogously to (A.2) wenow also have ∥∥∥∥∫

Xg(x)dE(x)h

∥∥∥∥2

=

∫X|g(x)|2dµh(x). (A.3)

In general, the operator∫X g(x)dE(x) is not bounded and it is only defined for those h ∈ H for

which g ∈ L2(X,Ω, µh), i.e. for those h ∈ H for which the right-hand side of (A.3) is finite. Notethat (A.3) implies that ∥∥∥∥∫

XxdE(x)h

∥∥∥∥2

=

∫X|x|2dµh(x)

for all h ∈ H for which the right-hand side is finite.We now formulate the spectral theorem for self-adjoint operators.

Theorem A.3 Let A : D → H be a self-adjoint operator in a separable Hilbert space H. Thenthere exists a unique spectral measure EA for (R,BR,H), with BR the Borel σ-algebra on R, suchthat

A =

∫Rx dEA(x).

We call EA the spectral measure generated by A. The domain of A can then be expressed asdom(A) = h ∈ H :

∫R |x|

2dµh(x) <∞.

132

Page 134: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

We can slightly generalize the spectral theorem as follows. If Aknk=1 is a system of pairwisecommuting self-adjoint operators defined on some common dense domain in H, then there existsa unique spectral measure EA for (Rn,BRn ,H), with BRn the Borel σ-algebra on Rn, such that

Ak =

∫Rnxk dEA(x)

for k = 1, . . . , n, where xk denotes the k-th component of the integration variable x. We willcall EA the joint spectral measure generated by the system Aknk=1. Note that we can apply ourresults about integration with respect to spectral measures in particular to the spectral measureEA. Thus, for each measurable function g : Rn → C we can define the operator

g(A1, . . . , An) :=

∫Rng(x)dEA(x). (A.4)

The domain of this operator is the set h ∈ H :∫R |g(x)|2dµh(x) <∞.

Definition A.4 If A : D → H is a self-adjoint operator in a (separable) Hilbert space H andf ∈ H, we define Hf to be the smallest closed subspace in H that contains all vectors of the formEA(∆)f with ∆ ∈ BR, and we call Hf the cyclic subspace in H generated by f .

Note that if h ∈ H with h ⊥ Hf , then for g ∈ Hf we have 〈EA(∆)h, g〉 = 〈h,EA(∆)g〉 = 0, soEA(∆)h ⊥ Hf . This shows that Hh ⊥ Hf whenever h ⊥ Hf .

We will now state a theorem that uses the spectral theorem for self-adjoint operators to repre-sent the Hilbert space as a certain direct integral of Hilbert spaces. We will not prove the theoremin detail, but we will give a sketch of the proof. The reason for not omitting the proof altogetheris that the construction is of great importance in quantum physics.

Theorem A.5 Let A : D → H be a self-adjoint operator in a separable Hilbert space H. Then Hcan be represented as a direct integral ∫ ⊕

RH(x)dµ(x)

of Hilbert spaces H(x) relative to a positive measure µ on R such that the action of A is given bymultiplication by x.

Proof sketchBecause H is separable, we can choose a countable dense set f1, f2 . . . in H. Define H1 := Hf1 =EA(∆)f1∆∈BR and suppose that for some n ≥ 1 we have already constructed cyclic subspacesH1, . . . ,Hn in H that are pairwise orthogonal, and write Hn := H1⊕ . . .⊕Hn for their direct sum.If Hn = H, then we can write H =

⊕nk=1Hk. If this is not the case, let kn = mink : fk /∈ Hn.

We choose a unit vector hn+1 in the subspace of H spanned by Hn and fkn such that hn+1 ⊥ Hn;we then define Hn+1 := Hhn+1 . Then Hn+1 is orthogonal to H1, . . . ,Hn and fkn ∈ H1⊕ . . .⊕Hn+1,and since f1, f2, . . . is dense in H, this construction gives rise to a decomposition

H =⊕n

Hn

of H into an orthogonal direct sum of (a finite or countably infinite number of) cyclic subspaces.It can be shown (although we will not prove this here) that the spaces Hn can be realized asfunction spaces L2

µn(R), where µn(∆) = 〈EA(∆)hn, hn〉 with hn as defined above. The corre-sponding isomorphism πn : Hn→L2

µn(R) is defined 53 by EA(∆)hn 7→ 1∆ with 1∆ the char-acteristic function of the set ∆; in particular, hn 7→ 1R. The corresponding action of A on

53The elements of L2 are actually equivalence classes of functions, not functions, but often we will work withrepresentatives whenever this is possible. So in this particular case we actually mean that EA(∆)hn is mapped tothe equivalence class of χ∆.

133

Page 135: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

f ∈ L2µn(R) :

∫R x

2|f(x)|2dµn(x) < ∞ ⊂ L2µn(R) is then given by πn(Af) = idR · πn(f). Since

H =⊕

nHn, we can thus realize each g = (g1, g2, . . .) ∈ H as a sequence (π1(g1), π2(g2), . . .) offunctions πn(gn) ∈ L2

µn(R)

g ∼ π(g) := (π1(g1), π2(g2), . . .) ∈⊕n

L2µn(R)

and if g ∈ D then

Ag ∼ π(Ag) = (idR · π1(g1), idR · π2(g2), . . .) ∈⊕n

L2µn(R).

Now define a measure µ on R by

µ(∆) =

∞∑n=1

2−nµn(∆).

Note that µn(∆) = 〈EA(∆)hn, hn〉 ≤ ‖hn‖2 = 1, so the sum converges for each Borel set ∆, i.e.µ(∆) <∞ for each Borel set ∆. It is clear that if µ(∆) = 0, then µn(∆) = 0 for all n, so we haveµn << µ for all n. By the Radon-Nikodym theorem from measure theory it then follows that foreach n there exists a nonnegative function ϕn such that µn(∆) =

∫∆ ϕn(x)dµ(x). Now let gn ∈ Hn

and let πn(gn) be the corresponding function in L2µn(R). Then the function πn(gn) :=

√ϕnπn(gn)

is in L2µ(R) and

‖πn(gn)‖2L2µ(R) =

∫R|πn(gn)(x)|2dµ(x) =

∫R|πn(gn)(x)|2ϕn(x)dµ(x) =

∫R|πn(gn)(x)|2dµn(x)

= ‖πn(gn)‖2L2µn

(R) = ‖gn‖2Hn ,

i.e. the mapping gn 7→ πn(gn) is an isometry of Hn into L2µ(R). Now define Xn ⊂ R by Xn =

x ∈ R : ϕn(x) > 0; then the above mapping πn : Hn → L2µ(R) defines an isomorphism πn :

Hn→L2µ(Xn). We thus have an isomorphism

π : H→⊕n

L2µ(Xn).

Define a function n : R → N ∪ ∞ by n(x) = #m : x ∈ Xm. Then n(x) is measurable andif we write Bn := x ∈ R : n(x) = n then clearly µ(B0) = 0. For x ∈ Bn we write m1(x) <m2(x) < . . . < mn(x) for the values of m for which x ∈ Xm. If g = (g1, g2, . . .) ∈

⊕n L

2µ(Xn), we

define for each n a set of functions ϕ(n)k (g) ∈ L2

µ(Bn) (k = 1, . . . , n) by ϕ(n)k (g)(x) = g

(n)mk(x)(x), so

for each n this defines a map αn :⊕

k L2µ(Xk)→ L2

µ(Bn)⊕n given by g = (g1, g2, . . .) 7→ ϕ(n)(g) :=

(ϕ(n)1 (g), . . . , ϕ

(n)n (g)). This, in turn, gives rise to a map α :

⊕n L

2µ(Xn)→

⊕n L

2µ(Bn)⊕n given by

g 7→ (ϕ(1)(g), ϕ(2)(g), . . .). Because

‖g‖2⊕n L

2µ(Xn) =

∑n

‖gn‖2L2µ(Xn) =

∑n

∫Xn

|gn(x)|2dµ(x) =∑n

∫Bn

n∑k=1

|gmk(x)(x)|2dµ(x)

=∑n

∫Bn

n∑k=1

|ϕ(n)k (g)(x)|2dµ(x) =

∑n

n∑k=1

‖ϕ(n)k (g)(x)‖2L2

µ(Bn)

=∑n

‖ϕ(n)k (g)(x)‖2L2

µ(Bn)⊕n = ‖α(g)‖⊕n L

2µ(Bn)⊕n ,

we see that the map α is isometric. For each x ∈ R we now choose a Hilbert space H(x)with dim(H(x)) = nx, where nx is the unique index in N ∪ ∞ such that x ∈ Bnx , and foreach x ∈ H we also specify a basis ek(x)nxk=1 for H(x). For each n we can then represent

the Hilbert space L2µ(Bn)⊕n as a direct integral

∫ ⊕BnH(x)dµ(x) of the spaces H(x). Thus, we

134

Page 136: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

now have isomorphisms⊕

n L2µ(Xn) '

⊕n L

2µ(Bn)⊕n '

⊕n

∫ ⊕BnH(x)dµ(x), which are given by

g 7→ (ϕ(1)(g), ϕ(2)(g), . . .) 7→ (ϕ(1)1 (g)e1, ϕ

(2)1 (g)e1 + ϕ

(2)2 (g)e2, . . . , ϕ

(n)1 (g)e1 + . . . + ϕ

(n)n (g)en, . . .).

It now follows from the definition of the direct integral that H can be represented as direct integral∫ ⊕RH(x)dµ(x).

This theorem can be slightly generalized as follows. When we have a finite system Aknk=1 ofpairwise commuting self-adjoint operators Ak on a Hilbert space H (i.e. the spectral measures EAland EAm of Al and Am commute for all l,m), then H can be represented as a direct integral∫ ⊕

RnH(x)dµ(x)

of Hilbert spaces H(x) relative to a positive measure µ on Rn such that the action of Ak is givenby multiplication by xk, where xk denotes the k-th component of the integration variable x ∈ Rn.

B Examples of free fields

We will now construct some of the fields for massive particles. The computation of the coefficientsin (3.26) is done by using the identities[

eθ·J( 12 )]jk

=[eθ·

σ2

]jk

= [2m(p0 +m)]−12

[(p0 +m)δjk +

3∑l=1

pl[σl]jk

][e−θ·J

( 12 )]jk

=[e−θ·

σ2

]jk

= [2m(p0 +m)]−12

[(p0 +m)δjk −

3∑l=1

pl[σl]jk

]

B.1 The (0, 0)-field (or scalar field)

The (0, 0)-field (or scalar field) can only describe particles τ with spin jτ = 0. According to (3.26)the coefficients u(p) are given by u(p) = (2p0)−1/2C00(0, 0; 0, 0) = (2p0)−1/2, so (using (3.22)) thescalar field is

(ψτ0,0)0,0(x) =

∫d3p

(2π)3/2(2p0)1/2

[e−iη(p,x)aτ (p) + eip·xa∗τC (p)

], (B.1)

with p0 =√m2τ + p2. Note that in case the particle τ coincides with its antiparticle τC then

ψ∗0,0(x) = ψ0,0(x). For this reason, such field will also be called real.

B.2 The (12, 1

2)-field (or vector field)

We will now consider the (12 ,

12)-field (or vector field). This field can only describe particles τ with

spin jτ = 0 or jτ = 1.

Spin 0

For spin 0 the coefficients areu− 1

2− 1

2(p)

u− 12

12(p)

u 12− 1

2(p)

u 12

12(p)

=1

2m√p0

p1 + ip2

−p0 + p3

p0 + p3

−p1 + ip2

135

Page 137: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Now define new coefficientsu0(p)u1(p)u2(p)u3(p)

:=−im√

2

u 1

2− 1

2− u− 1

212

u− 12− 1

2− u 1

212

−i[u− 12− 1

2+ u 1

212]

u− 12

12

+ u 12− 1

2

= −i(2p0)−1/2

p0

p1

p2

p3

.

This new choice of coefficients corresponds to a basis transformation in the space V ( 12, 12

) = V( 1

2)

A ⊗V

( 12

)

B . With respect to this new basis, the vector field for a particle of spin 0 becomes

(ψτ12, 12

)µ(x) = (2π)−3/2

∫d3puµ(p)

[e−ip·xaτ (p)− eip·xa∗τC (p)

]= (2π)−3/2(p0)−

12

∫d3p

[−ipµe−ip·xaτ (p) + ipµeip·xa∗τC (p)

]= ∂µ(ψτ0,0)0,0(x)

where in the first line we used that λ = (−1)2Bκ = −κ.

Spin 1

For spin 1 the coefficients areu− 1

2− 1

2(p, 1)

u− 12

12(p, 1)

u 12− 1

2(p, 1)

u 12

12(p, 1)

=[2m(p0 +m)]−1√

2p0

−(p1 + ip2)2

(p0 +m− p3)(p1 + ip2)−(p0 +m+ p3)(p1 + ip2)

(p0 +m)2 − (p3)2

,

u− 1

2− 1

2(p, 0)

u− 12

12(p, 0)

u 12− 1

2(p, 0)

u 12

12(p, 0)

=[2m(p0 +m)]−1

2√p0

2(p1 + ip2)p3

(p0 +m− p3)2 − (p1)2 − (p2)2

(p0 +m+ p3)2 − (p1)2 − (p2)2

2(−p1 + ip2)p3

,

u− 1

2− 1

2(p,−1)

u− 12

12(p,−1)

u 12− 1

2(p,−1)

u 12

12(p,−1)

=[2m(p0 +m)]−1√

2p0

(p0 +m)2 − (p3)2

(p0 +m− p3)(−p1 + ip2)(p0 +m+ p3)(p1 − ip2)

−(p1 − ip2)2

.

For σ ∈ 1, 0,−1 we define

e0(p, σ)e1(p, σ)e2(p, σ)e3(p, σ)

=√p0

u 1

2− 1

2(p, σ)− u− 1

212(p, σ)

u− 12− 1

2(p, σ)− u 1

212(p, σ)

−i[u− 12− 1

2(p, σ) + u 1

212(p, σ)]

u− 12

12(p, σ) + u 1

2− 1

2(p, σ)

.

Again, this corresponds to a basis transformation in the space V ( 12, 12

) = V( 1

2)

A ⊗ V ( 12

)

B . However,the new coefficients are not eµ(p, σ), but rather 1√

p0eµ(p, σ) (since any new basis must be a linear

136

Page 138: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

combination of the old basis, with scalar coefficients). Explicitly, these new coefficients aree0(p, 1)e1(p, 1)e2(p, 1)e3(p, 1)

= − 1√2

[m(p0 +m)]−1

(p1 + ip2)(p0 +m)

m(p0 +m) + (p1)2 + ip1p2

p1p2 + im(p0 +m) + i(p2)2

(p1 + ip2)p3

,

e0(p, 0)e1(p, 0)e2(p, 0)e3(p, 0)

= [m(p0 +m)]−1

(p0 +m)p3

p1p3

p2p3

(p3)2 +m(p0 +m)

,

e0(p,−1)e1(p,−1)e2(p,−1)e3(p,−1)

=1√2

[m(p0 +m)]−1

(p1 − ip2)(p0 +m)

(p1)2 +m(p0 +m)− ip1p2

p1p2 − im(p0 +m)− i(p2)2

(p1 − ip2)p3

.

Note that for zero momentum these coefficients are

eµ(0, 1) = − 1√2

01i0

, eµ(0, 0) =

0001

, eµ(0,−1) =1√2

01−i0

and that the eµ(p, σ) are related to these by

eµ(p, σ) = L(p)µνeν(0, σ),

where L(p) is the standard boost that maps the momentum vector (m,0) to (√m2 + p2,p):

L(p)µν = [m(p0+m)]−1

p0(p0 +m) p1(p0 +m) p2(p0 +m) p3(p0 +m)p1(p0 +m) m(p0 +m) + (p1)2 p1p2 p1p3

p2(p0 +m) p1p2 m(p0 +m) + (p2)2 p2p3

p3(p0 +m) p1p3 p2p3 m(p0 +m) + (p3)2

.

(B.2)The field is

(ψτ12, 12

)µ(x) = (2π)−3/2σ=1∑σ=−1

∫d3p√

2p0eµ(p, σ)

[e−ip·xaτ (p, σ) + (−1)σeip·xa∗τC (p,−σ)

].

B.3 The (12, 0)-field and the (0, 1

2)-field

These fields can only describe particles with spin 12 . The coefficients for the (1

2 , 0)-field are givenby (

u 12

0(p, 12)

u− 12

0(p, 12)

)= [4mp0(p0 +m)]−

12

(p0 +m+ p3

p1 + ip2

),(

u 12

0(p,−12)

u− 12

0(p,−12)

)= [4mp0(p0 +m)]−

12

(p1 − ip2

p0 +m− p3

).

So the field is given by

(ψτ12,0

)a0(x) = (2π)−3/2

σ= 12∑

σ=− 12

∫d3pua0(p, σ)

[e−ip·xaτ (p, σ) + (−1)

12σeip·xa∗τC (p,−σ)

]

= (2π)−3/2

σ= 12∑

σ=− 12

∫d3p

[ua0(p, σ)e−ip·xaτ (p, σ) + va0(p, σ)eip·xa∗τC (p, σ)

],

137

Page 139: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

where in the last line we have restored the coefficients va0(p, σ) = (−1)12

+σua0(p,−σ) and we haveused that λ = (−1)2Bκ = κ. The coefficients for the (0, 1

2)-field are given by(u0 1

2(p, 1

2)

u0− 12(p, 1

2)

)= [4mp0(p0 +m)]−

12

(p0 +m− p3

−p1 − ip2

),(

u0 12(p,−1

2)

u0− 12(p,−1

2)

)= [4mp0(p0 +m)]−

12

(−p1 + ip2

p0 +m+ p3

).

So the field is given by

(0, ψτ12

)0b(x) = (2π)−3/2

σ= 12∑

σ=− 12

∫d3pu0b(p, σ)

[e−ip·xaτ (p, σ) + (−1)

32σeip·xa∗τC (p,−σ)

]

= (2π)−3/2

σ= 12∑

σ=− 12

∫d3p

[u0b(p, σ)e−ip·xaτ (p, σ)− v0b(p, σ)eip·xa∗τC (p, σ)

],

where v0b(p, σ) = (−1)12

+σu0b(p,−σ) and the minus sign in the second term comes from λ =(−1)2Bκ = −κ.

B.4 The (12, 0)⊕ (0, 1

2)-field (or Dirac field)

This field is just((ψτ1

2,0

)a0(x)

(ψτ0, 1

2

)0b(x)

)= (2π)−3/2

σ= 12∑

σ=− 12

∫d3p

(ua0(p, σ)u0b(p, σ)

)e−ip·xaτ (p, σ) +

(va0(p, σ)−v0b(p, σ)

)eip·xa∗τC (p, σ)

= (2π)−3/2

σ= 12∑

σ=− 12

∫d3p

[u(p, σ)e−ip·xaτ (p, σ) + v(p, σ)eip·xa∗τC (p, σ)

],

where

u(p,1

2) =

u 1

20(p, 1

2)

u− 12

0(p, 12)

u0 12(p, 1

2)

u0− 12(p, 1

2)

= [4mp0(p0 +m)]−12

p0 +m+ p3

p1 + ip2

p0 +m− p3

−p1 − ip2

,

u(p,−1

2) =

u 1

20(p,−1

2)

u− 12

0(p,−12)

u0 12(p,−1

2)

u0− 12(p,−1

2)

= [4mp0(p0 +m)]−12

p1 − ip2

p0 +m− p3

−p1 + ip2

p0 +m+ p3

,

v(p,1

2) =

v 1

20(p, 1

2)

v− 12

0(p, 12)

−v0 12(p, 1

2)

−v0− 12(p, 1

2)

= [4mp0(p0 +m)]−12

−p1 + ip2

−p0 −m+ p3

−p1 + ip2

p0 +m+ p3

,

v(p,−1

2) =

v 1

20(p,−1

2)

v− 12

0(p,−12)

−v0 12(p,−1

2)

−v0− 12(p,−1

2)

= [4mp0(p0 +m)]−12

p0 +m+ p3

p1 + ip2

−p0 −m+ p3

p1 + ip2

.

138

Page 140: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

If we define the 4× 4-matrix M by

M =

0 0 p0 + p3 p1 − ip2

0 0 p1 + ip2 p0 − p3

p0 − p3 −p1 + ip2 0 0−p1 − ip2 p0 + p3 0 0

,

then it is easily seen that Mu(p, σ) = mu(p, σ) and Mv(p, σ) = −mv(p, σ). If we now define the4× 4-matrices

γ0 =

(0 1C2

1C2 0

), γi =

(0 −σiσi 0

),

where σi denote the Pauli matrices, these equations can be rewritten as (γµpµ −m)u(p, σ) = 0and (γµpµ + m)v(p, σ) = 0. Using that i∂µu(p, σ)e−ip·x = pµu(p, σ)e−ip·x and i∂µv(p, σ)eip·x =−pµv(p, σ)eip·x, this in turn implies that the field satisfies the Dirac equation

(iγµ∂µ −m)

((ψτ1

2,0

)a0(x)

(ψτ0, 1

2

)0b(x)

)= 0.

Note furthermore that under a space inversion the field transforms as

P

((ψτ1

2,0

)a0(x)

(ψτ0, 1

2

)0b(x)

)P−1 =

((ψτ

0, 12

)0b(x)

(ψτ12,0

)a0(x)

).

139

Page 141: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

References

[1] H. Araki. Mathematical theory of quantum fields, Oxford University Press, Oxford, 2000

[2] N.N. Bogolubov, A.A. Logunov, A.I. Oksak, I.T. Todorov. General principles of quantum fieldtheory, Kluwer, Dordrecht, 1990

[3] N.N. Bogolubov, A.A. Logunov, I.T. Todorov. Introduction to axiomatic quantum field theory,W.A. Benjamin, Inc., Massachusetts, 1975

[4] J. Cannon, A. Jaffe. Lorentz covariance of the λ(φ4)2 quantum field theory, Comm. Math.Phys. 17 (1970), 261-321

[5] J.B. Conway. A course in functional analysis (2nd edition), Springer, New York, 1990

[6] P.A.M. Dirac. Lectures on quantum mechanics, Yeshiva University, New York, 1964

[7] G.G. Emch. Algebraic methods in statistical mechanics and quantum field theory, John Wileyand sons, Inc., New York, 1972

[8] G.B. Folland. Quantum field theory, A tourist guide for mathematicians, American Mathe-matical Society, Providence, 2008

[9] I.M. Gel’fand, N.Ya. Vilenkin. Generalized functions, Volume 4: Applications of harmonicanalysis, Academic Press, New York, 1964

[10] J. Glimm. Boson fields with non-linear self-interaction in two dimensions, Comm. Math. Phys.8 (1968), 12-25

[11] J. Glimm, A. Jaffe. A λφ4 quantum field theory without cut-offs I., Phys. Rev., 176 (1968),1945-1961

[12] J. Glimm, A. Jaffe. The λφ4 quantum field theory without cut-offs II. The field operators andthe approximate vacuum, Ann. Math., 91 (1970), 362-401

[13] J. Glimm, A. Jaffe. The λφ4 quantum field theory without cut-offs III. The physical vacuum,Acta Math., 125 (1970), 204-267

[14] J. Glimm, A. Jaffe. The energy momentum spectrum and vacuum expectation values in quan-tum field theory I, J. Math. Phys. 11 (1970), 3335-3338

[15] J. Glimm, A. Jaffe. The energy momentum spectrum and vacuum expectation values in quan-tum field theory II, Comm. Math. Phys. 22 (1971), 1-22

[16] J. Glimm, A. Jaffe. Quantum field theory models, in Statistical mechanics and quantum fieldtheory, C. De Witt and R. Stora, Editors, Gordon and Breach, New York, 1971

[17] J. Glimm, A. Jaffe. The λφ4 quantum field theory without cut-offs III. Perturbations of theHamiltonian, J. Math. Phys., 13 (1972), 1568-1584

[18] J. Glimm, A. Jaffe. Boson quantum field models, in Mathematics of contemporary physics, R.Streater, Editor, Academic Press, London, 1972

[19] J. Glimm, A. Jaffe, T. Spencer. The Wightman axioms and particle structure in the P(φ)2

quantum field model, Ann. Math. 100 (1974), 585-632

[20] R. Haag. Local quantum physics: Fields, particles, algebras, Springer, Berlin, 1996

[21] R. Haag, D.Kastler. An algebraic approach to quantum field theory, J. Math. Phys. 5 (1964),848

140

Page 142: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

[22] B.C. Hall. Lie groups, Lie algebras, and representations, Springer, New York, 2003

[23] A. Jaffe. Constructive quantum field theory, Unpublished notes available atwww.arthurjaffe.com

[24] A.M.L Messiah, O.W. Greenberg. Symmetrization postulate and its experimental foundation,Phys. Rev. 136 (1964), 248-267

[25] J.R. Munkres. Topology (2nd edition), Prentice Hall, Inc., New Jersey, 2000

[26] G.L. Naber. The geometry of Minkonwski spacetime, an introduction to the mathematics ofthe special theory of relativity Springer, New York, 1992

[27] E. Nelson. A quartic interaction in two dimensions, in Mathematical Theory of ElementaryParticles, R. Goodman and I. Segal, Editors, M.I.T. Press, Cambridge, 1966

[28] E. Nelson. Construction of quantum fields from Markoff fields, J. Func. Anal. 12 (1973),97-112

[29] L.H. Ryder. Quantum field theory (2nd edition), Cambridge University press, Cambridge 1996

[30] B. Simon. The P (φ)2 (quantum) field theory, Princeton University Press, Princeton, 1974

[31] R.F. Streater. Connection between the spectrum condition and the Lorentz invariance of P(φ)2,Comm. Math. Phys. 26 (1972), 109-120

[32] R.F. Streater, A.S. Wightman. PCT, spin and statistics, and all that (2nd revised printing),Benjamin, New York, 1978; reprinted by Princeton University Press, Princeton, 2000

[33] S.J. Summers. A perspective on constructive quantum field theory, arXiv: 1203.3991v1

[34] L.A.Takhtajan. Quantum mechanics for mathematicians, American Mathematical Society,Providence, 2008

[35] S. Weinberg. The quantum theory of fields I: Foundations, Cambridge University Press, Cam-bridge, 1995

141

Page 143: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Popular summary (english)

The special theory of relativity provides a mathematical model for space and time in the absenceof gravity; this mathematical model is also called Minkowski spacetime. The benefit of Minkowskispacetime, over the more classical Newtonian spacetime, is that Minkowski spacetime remainsaccurate when objects are moving with (almost) the speed of light. Quantum theory provides amathematical model for the behaviour of microscopically small objects, such as atoms or elementaryparticles. If one speaks of quantum mechanics, one often means the quantum theory in whichthe microscopically small objects are in a spacetime that is described by the Newtonian model ofspacetime. However, in order to make accurate predictions for systems consisting of microscopicallysmall objects that travel with almost the speed of light (think for instance of the situation in aparticle accelerator), it becomes necessary to develop a quantum theory that uses Minkowskispacetime. In the development of such a theory it soon becomes very natural to introduce fields,and for this reason the corresponding theory is called quantum field theory.

Quantum field theory is a very successful theory in the sense that the predictions that canbe made within the theory are in very good agreement with experimental data. From the pointof view of a physicist, quantum field theory is therefore a very good theory. However, from amathematical point of view quantum field theory is very ill-defined, because the precise nature ofthe ’mathematical’ objects that are used in the theory is often unclear. In this thesis we investigateto what extent quantum field theory can be described as a mathematical theory. We will find thatthere are interesting formulations of what quantum field theory should be mathematically, but thatit is not so easy to prove that quantum field theory, as it is used by physicists, is actually of theform as described by these mathematical formulations54. That this is not so easy will follow fromthe difficulties that we encounter when we prove this for much easier (and non-realistic) versionsof quantum field theory.

54This problem, also called the quantum Yang-Mills problem, is one of the seven Millenium Prize Problems.Whoever manages to solve such a problem, receives a million dollars from the Clay Mathematics Institute. Thusfar,only one of these seven problems has been solved, namely the Poincare conjecture.

142

Page 144: Axiomatic and constructive quantum eld theory Thesis for ... · Quantum eld theory (QFT) is the physical theory that emerged when physicists tried to construct a quantum theory that

Popular summary (dutch)

De speciale relativiteitstheorie geeft een wiskundig model voor ruimte en tijd in de afwezigheidvan zwaartekracht; dit wiskundig model heet ook wel Minkowski ruimtetijd. Het voordeel vanMinkowski ruimtetijd, boven de meer ouderwetse Newtoniaanse ruimtetijd, is dat Minkowskiruimtetijd ook nauwkeurig blijft naarmate objecten met (bijna) de lichtsnelheid bewegen. Dekwantumtheorie geeft een wiskundig model voor het gedrag van microscopisch kleine objecten,zoals atomen of elementaire deeltjes. Wanneer men spreekt over kwantummechanica, dan bedoeltmen meestal de kwantumtheorie waarbij de microscopisch kleine objecten zich bevinden in eenruimtetijd die beschreven wordt door het Newtoniaanse model van ruimtetijd. Om nauwkeurigevoorspellingen te doen voor systemen bestaande uit microscopisch kleine objecten die met bijnade lichtsnelheid bewegen (denk bijvoorbeeld aan de situatie in een deeltjesversneller), is het echternoodzakelijk om een kwantumtheorie te onwikkelen die uitgaat van Minkowski ruimtetijd. In deontwikkeling van een dergelijke theorie wordt het al gauw heel natuurlijk om velden te introduceren,en om deze reden heet de betreffende theorie dan ook kwantumveldentheorie.

Kwantumveldentheorie is een zeer succesvolle theorie in de zin dat de voorspellingen die metde theorie gedaan kunnen worden, met zeer grote nauwkeurigheid overeenkomen met de exper-imentele data. Vanuit de natuurkunde beschouwd is kwantumveldentheorie dus een zeer goedetheorie. Wiskundig gezien is kwantumveldentheorie echter een theorie die zeer slecht gedefinieerdis, omdat vaak niet duidelijk is wat de precieze aard is van de ’wiskundige’ objecten die gebruiktworden. In deze scriptie onderzoeken we in hoeverre kwantumveldentheorie beschreven kan wordenals een wiskundige theorie. Het zal blijken dat er interessante formuleringen zijn van wat kwan-tumveldentheorie wiskundig zou moeten zijn, maar dat het nog niet zo eenvoudig is om te bewijzendat de kwantumveldentheorie, zoals die gebruikt wordt door natuurkundigen, ook daadwerkelijkvan de vorm is zoals beschreven in deze wiskundige formuleringen55. Dat dit niet eenvoudig is zalblijken uit de moeilijkheden die we tegenkomen als we ditzelfde bewijzen voor veel eenvoudigere(en niet-realistische) versies van kwantumveldentheorie.

55Dit probleem, ook wel het quantum Yang-Mills probleem genoemd, is een van de zeven Millenium Prize Problems.Wie een dergelijk probleem op weet te lossen, krijgt een miljoen dollar uitgekeerd van het Clay Mathematics Institute.Tot dusver is slechts een van deze zeven problemen opgelost, namelijk het Poincare vermoeden.

143