non-relativistic quantum mechanics - universitetet i oslo...non-relativistic quantum mechanics...

144
Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo September 2004

Upload: others

Post on 09-Jun-2020

12 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

Non-Relativistic Quantum MechanicsLecture notes – FYS 4110

Jon Magne LeinaasDepartment of Physics, University of Oslo

September 2004

Page 2: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2

Preface

These notes are prepared for the physics course FYS 4110, Non-relativistic Quantum Me-chanics, which is a second level course in quantum mechanics at the Physics Department inOslo. The course was first lectured in the fall semester of 2003, and the intention with this newcourse was to bring in more ”modern aspects” in the teaching of quantum physics.

The lecture notes have been developed parallel with my teaching of the course. They are stillin the process of being modified and extended. I apologize for misprints that have still survivedand welcome feedback from students concerning misprints as well as opinions on the contentsand presentation.

Department of Physics, University of Oslo, August 2004.Jon Magne Leinaas

Page 3: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

Contents

1 Quantum formalism 51.1 Summary of quantum states and observables . . . . . . . . . . . . . . . . . . . . 5

1.1.1 Classical and quantum states . . . . . . . . . . . . . . . . . . . . . . . . 51.1.2 The fundamental postulates . . . . . . . . . . . . . . . . . . . . . . . . 81.1.3 Coordinate representation and wave functions . . . . . . . . . . . . . . . 111.1.4 Spin-half system and the Stern Gerlach experiment . . . . . . . . . . . . 12

1.2 Quantum Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.2.1 The Schrodinger and Heisenberg pictures . . . . . . . . . . . . . . . . . 141.2.2 Path integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.2.3 Path integral for a free particle . . . . . . . . . . . . . . . . . . . . . . . 211.2.4 The classical theory as a limit of the path integral . . . . . . . . . . . . . 231.2.5 A semiclassical approximation . . . . . . . . . . . . . . . . . . . . . . . 241.2.6 The double slit experiment revisited . . . . . . . . . . . . . . . . . . . . 25

1.3 Two-level system and harmonic oscillator . . . . . . . . . . . . . . . . . . . . . 271.3.1 The two-level system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281.3.2 Spin dynamics and magnetic resonance . . . . . . . . . . . . . . . . . . 291.3.3 Harmonic oscillator and coherent states . . . . . . . . . . . . . . . . . . 331.3.4 Fermionic and bosonic oscillators: an example of supersymmetry . . . . 40

2 Quantum mechanics and probability 432.1 Classical and quantum probabilities . . . . . . . . . . . . . . . . . . . . . . . . 43

2.1.1 Pure and mixed states, the density operator . . . . . . . . . . . . . . . . 432.1.2 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462.1.3 Mixed states for a two-level system . . . . . . . . . . . . . . . . . . . . 47

2.2 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492.2.1 Composite systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492.2.2 Classical statistical correlations . . . . . . . . . . . . . . . . . . . . . . 502.2.3 States of a composite quantum system . . . . . . . . . . . . . . . . . . . 512.2.4 Correlations and entanglement . . . . . . . . . . . . . . . . . . . . . . . 522.2.5 Entanglement in a two-spin system . . . . . . . . . . . . . . . . . . . . 55

2.3 Quantum states and physical reality . . . . . . . . . . . . . . . . . . . . . . . . 572.3.1 EPR-paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572.3.2 Bell’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3

Page 4: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4 CONTENTS

3 Quantum physics and information 673.1 An interaction-free measurement . . . . . . . . . . . . . . . . . . . . . . . . . . 673.2 From bits to qubits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713.3 Communication with qubits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723.4 Principles for a quantum computer . . . . . . . . . . . . . . . . . . . . . . . . . 74

3.4.1 A universal quantum computer . . . . . . . . . . . . . . . . . . . . . . . 753.4.2 A simple algorithm for a quantum computation . . . . . . . . . . . . . . 783.4.3 Can a quantum computer be constructed? . . . . . . . . . . . . . . . . . 81

4 The quantum theory of light 834.1 Classical electromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.1.1 Maxwell’s equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 834.1.2 Field energy and field momentum . . . . . . . . . . . . . . . . . . . . . 864.1.3 Lagrange and Hamilton formulations of the classical Maxwell theory . . 87

4.2 Photons – the quanta of light . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934.2.1 Constructing Fock space . . . . . . . . . . . . . . . . . . . . . . . . . . 934.2.2 Coherent and incoherent photon states . . . . . . . . . . . . . . . . . . . 964.2.3 Photon emission and photon absorption . . . . . . . . . . . . . . . . . . 1014.2.4 Dipole approximation and selection rules . . . . . . . . . . . . . . . . . 104

4.3 Photon emission from excited atom . . . . . . . . . . . . . . . . . . . . . . . . 1064.3.1 First order transition and Fermi’s golden rule . . . . . . . . . . . . . . . 1064.3.2 Emission rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1084.3.3 Life time and line width . . . . . . . . . . . . . . . . . . . . . . . . . . 110

4.4 Stimulated photon emission and the principle of lasers . . . . . . . . . . . . . . 1124.4.1 Three-level model of a laser . . . . . . . . . . . . . . . . . . . . . . . . 1134.4.2 Quantum coherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5 Quantum mechanics and geometry 1215.1 Geometry in Hilbert space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

5.1.1 Real space geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1225.1.2 Geometry of the sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . 1255.1.3 The complex geometry of Hilbert space . . . . . . . . . . . . . . . . . . 1275.1.4 Geometry of physical states . . . . . . . . . . . . . . . . . . . . . . . . 1285.1.5 Example 1. Geometry of the Bloch sphere . . . . . . . . . . . . . . . . . 132

5.2 Adiabatic evolution and Berry’s phase . . . . . . . . . . . . . . . . . . . . . . . 1335.2.1 The adiabatic approximation . . . . . . . . . . . . . . . . . . . . . . . . 1335.2.2 Berry’ phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1345.2.3 Example: Spin motion in a magnetic field . . . . . . . . . . . . . . . . . 135

5.3 Geometrical interpretation of electromagnetic fields . . . . . . . . . . . . . . . . 1395.3.1 Local spaces of quantum states . . . . . . . . . . . . . . . . . . . . . . . 1405.3.2 Parallel transport and gauge fields . . . . . . . . . . . . . . . . . . . . . 141

Page 5: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

Chapter 1

Quantum formalism

1.1 Summary of quantum states and observables

In this section we make a summary of fundamental assumptions and postulates of the quantumtheory. We stress the correspondence with classical theory, but at the same time focus on the rad-ically different way the quantum theory is interpreted. We summarize how an isolated quantumsystem is described in terms of abstract vectors and operators in a Hilbert space.

1.1.1 Classical and quantum states

The classical description of a system that is most closely related the standard quantum descriptionis the phase space descriptionof the system. The variables are the generalized coordinatesq = qi ; i = 1, 2, ..., N and the canonical momentap = pi ; i = 1, 2, ..., N. (Note, when noambiguity can arise we denote the whole sets of coordinates simply byq andp.) Each degree offreedom is represented by one coordinateqi with the corresponding momentumpi. A completespecification of the state of the system is given by the full set of coordinates and momenta(q, p),which identifies a point in phase space.

There is a unique time evolution of the phase space coordinates(q(t), p(t)), with a giveninitial condition (q0, p0) = (q(t0), p(t0)) at time t0. This is so, since the equation of motion,expressed in terms of the phase space coordinates islinear in time derivatives. For a Hamiltoniansystem the dynamics can be expressed in terms of the classical Hamiltonian, which is a functionof the phase space variablesH(q, p) and normally is identical to the energy function. The timeevolution is expressed by Hamilton’s equations as

qi =∂H

∂pi, pi = −∂H

∂qi(1.1)

By solving for the canonical momenta the dynamical equations can be expressed in terms ofthe coordinatesqi alone. These define theconfiguration spaceof the system. The Lagrangian

5

Page 6: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

6 CHAPTER 1. QUANTUM FORMALISM

of the system, which is related to the Hamiltonian by

L(q, q) =∑i

qipi −H(q, p) , qi =∂H

∂pi(1.2)

determines the dynamics in configuration space through the Euler-Lagrange equations,

∂L

∂qi− d

dt(∂L

∂qi) = 0 , i = 1, 2, ..., N (1.3)

This is second order in time derivatives and corresponds to Newton’s second law expressed ingeneral coordinates.

If a complete specification of the system cannot be given, a statistical description is oftenused. The state of the system is then described in terms of a probability functionρ(q, p) definedon the phase space, and this is the basis of a statistical mechanics description of the system. Thetime evolution is described through the time derivative ofρ,

d

dtρ =

∑i

(∂ρ

∂qiqi +

∂ρ

∂pipi) +

∂tρ

= ρ,HPB +∂

∂tρ (1.4)

where thePoisson bracket, defined by

A,BPB =∑i

(∂A

∂qi

∂B

∂pi− ∂B

∂qi

∂A

∂pi

)(1.5)

has been introduced. One should note that in Eq.(1.4)∂∂t

is the time derivative with fixed phasespace coordinates, whereasd

dtincludes the time variation due to the motion in phase space.

The time evolution ofρ (and any other phase space variable), when written in this way, showsa remarkable similarity with theHeisenberg equation of motionof the quantum system. Thecommutator between the variables then takes the place of the Poisson bracket.

The quantum description of the system discussed above also involves the (phase space)variables(q, p). But these dynamical variables are now re-interpreted as operators that act oncomplex-valued functions, the wave-functionsψ(q), which usually are defined as a functionsover the configuration space. To specify the variables as operators they are often writtenqi andpi. In the standard way we refer to these asobservables, and the fundamental relation betweenthese observables is the (Heisenberg) commutation relation

[qi, pj] = ihδij (1.6)

with h as Planck’s constant. A more general observableA may be viewed as a function ofqi andpi, and two observablesA andB will in general not commute. We usually restrict observables tobe hermitian operators, which correspond to real-valued variables in the classical description.

Page 7: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.1. SUMMARY OF QUANTUM STATES AND OBSERVABLES 7

There is a close relation between the classical description of a mechanical system and thecorresponding quantum description, and this relation is most explicit when the dynamics is ex-pressed in terms of phase space variables. The transition from the classical to the quantumdescription is referred to ascanonical quantizationand is in its simplest form described as atransition between classical variables and quantum observables

qi → qi, pi → pi (1.7)

where the quantum variables satisfy the fundamental commutation relation (1.6). This transitionbetween the classical and quantum description is often expressed in terms of a substitution be-tween Poisson brackets (for the classical variables) and commutators (for the quantum variables)

A,BPB →1

ih

[A, B

](1.8)

Clearly this simple substitution rule gives the right commutator betweenqi andpj when used onthe Poisson brackets betweenqi andpj. 1

Viewed the other way, the classical description can be seen as a special “classical limit” ofthe quantum description, where Planck’s constant disappears from the equations. This transitionto the classical description, as a limit of the quantum theory, is referred to as thecorrespondenceprinciple, and was by Bohr (and others) used as a guiding principle in the development of theearly form of quantum mechanics. For radiative transitions between atomic levels the correspon-dence principle implies that the radiation formula of the quantum theory reproduces the classicalone for highly excited atoms, in the limit where the excitation energy approaches the ionizationvalue.

The close relation between the classical and quantum dynamics is clearly seen in the simi-larity between the classical equations of motion and theHeisenberg equation of motionfor thequantum system. The latter is usually obtained from the former simply by the substitution (1.7).This correspondence relates directly toEhrenfest’s theorem, which states that the classical dy-namical equations keep their validity also in the quantum theory, if the classical variables arereplaced by their corresponding quantum expectation values. Thus, the quantum expectationvalue 〈q〉 in many respects behaves like a classical variableq, and the time evolution of theexpectation value follows a classical equation of motion. As long as the wave function is welllocalized (in theq-variable), the system is “almost classical”. However if the wave functionsmore spread or divides into separated parts, then highly “non-classical effects” may arise.

The close correspondence between the classical and quantum theory is in many respectsrather surprising, since the physical interpretation of the two theories are radically different. Thedifference is linked to the statistical interpretation of the quantum theory, which is the subjectof one of the later sections. Both classical and quantum descriptions of a system will often beof statistical nature, since the full information (especially for systems with a large number of

1In general there will, however be an ambiguity in this substitution in the form of the so calledoperator orderingproblem. Since classical observables commute, a composite variableC = AB = BA can be written in several ways.The corresponding quantum observables may be different due to non-commutativity,C = AB 6= BA = C ′. TheWeyl orderingis a way to solve the ambiguity by replacing a product by its symmetrized version,C = 1

2 (AB+BA).

Page 8: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

8 CHAPTER 1. QUANTUM FORMALISM

degree of freedom) may not be achievable. Often interactions with other systems (the surround-ings) disturb the system in such a way that only a statistical description is meaningful. If suchdisturbances are negligible the system is referred to as anisolatedor closed systemand for aclassical system all the dynamical variables can in principle be ascribed sharp values.

For a quantum system this is not the case. The quantum state of an isolated system is de-scribed by the wave functionψ(q) defined over the (classical) configuration space and this isinterpreted as anprobability amplitude. This means that the absolute value|ψ(q)|2 defines aprobability distribution in configuration space. For a general observableA this leads to a statis-tical variation or uncertainty in the measured value, expressed in terms of the variable

∆A2 =⟨(A−

⟨A⟩)2⟩

(1.9)

Even if such a probability distribution, in principle, can be sharp in the set of variablesq, itcannot at the same time be sharp in the conjugate variablesp due to the fundamental commutationrelation (1.6). This is quantified inHeisenberg’s uncertainty relation

∆qi∆pj ≥h

2δij (1.10)

The inherent probabilistic interpretation of the quantum theory in many respects make it moreclosely related to a statistical description of the classical system than to a detailed non-statisticaldescription. However, the standard description in terms of a wave functionψ(q) defined on theconfiguration space seems rather different from a classical statistical description in terms of aphase space probability distributionρ(q, p). Quantum descriptions in terms of functions similarto ρ(q, p) are possible (the so-calledWigner functionis of this type) and in some cases theyare useful description. But one important property of the wave functions is hidden in such areformulation. Thesuperposition principleis a fundamental principle of quantum mechanicswhich implies that the dynamical equations expressed in terms of the wave functions arelinearequations. The (quasi-) probability distributions derived from the standard quantum descriptionare quadratic inψ(q), and therefore the linearity is lost. In the classical statistical theory there isno counterpart to the superposition principle.

The description of quantum systems in terms of wave functions, defined as functions over theclassical configuration space, is only one of many equivalent “representations” of the quantumtheory. A more abstract formulation exists, based on superposition principle, where the states are(abstract) vectors in a complex vector space. Different representations of the theory correspondto different choices of basis in this vector space. In the following a summary of this abstract(and formal) description is given, in terms of what may be called the fundamental postulates ofquantum theory.

1.1.2 The fundamental postulates

Page 9: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.1. SUMMARY OF QUANTUM STATES AND OBSERVABLES 9

1. A quantum state of an isolated physical system is described by a vector with unit norm ina complex vector space (a Hilbert space)2 equipped with a scalar product.

In the Dirac notation a vector is represented by “ket”|ψ〉, which can be expanded in anycomplete set of basis vectors|ai〉,

|ψ〉 =∑i

ci|ai〉 (1.11)

where the coefficientsci are complex numbers. For an infinite dimensional Hilbert spacethe basis may be a discrete or continuous set of vectors. We refer to the vectors (kets) asstate vectorsand the vector space as thestate space.

A “bra” 〈ψ| is regarded as vector in the “dual vector space”, and is related to|ψ〉 by an“anti-linear” mapping (linear mapping + complex conjugation)

|ψ〉 → 〈ψ| =∑i

c∗i 〈ai| (1.12)

The scalar product is a complex-valued composition of a bra and a ket,〈φ|ψ〉 which is alinear function of|ψ〉 and an antilinear function of〈φ|.

2. The time evolution of the state vector,|ψ〉 = |ψ(t)〉, is (in the Schrodinger picture) definedby the Schrodinger equation, of the form

ih∂

∂t|ψ(t)〉 = H|ψ(t)〉 (1.13)

The equation is linear in the time derivative, which means that the time evolution|ψ〉 =|ψ(t)〉 is uniquely determined by the initial condition|ψ〉0 = |ψ(t0)〉. H is the Hamiltonianof the system which is alinear, hermitianoperator. It gives rise to a time evolution whichis aunitary, time dependent, mapping of the quantum states.

3. Each physical observable of a system is associated with a hermitian operator acting on theHilbert space. The eigenstates of each such operator form a complete orthonormal set.

With A as an observable, hermiticity means

〈φ|Aψ〉 = 〈Aφ|ψ〉 ≡ 〈φ|A|ψ〉 (1.14)

If the observable has a discrete spectrum, the eigenstates are orthogonal and may be nor-malized as

〈ai|aj〉 = δij (1.15)

2A Hilbert space is an (infinite dimensional) vector space with a scalar product (an inner product space) whichis completein the norm. This means that any (Cauchy) sequence of vectors|n〉, n = 1, 2, ..., where the norm of therelative vectors|n,m〉 = |n〉 − |m〉 goes to zero asn,m→∞ will have a limit (vector) belonging to the space.

Page 10: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

10 CHAPTER 1. QUANTUM FORMALISM

Completeness means ∑i

|ai〉〈ai| = 1 (1.16)

where1 is the unit operator. In general a hermitian operator will have partly a discreteand partly a continuous spectrum. For the continuous vectors orthogonality is expressed interms of Dirac’s delta function.3

4. If the system is in state|ψ >, then the result of a measurement of a physical observableA is one of the eigenvaluesan of the associated hermitian operator. The probability formeasuring a specific eigenvaluean is given by the square modulus of the scalar product ofthe state|ψ〉 with the (normalized) eigenvector|an〉,

pn = |〈an|ψ〉|2 (1.17)

If the observable has a degeneracy, so that several (orthogonal) eigenvectors have the sameeigenvalue, the probability is given as a sum over all eigenvectors with the same eigenvaluean. The expectation value of an observableA obtained by a measurement on the system is

〈A〉 =< ψ|A|ψ > . (1.18)

and is equal to the mean value obtained by an (infinite) sequence of identical measurementsperformed on the system, which before each measurement is prepared in the same state|ψ〉.

5. An idealmeasurement of observableA resulting in a valuean projects the state vectorfrom initial value|ψ > to final state

|ψ >→ |ψ′ >= Pn|ψ > . (1.19)

wherePn is the projection on the eigenstate|an〉, or more generally on the subspacespanned by the vectors with eigenvaluean.4

Note that since the projected state in general will not be normalized to unity, the stateshould also be multiplied by a normalization factor in order to satisfy the standard normal-ization condition for physical states.

The projection to the eigenstate which corresponds to the measured eigenvalue in a senseis a minimal disturbance of the system caused by the measurement. It is often referred toas the “collapse of the wave function”, and corresponds to the “collapse” of a probabilityfunction of a classical system when additional information is introduced in the description.But one should be aware of the far-reaching difference of this “collapse by adding newinformation” in the classical and quantum description.

3For an observable with a discrete spectrum the eigenstates are normalizable and belong to the Hilbert space.For a continuous spectrum the eigenstates are non-normalizable and fall outside the Hilbert space. They can beincluded in an extension of the Hilbert space. Completeness holds within this extended space, but orthonormality ofthe vectors has to be expressed in terms of Dirac’s delta function rather than the Kronecker delta.

4Such idealized measurements are often referred to asprojectivemeasurements.

Page 11: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.1. SUMMARY OF QUANTUM STATES AND OBSERVABLES 11

1.1.3 Coordinate representation and wave functions

A coordinate basis is a continuous set of eigenvectors of the coordinate observablesqi

qi |q〉 = qi |q〉 (1.20)

whereq denotes the set of coordinatesqi. For a Cartesian set of coordinates the standardnormalization is

〈q′|q〉 = δ(q′ − q) (1.21)

whereδ(q′ − q) is the N-dimensional Dirac delta-function, withN as the dimension of theconfiguration space. The wave functions defined over the configuration space of the systemare the components of the abstract state vector|ψ〉 on this basis,

ψ(q) = 〈q|ψ〉 (1.22)

A general observable is in the coordinate representation specified by its matrix elements

A(q′, q) ≡ 〈q′|A|q〉 (1.23)

It acts on the wave function as an integral operator

〈q|A|ψ〉 =∫dNq′A(q, q′)ψ(q′) (1.24)

A potential function is an example of alocal observable,

V (q′, q) = V (q) δ(q′ − q) (1.25)

The momentum operator isquasi-localin the sense that it can be expressed as a derivative ratherthan an integral

〈q|pi|ψ〉 = −ih ∂

∂qi〈q|ψ〉 (1.26)

Formally we can write the matrix elements of the momentum operator as the derivative of a deltafunction

〈q|pi|q′〉 = −ih ∂

∂qiδ(q − q′) (1.27)

(Check this by use of the integration formula for observables in the coordinate representation.)From the abstract formulation it is clear that the coordinate representation is only one of many

equivalent representations of quantum states and observables. Themomentum representationisdefined quite analogous to the coordinate representation, but now with the momentum states|p〉as basis vectors,

ψ(p) = 〈p|ψ〉 (1.28)

Page 12: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

12 CHAPTER 1. QUANTUM FORMALISM

The transition matrix elements between the two representations is (for Cartesian coordinates),

〈q|p〉 = (2πh)−N/2 exp(i

hq · p) (1.29)

with q · p =∑i qipi, which means that these two (conjugate) representations are related by a

Fourier transformation.Note that often a set of continuous (generalized) coordinates is not sufficient to describe the

wave function. For example, the spin variable of a particle with spin has discrete eigenvalues anddoes not have a direct counterpart in terms of a continuous classical coordinate. With discretevariables present the wave function can be described as a multicomponent function

ψm(q) = 〈q,m|ψ〉 (1.30)

wherem represents the discrete variable,e.g. the spin component in thez-direction.The coordinate representation and the momentum representation are only two specific ex-

amples ofequivalent representationsof the quantum system. In general the transitian matrixelement between two representations, defined by two orthonormal sets of basis vectors,|an〉and|bm〉 is,

Umn = 〈an|bm〉 (1.31)

and satisfies the condition∑m

UnmU∗mn′ =

∑m

〈an|bm〉〈bm|a′n〉 = δnn′ (1.32)

This means that it is aunitary matrix, and the corresponding representations are unitarily equiv-alent.

1.1.4 Spin-half system and the Stern Gerlach experiment

The postulates of quantum mechanics have far reaching implications. We have earlier stressedthe close correspondance between the classical (phase space) theory and the quantum theory.Now we will study a special representation of the simplest quantum system, thetwo-level system,where some of the basic differences between the classical and quantum theory are apparent.

The electron spin gives an example of aspin-half system, and when the (orbital) motion of theelectron is not taken into account the Hilbert space is reduced to a two-dimensional (complex)vector space. This two-dimensionality is directly related to the discovery of Stern and Gerlachof the two spin states of silver atoms. Their discovery is clearly incompatible with a classicalmodel of electron spin as due to the rotation of a small body.

We focus on the Stern-Gerlach experiment as shown schematically in Figure 1. A beam ofAg atoms is produced by an oven with a small hole. Atoms with velocity sharply peaked arounda given value are selected and sent through a strong magnetic field. Due to a weak gradient in thefield the particles in the beam are deflected, with a deflection angle depending on the component

Page 13: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.1. SUMMARY OF QUANTUM STATES AND OBSERVABLES 13

S

N

Figure 1.1:The Stern-Gerlach experiment. Atoms with spin 1/2 are sendt in a beam from an oven. Whenpassing between two magnets, the atoms are deflected vertically, with an angle depending on the verticalspin component. Classically a smooth distribution is expected, since there is no preferred directions for thespin. In reality only two directions are observed, consistent with the quantum prediction of quantizationof the spin.

of the magnetic moment in the direction of the gradient. The degree of deflection is measured byregistering the particles on a screen.

Let us first analyze the deflection from a classical point of view. We assume the atoms tohave a magnetic momentµ = −e/(mec)S, whereS is the intrinsic spin,e is the electron chargeandme is the electron mass. (The main contribution to the magnetic moment comes from theoutermost electron.) Between the magnets there is an approximately constant magnetic fieldB = Be1, wheree1 is a unit vector in the direction of the magnetic field. The spin will hererapidly presses around the magnetic field and the average value will be in the direction of themagnetic field,µ = µ1e1. There is a gradient in the magnetic field which produces a force onthe atom and changes its momentum

p = ∇(µ ·B) ≈ µ1∂1Be1 (1.33)

This shows that the deflection angle is proportional to the component of the magnetic momentalong the magnetic field. Since we expect the spin direction of the emitted Ag atoms to be ran-domly distributed in space we expect from this classical reasoning to se a continuous distributionof the atoms on the screen.

The surprising discovery of Stern and Gerlach was that there was no such continuous dis-tribution. Instead the position of the atoms were rather strongly restricted to two spots, whichaccording to the deflection formula would correspond to the two values

µ1 = ±µ . (1.34)

This result cannot be understood within classical theory.To see this, let us consider twohypotheticalsituations, that instead of measuring the compo-

nentµ1 (by measuring the deflection angle in thee1 direction), we had measuredµ2 along thedirectione2 rotated by an angle+ 1200 or µ3 along the directione3 rotated by an angle− 1200

relative toe1. By elementary vector addition it is clear that the sum of these (hypothetical) resultswould be

µ1 + µ2 + µ3 = µ · (e1 + e2 + e3) = 0 (1.35)

Page 14: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

14 CHAPTER 1. QUANTUM FORMALISM

But this is incompatible with the result of the Stern-Gerlach experiment, since the possible values± µ cannot be restricted to the componentµ1, but must, due to rotational symmetry, be the onlyvalues possible also forµ2 andµ3.

However, the result of the Stern-Gerlach experiment is consistent with the postulates of quan-tum mechanics, if we assume that the spin component in a given direction is an observable withonly two eigenvalues

Sx |±〉x = ± h2|±〉x (1.36)

When the components satisfy the spin algebra[Sx, Sy

]= ih Sz (+ cycl. perm.) (1.37)

then the component of the spin vector inanydirection will have the two eigenvalues± h/2.A typical feature of the spin operator is that its components in different directions do not

commute, they areincompatibleobservables. This means that they in general cannot be ascribedsharp values at the same time, since they do not have common eigenvectors. This incompatibilityis directly related to the paradox met above when we ascribed a (hypothetical) sharp value to themagnetic moment in all the three directionsµ1, µ2 andµ3. The situation that we have met here,that the components of a vector that can becontinuouslyrotated havediscreteeigenvalues, cannotbe explained within the framework of classical theory.

1.2 Quantum Dynamics

In this section we formulate the dynamical equation of a quantum system and compare the twounitarily equivalent descriptions, the Schrodinger and Heisenberg pictures. We then examine therather different Feynman’s path integral formulation of quantum dynamics.

1.2.1 The Schrodinger and Heisenberg pictures

The Schrodinger picture.The time evolution of an isolated quantum system is defined by theSchrodinger equation. Origi-nally this was formulated as a wave equation, but it can be reformulated as a differential equationin the (abstract) Hilbert space of ket-vectors as

ih∂

∂t|ψ(t)〉 = H|ψ(t)〉 (1.38)

With the state vector given for an initial timet0 it will determine the state vector at later timest (and also for earlier times) as long as the system stays isolated. The information about thedynamics is contained in the HamiltonianH, which usually can be identified with the energyobservable of the system. The original Schrodinger equation, described as a wave equation canbe viewed as the coordinate representation of Eq.(1.38).

Page 15: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.2. QUANTUM DYNAMICS 15

For a system described by phase space variables(q, p), the classical motion in phase spaceis described by the Hamiltonian functionH((q, p)), which (normally) is the classical energywritten as a function ofq andp. Canonical quantization implies that the quantum Hamiltonianis defined by the same function of the (quantum) phase space variables,H = H(q, p). However,one should note that in some cases there is an “operator order” ambiguity. This means that twoclassical expressions forH that are equivalent due to the commutativity of the classical phasespace variables, may be mapped into to two non-equivalent quantum observables, due to the non-commutativity of the quantum phase space variables. One way to resolve this ambiguity is theWeyl orderingwhich symmetrizes products inq andp. (However, when applying the formalismto a real physical system, the correct operator ordering may be viewed as following from thecomplete specification of the physical properties of the system.)

The time evolution of the state vector can be expressed in terms of a time evolution operatorU(t, t0), which is a unitary operator that relates the state vector of the system at timet with thatof time t0,

U(t, t0)|ψ(t0)〉 = |ψ(t)〉 (1.39)

The time evolution operator is determined by the Hamiltonian through the equation

ih∂

∂tU(t, t0) = HU(t, t0) (1.40)

which follows from the Schrodinger equation(1.13).WhenH is a time-independent operator, a closed form for the time evolution can be given

U(t− t0) = e−ihH(t−t0) (1.41)

(Note that a function of an observableH, like exp(− ihH(t− t0)) can be defined by its action on

the eigenvectors|E〉 of H,

e−ihH(t−t0)|E〉 = e−

ihE(t−t0)|E〉 .) (1.42)

If howeverH is time dependent so that the operator at different times do not commute, we mayuse a more general integral expression

U(t, t0) =∞∑n=0

(−ih

)n ∫ t

t0dt1

∫ t1

t0dt2 · · ·

∫ tn−1

t0dtn H(t1)H(t2) · · · H(tn)

(1.43)

where the term corresponding ton = 0 is simply the unit operator1. Note that the product ofthe the time dependent operatorsH(tk) is atime-ordered product.

The Heisenberg picture.The description of the quantum dynamics given above is usually referred to as the Schrodingerpicture. From the discussion of differentrepresentationsof the quantum system we know that

Page 16: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

16 CHAPTER 1. QUANTUM FORMALISM

unitary transformations of states and observables leads to a different, but equivalent representa-tion of the system. Thus, if we denote the states of a system by|ψ〉 and the observables byA andmake a unitary transformationU onall states andall observables,

|ψ〉 → |ψ′〉 = U |ψ〉 , A→ A′ = UAU † , U U † = 1 , (1.44)

then all matrix elements are left unchanged,

〈φ′|A′|ψ′〉 = 〈φ|U †UAU †U |ψ〉 = 〈φ|A|ψ〉 (1.45)

and since all measurable quantities can be expressed in terms of such matrix elements, the twodescriptions related by a unitary transformation can be viewed as equivalent. This is true alsowhenU = U(t) is a time dependent transformation.

The transition to theHeisenberg pictureis defined by a special unitary transformation

U(t) = U †(t, t0) (1.46)

This is the inverse of the time-evolution operator, and when applied to the time-dependent statevector of the Schrodinger picture it will simply cancel the time dependence of the state vector

|ψ〉H = U †(t, t0)|ψ(t)〉S = |ψ(t0)〉S (1.47)

Here we have introduced a subscriptS for the vector in the Schrodinger picture andH for theHeisenberg picture. (The initial timet0 is arbitrary and is often chosen ast0 = 0.) The timeevolution is now carried by the observables, rather than the state vectors,

AH(t) = U †(t, t0) AS U(t, t0) (1.48)

and the Schrodinger equation is replaced by the Heisenberg equation of motion

d

dtAH =

i

h[H, AH ] +

∂tAH (1.49)

One should note the difference between the (dynamical) time dependence of the observableAH(t) in the Heisenberg picture and the time dependence allowed for in the expression (1.43)for the time evolution operator. The latter is due to a possible (explicit) time dependence of theHamiltonian caused by time varying external influence on the system. This gives rise to timevariation of the observables in the Schrodinger picture. In the Heisenberg equation of motion ob-servables such a variation will give a contribution through the partial derivative in (1.49). Withoutsuch an explicit time dependence, the time evolution is caused only by the non-commutativity ofthe observable with the Hamiltonian.

The interaction pictureA third representation of the unitary time evolution of a quantum system is theinteraction picturewhich is particularly useful in the context of time-dependent perturbation theory. The Hamilto-nian is of the form

H = H0 + H1 (1.50)

Page 17: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.2. QUANTUM DYNAMICS 17

whereH0 is the unperturbed Hamiltonian andH1 is the (possibly time dependent) perturbation.The eigenvalue problem ofH0 we assume can be solved and the corresponding time evolutionoperator is

U0(t− t0) = e−ihH0(t−t0) (1.51)

The transition from the Schrodinger picture to the interaction picture is defined by acting withthe inverse of this on the state vectors

|ψ(t)〉I = U †0(t, t0) |ψ(t)〉S (1.52)

Note that the time variation of the state vector is only partly cancelled by this transformation,since the effect of the perturbationH1 is not included. The time evolution of the observables isgiven by

AI(t) = U †0(t, t0) AS U0(t, t0) (1.53)

This means that they satisfy the same Heisenberg equation of motion as for a system wherethe hamiltonian is simplyH = H0. The remaining part of the dynamics is described by theinteraction Hamiltonian

HI(t) = U †0(t, t0) H1 U0(t, t0) (1.54)

which acts on the state vectors through the (modified) Schrodinger equation

ih∂

∂t|ψ(t)〉I = HI(t)|ψ(t)〉I (1.55)

The corresponding time evolution operator has the same form as (1.43),

UI(t, t0) =∞∑n=0

(−ih

)n ∫ t

t0dt1

∫ t1

t0dt2 · · ·

∫ tn−1

t0dtn HI(t1)HI(t2) · · · HI(tn)

(1.56)

and this form of the operator gives a convenient starting point for a perturbative treatment of theeffect of HI . We shall apply this method when studying the interaction between photons andatoms in a later section.

We summarize the difference between the three pictures by the following table

States ObservablesSchrodinger time dependent time independentHeisenberg time independent time dependentInteraction time dependent time dependent

Page 18: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

18 CHAPTER 1. QUANTUM FORMALISM

x

t

(x0,t0)

(x1,t1)

Figure 1.2: The path integral as a “sum over histories”. All possible paths between the initial point(x0, t0) and the final point(x1, t1) contribute to the quantum transition amplitude between the points.The paths close to the classical path, here shown in dark blue, tend to be most important since theircontributions interfere constructively.

1.2.2 Path integrals

Feynman’s path integral method provides an approach to the dynamics of quantum systems thatis rather different from the methods outlined above. Instead of applying the standard descriptionof states as vectors in a Hilbert space, it focusses directly on transition matrix elements anddescribe these as integrals over classical trajectories of the system. The method is well suitedto describe quantum systems which can be expressed in terms of continuous variables, but fordiscrete variables like the intrinsic spin of the electron, it is not so well suited. (However, pathintegral methods which use “Grassman variables” can deal also with discrete variables.

The path integral method is used extensively in quantum field theory, and it has becomea standard method which is applied in many parts of physics. For this reason it is natural toinclude it also here. Although the path integral method can be viewed as a fundamental approachto quantum theory,i.e., a method that completely circumvents the standard description with statevectors and observables, it is often instead derived from the Hamiltonian formulation, and that isthe approach we shall also take.

Let us consider the time evolution of a quantum system as a waveψ(q, t) in configurationspace, whereq = q1, q2, ..., qN is a set of continuous (generalized) coordinates. In the “bra-ket”notation we write it as

ψ(q, t) = 〈q|ψ(t)〉= 〈q|U(t, t0)|ψ(t0)〉

=∫dNq′〈q|U(t, t0)|q′〉〈q′|ψ(t0)〉

≡∫dNq′〈q t|q′ t0〉 ψ(t0) (1.57)

The information about the dynamics of the system is encoded in the transition matrix element

〈q t|q′ t0〉 = 〈q|U(t, t0)|q′〉 (1.58)

Page 19: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.2. QUANTUM DYNAMICS 19

which gives the probability amplitude for the system which starts in the configurationq′ at timet0 to end up (in a position measurement) in the configurationq at a later timet. This amplitudeis often referred to as a propagator,G(q t, q′ t0) and is usually defined as being a “causal”propagator, in the sense that it only propagates forward in time.

G(q t, q′ t0) = 〈qt | qt0〉 t > t0

0 t < t0(1.59)

We will see how a path integral representation of this propagator can be found. For simplicitywe shall consider the case of a one-dimensional configuration space. We may visualize this as aparticle moving on a line.

First we note that the propagation betweent0 andt can be viewed as composed of the prop-agation between a series of intermediate timestk, k = 0, 1, ...n with tn = t and wheren may bearbitrary large

G(x t, x′ t0) =∫dxn−1...

∫dx2

∫dx1G(x t, xn−1 tn−1)...G(x2 t2, x1 t1)G(x1 t1, x

′ t0)

(1.60)

This follows from a repeated use of the composition rule satisfied by the time evolution operator

U(tf , ti) = U(tf , tm) U(tm, ti) (1.61)

whereti, tm and tf are arbitrary chosen times. In the expression (1.60) forG(x t, x′ t0) theintermediate timest1, t2... may also be arbitrarily distributed betweent0 andt, but for simplicitywe think of them as having a fixed distance

tk+1 − tk = ∆t ≡ (t− t0)/n (1.62)

To proceed we assume a specific form for the Hamiltonian,

H =1

2mp2 + V (x) (1.63)

which is that of a particle of massmmoving in a one-dimensional potentialV (x). The propagatorfor a small time inteval∆t is

G(x t+ ∆t, x′ t) = 〈x|e−ihH∆t|x′〉

≈ 〈x|e−ih

12m

p2∆te−ihV (x)∆t|x′〉

= 〈x|e−ih

12m

p2∆t|x′〉e−ihV (x′)∆t (1.64)

We have here made use of

ei(A+B)∆t = eiA∆teiB∆t +O(∆t2) (1.65)

whereA andB are two (non-commuting) operators and theO(∆t2) term comes from the com-mutator betweenA andB. In the present caseA = p2/(2mh), B = V /h, and these clearly do

Page 20: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

20 CHAPTER 1. QUANTUM FORMALISM

not commute. However, we will take the limit∆t → 0 (n → ∞) and this allows us to neglectthe correction term coming from the commutator. The x-space matrix element of the kinetic termcan be evaluated

< x|e−ih∆t p2

2m |x′ > =∫dp < x|p > e−

ih

p2

2m∆t < p|x′ >

=∫ dp

2πhe

ihp(x−x′)e−

ih

p2

2m∆t

=∫ dp

2πhe−

ih

∆t2m

(p−mx−x′∆t

)2eih∆tm

2

(x−x′∆t

)2

= N∆t ei

m(x−x′)22h∆t . (1.66)

whereN∆t is anx-independent normalization constant,

N∆t =∫ dp

2πhe−i

∆t2mh

(p−mx−x′∆t

)2

=∫ dp

2πhe−i

∆t2mh

p2

(1.67)

This last expression may look somewhat mysterious, since the integral does not seem to con-verge for largep. However, let us focus on a related integral, the Gaussian integral with arealcoefficient in the exponent. This is convergent,

+∞∫−∞

dpe−λ p2

=

√π

λ(1.68)

Now we shall simply assume that the correct expression forN∆t is found by making an analyticcontinuation of this result to imaginaryλ. This gives

N∆t =

√m

2πih∆t(1.69)

For the matrix element of the time evolution operator we now get,

< x|e−ih∆tH |x′ >= N∆t e

im(x−x′)2

2h∆t e−ihV (x′)∆t (1.70)

and withx′ → xk andx → xk+1 this can be used for each term in the factorized expression(1.60) for the propagator. The result is

G(x t, x′ t0) = (N∆t)n∫dxn−1...

∫dx2

∫dx1 e

ih∆t

n∑k=0

[m2

(xk+1−xk

∆t)2−V (xk) ]

(1.71)

The exponent can be further simpified in the limitn→∞,

i

h∆t

n∑k=0

[m

2(xk+1 − xk

∆t)2 − V (xk) ] → i

h

t∫t0

dt [1

2m(

dx

dt)2 − V (x) ] (1.72)

Page 21: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.2. QUANTUM DYNAMICS 21

where we have now assumed that the sequence of intermediate positionsxk (which we integrateover) in the limitn→∞ defines a differentiable curve. The expression we arrive at can be iden-tified as the (classical) action associated with the curve defined by the positionsxk as functionsof time,

S[x(t)] =

t∫t0

L(x, x) =

t∫t0

(1

2mx2 − V (x)) (1.73)

In the continuum limit (n→∞) we therefore write the propagator as

G(x t, x′ t0) =∫D[x(t)]e

ihS[x(t)] (1.74)

The meaning of this expression is that we integrate over all possible pathsx(t) which intercon-nect the initial pointx′ = x(t0) with the final pointx = x(t), with a complex weight factordetermined by the classical action of the path. Note that the discretized expression (1.71), withthe correct normalization factors included can now be thought of asdefiningthe integration mea-sureD[x(t)] of the path integral. However, we have noticed some complications in the pathintegral expression. Thus, it is difficult to control the oscillations in the factore

ihS[x(t)] unless we

impose some method (analytic continuation) to damp the exponential factor.In the above derivation we have used the Hilbert space description of quantum mechanics as

starting point and discretized the time variable in order to approach a situation where the interme-diate positions define a continuous curve. However, the path integral formulation, as originallydiscussed by Feynman, is usually considered to be a fundamental formulation of quantum me-chanics, not limited to the discrete formulation used here. This path integral description hasseveral appealing features. The quantum evolution can be thought of as a sum over “all possiblehistories” of the system, not only the one that is dynamically possible in a classical theory. In thissence it is quite general, not limited to a system with one dimension or with a given form of theHamiltonian. But in general we should expect that to give a precise definition to the integrationmeasure may be highly non-trivial. Even in the one-dimensional case we meet normalizationfactors that diverge in the continuum limit.

1.2.3 Path integral for a free particle

The discretization of time is convenient to discuss the connection between the Hamiltonian for-mulation and the path integral formulation of quantum mechanics. However, the discretization isnot faithful to the idea of the path integral as a sum over contributions from (continuous) paths,since the points for different times are integrated over independently. We will here try out an-other formulation which respects more the idea of paths, and apply it to the example of a freeparticle.

We then consider a path as a (continuous) curvex(t) which connects an initial pointx0 =x(t0) with a final pointx1 = x(t1), and denote the time difference asT = t1 − t0. With theendpoints of the curve fixed, an arbitrary curve between these points can be written as

x(t) = xcl(t) +∞∑n=1

cn sin(nπt− t0T

) (1.75)

Page 22: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

22 CHAPTER 1. QUANTUM FORMALISM

wherexcl(t) is a solution of the classical equation of motion with the given end points. Thedeviation from the classical curve is expanded in a Fourier series. We shall now interprete thepath integral as an independent integration over each Fourier componentscn. Note that forarbitrarily chosen coefficients we now have a continuous curve.

The action for a free particle is given by

S[x(t)] =

t1∫t0

dt1

2mx2

= S[xcl(t)] +1

2m

t1∫t0

dt∑nn′

cncn′nn′π2

T 2cos(nπ

t− t0T

) cos(n′πt− t0T

)

= S[xcl(t)] +1

2m

π∫0

dφ∑nn′

cncn′nn′π

Tcos(nφ) cos(n′φ)

= S[xcl(t)] +mπ2

4T

∑n

n2c2n (1.76)

For the propagator this gives

G0(x1 t1, x0 t0) =∫D[x(t)]e

ihS[x(t)]

= N eihS[xcl(t)]

∏n

∫dcne

imπ2

4Thn2c2n (1.77)

In the last expression we have interpreted the path integral as an integral over the Fourier coeffi-cientscn, but we have also introduced a normalization factor, which at this stage is unspecified.Also note that the integrals have to be made well-defined by the same trick as before, by changingthe imaginary coefficient to a real one. This gives

G0(x1 t1, x0 t0) = N eihS[xcl(t)]

∏n

(2

n

√iT h

mπ)

= N ′eihS[xcl(t)]

= N ′eih

12m

(x1−x0)2

T (1.78)

Note that the product is not well-defined separately fromN and is absorbed in the total normal-ization factorN ′, which we have not been able to determine. This factor depends on the presisedefinition of the path integral.

We could alternatively had used the discrete time definition to derive the propagator, andthis would also have determined the normalization factorN ′. However, in this simple case thepropagator can be evaluated directly, and we use the expression to check our result from the pathintegral formulation,

G0(x1 t1, x0 t0) = 〈x1|e−ih

p2

2mT |x0〉

Page 23: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.2. QUANTUM DYNAMICS 23

=∫dpe−

ih

p2

2mT 〈x1|p〉〈p|x0〉

=∫ dp

2πhe−

ih[ p2

2mT−p(x1−x0)]

=

√m

2πihTei

m(x1−x0)2

2hT (1.79)

This agrees with the expression (1.78) and determinesN ′. We note that the exponential factoris determined by the action integral of the classical path between the initial and final points. Asimilar expression for the propagator, involving the action of the classical path, can be found forany action that is quadratic in bothq andq.

1.2.4 The classical theory as a limit of the path integral

One of the advantages of the Feynman path integral is its close relation to the classical theory.This is clear already from the formulation in terms of the classical Lagrangian of the system. Letus formulate the path integral in the general form

G(qf tf , qi ti) =∫D[q(t)]e

ihS[q(t)]

=∫D[q(t)] exp(

i

h

tf∫ti

L(q, q)dt) (1.80)

where the LagrangianL(q, q) depends on a set of generalized coordinatesq = qi and theirderivativesq = qi. We note from this formulation that variations in the pathq(t) that give riseto rapid variations in the actionS[q(t)] tend to give contributions to the path integral that adddestructively. This is so because of the rapid change in the complex phase of the integrand.

The classical limit of a quantum theory is often thought of as a formal limith → 0. Fromthe expression for the path integral we note that smallerh means more rapid variation in thecomplex phase. This indicates that in the classical limit most of the paths will not contribute tothe path integral, since variations in the action from the neighbouring paths will tend to “washout” the contribution. The only paths which retain their importance are those where the action isstationary,i.e., where the action does not change under small variations in the path.

The stationary paths are characterized byδS = 0, with

δS =

tf∫ti

(∂L

∂qiδqi +

∂L

∂qiδqi)dt

=

tf∫ti

[∂L∂qi

− d

dt(∂L

∂qi)]δqidt (1.81)

In this expressionδqi denotes an (infinitesimal) variation in the path, andδqi the correspondingvariation in the time derivative. The last expression in (1.81) is found by a partial integration and

Page 24: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

24 CHAPTER 1. QUANTUM FORMALISM

applying the constraint on the variation that it vanishes in the end points,δqi(ti) = δqi(tf ) = 0.This constraint follows from the fact that the end points of the paths are fixed by the coordinatesof the propagator (1.89).

Thus, the important paths are those with stationary action, and these satisfy the Euler-Lagrangeequations,

∂L

∂qi− d

dt(∂L

∂qi) = 0 , i = 1, 2, ..., N (1.82)

sinceδS should be0 for all (infinitesimal) variations. In a Lagrangian formulation of the classicaltheory of a system with dynamics determined byL(q, q), these are exactly the classical equationsof motion. A simple example is provided by a particle moving in a three-dimensional potential,where

L(~r, ~r) =1

2~r

2 − V (~r) (1.83)

In this case it is straight forward to check that the Euler-Lagrange equations give the Newton’s2. law, in the form

m~r = −~∇V (~r) (1.84)

1.2.5 A semiclassical approximation

As discussed above, in the classical limit the solutions of the classical equations of motion appearas the only paths important to the path integral, since the action takes stationary values at thesepaths. This motivates a semiclassical approach where we include in the path integral only thepaths that are close to classical solutions. We write such a path in the form

q(t) = qcl(t) + η(t) (1.85)

whereq(t) may represent a set of coordinatesqi(t), i = 1, ..., N, and where the deviation froma classical solution,η(t), is assumed to be small. Sinceqcl(t) is supposed to satisfy the correctboundary conditions,η(t) should vanish at the endpoints of the action integral.

We introduce the approximation by assuming that the action can be expanded to second orderin η(t) and that higher orders can be neglected. Thus,

S[q(t)] = S[qcl(t)] + ∆S[q(t)] (1.86)

with

∆S[q(t)] =∑ij

∫dt

[∂2L

∂qi∂qjηiηj + 2

∂2L

∂qi∂qjηiηj +

∂2L

∂qi∂qjηiηj

]

≡∑ij

∫dt [Aij ηiηj + 2Bijηiηj + Cijηiηj] (1.87)

Page 25: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.2. QUANTUM DYNAMICS 25

1

2

SP

Figure 1.3:The double slit experiment. A beam of electrons is sent from the sourceS through a screenwith two holes. The intensity of electrons is then registered on a second screen behind the first one. Theintensity variations can be viewed as due to interference of the partial waves from the two slits. In thepath integral approach the intensity at a given pointP is determined by the difference in action of two(classical) paths from the source, passing through each of the openings, and and ending both inP .

In this approximation, withη(t) as a new set of variable, the Lagrangian is quadratic in thecoordinates and velocities. The path integral for such a Lagrangian can (in principle) be evaluatedand the general form is

G(qt, q0t0) = NeihS[qcl(t)] (1.88)

whereN is the contribution from the integral over pathsη(t). If the coefficientsAij,Bij andCijare time independent along the pathN will only depend of the length of the time interval, as wehave already seen for a free particle.

In some cases there may be more than one classical path connecting the two points(q0, t0)and(q, t). In that case the path integral is given by a sum over the classical paths

G(qt, q0t0) =∑cl

NcleihS[qcl(t)] (1.89)

and the strength of the transition amplitude depends on whether the contributions from differentpaths interfer constructively or destructively.

1.2.6 The double slit experiment revisited

As a simple example let us apply the semiclassical approximation to the double slit experiment,schematically shown in Fig.1.3, and compare with conditions for constructive interference asderived in an elementary way from interference between partial waves.

We then consider a beam of electrons with sharply defined energy that impinge on a screenwith two holes, as shown in the figure. Those of the electrons that pass the holes are registered on

Page 26: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

26 CHAPTER 1. QUANTUM FORMALISM

a second screen. When the experiment is running, an intensity distribution will gradually buildup, and this will show an interference pattern in accordance with the quantum mechanical wavepicture of the electrons.

When viewed as a propagating wave, the interference pattern is determined by the relativephase of the two partial waves, emerging from the two holes, at any given position on the screen.Since the phase of each wave (at a given time) is determined by the distance from the holemeasured in wave lengths, the condition for constructive interference is

∆L = nλ (1.90)

where∆L is the difference in length of the two straight lines from the holes to the point consid-ered on the second screen.λ is the de Broglie wave length of the electrons andn is an integer.

In the path integral description the conditions for constructive interference also depends onlength of paths, but now in a slightly different way. We consider paths beginning at the electronsource, with positionxP , at timet = 0 and ending at a point P on the second screen, with positionxP at a later timet = T . In principle all paths, passing through one of the two holes contributesto the path integral, but in the semiclassical approximation we only include contributions fromthe classical paths and those nearby. In this case there are two classical paths to a given point.The first one describes a free particle moving to the upper hole where it is scattered and sent asa free particle towards the pointP . On the second path the electrons is instead scattered in thelower hole and ends up in the same pointP .

Note that all paths that contribute corresponds to the same timeT spent between the initialand final point. This means that if one of the paths is longer, the particle has to move morerapidly on that path.

If we denote the length of the upper path asL1 ≡ L, the action of this path can be written as

S1 =1

2mv2

1T =1

2mL2

T(1.91)

where we have used the fact that the velocity is constant along the classical path. The action isexpressed as a function of the varable path length and the fixed time length. If the length of thesecond path is written asL2 = L+ ∆L, the action of this path is

S2 =1

2

L22

T=

1

2

(L+ ∆L)2

T= S1 +m

L∆L

T+

1

2m

∆L2

T(1.92)

To reproduce the result (1.90) from the simple analysis, we have to make the assumption thatthe term proportional to∆L2 is negligible. That is so if

mv

h

∆L2

L<< 2π (1.93)

By use of the de Broglie relationp = mv = 2π/λ, we find that the inequality can be written as

∆L <<√Lλ (1.94)

Page 27: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.3. TWO-LEVEL SYSTEM AND HARMONIC OSCILLATOR 27

This is satisfied whenL is much longer than the other relevant distances in the set up.The path integral (1.89), gets contributions from two terms, and is

G(xP T, xS 0) = N(eihS1 + e

ihS2) = Ne

ihS1(1 + e

ih∆S) (1.95)

with ∆S = S2 − S1. This can under the above conditions be written as

∆S = mv∆L = 2πh∆L

λ(1.96)

where again the de Broglie relations have been used. The condition for constructive interferenceis

1

h∆S = 2πn (1.97)

and with the expression (1.96) for∆S the condition (1.90) is reproduced.Thus, in this simple example we can demonstrate explicitly the relation between the effect of

interference between partial waves on one side and interference through amplitudes associatedwith classical paths on the other side. In fact, the path integral formulation is like a geometricaloptics approach to wave propagation and is in a sense complimentary toHuygens principlewherethe motion of a wave is due the combined effect of all the possible partial waves that are producedat an earlier wave front.

1.3 Two-level system and harmonic oscillator

The (one-dimensional) harmonic oscillator is an important system to study, both in the context ofclassical and quantum physics. One reason is that in many respects it is the simplest system withnon-trivial dynamics to study, and physicists always like to reduce more complicated problemsto harmonic oscillators if possible. There are also many physical systems that are well describedas harmonic oscillators. All periodic motions close to (stable) equilibrium can be viewed asapproximate harmonic oscillator motions, and the modes of free (non-interacting) fields can beviewed, both classically and quantum mechanically, as harmonic oscillators.

Even if the harmonic oscillator, in many respects, can be regarded as the simplest systemto study, there is a quantum mechanical system that in a sense is even more fundamental, thetwo-level system. A special realization of this is the spin-half system already discussed. Thereare also other physical systems which, to a good approximation, can be regarded as a two levelsystem. One example is an atomic system with two (almost) degenerate ground states, where thedynamics can be described as a transition (tunneling) between these two states. Atomic clocksare quantum systems of this type. Also for transitions between atoms with many energy levelsoften only two of the levels will be active in the transition and a two-level model is adequate.

As opposed to the harmonic oscillator there is no classical analogue to the quantum two-levelsystem. Classical spin does of course exist, but that correspond to “high-spin” quantum systemsrather than to the simplest spin-half system. In recent years the interest for two-level systemshave increased with the interest for “quantum information”, since the fundamental ”qubit” isdescribed by a two-level system.

In this section we will study these two fundamental systems, first the two-level system.

Page 28: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

28 CHAPTER 1. QUANTUM FORMALISM

1.3.1 The two-level system

The Hilbert space of this system is two-dimensional. Let us denote by|k〉, k = 0, 1 thebasis vectors of an arbitrarily chosen orthonormal basis. Any state vector can be represented asa two-component, complex matrixΨ,

|ψ〉 =1∑

k=0

ψk |k〉 ⇒ Ψ =(ψ1

ψ0

)(1.98)

An operatorA acting in this space has four independent components and can be represented as a2× 2 matrix A,

A|k〉 =∑l

Alk |l〉 ⇒ A =(A11 A12

A21 A22

)(1.99)

It can be expressed in terms of the unit matrix and the Pauli spin matrices as,

A = a01+∑m

amσm (1.100)

with

a0 =1

2(A11 + A22) , a1 =

1

2(A12 + A21) ,

a2 =i

2(A12 − A21) , a3 =

1

2(A11 − A22) (1.101)

WhenA is hermitian, all the coefficientsak are real. The Pauli matrices are represented as

σ1 =(

0 11 0

)σ2 =

(0 −ii 0

)σ3 =

(1 00 −1

)They define the fundamental commutator relations of the observables of the two level system,

[σi, σj] = 2iεijkσk (1.102)

whereεijk is the Levi-Civita symbol, which is totally antisymmetric in the indicesijk and satisfyε123 = 1 . The Pauli matrices also satisfy theanti-commutationrelations

σi, σj = 2δij (1.103)

Let us consider a general (“rotated”) Pauli matrix

σn = n · σ =(

cos θ e−iφ sin θeiφ sin θ − cos θ

)(1.104)

with n as a (three-component) unit vector. The two eigenstates of this operator are

|n〉 =(e−iϕ cos θ

2

sin θ2

), | − n〉 =

(−e−iϕ sin θ2

cos θ2

)(1.105)

Page 29: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.3. TWO-LEVEL SYSTEM AND HARMONIC OSCILLATOR 29

with |n〉 as the “spin up” state in then direction and|−n〉 as the “spin down” vector. We notethatanyvector in the two-dimensional state space can, up to a normalization factor, be writtenin the form of |n〉 for some vectorn. This implies that thephysically distinctstate vectors ofthe two-level system have the topology of a two-dimensional sphere. They can be identifieduniquely by the three-dimensional unit vectorn. Note however, the difference with a classicalconfiguration space with the topology of a sphere. In the latter case all the points on the spherecorrespond to independent configurations. In the quantum two-level model, there are only twoindependent states, |0〉 (the“south pole”) and|1〉 (the“north pole”). All other points on the spherecorrespond to linear superpositions of these.

1.3.2 Spin dynamics and magnetic resonance

The spin realization of the two-level system has already been briefly discussed in the context ofthe Stern-Gerlach experiment. We here consider the spin dynamics in a constant magnetic fieldin more detail and proceed to show how to solve a time-dependent problem, where the spin issubject to a periodic field.5

The basic observables of the spin half-system are the three components of the spin vector

S = (h/2)σ (1.106)

whereσ is a vector matrix with the three Pauli matrices as components. They correspond to thethree space components of the spin vector.

The observableS we will identify as the spin of an electron. The corresponding magneticmoment is given by

µ =e

mecS (1.107)

wheree is the electron charge andme is the electron mass. The spin Hamiltonian is

H = − e

mecB · S. (1.108)

The Heisenberg equation of motion for the spin, which follows from (1.49) is

d

dtSH = ωcn× SH , (1.109)

where

ωc = − eB

mec. (1.110)

and n is a unit vectorB = Bn.6 Eq.(1.109) has exactly the same form as the classical spinpresession equation withωc as the presession frequency. A natural interpretation is then that the

5The effect of a magnetic field on the electron spin, with its corresponding changes in the atomic levels, is usuallyreferred to as theZeeman effect.

6ωc may be chosen to be positive by choosingeB negative. For the electron, withe < 0 this means choosingnin the direction ofB, so thatB is positive. For a particle with positive chargee n is chosen in theoppositedirectionof B, so thatB is negative.

Page 30: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

30 CHAPTER 1. QUANTUM FORMALISM

quantum spin presesses in the magnetic field in the same way as the classical spin. However, asour discussion of the Stern-Gerlach experiment has shown the non-commutativity of the differentcomponents ofS makes the quantum spin variable qualitatively different from the classical spin.

The expression for the Hamiltonian (5.72) can be simplified by the choice of coordinate axes.If we choose the (1,2,3) components of the Pauli spin matrices to correspond to the(x, y, z)directions in space, andeB to point in the positivez-direction, the Hamiltonian gets the form

H = − e

mecB Sz

=1

2hωcσz. (1.111)

The corresponding time evolution operator is

U(t) = e−ihHt = e−

i2ωcσzt (1.112)

and shows explicitely the rotation around thez-axis.We now proceed to examine a two-state problem with atime-dependentHamiltonian, which

is directly relevant for the so-calledspin magnetic resonance. The system is the spin variable of a(bound) electron in a magnetic fieldB which now, in addition to a constant part has an oscillatingpart,

B = B0 k +B1(cosωt i + sinωtj). (1.113)

BothB0 andB1 are constants. The oscillating field is due to a circularly polarized electromag-netic field interacting with the electron.

The time variation ofB gives rise to a time-dependent Hamiltonian. Usually a time-dependentproblem like this is highly non-trivial, and can only be solved within some approximation scheme.But the present problem can be solved exactly. We show this by rewriting the Hamiltonian in thefollowing form

H = − e

mec[B0 k +B1(cosωt i + sinωtj)] · S

= − e

mec[B0Sz +B1(cosωt Sx + sinωt Sy)]

= − e

mece−

ihωtSz [B0Sz +B1Sx]e

ihωtSz (1.114)

where the last expression follows from the commutation relations between the components ofS.This form for the Hamiltonian shows that it can be transformed to a time-independent form bya unitary transformation. In fact, the Hamiltonian can be transformed into the spin Hamiltonian(5.72) that we have already discussed.

In order to show this we perform thetime-dependenttransformation

|ψ(t)〉 → |ψ(t)〉1 = eihωtSz |ψ(t)〉 (1.115)

Page 31: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.3. TWO-LEVEL SYSTEM AND HARMONIC OSCILLATOR 31

The transformed state vector satisfies the modified Schrodinger equation

ihd

dt|ψ(t)〉1 = [e

ihωtSzHe−

ihωtSz − ωSz]|ψ(t)〉1 (1.116)

This is the Schrodinger equation for a time-independent Hamiltonian of the form

H1 = eihωtSzHe−

ihωtSz − ωSz

= − e

mec[B0Sz +B1Sx]− ωSz

=1

2h[(ω0 − ω)σz + ω1σx] (1.117)

where we have introducedω0 = −eB0/(mec) andω1 = −eB1/(mec). The transformationabove can be interpreted as changing to a rotating reference frame, where the magnetic fieldlooks time-independent.

If we now introduce the parameters

Ω =√

(ω0 − ω)2 + ω21 (1.118)

and

cos θ =ω0 − ω√

(ω0 − ω)2 + ω21

, sin θ =ω1√

(ω0 − ω)2 + ω21

(1.119)

the transformed Hamilton gets the form

H1 =1

2hΩ (cos θσz + sin θσx) (1.120)

It has the same form as (1.111) except the magnetic field is rotated by an angleθ relative to thez-axis. Thus, the time evolution in the transformed frame is simply

U1(t) = e−i2Ωt (cos θσz+sin θσx) (1.121)

Now the time evolution operator, in the original frame, can be found by applying the timedependent transformation in reverse,

U(t) = e−i2ωtσz U1(t)

= e−i2ωtσze−

i2Ωt(cos θσz+sin θσx) (1.122)

which in matrix form is

U(t) =

(e−

i2ωt 0

0 ei2ωt

)(cos Ωt

2− i cos θ sin Ωt

2−i sin θ sin Ωt

2

−i sin θ sin Ωt2

cos Ωt2

+ i cos θ sin Ωt2

)

=

((cos Ωt

2− i cos θ sin Ωt

2)e−

i2ωt −i sin θ sin Ωt

2e−

i2ωt

−i sin θ sin Ωt2e

i2ωt (cos Ωt

2+ i cos θ sin Ωt

2)e

i2ωt

)(1.123)

Page 32: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

32 CHAPTER 1. QUANTUM FORMALISM

5 10 15 20

0.2

0.4

0.6

0.8

1

ω1t

|c1|2

0.0

1.0

2.0

Figure 1.4:Rabi oscillations in a two-level system, caused by an oscillating field. The oscillation in theoccupation of the upper level is shown as a function of time. The oscillation is shown for three differentvalues of the detuning parameter(ω0 − ω)/ω1.

The above result shows that the time-varying magnetic fieldB1 will induce oscillations inthe spin between the two eigenstates|0〉 and|1〉 of the time-independent spin HamiltonianH0 =−heB0/(mec)σz,

|ψ(t)〉 = c0(t)|0〉+ c1(t)|1〉 (1.124)

Let us choose as initial conditions,co(0) = 1, c1(0) = 0, which means that the spin starts in theground state ofH0. This gives,

c0(t) = (cosΩt

2+ i cos θ sin

Ωt

2)e

i2ωt

c1(t) = −i sin θ sinΩt

2e

i2ωt (1.125)

The time-dependent occupation probability of the upper level is

|c1(t)|2 = sin2 θ sin2 Ωt

2(1.126)

The amplitude of the oscillations is

sin2 θ =ω2

1

(ω0 − ω)2 − ω21

(1.127)

The expression shows aresonance effect, when the frequency of the oscillating field matches theenergy difference between the two levels,ω = ω0. For this frequency the maximum value of|c1(t)|2 is 1, which means that there is a complete transitions between the two levels|0〉 and|1〉during the oscillations. The frequency of the oscillations is

Ω =√

(ω0 − ω)2 + ω21 (1.128)

Page 33: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.3. TWO-LEVEL SYSTEM AND HARMONIC OSCILLATOR 33

with a resonance value

Ωres = |ω1| = |eB1

mec| (1.129)

Thus, the frequency depends on the amplitude of the oscillating field. In Fig.1.3.2 the oscillationsare shown as a function of time

The model studied here has important applications in the context ofnuclear magnetic reso-nanceand in atomic beam physics. It is often referred to as theRabi effect, and the oscillationfrequency is called theRabi frequency.

1.3.3 Harmonic oscillator and coherent states

The Hamiltonian of a one-dimensional (quantum) harmonic oscillator we write in the standardway as

H =1

2m(p2 +m2ω2x2) (1.130)

which means that it is realized as the energy observable of a particle with massm in the oscillatorpotentalV = (ω2/2m)x2.

The most elegant approach to solve the energy eigenvalue problem is the algebraic methodshown in all introductory text books on quantum mechanics. It is based on the closed commutatoralgebra formed by the operatorsx, p andH, 7

[x, p] = ih ,[x, H

]= i

h

mp ,

[p, H

]= −ihmω2x (1.131)

This is conveniently reformulated in terms of theraisingandloweringoperators

a =1√

2mhω(mωx+ ip) , a† =

1√2mhω

(mωx− ip) (1.132)

which gives

H =1

2hω(aa† + a†a)) (1.133)

The commutator algebra is now reformulated as[a, a†

]= 1 ,

[H, a

]= −ahω ,

[H, a†

]= a†hω (1.134)

We briefly summerize the construction of energy eigen states. SinceH is a positive definiteoperator, there is a lowest energy state, which is annihilated bya,

a|0〉 = 0 (1.135)

7The commutator algebra of the observables defines aLie algebra. When the Hamiltonian belongs to such afinite-dimensional Lie algebra, the general methods for findingrepresentationsof the Lie algebra can be used toconstruct the eigenstates and find the eigenvalues of the Hamiltonian.)

Page 34: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

34 CHAPTER 1. QUANTUM FORMALISM

The repeated application ofa† on this state generate a series of states|n〉, n = 0, 1, 2, ..., whichaccording to the commutator withH all are energy eigenstates. These states form a complete setof states in the Hilbert space. The explicit action of the operators on these states follows fromthe algebraic relations and are here summerized as

a|n〉 =√n|n− 1〉

a†|n− 1〉 =√n|n〉 ,

H|n〉 = hω(n+1

2)|n〉 (1.136)

Expansion of state vectors|ψ〉 in the orthonormal basis|n〉 defines a representation whichwe shall refer to as then-representation. The transition between this representation and thestandard coordinate (orx-) representation is defined by the matrix elements

〈x|n〉 ≡ ψn(x) (1.137)

For givenn this corresponds to the energy eigenfunction in the coordinate representation. Werefer to standard treatments of the harmonic oscillator, where these eigenstates are expressed interms of Hermite polynomials.

After this brief reminder on standard treatments of the harmonic oscillator, we turn to themain team of this section, which is discussion of the so-calledcoherent states. These are definedas the eigenstates of the annihilation operatora,

a|z〉 = z|z〉 (1.138)

What is unusual about this definition of states is thata is not a hermitian operator (and therforenot an observable in the usual sense). However, the states|z〉 defined in this way do form acomplete set, in fact an overcomplete set, and they define a new representation, thecoherentstate representationwith many useful properties.

Note that, sincea is non-hermitian, the eigenvaluesz will in general be complex rather thanreal. Based on the relation betweena with x andp it is useful to writez as,

z =1√

2mhω(mω xc + ipc) (1.139)

This indicates thatz can be interpreted as a complexphase spacevariable, with Rez proportionalto x and Imz proportinal top. However, sincex has already been used for the eigenvalues ofxandp for p, we have introducedxc andpc for the phase-space components ofz. Such a distinctionis necessary, since|z〉 is not an eigenstate forx and p, although bothx andp will be stronglypeaked aroundxc andpc, respectively. Expressed in terms of the expectation values we havehave

〈z|a|z〉 = z , 〈z|a†|z〉 = z∗ (1.140)

which for the expectation values of position and momentum gives

〈x〉z = xc , 〈p〉z = pc (1.141)

Page 35: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.3. TWO-LEVEL SYSTEM AND HARMONIC OSCILLATOR 35

To study the coherent states further we first focus on the ground state of the harmonic oscil-lator which is a coherent state withz = 0, as follows from (1.135). For the ground state we havethe following expectation values forx andp,

〈x〉0 = 〈p〉0 = 0

〈x〉20 =h

2mω〈0|(a+ a†)(a+ a†)|0〉 =

h

2mω

〈p〉20 = −mωh2〈0|(a− a†)(a− a†)|0〉 =

mωh

2(1.142)

From this follows that the uncertainties inx andp for the ground state satisfy

∆x20∆p

20 =

h

2mω

mωh

2=h2

4(1.143)

This is the minimum value for the product allowed by Heisenberg’s uncertainty principle. Thus,the ground state is aminimum uncertaintystate. A similar calculation for the excited states showthat they are not,

∆x2n∆p

2n =

h2

4(2n+ 1)2 (1.144)

We shall proceed to show that allcoherent states are minimum uncertainty states. Since they areoptimally focussed inx andp the coherent states are the quantum states that are closest to theclassical states, which are defined as points in phase space.

To show this we introduce the unitary operator

D(z) = e(za†−z∗a)

= eih(pcx−xcp) (1.145)

wherez is a complex number, related toxc and pc (1.139). It is the quantum version of adisplacement operator in phase space. It transforma anda† as

D(z)†aD(z) = a+ z , D(z)†a†D(z) = a† + ζ∗ (1.146)

which is shown by use of the operator identity

eBAe−B = A+[B, A

]+

1

2

[B,[B, A

]]+ ... (1.147)

It acts onx andp in the following way

D(z)†xD(z) = x+ xc , D(z)†pD(z) = p+ pc (1.148)

which explains the interpretation ofD as a displacement operator in phase space.With the displacement operatorD acting on the ground state a continuum of new states can

be generated,

|z〉 = D(z)|0〉 (1.149)

Page 36: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

36 CHAPTER 1. QUANTUM FORMALISM

and it is straight forward to demonstrate from the above relations that these are coherent statesas defined by (1.138). The eigenvaluesz take all values in the complex plane, which means thatthere is one coherent state for each point in the (two-dimensional) phase space. Furthermore,sinceD(z) simply adds a constant to observablesx andp, which means thatx− 〈x〉 andp− 〈p〉are unchanged by the displacements, the shifted state|z〉 has the same uncertainy inx andp asthe ground state. Thus, all coherent states|z〉 are minimum incertainty states.

Coherent states in the coordinate representationThe coordinate representation of the coherent states are defined by

ψz(x) = 〈x|z〉 (1.150)

where thebra corresponds to a postion eigenstate and theketto a coherent state. Thez = 0 stateis the ground state of the harmonic oscillator and is known to have the gaussian form

ψ0(x) = (mω

πh)

14 e−

mω2hx2

(1.151)

This expression can readily be generalized to arbitrary coherent states, since they satisfy a lineardifferential equation

1√2mhω

(mωx+ hd

dx)ψz(x) = zψz(x) (1.152)

or,

d

dxψz(x) = (−mω

hx+

√2mω

hz) ψz(x) (1.153)

The equation has the solution

ψz(x) = Nze−(mω

2hx2−√

2mωh

zx) (1.154)

whereNz is az-dependent normalization factor. We rewrite it in the form

ψz(x) = N ′ze

−(mω2h

(x−xc)2− ihxpc) (1.155)

with a new normalization factorN ′z.

The phase space coordinatesxc andpc correspond to the real and imaginary parts ofz asgiven by (1.139) The normalization integral determinesN ′

z to be, up to a phase factor, the sameas the prefactor of the ground state wave function (1.151) . At this stage the phase factor isarbitrary. However, implicitely this phase has already been fixed by the definition (1.149). Toshow this we shall find the expression forψz(x) in alternative, more direct way.

ψz(x) = 〈x|D(z)|0〉= 〈x|e

ih(pcx−xcp)|0〉

= e−i

2hxcpc〈x|e

ihpcxe−

ihxcp)|0〉

= e−i

2hxcpce

ihpcxe−xc

ddx 〈x|0〉

= e−i

2hxcpce

ihpcxψ0(x− xc) (1.156)

Page 37: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.3. TWO-LEVEL SYSTEM AND HARMONIC OSCILLATOR 37

In this calculation we have used thatexp(− ihxcp) is a translation operator inx-space. With the

ground state wave function given by (1.151), the full expression for the coherent state in thex-representation is

ψz(x) = (mω

πh)

14 e−(mω

2h(x−xc)2− i

hxpc+

i2hxcpc) (1.157)

This expression agrees with (1.155) and also gives the expression for thex-independent phasefactor of the normalization factor.

Time evolution of coherent statesIn the Heisenberg picture the time evolution of the creation and annihilation operators are

a†(t) = U(t, 0)† a† U(t, 0)

= eiωa†aa†e−iωa

†a

= eiωta† (1.158)

and

a(t) = U(t, 0)† a U(t, 0)

= eiωa†aae−iωa

†a

= e−iωta (1.159)

From the last one follows,

a U(t, 0)|z〉 = e−iωtU(t, 0)a|z〉 = e−iωtz U(t, 0)|z〉 (1.160)

which gives the time evolution

U(t, 0)|z〉 = eiα(t)|e−iωtz〉 (1.161)

whereα(t) is an undetermined complex phase. The equation shows that a coherent state con-tinues to be a coherent state during the time evolution. This means that it keeps it property ofmaximal localization in the phase space variables. The motion is given by

z(t) = e−iωt z(0) (1.162)

which means for the real and imaginary parts

mω xc(t) = cosωt mω xc(0) + sinωt pc(0)

pc(t) = cosωt pc(0)− sinωt mω xc(0) (1.163)

This shows that the coherent state moves in such a way that the phase space variablesxc andpc change in exactly the same way as the variables of a classical harmonic oscillator. This isconsistent withEhrenfest’s theorem, sincexc andpc coincide with the expectation values〈x〉

Page 38: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

38 CHAPTER 1. QUANTUM FORMALISM

and 〈p〉. Since the coherent states keep their (maximal) localization, they are as close as we canget to classical states within the quantum description.

The coherent state representationThe coherent states are expressed in then-representation in the following way

〈n|z〉 = 〈n|D(z)|0〉= 〈n|e(za†−z∗a)|0〉= 〈n|e−

12|z|2eza

†e−z

∗a|0〉= e−

12|z|2〈n|eza†|0〉

= e−12|z|2〈n|

∞∑m=0

zm

m!(a†)m|0〉

= e−12|z|2 zn√

n!(1.164)

From this follows that the overlap between two coherent states is

〈z|z′〉 =∑n

〈z|n〉〈n|z′〉

= e−12(|z|2+|z′|2)

∑n

(z′z∗)n

n!

= e−12(|z|2+|z′|2)+z′z∗ (1.165)

and the absolute value is

|〈z|z′〉| = e−|z−z′|2 (1.166)

The coherent states corresponding to two different values ofz are not orthogonal states, but theoverlap falls off exponentially fast with the distance between the two points. This overlap givesa measure of the intrinsic uncertainty of the coherent state as a probability amplitude in phasespace.

An interesting property of the coherent states is that, even if they are not orthogonal, theysatisfy a completeness relation. To see this we calculate the integral∫

d2z|z〉〈z| =∫d2ze−|z|2 ∑

n,m

znz∗m√n!m!

|n〉〈m|

=

2π∫0

∞∫0

drre−r2 ∑n,m

r(n+m)

√n!m!

eiθ(n−m)|n〉〈m|

= 2π

∞∫0

dre−r2 ∑

n

r2n+1

n!|n〉〈n|

= π∑n

|n〉〈n|

= π 1 (1.167)

Page 39: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.3. TWO-LEVEL SYSTEM AND HARMONIC OSCILLATOR 39

We rewrite this as the completeness relation∫ d2z

π|z〉〈z| = 1 (1.168)

With the help of the completeness relation thecoherent state representationcan be definedas an alternative to the coordinate representation and the momentum representation. The wavefunction, which is a function of the complex phase space variablez, is defined by the state vector|ψ〉 as

ψ(z) = 〈z|ψ〉 (1.169)

and the inverse relation is

|ψ〉 =∫ d2z

π|z〉〈z|ψ〉 =

∫ d2z

π|z〉ψ(z) (1.170)

One of the implications of the above relation is that the coherent states do not form a linearlyindependent set of states, they form instead an overcomplete set. Thus,

|z〉 =∫ d2z′

π|z′〉〈z′|z〉 =

∫ d2z′

π|z′〉e−

12(|z|2+|z′|2)+z′∗z (1.171)

which demonstrates the lack of linear independence. A consequence of this is that an expansionof a state vector|ψ〉 in terms of the set of coherent states is not uniquely defined. Nevertheless,the expansion given by (1.170)is uniquebecause of constraints that implicitly are posed on thewave functionsψ(z). To se this we rewrite it in terms of then-representation

ψ(z) =∑n

〈z|n〉〈n|ψ〉

=∑n

〈n|ψ〉e−12|z|2 z

∗n√n!

≡ e−12|z|2f(z∗) (1.172)

The function

f(z∗) =∑n

〈n|ψ〉 z∗m√n!

(1.173)

is ananalytic functionof z∗ since it depends only onz∗ and not onz. This is the constrainton the wave functionsψ(z) that makes the coherent state representation well-defined,the wavefunctions are up to a common factore−

12|z|2 restricted to be analytic functions.

Thus, wave functions and observables of the originally one-dimensional problem can berewritten in terms of analytic functions defined on the two-dimensional phase space. One should,however, be aware of the fact that that several relations in this representation are unfamiliar, be-cause of the non-orthogonality between the basis states|z〉.

The coherent states are important in many respects because of their close relation with clas-sical states. They were introduced in the context of the quantum description of light, where theydescribe states ofclassical lightwithin the quantum theory.

Page 40: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

40 CHAPTER 1. QUANTUM FORMALISM

1.3.4 Fermionic and bosonic oscillators: an example of supersymmetry

There is a formal similarity between the two-level system and the harmonic oscillator which weshall examine in this section. To make the similarity explicit, we write the Hamiltonian of thetwo-level system as

HF =1

2hωσz (1.174)

and introduce the raising and lowering operators

b† = σ+ =1

2(σx + iσy) , b = σ− =

1

2(σx − iσy) (1.175)

In matrix form the operators are are

HF =1

2hω

(1 00 −1

), b† =

(0 10 0

), b =

(0 01 0

)We now have the algebraic relations

b, b†

= 1 , HF =

1

2hω

[b†, b

](1.176)

whereb, b†

is the anticommutatorbb† + b†b. The corresponding relations for a harmonic

oscillator are [a, a†

]= 1 , HB =

1

2hω

a†, a

(1.177)

We note that the (formal) transition between the two systems corresponds to interchanging com-mutators with anticommutators.

There are many physical realizations of these two systems. We will now focus on a simplemany-particle realization. Let us assume that a single state is available for many identical par-ticles. The state may be considered to be one of the field modes of a free field. The identicalparticles may be either fermions or bosons.

In the fermion case the the state space will contain only two states.|0〉 is thevacuum statewith no particle present and|1〉 is the excited state with one particle present. Due to the Pauliexclusion principle the single-particle state cannot be occupied by more than one particle. Theoccupation energy of this state ishω. With this interpretation of the two-level system the operatorb† is a creation operatorfor a fermion andb is an annihilation operator. The Hamiltonian isdefined so that the vacuum energy is− 1

2hω.

In the boson case there is an infinite number of states, since the single-particle state can beoccupied by an arbitrary number of particles. The states|n〉 now are interpreted as states withnbosons present. The operatora† is a creation operator for bosons anda an annihilation operator.The boson vacuum state has a vacuum energy+ 1

2hω which is the ground state energy of the

harmonic oscillator.

Page 41: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

1.3. TWO-LEVEL SYSTEM AND HARMONIC OSCILLATOR 41

a+

Q

b+

Ehω

0

1

2

3

4

5

Figure 1.5:The energy spectrum of a supersymmetric oscillator. The bosonic creation operatora† actsvertically, while the fermionic creation operatorb† acts in the diagonal direction. The superchargeQ isa symmetry operator that maps between pairs of degenerate excited states. The ground state which isnon-degenerate is annihilated byQ andQ†.

With the interpretation above in mind we may refer to the two-level system as afermionicoscillatorand the standard harmonic oscillator as abosonic oscillator.

In recent years the idea has been extensively developed that nature has a (hidden) symmetrybetween fermions and bosons calledsupersymmetry. There is at this stage no physical evidencefor the presence of this symmetry as a fundamental symmetry of nature. Nevertheless the ideahas been pursued, since supersymmetry is an important input instring theoriesandsupergravitytheories.

We discuss here a simple realization of supersymmetry (or Fermi-Bose symmetry) as a sym-metry for the two-level model and the quantum harmonic oscillator.

The symmetrized model has a Hamiltonian that can be written as a sum of the two Hamilto-nians

H = HF + HB =1

2hω[

a†, a

+[b†, b

]] (1.178)

Since the level spacing of the two Hamiltonians have been chosen to be equal there is a dou-ble degeneracy of all the excited levels, while the ground state is non-degenerate, as shown inFig.1.3.4.

The supersymmetry is made explicit in terms of asupercharge, defined as

Q =√hω a†b , Q† =

√hω ab† (1.179)

Together with the Hamiltonian it defines asupersymmetry algebraQ, Q†

= H

Page 42: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

42 CHAPTER 1. QUANTUM FORMALISM

[Q, H

]=

[Q†, H

]= 0

Q, Q

= 2Q2 = 0Q†, Q†

= 2Q†2 = 0 (1.180)

This is not a standard Lie algebra, since it involves both commutators and anticommutators. It isreferred to as agraded Lie algebra. Q andQ† are the fermionic (or odd) elements andH is thebosonic (or even) element of this graded algebra.

The supersymmetry gives, as a general feature, a vacuum energy which is0, due to cancel-lation of the contributions from the bosonic and fermionic variables. This type of cancellationis important insupersymmetric quantum field theories, where the divergent contributions to thevacuum energy are avoided.

Page 43: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

Chapter 2

Quantum mechanics and probability

2.1 Classical and quantum probabilities

In this section we extend the quantum description to states that include “classical” uncertaintiesin addition to quantum probabilities. Such additional uncertainties may be due to lack of a fullknowledge of the state vector of a system or due to a description of the system as member of astatistical ensemble. But it may also be due toentanglement, when the system is the part of alarger quantum system.

2.1.1 Pure and mixed states, the density operator

The state of a quantum system, described either as a wave function or an abstract vector inthe state space, has aprobability interpretation. Thus, the wave function is referred to as aprobability amplitudeand it predicts the result of a measurement performed on the system onlyin a statistical sense. The state vector therefore characterizes the state of the quantum system ina way that seems closer to the statistical description of a classical system than to a detailed, non-statistical description. However, in the standard interpretation, this uncertainty about the result ofa measurement performed on the system is not ascribed to lack of information about the system.We refer to the quantum state described by a (single) state vector, as apure stateand considerthis to containmaximum available informationabout the system. Thus, if we intend to acquirefurther information about the state of the system by performing measurements on the system,this will in general lead to a change of the state vector which we interprete as areal modificationof the physical state due to the action of the measuring device. It cannot be interpreted simply asa (passive) collection of additional information.1

1An interesting question concerns the interpretation of a quantum state vector|ψ〉 as being theobjectivestateassociated with asinglesystem. An alternative understanding, as stressed by Einstein, claims that the probabilityinterpretation implies that the state vector can only be understood as anensemble variable, it describes the state ofan ensemble of identically prepared systems. From Einstein’s point of view this means that additional information,beyond that included in the wave function, shouldin principle be possible to acquire when we consider singlesystems rather than ensembles.

43

Page 44: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

44 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

In reality we will often have less information about a quantum system than the maximallypossible information contained in the state vector. Let us for example consider the spin state ofthe silver atoms emerging from the oven of the Stern-Gerlach experiment. In principle each atomcould be in a pure spin state|n〉 with a quantized spin component in then-direction. However,the collection of atoms in the beam clearly have spins that are isotropically oriented in space andthey cannot be described by a singel spin vector. Instead they can be associated with a statisticalensemble of vectors. Since the spins are isotropically distributed, this indicates that the ensembleof spin vectors|n〉 should have a uniform distribution over all directionsn. However, since thevectors of this ensemble are not linearly independent, the isotropic ensemble is equivalent toanother ensemble with only two states, spin up and spin down inany fixed directionn, withequal probability for the two states.

We therefore proceed to consider situations where a system is described not by a single statevector, but by an ensemble of state vectors,|ψ〉1, |ψ〉2, ..., |ψ〉n with a probability distributionp1, p2, ..., pn defined over the ensemble. We may consider this ensemble to contain bothquan-tum probabilitiescarried by the state vectors|ψ〉k andclassical probabilitiescarried by thedistributionpk. A system described by such an ensemble of states is said to be in amixedstate. There seems to be a clear division between the two types of probabilities, but as we shallsee this is not fully correct. There are interesting examples of mixed states with no clear divisionbetween the quantum and classical probabilities.

The expectation value of a quantum observable in a state described by an ensemble of statevectors is

〈A〉 =n∑k=1

pk 〈A〉k =n∑k=1

pk〈ψk|A|ψk〉 (2.1)

This expression motivates the introduction of thedensity operatorassociated with the mixedstate,

ρ =n∑

k=n

pk |ψk〉〈ψk| (2.2)

The corresponding matrix, defined by reference to an (orthogonal) basis, is called thedensitymatrix,

ρij =n∑

k=n

pk 〈φi|ψk〉〈ψk|φj〉 (2.3)

The important point to note is thatall measurable information about the mixed state is containedin the density operator of the state, in the sense that the expectation value ofanyobservable canbe expressed in terms ofρ,

〈A〉 =n∑k=1

pk∑ij

〈ψk|φi〉〈φi|A|φj〉〈φj|ψk〉

=∑i

n∑k=1

pk〈φj|ψk〉〈ψk|A|φj〉

= Tr(ρA) (2.4)

Page 45: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2.1. CLASSICAL AND QUANTUM PROBABILITIES 45

The density operator satisfies certain properties, in particular,

a) Hermiticity : ρ† = ρ ⇒ pk = p∗kb) Positivity : 〈χ|ρ|χ〉 ≥ 0 for all |χ〉 ⇒ pk ≥ 0

c) Normalization : Tr ρ = 1 ⇒∑k

pk = 1 (2.5)

Thus, the eigenvalues are real and non-negative with a sum that equals 1. These conditions followfrom (2.2) with the coefficientspk interpreted as probabilities. We also note that

Tr ρ2 =∑k

p2k ⇒ 0 < Tr ρ2 ≤ 1 (2.6)

This inequality follows from the fact that for all eigenvaluespk ≤ 1, which means thatTr ρ2 ≤Tr ρ.

The pure states are the special case where one of the probabilitiespk is equal to1 and theothers are0. In this case the density operator is the projection operator on a single state,

ρ = |ψ〉〈ψ| ⇒ ρ2 = ρ (2.7)

ThereforeTr(ρ2) = 1 for a pure state, while for all the (truly) mixed statesTr(ρ2) < 1.A general density matrix can be written in the form

ρ =∑k

pk|ψk〉〈ψk| (2.8)

where the states|ψk〉 may be identified as members of a statistical ensemble of state vectorsassociated with the mixed state. Note, however, thatthis expansion is not unique. There aremany different ensembles that give rise to the same density matrix. This means that even if allinformation about the mixed state is contained in the density operator in order to specify theexpectation value of any observable, there may in principle be additional information availablethat specifies the physical ensemble to which the system belongs.

An especially useful expansion of a density operator is the expension in terms of its eigen-states. In this case the states|ψk〉 are orthogonal and the eigenvalues are the probabilitiespkassociated with the eigenstates. This expansion is unique unless there are eigenvalues with de-generacies.

As opposed to the pure states, the mixed states are not providing the maximal possible in-formation about the system. This is due to the classical probabilities contained in the mixedstate, which to some degree makes it similar to a statistical state of a classical system. Thus,additional information may in principle be obtainable without interacting with the system. Forexample, two different observers may have different degrees of information about the systemand therefore associate different density matrices to the system. By exchanging information theymay increase their knowledge about the system without interacting with it. If, on the other hand,two observers havemaximalinformation about the system, which means that they both describeit by a pure state, they have to associate the same state vector with the system in order to have aconsistent description.

Page 46: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

46 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

A mixed state described by the density operator (2.8) is sometimes referred to as anincoher-entmixture of the states|ψk〉. A coherentmixture is instead asuperpositionof the states,

|ψ〉 =∑k

ck|ψk〉 (2.9)

which then represents a pure state. The corresponding density matrix can be written as

ρ =∑k

|ck|2|ψk〉〈ψk|+∑k 6=l

c∗kcl|ψk〉〈ψl| (2.10)

Comparing this with (2.8) we note that the first sum in (2.10) can be identified with the incoherentmixture of the states (withpk = |ck|2). The second sum involves theinterferenceterms of thesuperposition and these are essential for the coherence effect. Consequently, if the off-diagonalmatrix elements of the density matrix (2.10) are erased, the pure state is reduced to a mixed statewith the same probabilitypk for the states|ψk〉.

The time evolution of the density operator for an isolated (closed) system is determined bythe Schrodinger equation. As follows from the expression (2.8) the density operator satisfies thedynamical equation

ih∂

∂tρ =

[H, ρ

](2.11)

This looks similar to the Heisenberg equation of motion for an observable (except for a signchange), but one should note that Eq.(2.11)is valid in the Schrodinger picture. In the Heisenbergpicture, the density operator, like the state vector is time independent.

One should note the close similarity between Eq.(2.11) and Liouville’s equation for the clas-sical probability densityρ(q, p) in phase space

∂tρ = ρ,HPB (2.12)

where, PB is the Poisson bracket.

2.1.2 Entropy

In the same way as one associatesentropywith statistical states of a classical system, one as-sociates entropy with mixed quantum states as a measure of the lack of (optimal) informationassociated with the state. Thevon Neuman entropyis defined as

S = −Tr(ρ log ρ) (2.13)

Rewritten in terms of its eigenvalues ofρ it has the form

S = −∑k

pk log pk (2.14)

Page 47: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2.1. CLASSICAL AND QUANTUM PROBABILITIES 47

which shows that it is closely related to the entropy defined in statistical mechanics and in infor-mation theory.2

The pure states are states with zero entropy. For mixed states the entropy measures “how faraway” the state is from being pure. The entropy increases when the probabilities get distributedover many states. In particular we note that for a finite dimensional Hilbert space, a maximalentropy state exist where all states are equally probable. The corresponding density operator is

ρmax =1

n

∑k

|k〉〈k| = 1

n1 (2.15)

wheren is the dimension of the Hilbert space and|k〉 is an orthonormal set of basis vectors.Thus, the density operator is proportional to a projection operator that projects on the full Hilbertspace. The corresponding maximal value of the entropy is

Smax = log n (2.16)

A thermal stateis a special case of a mixed state, with a (statistical)Boltzman distributionover the energy levels. It is described by atemperature dependentdensity operator of the form

ρ = Ne−βH (2.17)

whereβ = 1/(kBT ) andN is a normalization factor. It is given by

N−1 = Tre−βH =∑k

e−βEk (2.18)

in order to giveρ the correct normalization (2.5) consistent with the probability interpretation.In the above expressionEk are the energy eigenvalues of the system. The close relation betweenthe normalization factor and thepartition functionin (classical) statistical mechanics is apparent.

Thequantum statistical mechanicsis based on the definition of density matrices associatedwith statistical ensembles. Thus, the density matrix (2.17) is associated with thecanonical en-semble. Furthermore thethermodynamic entropyis, in the quantum statistical mechanics, iden-tical to the von Neuman entropy (2.13) apart from a factorkB, theBoltzmann constant. We willin the following make some futher study of the entropy, but focus mainly on itsinformationcon-tent rather than thermodynamic relevance. Whereas the thermodynamic entropy is most relevantfor systems with a large number of degrees of freedom, the von Neuman entropy (2.13) is alsohighly relevant for small systems in the context of quantum information.

2.1.3 Mixed states for a two-level system

For the two-level system we can give an explicit (geometrical) representation of the densitymatrices of mixed (and pure) states.

2In statistical mechanics the natural logarithm is usually used in the definition of entropy, but in informationtheory the base 2 logarithm is more common. There is no basic difference between these choices, since the differenceis only a state independent, constant factor.

Page 48: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

48 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

Since the matrices are hermitian, with trace1, they may be written as

ρ =1

2(1+ r · σ) (2.19)

wherer is a three-component vector an1 is the 2x2 unit matrix. The condition for this to be apure state is

ρ2 = ρ

⇒ 1

4(1+ r2 + 2r · σ) =

1

2(1+ r · σ)

⇒ r2 = 1 (2.20)

Thus, a pure state can be written as

ρ =1

2(1+ n · σ) (2.21)

with n as a unit vector. This represents a projection operator that projects on the one-dimensionalspace spanned by thespin upeigenvector ofn·σ. We note that this is consistent we the discussionof section 1.3.1, where it was shown thatany statein the two-dimensional space would be thespin up eigenstate of the operatorn · σ for some unit vectorn.

For a general, mixed state we can write the density matrix as

ρ =1

2(1+ rn · σ)

=1

2(1 + r)

1+ n · σ2

+1

2(1− r)

1− n · σ2

=1

2(1 + r)P+(n) +

1

2(1− r)P−(n) (2.22)

with r = rn andP±(n) as matrices that project on spin up (+) or spin down (-). This means thatthe eigenvectors ofρ, also in this case are the eigenvectors ofn · σ, but the eigenvaluep+ of thespin up state and the eigenvaluep− of the spin down state are given by

p± =1

2(1± r) (2.23)

The interpretation ofp± as probabilities means that they have to be positive (and less than one).This is satisfied ifr ≤ 1. This result shows that all the physical states, both pure and mixed canbe represented by points in a three dimensional sphere of radiusr = 1. The points on the surface(r = 1) correspond to the pure states. Asr decreases the states gets less pure and forr = 0 wefind the state of maximal entropi. The entropi, defined by (2.14), is a monotonic function ofr,with maximum (log 2) for r = 0 and minimum (0) forr = 1. The explicit expression for theentropy is

S = − (1

2(1 + r) log

1

2(1 + r) +

1

2(1− r) log

1

2(1− r) )

= −1

2log[(1 + r)1+r(1− r)1−r] (2.24)

Thesphere of statesfor the two-level system is called theBloch sphere. More generally theset of density matrices of a quantum system can be shown to form aconvex set.

Page 49: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2.2. ENTANGLEMENT 49

0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

S

r

Figure 2.1:The entropy of mixed states for a two level system. The entropyS is shown as a function ofthe radial variabler for theBloch sphere representationof the states.

2.2 Entanglement

In this section we considercomposite systems, which can be considered as consisting of two ormoresubsystems. These subsystems will in general becorrelateddue to interactions betweenthe two parts. A quantum system may have correlations that in some sense are stronger thanwhat is possible in a classical system. We refer to this effect as quantumentanglement. This isconsidered as one of the clearest marks of the difference between classical and quantum physics.

2.2.1 Composite systems

Up to this point we have mainly considered anisolated quantum system. Only the variables ofthis system then enters in the quantum description, in the form of of state vectors and observ-ables, while the dynamical variables of other systems are irrelevant. As long as the system staysisolated all interactions with other systems are negligible and the time evolution is described bythe Schrodinger equation.

With maximal information about the system we describe it by a pure state (a state vector),while with less than maximal information we describe it by a mixed state (a density matrix). Ifthe system is in a pure state, it will continue to be in a pure state as long as it stays isolated. Fora mixed state, the degree of “non-purity” measured by the entropy will stay constant as long asit is isolated. This follows from the fact that the time evolution is unitary and the eigenvalues ofthe density operator therefore do not change with time.

Clearly an isolated system is an abstraction since interactions with other systems (generallyreferred to as theenvironment) can never be totally absent. However, in some cases this ide-alization works perfectly well. If interactions with the environments cannot be neglected, alsovariables associated with the environment have to be taken into account. If these act randomly onthe system they tend to introducedecoherencein the system. The state develops in the directionof being less pure,i.e., the entropy increases. However, a systematic manipulation with the sys-tem may change the state in the direction of being more pure. Thus, a partial or full measurement

Page 50: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

50 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

performed on the system will be of this kind.Even if a systemA cannot be considered as isolated, sometimes it will be a part of alarger

isolated systemS. We will consider this sitation and assume that the total systemS can bedescibed in terms of a set of variables for systemA and a set of variables for the rest of thesystem, which we denoteB. We assume that these two sets of variables can be regarded asindependent (they are associated with independent degrees of freedom for the two subsystemsAandB), and that the interactions betweenA andB do not destroy this relative independence. Wefurther assume that the totality of dynamical variables forA andB provides a complete set ofvariables for the full system. We refer to this asa composite system.

Correlations is a typical feature of interacting systems, both at the classical and quantumlevel. Thus, the interactions between the two parts of a composite system will in general correlatetheir behaviour. Such correlations will persist after the two subsystems have ceased to interact,and when that is the case they bring information about interactions in the past. For a classicalsystem the correlations are often expressed in terms ofstatistical correlation functions, whichcan be expressed in terms of mean values of products of variables belonging to the two systems.In quantum mechanics the corresponding correlation functions are expectation values of productsof observables for the subsystems. In the following we focus on such correlations with the aim ofstudying thedifferencebetween quantum and classical correlations. We first give a brief resumeof classical statistical correlations.

2.2.2 Classical statistical correlations

We consider a physical systemS with two clearly distinguishable subsystemsA andB. Let usassumeA to be a variable of systemA which we for simplicity assume to take discrete valuesa1, a2, .... SimilarlyB is a variable of subsystemB which can take valuesb1, b2, .... In a statisticaldescription these variables may each be associated with a probability distribution, so thatpAk isthe probability for the variableA to take the valueak andpBl is the probability for the variableBto take the valuebl. To study correlations between the subsystems we introducejoint probablities,wherepkl measures the probability for the the variableA to take the valueak and the variableBto take the valuebl. Clearly we have the relations

pAk =∑l

pkl , pbl =∑k

pkl (2.25)

The statistical mean values for each subsystem then are given by

〈A〉 =∑k

pAk ak , 〈B〉 =∑l

pBl bl (2.26)

And for the composite system the mean value of the product variableAB is

〈AB〉 =∑kl

pklakbl (2.27)

The subsystems of the composite system are uncorrelated when all the information about thestatistical mean values are contained in the probability distributionspAk andpBl . This means

Page 51: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2.2. ENTANGLEMENT 51

that the joint probability has the product form

pkl = pAk pBl (2.28)

and the mean value of the productAB therefore also factorizes,

〈AB〉 = 〈A〉 〈B〉 (2.29)

If the probability distribution, and thereby the mean values, do not factorize in this way, thatexpresses the presence of statistical correlations between the subsystems. The correlations canthen be studied in the form of correlation functions defined by

C(A,B) = 〈AB〉 − 〈A〉 〈B〉 (2.30)

We note that if we have full information about the composite systemS, and thereby fullinformation about the subsystemsA andB, then the the value of the productAB will triviallyfactorize. In this sense there are no correlations between the subsystems. Thus, for statisticalcorrelations between the subsystems to be a meaningful concept, the state of the full system aswell as of its parts has to be associated with statistical ensembles. For a quantum system this isdifferent. Even if we have full information about the composite system, which means that it is ina pure state, we do not necessarily have full information about its parts. Therefore the subsystemsmay have non-trivial correlations, and with the full system in a pure state, these correlations areof non-classical nature. This type of correlations is what we refer to as entanglement.

2.2.3 States of a composite quantum system

In mathematical terms we describe the Hilbert spaceH of a composite quantum system as atensor productof two Hilbert spacesHA andHB associated with the two subsytems,

H = HA ⊗HB (2.31)

A basis|i〉 of HA and a basis|α〉 of HB then defines a tensor product basis forH,

|iα〉 = |i〉 ⊗ |α〉 (2.32)

A general vector inH is of the form

|χ〉 =∑iα

ciα|i〉 ⊗ |α〉 (2.33)

and in general it isnota tensor product of a vector fromHA with a vector fromHB.The observables of systemA and the observables of systemB definecommuting observables

of the full system. This follows directly from how they act on the basis vectors (2.32). WithAas an observable for systemA andB as an observable for systemB, we have

A|iα〉 =∑j

Aji|jα〉 , B|iα〉 =∑β

Bβα|iβ〉 (2.34)

Page 52: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

52 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

Clearly

AB|iα〉 =∑jβ

AjiBβα|jβ〉 = BA|iα〉 (2.35)

The expectation values of observables acting only on subsystemA are determined by thereduced density matrixof systemA. This is obtained by taking thepartial tracewith respect tothe coordinates of systemB,

ρA = TrB(ρ) , ρAij =∑α

ρiα jα (2.36)

In the same way we define the reduced density matrix for systemB

ρB = TrA(ρ) , ρBαβ =∑i

ρiα iβ (2.37)

In general the reduced density matrices will correspond to mixed states for the subsystemsA andB, even if the full system is in a pure state.

For a composite system there exist certain relations between the entropyS(ρ) of the fullsystem and the entropies of the subsystems,S(ρA) andS(ρB). Thus, for abipartite system (asystem consisting of two parts), like the one considered above, the following inequalities aresatisfied,

S(ρ) ≤ SA + SB , S(ρ) ≥ |SA − SB| (2.38)

We note in particular the first inequality. This may seem somewhat surprising, since it indicatesthat there are states of a composite system that are more pure than the states of its subsystems. Acorresponding situation for a classical system would be rather paradoxical since it would meanthat the total system could be in a state that was more ordered than the states of its subsystems.However, for a quantum system this may be the case. In particular, the full system may in a purestate, withS = 0, while the subsystems may be in mixed states, so that bothSA andSB arenon-vanishing and positive. In this sense maximal information about the full quantum systemdoes not imply maximal information about its parts.

2.2.4 Correlations and entanglement

A product state of the form

ρ = ρA ⊗ ρB (2.39)

is a state withno correlationsbetween the subsystemsA andB. For any observableA actingonA andB acting onB the expectation value of the product operatorAB, is then simply theproduct of the expectation values

〈AB〉 = 〈A〉A 〈B〉B (2.40)

Page 53: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2.2. ENTANGLEMENT 53

Written in terms of the density operators,

Tr(ρAB) = TrA(ρAA)TrB(ρBB) . (2.41)

Conversely, if the product relation (2.40) is correct forall observablesA acting onA andallobservablesB acting onB, the two subsystems are uncorrelated and the density operator can bewritten in the product form (2.40).

We next consider a (mixed) state of the form

ρ =∑k

pk ρAk ⊗ ρBk . (2.42)

With pk as a probability distribution, this density matrix can be viewed as describing astatis-tical ensembleof product states,i.e., of states which, when regarded separately, do not containcorrelations between the two subsytems. However, the density matrix of the ensembledoescontain (statistical) correlations between the two subsystems. Thus, the expectation value of aproduct of observables forA andB is

〈AB〉 =∑k

pk 〈A〉k A 〈B〉k B

=∑k

pkTrA(ρAk A)TrB(ρBk B) (2.43)

and this is generally different from the product of expectation values,

〈A〉A 〈B〉B =∑k

pk 〈A〉k A∑l

pl 〈B〉l B

=∑k l

pk pl TrA(ρAk A)TrB(ρBl B) . (2.44)

The correlations contained in a density operator of the form (2.42) we may refer to asclassi-cal correlations, since these are of the same form that we find in a classical, statistical descriptionof a composite system. Such a state we refer to asseparable. However, in a quantum systemthere may be correlations that cannot be written in the form (2.42) and which are therefore ofgenuinely quantum mechanical nature. Composite systems with this kind of correlations aresaid to beentangled, and in some sense the correlations that are found in entangled states arestronger than can be achieved due to classical statistical correlations. In recent years there hasbeen much effort devoted to give a concise quantiative meaning to entanglement in compositesystems. However, for bipartite systems in mixed states and for multipartite systems in general,it is not obvious how to quantify deviations from classical correlations. But for a bipartite systemin a pure state a unique definition of entanglement in terms of the entropy of the reduced densitymatrices of the subsystems. We focus on this case.

We therfore now consider a generalpurestate of the total systemA+B. The entropy of sucha state clearly vanishes. The state has the general form

|χ〉 =∑iα

ciα|i〉A ⊗ |α〉B (2.45)

Page 54: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

54 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

but can also be written in the diagonal form

|χ〉 =∑n

dn|n〉A ⊗ |n〉B (2.46)

where|n〉A is a set of orthonormal states for theA system and|n〉B is a set of orthonormalstates for theB system. This form of the state vector as a simple sum over product vectors ratherthan a double sum is referred to as theSchmidt decompositionof |χ〉. It will be shown below thatsuch an expansion is generally valid for a composite bipartite system.

The form of the state vector|χ〉 is similar to that of density operator (2.42) for the correlatedstate. In the same way as for the density operator the system is uncorrelated if the sum includesonly one term,i.e., if the state has a product form. However, the correlations implied by thegeneral form of the state vector (2.46) are different from those of the (classically) correlatedstate (2.42). From the Schmidt decomposition follows that the density matrix of the full systemcorresponding to the pure state|χ〉 is

ρ =∑nm

dnd∗m |n〉A〈m|A ⊗ |n〉B〈m|B (2.47)

This is not of the form (2.42). The state (2.46) hascorrelations between state vectorsof the twosubsystems rather than correlations only between density operators.

From (2.47) also follows that the reduced density operators of the two subsystems have theform

ρA =∑n

|dn|2|n〉A〈n|A , ρB =∑n

|dn|2|n〉B〈n|B (2.48)

The expressions show that all the eigenvalues of the two reduced density matrices are the same.This implies that the von Neuman entropy of the two subsystems is the same,

SA = SB = −∑n

|dn|2 log |dn|2 (2.49)

In this case, with the full systemA+B in a pure state, the entropy of one of the subsystems (A orB) is taken as a quantitative measure of the entanglement between the subsystems.

The Schmidt decompositionWe show here that a general state of the composite system can be written in the form (2.46). Thestarting point is the general expression (2.45). We introduce a unitary transformationU for thebasis of systemA,

|i〉A =∑m

Umi |m〉A (2.50)

and rewrite the state vector as

|χ〉 =∑iα

∑m

ciαUmi|m〉A ⊗ |α〉B

=∑m

|m〉A ⊗ (∑iα

ciαUmi|α〉B)

≡∑m

|m〉A ⊗ |m〉B (2.51)

Page 55: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2.2. ENTANGLEMENT 55

If the unitary transformation can be chosen to make the set of states|m〉 anorthogonalset ofstates, then (2.46) follows. The scalar product between two of the states is

〈n|m〉B =∑iα

∑jβ

c∗jβU∗njciαUmi〈β|α〉B

=∑ij

∑α

c∗jαU∗njciαUmi

= (UCC†U †)mn (2.52)

In the last expression the coefficientsciα have been treated as the matrix elements of the matrixC. We note that the matrixM = CC† is hermitian and positive definite, and the unitary matrixUcan therefore be chosen to diagonalizeM . With the (non-negative) eigenvalues written as|dn|2we have

〈n|m〉B = |dn|2δmn (2.53)

and if the normalized vectors|m〉 are introduced by

|m〉 = dm|m〉 (2.54)

then the Schmidt form (2.46) of the vector|χ〉 is reproduced with the basis sets of both systemsA andB as orthonormal vectors.

2.2.5 Entanglement in a two-spin system

As an example we consider entanglement in the simplest possible composite system, which is asystem that consists of two two-level subsystems (for example two spin-half systems). We shallconsider several sets of vectors with different degrees of entanglement.

An orthogonal basis of product states is given by the four vectors

| ↑ ↑〉 = | ↑〉A ⊗ | ↑〉B| ↑ ↓〉 = | ↑〉A ⊗ | ↓〉B| ↓ ↑〉 = | ↓〉A ⊗ | ↑〉B| ↓ ↓〉 = | ↓〉A ⊗ | ↓〉B (2.55)

where| ↑〉, | ↓〉 is an orthonormal basis set for the spin-half system.Another basis is given by states with well-defined total spin. As is well-known by the rules

of addition of angular momentum, two spin half systems will have states of total spin0 or 1. Thespin0 state is the antisymmetric (spin singlet) state

|0〉 =1√2(| ↑ ↓〉 − | ↓ ↑〉) (2.56)

while spin1 is described by the symmetric (spin triplet) states

|1, 1〉 = | ↑ ↑〉 , |1, 0〉 =1√2(| ↑ ↓〉+ | ↓ ↑〉) , |1,−1〉 = | ↓ ↓〉 (2.57)

Page 56: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

56 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

Clearly the two states|1, 1〉 = | ↑ ↑〉 and|1,−1〉 = | ↓ ↓〉 are product states with no correlationbetween the two spin systems. However, the two remaining states|0〉 and|1, 0〉 are entangled.These states may be included as two of the states of a third basis of orthonormal states, calledtheBell states. They are defined by

|a,±〉 =1√2(| ↑ ↓〉 ± | ↓ ↑〉)

|c,±〉 =1√2(| ↑ ↑〉 ± | ↓ ↓〉) (2.58)

and are all states ofmaximal entanglementbetween the two subsystems. We note that the twospins are strictlyanticorrelatedfor the first two states (|a,±〉) and strictlycorrelatedfor the twoother states (|c,±〉).

Let us focus on the spin singlet state|a,−〉. It is a pure state (of systemA+B) with densitymatrix

ρ(a,−) = |a,−〉〈a,−|

=1

2(| ↑ ↓〉〈↑ ↓ |+ | ↓ ↑〉〈↓ ↑ | − | ↑ ↓〉〈↓ ↑ | − | ↓ ↑〉〈↑ ↓ |)

(2.59)

The corresponding reduced density matrices of both systems A and B are of the same form

ρA = ρB =1

2(| ↑〉〈↑ |+ | ↓〉〈↓ |) (2.60)

The symmetric stateρ(a,+) has the same reduced density matrices asρ(a,−). In fact this istrue for all the four Bell states. Thus, the information contained in the density matrices of thesubsystems do not distinguish between these four states of the total system.

The loss of information when we consider the reduced density matrices is demonstrated ex-plicitly in taking the partial trace of (2.59) with respect to subsystemA or subsystemB. The twolast terms in (2.59) simply do not contribute. If we leave out the two last terms of (2.59) we havethe following density matrix, which also have the same reduced density matrices,

ρ(a) =1

2(| ↑ ↓〉〈↑ ↓ |+ | ↓ ↑〉〈↓ ↑ |) (2.61)

This is still strictly anticorrelated in the spin of the two particles, but the correlation is nowclassicalin the sense that the full density matrix is written in the form (2.42). Thus, the termsthat are important for the quantum entanglement between the two subsystems are the ones thatare left out when we take the partial trace. These terms are the off-diagonalinterference termsof the full density matrix, and this means that we can view the entanglement as a special type ofinterference effect associated with the composite system.

When the reduced density operators are written as2× 2 matrices they have the form

ρA = ρB =1

2

(1 00 1

)(2.62)

Page 57: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2.3. QUANTUM STATES AND PHYSICAL REALITY 57

Thus they are proportional to the2 × 2 unit operator and therefore correspond to states withmaximal entropySA = SB = log 2. With the entropy of the subsystems taken as a measure ofentanglement this means that the spin singlet state is a state ofmaximal entanglement. This istrue for all the Bell states, as already has been mentioned.

Finally, note that the reduced density matrices (3.8) arerotationally invariant. This is how-ever not the case for the “classical” density matrix (2.61) of the full system. It describes a statewhere the spins of the two particlesalong one particular directionis strictly anticorrelated. Itis interesting to note that we cannot define a density matrix of this form which predicts strictanticorrelation for the spin inany direction. Also note that neither the pure state|a,+〉 is ro-tationally invariant. In fact the state|a,−〉 is the only Bell state that is rotationally inavriant,since it corresponds tospin 0. This makes the spin singlet state particularly interesting, since itdescribes a situation where the spins are strictly anticorrelatedalong any direction in space.

2.3 Quantum states and physical reality

In 1935 Einstein, Podolsky and Rosen published a paper where they focussed on a central ques-tion in quantum mechanics. This question has to do with the relation between the theformalismof quantum mechanics, with its probability interpretation, and the underlyingphysical realitythatquantum mechanics describes. They pointed out, by way of a simple thought experiment, thatone of the implications of quantum physics is that it blurs the distinction between theobjectivereality of natureand thesubjective descriptionused by the physicist. Their thought experimenthas been referred to as the EPR paradox, and it has challenged physicists in their understandingof the relation between physics and the reality of natural phenomena up to this day. As conclu-sion drawn from the paradox Einstein, Podolsky and Rosen suggested that quantum mechanicsis an incompletetheory of nature. Later, in 1964 John Bell showed that the conflict betweenquantum mechanics and intuitive notions of reality goes deeper. Quantum theory allows types ofcorrelations that cannot be found in classical theories that obey the basic assumptions oflocalityandreality.

In this section we examine the EPR paradox in a form introduced by David Bohm and proceedto examine how the limitations of classical theory is broken in the form of Bell inequalities. Asfirst emphasized by Erwin Schrodinger the basic element of quantum physics involved in theseconsiderations is that ofentanglement.

2.3.1 EPR-paradox

We consider a thought experiment, where two spin half particles (particle A and particle B) areproduced in a spin singlet state (with total spin zero). The two particles move apart, but sincethey are considered to be well separated from any disturbance, they keep their correlations sothat the spin state is left unchanged. The full wave function of the two-particle system can beviewed as the product of a spin state and a position state, but for our purpose only the spin state isof interest. Concerning the position state it is sufficient to know that after some time the particlesare separated by a large distance.

Page 58: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

58 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

At a point in time when the two particles are well separated, a spin measurement is performedon particle A. This will modify the spin state of this particle, but since particle B is far away,themeasurement cannot affect particle B in any real sense. However, and this is the paradox,thechange in the spin state caused by the measurement will also influence the possible outcomes ofspin measurements performed on particle B.

A B

entangled pairs

spin measurement

S

Figure 2.2: The set-up of an EPR experiment. Pairs of entangled particles are prepared in a singletspin state and sendt in opposite directions from a sourceS. ( In the figure the entangled pairs are shownconnected with dashed red curves.) The spin direction of each particle is, in this initial state, totallyundetermined. At some distance from the source one particle (A) from each pair passes a measuring device(shown by the green window) which measures the spin component in the z- direction. After passing themeasuring apparatus the particle appears with quantized spin in the z-direction, either spin up as shown inthe figure or with spin down. Due to the strict anti-correlation between the spin orientation of particle Aand particle B in a pair, at the same time as particle A appears with spin up in the z-direction particle B willappear with spin down along the z-axis. In this way the measuring of the spin of particle A effectively actsas a measurement of the spin of particle B (indicated by the dashed green window) even if no measuringdevice acts on this particle. Following EPR we make the following deduction. Since particle B has beensubject to no physical action, spin down along the z-axis for particle B after the measurement on A impliesthat B had spin up along the z-action also before the measurement (even if that was not known prior tothe measurement). But before the measurement was performed we could in principle have decided tomeasure the spin in the x-direction. If that was done the spin of particle B in the x-direction (red arrow)would have been quantized. In this way, by choosing how to perform the the measurement on A, we candecide whether to quantize the spin of B in the z- or the x-direction. Again, since no physical interactionwith B has taken place, this implies that the spin components both in the z- and the x-directions must havehad quantized, sharp (but unknown) values before the measurement. This conclusion is not consistentwith quantum mechanics, since the two components of the spin areincompatibleobservables, hence theparadox.

The experiment is schematically shown in Fig.(2.3.1). From the time when the particles areemitted and until the time when the spin experiment is performed the two particles are in theentangled state

|a,−〉 =1√2(| ↑ ↓〉 − | ↓ ↑〉) (2.63)

In this state the spin of the two particles are strictlyanticorrelated. This means that if particle

Page 59: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2.3. QUANTUM STATES AND PHYSICAL REALITY 59

A is measured to be in a spin up state, particle B is necessarily in a spin down state. But eachparticle, when viewed separately, is with equal probability found with spin up and spin down.

The reference tospin up( ↑) andspin down( ↓) in the state vector (2.63) seems to indicatethat we have given preference to some particular direction in space. However, the singlet state isrotationally invariant. Therefore the state is left unchanged if we redefine this direction in space.Thus, we do not have to specify whether thez-axis, thex-axis or any other direction has beenchosen, they all give rise to the same (spin0) state.

We now consider, in this hypothetical experiment, that the spin of particle A is measuredalong the z-axis. If the result isspin up, we know with certainty that the spin of particle B in thez-direction isspin down. Likewise, if the result of measurement on A isspin downthe spin ofparticle B in the z-direction isspin up. In both cases, a measurement of the z spin componentof particle A will with necessity project particle B into a state with well defined (quantized) z-component of the spin. Since no real change can have been introduced in the state of particle B(it is far away) it seems natural to conclude that the spin component along the z-axis must havehad a sharp (although unknown value) alsobeforethe measurement was actually performed onparticle A.

This conclusion is on the other hand in conflict with the standard interpretation of quantummechanics. Thus, if the the z-component of the spin of particle B has a sharp value beforethe measurement, it will have a sharp value even if the measurement along the z-axis is notperformed at all, and even if we choose to measure the spin along the x-axis instead. But nowthe argument can be repeated for the x-component of the spin. This will in the same way leadto the conclusion that the x-component of the spin of particle B has a sharp value before themeasurement. Consequently, both the z-component and the x-component of the spin must have asharp value before the measurement is performed on particle A. But this is not in accordance withthe standard interpretation of quantum mechanics. The z-component and the x-component of thespin operator areincompatible observables, since they do not commute as operators. Thereforethey cannot simultanuously be assigned sharp values.

The conclusion Einstein, Podolsky and Rosen drew from the thought experiment is that bothobservables will in reality have sharp values, but since quantum mechanics is not able to predictthese values with certainty (only with probabilities) quantum theory cannot be a complete theoryof nature. From their point of view there seem to be some missing variables in the theory. If suchvariables are introduced, we refer to the theory as a “hidden variable” theory.

Note the two elements that enter into the above argument:Locality — Since particle B is far away from where the measurement takes place we concludethat no real change can take place concerning the state of particle B.Reality— When the measurement on particle A makes it possible to predictwith certaintytheoutcome of a measurement performed on particle B, we draw the conclusion that this representsa property of B which is there even without actually performing a measurement on B.

From later studies we know that the idea of Einstein, Podolsky and Rosen that quantum theoryis an incomplete theory does not really resolve the EPR paradox. Hidden variable theories canin principle be introduced, but to be consistent with the predictions of quantum mechanics, theycannot satisfy the conditions of locality and reality used by EPR. Thus, there is a realconflictbetween quantum mechanics and basic principles of classical theories of nature.

Page 60: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

60 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

From the discussion above it is clear thatentanglementbetween the two particles is the sourceof the (apparent) problem. It seems natural to draw the conclusion that if a measurement isperformed on one of the partners of an entangled system there will be (immediately) a change inthe state also of the other partner. But if we phrase the effect in this way, one should be aware ofthe fact that the effect of entanglement is always hidden in correlations between measurementsperformed on the two parts. The measurement performed on particle A does not lead to any localchange at particle B that can be seen without consulting the result of the measurement performedon A. Thus, no change that can be interpreted as a signal sent from A to B has taken place. Thismeans that the immediate change that is affecting the total system A+B due to the measurementon A does not introduce results that are in conflict with causality.

2.3.2 Bell’s inequality

Correlation functionsThe EPR paradox shows that quantum mechanics is radically different from classical statisticaltheories when entanglement between systems is involved. In the original presentation of Einstein,Podolsky and Rosen it was left open as a possibility that a more complete description of naturecould replace quantum mechanics. However, by studying correlations between spin systems JohnBell concluded in 1964 that the predictions of quantum mechanics is in direct conflict with whatcan be deduced from a “more complete” theory with hidden variables, when this satisfy what isknown asEinstein locality. This was expressed in terms of a certain inequality (Bell inequality)that has to be satisfied by classical (local, realistic) theories. Quantum theory does not satisfythis inequality.3

To examine this difference between classical and quantum correlations we reconsider theEPR experiment, but focus on correlations between measurements of spin components performedon both particles A and B. Let us assume that the particles move in they-direction. The spincomponent in a rotated direction in thex, z-plane will have the form

Sθ = cos θSz + sin θSx (2.64)

We consider now the following correlation function between spin measurements on the two par-ticles,

E(θA, θB) ≡ 4

h2 〈SθASθB

〉 (2.65)

Thus, we are interested in correlations between spin directions that are rotated byarbitrary anglesθA andθB in the x, z-plane. The factorh2/4 is included to makeE(θA, θB) a dimensionlessfunction, normalized to take values in the interval(−1,+1).

3Bell’s analysis focusses on the conflict between the possible correlations in a quantum and a classical system.But to assume sharp values for the components of the particle spin in different directions creates a problem evenwithout making reference to correlations. As discussed in the context of the Stein-Gerlach experiment, the fact thatonly spin up and spin down can be measured for a spin component seems incompatible with rotational invariancewhen spin is treated as a classical variable.

Page 61: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2.3. QUANTUM STATES AND PHYSICAL REALITY 61

For the spin singlet state it is easy to calculate this correlation function. It is

E(θA, θB) = − cos(θA − θB) (2.66)

With the two spin measurements oriented in the same direction,θA = θB ≡ θ, we have

E(θ, θ) = −1 (2.67)

which shows the anticorrelation between the two spin vectors.Let us now leave the predictions of quantum mechanics and look at the limitations of alocal,

realistic hidden variable theory. We write the measured spin values asSA = h2σA andSB = h

2σB

so thatσA andσB are normalized to unity in absolute value. We make the following assumptions

1. Hidden variables.The measured spin valuesSA of particle A andSB of particle B aredetermined (before the measurement) by one or more unknown varlables, denotedλ. ThusσA =σA(λ) andσB = σB(λ).

2. Einstein locality. The spin valueSA will depend on the orientationθA, but not on theorientationθB, when the two measuring devices are separated by a large distance. SimilarlySBwill be independent ofθA. We write this as

σA = σA(θA, λ) σB = σB(θB, λ)

3. Spin quantization. We assume the possible outcome of measurements are consistent withspin quantization,

σA = ±1 σB = ±1

4. Anticorrelation. We assume that the measured values are consistent with the known anti-correlation of the singlet state

σA(θ, λ) = −σB(θ, λ) ≡ σ(θ, λ)

Since the variableλ is unknown we assume that particles are emitted from the sourceS withsome probability distributionρ(λ) overλ. Thus, the expectation value (2.65) can be written as

E(θA, θB) =∫dλ ρ(λ) σA(θA, λ) σB(θB, λ)

= −∫dλ ρ(λ) σ(θA, λ) σ(θB, λ) (2.68)

We now consider the difference between correlation functionE for two different set of angles

E(θ1, θ2)− E(θ1, θ3) = −∫dλ ρ(λ)(σ(θ1, λ)σ(θ2, λ)− σ(θ1, λ)σ(θ3, λ))

=∫dλ ρ(λ)σ(θ1, λ)σ(θ2, λ) (σ(θ2, λ)σ(θ3, λ)− 1)

(2.69)

Page 62: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

62 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

In the last expression we have usedσ(θ, λ)2 = 1, which follows from assumption 3. From thiswe deduce the following inequality

|E(θ1, θ2)− E(θ1, θ3)| ≤∫dλ ρ(λ) (1− σ(θ2, λ)σ(θ3, λ))

= 1 + E(θ2, θ3) (2.70)

which is one of the forms of theBell inequality.Let us make the further assumption that the correlation function is only dependent on the

relative angle (rotational invariance),

E(θ1, θ2) = E(θ1 − θ2) (2.71)

Let us also assume thatE(θ) increases monotonically withθ in the interval(0, π). (Both theseassumptions are true for the quantum mechanical correlation function (2.66).) We introduce theprobability function

P (θ) =1

2(E(θ) + 1) (2.72)

This function gives the probability for measuring eitherspin upor spin downon both spin mea-surements(see the discussion below). It satisfies the simplified inequality

P (2θ) ≤ 2P (θ) , 0 ≤ θ ≤ π

2(2.73)

Due to strict anticorrelation between the spin measurements forθ = 0, which means strictcorrelation forθ = π, the function satisfies the boundary conditions

P (0) = 0 , P (π) = 1 (2.74)

The inequality (2.73) further implies thatP (θ) is aconcave functionwith these boundary values.This is illustrated in Fig.(2.3.2), whereP (θ) = θ/π is a limiting function which satisfies (2.73)with equalityrather than inequality.

We now turn to the predictions of quantum theory. As follows from (2.66) the functionP (θ)is given by

P (θ) = sin2 θ

2(2.75)

This function does not satisfy the Bell inequality. This is clear from taking a special valueθ =π/3. We have

P (2π/3) = sin2 π

3=

3

4

2P (π/3) = 2 sin2 π

6=

1

2(2.76)

Page 63: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2.3. QUANTUM STATES AND PHYSICAL REALITY 63

0.5 1 1.5 2 2.5 3

0.2

0.4

0.6

0.8

1 P(Θ)

Θ

A

B

C

Figure 2.3: The probabilityP (θ) for measuring twospin ups or twospin downs as a function of therelative angleθ between the spin directions. CurveA satisfies the Bell inequality, CurveB is the limitingcase where the Bell inequality is replaced by equality and Curve C is the probability predicted by quantummechanics. Curve C is not consistent with the Bell inequality, but is confirmed by experiment.

Thus,P (2π/3) > 2P (π/3) in clear contradiction to Bell’s inequality. This breaking of the in-equality is seen also in Fig.(2.3.2).

Measured frequenciesTo get a better understanding of the physical content of the Bell inequality we will discuss itsmeaning in the context of spin measurements performed on the two particles, where a seriesof resultsspin upor spin downis registered for each particle. We are particularly interested incorrelations between the results of pairs of entangled particles.

Let us therefore consider a spin measurement experiment where the orientations of the twomeasurement devices are set to fixed directions. We assumeN pairs of entangled particles areused for the experiment. The results are registered for each pair, and we denote byn+ the numberof times where the two particles are registered both withspin upor both withspin down. Thenumber of times where one of the particles are registered withspin upand the other withspindownwe denote byn−. Clearlyn+ + n− = N .

The functionsE(θ) andP (θ) are for largeN approximated by the frequencies

E(θ)exp =n+ − n−

N

P (θ)exp =1

2(E(θ)exp + 1) =

n+

N(2.77)

ThusP (θ) represents the probability for having the same result (bothspin upor bothspin down)for an entangled pair, as already mentioned.

Let us first focus at the list of results calledSeries Iin Fig.(2.3.2). We consider this list asthe outcome of a series of measurements for a setup where the directions of the two measuringdevices are aligned(θA = θB = 0). Each pair of entangled particles is separated in aparticle Awhose spin is measured inMA and aparticle Bwhose spin is measured inMB.

Page 64: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

64 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

The list indicates, as we should expect, that the results of measurements atMA, when con-sidered separately, correspond tospin upandspin downwith equal probability. The same isthe case for the meaurements atMB. The list also reveals the strictanticorrelation betweenmeasurements on each entangled pair. We write the result as

P (I)exp =n+(I)

N= 0 (2.78)

Instead of performing a new series of measurements we next make some theoretical consider-ations based on the assumption oflocality. We ask the question:What would have happened if inthe experiment performed the spin detectorMB had been rotated to another directionθB = π/3.Locality is now interpreted as meaning that ifMB were rotated, that could in no way have influ-enced the results atMA. The series of measurementsSA would therefore have been unchanged.The resultsSB would however have to change since the results forθ 6= 0 are not strictly an-ticorrelated. Thus, some of the results would be different compared to those listed in Series I.Quantum mechanics in this case predictsP (π/3) = 1/4. In the list of Series II we have indicateda possible outcome consistent with this, where

P (II)exp =n+(II)

N=

3

12(2.79)

Since we have a symmetric situation betweenMA andMB we clearly could have used thesame argument ifMA were rotated instead ofMB. This situation is shown in Series III, wherethe series of measurementsSB is left unchanged, but some of the results ofSA are changed. Inthis case we have chosenθA = −π/3. We also in this list have

P (III)exp =n+(III)

N=

3

12(2.80)

consistent with the predictions of quantum mechanics.Finally, we combine these two results to a two-step argument for what would have happened

if both measurement devices had been rotated, to the directionsθA = −π/3 andθB = π/3. Wecan start with either Series II or Series III. Rotation of the second measuring device would thenlead to a change of about1/4 of the results in the series that was not changed in the first step.The total number of correlated pairs would now be

n(IV ) = n(II) + n(III)−∆ (2.81)

In this expressionn(II) is the number of flips in the result ofSB andn(III) the number offlips in the results ofSA. Such flips would change the result fromanticorrelatedto correlated.However that would be true only ifonly oneof the spin results of an entangled pair were flipped.If both spins were flipped we would be back to the situation withanticorrelations. Thus thenumber of correlated pairs would be equal to the sum of the number of flips in each series minusthe number where two flips is applied to the same pair. The last number is represented by∆ inthe formula (2.81). In the result of Series IV this is represented by the result

P (IV )exp =n+(IV )

N=

4

12

< P (II)exp + P (III)exp =6

12(2.82)

Page 65: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

2.3. QUANTUM STATES AND PHYSICAL REALITY 65

The subtraction of∆ in (2.81) explains the Bell inequality, which here takes the form

P (2π/3) ≤ 2P (π/3) (2.83)

The results extracted from the series of spin measurements listed in Fig.(2.3.2) gives (2.82) inaccordance with the Bell inequality (2.83).

The arguments given above, which reproduce Bell’s inequality, may seem to implement theassumption oflocality in a very straight forward way:Changes in the set up of the measure-ments atMA cannot influence the result of measurements performed atMB when these devicesare separated by a large distance. The predictions of quantum mechanics, on the other hand,are not consistent with the results of these hypothetical experiments, andreal experimentshaveconfirmed quantum mechanics rather than the Bell inequality.

So is there anything about the arguments given for the hypothetical experiments that indicatesthat they may be wrong? We note at least one disturbing fact. Only one of the four series ofresults in Fig.(2.3.2) can be associated with areal experiment. The others must be based onassumptions of what would have happened ifthe same series of measurementswould have beenperformed under somewhat different conditions. This is essential for the result. If the four seriesof measurements should instead correspond tofour different (real) experimentsthe situationwould be different. We then would have no reason to believe that the series of resultsSA wouldbe preserved when going from I to II orSB when going from I to II. Each series would in thiscase rather give a new (random) distribution of results for particle A (or particle B).

In any case the conclusion seems inevitable, that quantum mechanics is in conflict withEin-stein locality, i.e., with the basic assumptions that lead to the Bell inequality. Does this mean thatsome kind of influenceis transmitted from A to B when measuring on particle A, even when thetwo particles are very far apart?

Page 66: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

66 CHAPTER 2. QUANTUM MECHANICS AND PROBABILITY

SA + - + + - + - - + + - +

SB - + - - + - + + - - + -

Series I: ΘA=ΘB=0, n+=0

Spin measurements

SA + - + + - + - - + + - +

SB - + + - + - - + + - + -

Series II: ΘA=0, ΘB=π/3, n+=3

SA - - + - - + - - - + - +

SB - + - - + - + + - - + -

Series III: ΘA=-π/3, ΘB=0, n+=3

SA - - + - - + - - - - - +

SB - + + - + - - + + - + -

Series IV: ΘA=-π/3, ΘB=π/3, n+=4

Figure 2.4:Four series with results of spin measurement on pairs of entangled particles (SA andSB).Spin upis represented by + andspin downby - . Series Iis considered as the result of a real expeimentwith alignment of the directions of the measuring devicesMA andMB. A strict anticorrelation betweenthe result is observed for each pair.Series IIgives the hypothetical results that would have been obtainedin the same experiment if the direction ofMB were tilted relative toMA. About 1/4 of the spinsSBwould be flipped. These are indicated by red circles. This would lead to four pairs with correlated spins,indicated by green lines.Series IIIis a similar hypothetical situation withMA rotated. FinallySeries IVgives the results if bothMA andMB were rotated. In one of the pairs both spins are flipped relative toSeries I. This reduces the number of correlated spins relative to the sum of correlated pairs forSeries IIandSeries III.

Page 67: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

Chapter 3

Quantum physics and information

In recent years there has been an increasing interest in questions concerning the relation betweenquantum physics and information theory. The present understanding is that the characteristicfeatures of quantum physics that distinguishes it from classical physics, namely quantum inter-ference in general and quantum entanglement in particular, creates the physical foundation for anapproach to communication and to processing of information that is qualitatively different fromthe traditional one. At present there is only a partial understanding of this new approach, but thebelief of many physicists is that a new type ofquantum information theoryshould be developedas an alternative to classical information theory. This belief is supported by the discovery ofalgorithms that could speed up the computation of certain types of mathematical problems in aquantum computerand by the develpement of methods for secure and efficient communicationby use ofentangled qubits.

The development of this new approach to information and communication poses importantchallenges to the manipulation of quantum systems. This is so sincequantum coherenceis im-portant for the methods to work, and in a system with many degrees of freedomdecoherencewillunder normal conditions rapidly destroy the important quantum correlations. The very difficultchallenge is to create a quantum system where, on one side, the quantum states are effectivelyprotected from outside disturbances and, on the other side, the variables can rapidly be adressedand manipulated in a controlled way in order for the system to perform the task in question.

In this chapter we focus on some basictheoreticalelements in this new approach to physicsand information, while for the discussion of the present status ofimplementationsof the ideas onphysical systems we refer to several recent books on the subject. We first focus on an exampleof how quantum physics admits the possibility of aquiring information in a radically new waythrough aninteraction-free measurement. We then proceed to study howqubitscan replacebitsas the fundamental unit of information.

3.1 An interaction-free measurement

The usual picture of measurements performed on a quantum system is that they involve a non-negligible, minimal disturbance quantified by Planck’s constant. If photons are used to examine

67

Page 68: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

68 CHAPTER 3. QUANTUM PHYSICS AND INFORMATION

a physical object, the minimal disturbance corresponds to letting a single photon interact withthe system. The energy of the photon may be made small, but since this means making thecorresponding frequency small, one will thereby loose resolution. Thus if a certain resolutionis required, a minimal energy has to be carried by the photon and this gives rise to a finiteperturbation of the object. The picture of measurements in classical theory is different. There theenergy that is carried by light of a given frequency can be made arbitrarily small by reducing theamplitude, and therefore there is no lower limit to how much an (idealized) measurement willhave to disturb the object studied.

However, this is not the complete picture. Quantum mechanics opens up the possibility forother types of more “intelligent” interactions than the direct “mechanical” interaction betweenthe object and the measuring apparatus. With the use of quantum superposition (or interference)certain types of measurements can be performed which involveno mechanical interactionwiththe object. A particular example is discussed here.1

Let us assume that a measurement should be performed in order to examine whether or not anobject is present within a small transparent box. If the object is not present the box is transparentto light, while if it is present the box is not transparent since the object will absorb or scatter thephoton. Let us further assume that measurements are performed with single photons.

A direct measurement would be to send a photon through the box and to register whetherthe photon is transmitted through the box or whether it is not transmitted. This would clearlygive the information required. If the object is present, the information about this situation isachieved by a direct (mechanical) interaction between the photon and the object. Apparently thisis the least interaction with the object that can be made in order detect its presence. However,this is not the correct conclusion to draw, as is outlined in Fig.3.1. The figure shows a Mach-Zehnder interferometer where an incoming photon can follow two different paths and eventuallybe registered in of the two detectors. We first consider the case where both paths are open.The photon will first meet a beam splitter that with equal probability will direct the photonhorizontally into the lower path or vertically into the upper part. On both paths the photon willmeet a mirror that redirects it towards a second (50/50) beam splitter. Here the two componentsof the photon wave functions will meet and form a superposition that can either direct it in thehorizontal direction towards a detector A or vertically towards a detector B.

If we neglect the coherence effect and assume an incoherent scattering of the photon by thebeam splitters, with equal probability in the two directions, then we expect that the probabilityfor detecting the photon by detector A to be 50%. With the same probability the photon will bedetected by B. However, the quantum description of the transmission of the photon through theapparatus implies that the photon at intermediate times is not located (with a certain probability)on one of the paths, but is rather in asuperpositionof being on each of the two paths. Thismeans that the two signals following the upper and lower paths will interfere when they meet atthe second beam splitter. In the following we will assume the experimental setup to be adjustedto a situation where the interference acts constructively for a photon directed towards detector Aand destructively for a photon directet towards detector B. Thus, the probability for detecting thephoton by A is1 and the probability for detecting the photon by B is0. This will be the case as

1This example is taken from A.C. Elitzur and L. Vaidman, Found. Phys23 (1993) 987.

Page 69: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

3.1. AN INTERACTION-FREE MEASUREMENT 69

detector B

detector A

photon

alternative I:

box empty

alternative II:

object present

mirror

beam splitter

Figure 3.1:The set up of a single-photon measurement to detect the presence of an object in a transparentbox by use of a Mach-Zehnder interferometer. A photon is sent through a beam splitter that directs it, in asuperposition, either in the horizontal or vertical direction. On the lower path the box is placed. Therefore,if the object is present, the lower path is blocked, while if it is not present both paths are open. The twopaths meet again at a second beam splitter and then the photon is directed towards one out of two possibledetectors. The interferometer is arranged so that if both paths are open, destructive interference preventsthe photon from reaching detector B. Thus, if the photon is registered by B this provides the informationthat the lower path is blocked and that the object is present in the box.

long as both paths are open. Clearly, if one of the paths are closed the photons that get throughare instead detected with equal probability by A and B.

To describe the situation more formally let us denote the state of a photon moving in thehorizontal direction by|0〉 and a photon moving in the vertical direction by|1〉. (With thisnotation we do not make any distinction between where in the apparatus the photon is.) Theaction of the beam splitters on a photon is described by the mapping

|0〉 → 1√2(|0〉+ i|1〉)

|1〉 → 1√2(|1〉+ i|0〉) (3.1)

while the action of the mirrors is given by

|0〉 → i|1〉 , (lower mirror)

|1〉 → i|0〉 , (upper mirror) (3.2)

Page 70: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

70 CHAPTER 3. QUANTUM PHYSICS AND INFORMATION

The lengths of the paths are assumed to be the same, so that the difference in phase acquired bythe photon on the two paths is only due to the phase shifts produced at the mirrors and beamsplitters. The photon is subject to a serie of transformations of the form (3.1) and (3.2) in theinterferometer. Let us consider the mapping from the incoming state to the outgoing state (beforedetection) when only the upper path is open,

|0〉 → i|1〉 → −|0〉 → − 1√2(|0〉+ i|1〉) (3.3)

Since we are interested in the state of a photon that exits from the last beam splitter, the interme-diate states have been normalized to 1, thus neglecting the probability that the photon is absorbedon the lower path. The interesting point to note that the final state is a superposition with equalprobability for exit in the horizontal and vertical direction.

If both paths are open the mapping from the initial to the final state is instead

|0〉 → 1√2(|0〉+ i|1〉) → 1√

2(−|0〉+ i|1〉) → −|0〉 (3.4)

and we note that only one component survives. The photon will exit (with probability1) in thehorizontal direction.

We will now turn to the original problem and consider the the situation where the box, whicheither is empty or not empty, is placed on one of the paths of the photon. The intention is to sendone photon through the interferometer in order to investigate whether the box is empty or not.We note that if the box is empty we have the situation where both photon paths are open. If thebox is not empty only one of the paths will be open.

Let us consider the possible outcomes of the experiment where a photon is sent through theinterferometer:

1. The photon does not get through.We conclude that the object is present, the photon has interacted with the object.

2. The photon is registered by detector A.The result is inconclusive. Whether the object is there or not there is always a chance forthe photon to be detected by A. The experiment has to be repeated.

3. The photon is registered by detector B.We conclude that the object is present, since the probability of detecting the photon by Bwith both paths open is0.

Of the possible outcomes we focus on 3., which is the interesting one. In this case thepresence of the object has been detected without any interaction with it. This is so since thedetectionof the photon implies that no interaction has taken place. A natural explanation seemsto be that the photon has followed the upper (open) path, but that the detection of the photonprovides information about the lower path (that it is closed). This result depends crucially on thepossibility of quantum superposition. In a classical theory this would not happen. Note however,the curious fact that when the object blocks the path, in reality no superposition takes place. The

Page 71: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

3.2. FROM BITS TO QUBITS 71

result of the measurement will be the same as in a classical theory with a certain probability forthe photon to follow the upper path. It isour knowledgeof the possible outcomes when bothpaths are open that allows us to draw the conclusion that one path is closed.

In conclusion this thought experiment shows that a direct interaction in a measurement is notalways needed. But there has to be apossibilityfor the interaction to take place. A superpositionbetween two states where one of them interacts (when the object is there) and the other does notis an important ingredience in the set up.

The example discussed here shows that alternative ways to collect information with quantummechanical methods is a possibility. Quantum coherence or superposition is important for suchmethods to work. Also the importance of using “intelligent ways” to address the measuringproblem, instead of a (naive) direct measurement is emphasized.

3.2 From bits to qubits

In the classical information theory developed by C. Shannon information is quantified in termsof discreteunits of information. Thus, the elementary information unit is abit, and this is viewedas a function which can take two possible values, normally the numerical values0 and1. Theinformation lies in specifying which of the two values to asign to the bit. The idea is that ageneral message, to an arbitrary good precision, can be expressed in terms of a finite sequenceof bits.

The idea ofquantizing informationcreates the basis for the general theory of information. Butas we all know it also creates the basis for a practical approach to information and communicationin the form of digital signals. Whereas communication (by telephone, radio or TV) used to be inthe form ofanalog signals, presently the use of digital signals are preferred because this admitsa more precise determination (and correction) of the information content of the signal.

Information theory can be viewed (and is normally so) as a mathematical discipline. How-ever, from a physics point of view, it is natural to focus on the implementation of the theoreticalideas in terms of physical signals. Thus the information will normally be coded into signals thatare created and manipulated in physical (electronic) devices. They are transmitted by physicalmediators (electromagnetic waves or electric signals) and are again manipulated and decoded in(electronic) receivers.

A message consisting of a certain number of bits can be viewed as astateof a physicalsystem. WithN bits there are2N states, which represent all the different messages that can beencoded in theN bits. In this picture the factorization of the message into single bits correspondsto a separation of the physical system intoN two-state subsystems. Thus, the information unitbit corresponds to thetwo-state systemas a physical unit. Such a system can be realized in manyways, as a physical system that can easily be switched between two stable states, as an electricsignal with only two allowed values (“on” or “off”) etc.

The important point to note is that the two-state system considered in this way is aclassicalsystem. And the interesting question which has been addressed in recent years is whether quan-tum physics should introduce a new picture of the (physical) unit of information. The classicaltwo-state system has its counterpart in the quantum two-state or two-level system, and for the

Page 72: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

72 CHAPTER 3. QUANTUM PHYSICS AND INFORMATION

quantum system a new feature is thatcoherent superpositionsbetween states are possible. In thesame way as the classical two-state system is associated with a bit of information, the quantumtwo-level system is associated with a new information unit, aqubit. While the possible values ofa bit is restricted to0 and1, the qubit takes values in a two-dimensional Hilbert space spannedby two vectors|0〉 and|1〉. Thus a general qubit state is

|q〉 = α|0〉+ β|1〉 (3.5)

with α andβ as complex coefficients.Let us therefore assume that a message in this new picture is encoded not in a classical

state of a system, but in a quantum state of a finite-dimensional Hilbert space. Such a Hilbertspace is unitarily equivalent to a tensor product ofN two-level systems, provided we restrict thedimension of the Hilbert space toM = 2N . In this sense we can view the qubit as the elementarybuilding block of information. Note however that the general state is not a product state of thequbits, since also superpositions should be included. This means that the general state involvesentanglementbetween the qubits.

Apparently there is much more than one bit of information contained in each qubit, since thequbit states form a continuum that interpolate between the “classical” states|0〉 and|1〉. However,one should be aware of the fact that even if more information is contained in thespecificationof the qubit state, this information cannot be read out by making a measurement on the qubit.This is due to the probabilistic interpretation of the state. One may compare this to a situationwith a (classical) probability distribution over the two states0 and 1. Since the probabilitydistribution depends on a continuous parameter, much more than one bit of information is neededto specify the value of this parameter. However, since each classical two-state system can onlybe in the states0 or 1, an ensemble of these systems is needed to provide the information aboutthe probability distribution.

In the case of qubits the situation is somewhat similar, but not completely so. Unlike aclassical probability distribution the superposition of states is a resource that can be used forsome types of information processing. This has been demonstrated by specific examples.

3.3 Communication with qubits

New possibilities open up in communication when we can exploit quantum interference andquantum entanglement. We show here a simple example ofdense codingof information with thehelp of entangled qubits.

Let us assume that a sender A (often referred to asAlice) wants to send a two-bit messageto a receiver B (referred to asBob). The question that is posed is whether this can be done bytransmitting a single qubit, since the claim is that a qubit carries more information than a bit. Theapparent answer is no: If Alice prepares the the qubit in a pure state and sends it to Bob, he canread out the information by a measuring the state in a given basis (corresponding to measuringthe spin component in some direction). The result is0 or 1, where the probability for gettingthese two results is determined by the decomposition of the prepared state on the the two basisstates|0〉 and|1〉. It seems that the best they can do in order to send the meassage is to agree on

Page 73: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

3.3. COMMUNICATION WITH QUBITS 73

0

1

q

Figure 3.2: The physical states of a qubit can be viewed as points on a sphere. The poles of thesphere correspond to the two classical one-bit states0 and1

what basis to use. Then Alice can choose between two possible states|0〉 and|1〉 and Bob candetermine which of the states is chosen by making a measurement in the same basis as used byAlice. But in this way a qubit can communicate only one bit of information.

However, a more intelligent way to do it exists. Let us assume that Alice and Bob in advancehave shared a pair of qubits with maximum entanglement. They may for example be in the state

|c,+〉 =1√2(|00〉+ |11〉) (3.6)

We assume the qubits are kept in a safe way so that the entanglemet is kept unchanged until thequbits are used for communicating the message.

The four two-bit messages can be associated with a set of four states, denoted00, 01, 10 and11, by assigning one of the states to each of the messages. Alice now encodes the chosen messageto her (entangled) qubit by making a transformation on it. The transformation is determined bythe message in the following way,

00 → 1⊗ 1 |c,+〉 = |c,+〉01 → iσz ⊗ 1 |c,+〉 = i |c,−〉10 → iσx ⊗ 1 |c,+〉 = i |a,+〉11 → iσy ⊗ 1 |c,+〉 = |a,+〉 (3.7)

In the first case no change is made to her qubit, in the second case a rotation ofπ around thez-axis is performed, in the subsequent cases rotations around they-axis andz-axis are performed.(We here envisage the qubit states as spin states.) We note that in all cases no change is done tothe B-qubit, and in all cases the maximal entanglement is kept by transforming the original stateinto another Bell state.

Alice now transmits her qubit to Bob who is free to make measurements on both entangledqubits. We note that the four different (two-bit) messages are encoded in the four orthogonalBell states. Bob can determine which one is chosen by Alice by measuring the (eigen)value of

Page 74: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

74 CHAPTER 3. QUANTUM PHYSICS AND INFORMATION

an observable which has the Bell states as eigenstates (assuming different eigenvalues for thefour states). If Bob in advance has been informed about the key to decode the message from theBell states, then he will obtain the full two-bit message from the measurement.

In this way Alice has managed to transfer the two-bit message to Bob by encoding the mes-sage into a single qubit which is afterwards transmitted to Bob. The second qubit, belonging toBob is not affected by the manipulations performed by Alice. Also note that the original entan-gled state contains no information about the message. Nevertheless, the message is read out bymaking a measurement on both qubits.

The entanglement is essential for being able to transmit the full message by a single qubit. Infact, if we consider Alice’s qubit separately, it is all the time in the same state, described by thereduced density matrix

ρA =1

2(|0〉〈0|+ |1〉〈1|) (3.8)

This means thatall the information is contained in the entanglement between the two qubits,since no information lies in the reduced states of the two qubits. From this we also note theimportant point: The message can only be read by the receiver who has the second qubit of theentangled pair. In this way the message is protected from others than Bob. This principle ofprotecting information by encoding the information into entangled pairs of qubits is the basisfor quantum cryptography, which is a field of research that has been rapidly developed in recentyears.

3.4 Principles for a quantum computer

The probably most interesting suggestion for application of quantum physics to informationtechnology is in the form ofquantum computers. The idea is that a computer build on quantumprinciples will manipulate information in a qualitatively different way than a classical computerand thereby solve certain types of problems much more efficiently.

A type of problem that is most interesting for physicists issimulation of quantum systems.Today numerical solutions of physics problems are important for research in almost any fieldof physics, but the capacity of present day computers gives a clear limitation to the size of theproblem that can be solved. Thus, a quantum system withN degrees of freedom has a Hilbertspace with dimensionmN , if each degree of freedom is described by anm-dimensional space.This means thatmN complex parameters are needed to specify a Hilbert space vector, and thenumber of variables to handle therfore grows exponentially withN .

The idea is that a quantum computerworks as a quantum mechanical system, with the com-putation performed by unitary transformation on superposition of qubit states. For the simulationof quantum systems there is an obvious gain, since the number of qubits needed to represent thewave function grows linearly with the number of degrees of freedomN rather than exponentially.In addition to the simulation of quantum systems there are also certain other types of mathemat-ical problems that can be solved more efficiently with the use of quantum superposition. Twoalgorithms that have gained much interest are theShor algorithmfor factorizing large numbersand theGrover algorithmfor making efficient search through data bases.

Page 75: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

3.4. PRINCIPLES FOR A QUANTUM COMPUTER 75

The typical feature of a quantum computer is to work with superpositions of (qubit) states.From a computational point of view this can be seen as new type of(quantum) parallel comput-ing. In the picture of path integrals we may view a classical computation as a (classical) path,where each logical operation corresponds to making a (new) direction for the path. Parallel pro-cessing in this picture corresponds to working simultaneously with several paths in the “space oflogical operations”. In the quantum computer many paths are, in a natural way, involved at thesame time in the form of a superposition of states, and the final result is obtained by quantum me-chanical interference between contributions from all the (classical) paths. Clearly, if a problemshould be solved much more efficiently on a quantum computer than on a classical computer,superposition of states has to be used extensively in the computation. This means that the qubitswill be highly entangled during the computation. The serious challenge for constructing a quan-tum computer is therefore to be able to preserve and operate on such highly entangled states.

3.4.1 A universal quantum computer

The idea of a universal quantum computer is similar to that of a universal classical computer.Thus, a universal quantum computer is designed to solve general types of problems by reducingthe computation to elementary qubit operations. This means that the input wave function isencoded in a set of (input) qubits, and the computational program acts on these by performinglogical operationsin the form of unitary transformations on the qubits. A standard set of unitaryone-qubitandtwo-qubitoperations are used, where each operation is performed at a logical gate.Together the logical gates form acomputational networkof gates.

In Fig. 3.4.1 a schematic picture of a universal quantum computer is shown. The input datais encoded by preparation of an input quantum state. The computer program acts on the inputstate and by a unitary transformation produces an output state where the result of the computa-tion is read out by a quantum measurement. The computational task is specified partly by thetransformation performed on the state and partly on how the measurement is performed.

This picture of a quantum computer is quite analogous to that of a classical computer, wherebits of information are processed at logical gates that together form a logical network. The maindifference is that in the quantum computer the information is processed as quantum superpo-sition between states. And the computation isreversiblesince the unitary transformations areall invertible. This is different from a (standard) classical computer where some of the standardlogical operations are irreversible in the sense that the mapping between the input state and theoutput state is not one-to-one.

A universal set of logical gatesThe idea of a universal quantum computer is based on the possibility of reducing any unitarytransformation that acts on the quantum states of anN qubit system to a repeated operation ofa small number of standard one-qubit and two-qubit transformations. We will here only outlinehow such a reduction can be performed. It involves the following steps,

1. A unitary transformation acting on a finitedimensional Hilbert space can be factorized in

Page 76: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

76 CHAPTER 3. QUANTUM PHYSICS AND INFORMATION

q1

q2

q3

q4

q5

Ψin

preparation measurement

Ψout

unitary transformations

Figure 3.3: A schematic picture of a universal quantum computer. An input state is preparedas a quantum state of a set of qubits. A network of logical gates, that perform one-qubit andtwo-qubit transformations, operates on the input qubits (and a set of additional work qubits) toproduce an output state. The result of the computation is read out by measuring the state of eachoutput qubit. In the diagram the horisontal lines represent the qubits and the boxes represent thegates or logical operations performed on the qubits.

terms oftwo-levelunitary transformations. These transformations act on two-level subsys-tems spanned by orthonormalized vectors of a common basis in the full Hilbert space,

U =∏n

U(in, jn) (3.9)

wherein andjn denotes the basis states affected by thenth transformation

2. A two-level unitary transformation acting between basis states of the N-qubit Hilbert spacecan be factorized in terms of a set of one-qubit and two-qubit operations. If the number ofterms in the factorization is finite, a general unitary transformation can only be representedin an approximate form. Thus, the continuous parameters of the unitary transformation isreplaced by a set of finite values.

The following one-qubit and two-qubit transformations give an example of a universal set ofqubit operations,

1. The Hadamard transformation.This is a single-qubit transformation defined by the operations on the qubit states in thefollowing way,

H |0〉 =1√2(|0〉+ |1〉)

H |1〉 =1√2( |0〉 − |1〉) (3.10)

Page 77: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

3.4. PRINCIPLES FOR A QUANTUM COMPUTER 77

H S T

Hadamard Phase π/8

CNOT

Figure 3.4: Symbolic representation of logical gates. In the representation of the CNOT gate theupper line corresponds to the control qubit and the lower line to the target qubit

In matrix form this is,

H =1√2

(1 11 − 1

)(3.11)

with |0〉 corresponding to the to the upper place and|1〉 to the lower place of the matrix.

2. The Phase transformation.This is also a single-qubit transformation, defined by

S |0〉 = |0〉S |1〉 = i |1〉 (3.12)

which in matrix form is

S =(

1 00 i

)(3.13)

3. Theπ/8 transformation.This is the third single-qubit transformation. It is defined by

T |0〉 = |0〉T |1〉 = eiπ/4|1〉 (3.14)

with the matrix form

T =(

1 00 eiπ/4

)(3.15)

4. TheCNOT transformation.This is a two-qubit transformation, which is a quantum version of acontrolled notopera-tion. It is defined by

CNOT |0〉 ⊗ |0〉 = |0〉 ⊗ |0〉

Page 78: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

78 CHAPTER 3. QUANTUM PHYSICS AND INFORMATION

CNOT |0〉 ⊗ |1〉 = |0〉 ⊗ |1〉CNOT |1〉 ⊗ |0〉 = |1〉 ⊗ |1〉CNOT |1〉 ⊗ |1〉 = |1〉 ⊗ |0〉 (3.16)

We note that the state of the first qubit is left unchanged. It acts as acontrol qubiton thesecond qubit: If the first qubit is in the state|0〉 the second qubit is left unchanged. Ifthe first qubit is in the state|1〉 the second qubit switches state|0〉 ↔ |1〉. With the basisvectors of the product space written in matrix form,

|0〉 ⊗ |0〉 →

1000

, |0〉 ⊗ |1〉 →

0100

|1〉 ⊗ |0〉 →

0010

, |1〉 ⊗ |1〉 →

0001

(3.17)

theCNOT operation corresponds to the following4× 4 matrix,

CNOT =

1 0 0 00 1 0 00 0 0 10 0 1 0

(3.18)

The definitions above give the action of qubit operations on a set of basis vectors for thesingle-qubit and-two qubit spaces. With specification of the qubits involved, the action of theseoperators on a complete set of basis vectors for the fullN - qubit space is determined, and therebythe action of the operators on any state vector, by the principle of linear superposition.

3.4.2 A simple algorithm for a quantum computation

As already mentioned, certain algorithms have been designed for solving mathematical problemsmore efficiently on a quantum computer than can be done on a classical computer. A famous ex-ample isShor’s algorithm, that adresses the question of how to factorize large numbers. Thisis a hard problem on a classical computer, since the computational time used to factorize largenubers will in general increase exponentially with the number of digits. (This fact is the basis formaking use of factorization of large numbers as keys in crytographic schemes.) The demonstra-tion by P. Shor that the factorization can be done more efficiently on a quantum computer is oneof the reasons for the boost of interest for quantum computing over the last decade.

In this section we will focus on a simpler algorithm introduced by D. Deutsch some timeago. The intention is to use this algorith as a simple demonstration of how superposition makesit possible to adress certain types of problems more efficiently.

Page 79: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

3.4. PRINCIPLES FOR A QUANTUM COMPUTER 79

The problem adressed is to studyone-bit functions. Such a function gives a mapping

f(x) : 0, 1 → 0, 1 (3.19)

We will need to add these functions, and note that addition between one-bit numbers can bededuced from ordinary addition if the result is definedmodulo 2. Thus, the explicit addition ruleis

0 + 0 = 0 , 0 + 1 = 1 + 0 = 1 , 1 + 1 = 0 (3.20)

We also note that there exist four different one-bit functions (3.19),

fa : 0, 1 → 0, 1 , fb : 0, 1 → 1, 0fc : 0, 1 → 0, 0 , fd : 0, 1 → 1, 1 (3.21)

where the sets of input values and output values are here considered asordered sets. We distin-guish between two types of functions, the setA = fa, fb are theinvertiblefunctions, while thesetA = fa, fb areconstantfunctions.

Let us assume that a functionf(x) is known onlyoperationally, i.e., in a classical computa-tion one assigns to the input variablex one of the two possible values0 and1, from which oneof the two possible output valuesf(0) andf(1) are produced. (We may think of the functionas ablack boxthat produces the output result from a given input.) Initially the functionf(x) isnot known, and as a concrete problem we want to determine whether the function belongs to setA or set B. Thus, we do not seek a detailed knowledge of the function. However, by a classicalcomputations, as explained above, we clearly have to use two input valuesx = 0 andx = 1 inorder to determine which set the function belongs to, which means that a complete determinationof the function has to be done.

The intention is to show that a quantum computation, which makes use of superpositions,can distinguish between set A and set B in one operation. We first have to discuss how such aquantum computation can be implemented. Let us consider the possibility of simpel extensionof the classical operation to a linear transformation, that acts on a qubit in the following way,

Tf : |q〉 = α|0〉+ β|1〉 → α|f(0)〉+ β|(f(1)〉 (3.22)

This is considered as a single (qubit) operation even if both resultsf(0) andf(1) are presentin the output state. However, for the non-invertible functions (fc andfd) the operationTf isnon-invertible and therefore non-unitary. In order to representf as a unitary transformation wetherefore modify the one-bit operationTf to a unitary two-qubit operationUf in the followingway

Uf : |x〉 ⊗ |y〉 → |x〉 ⊗ |y + f(x)〉 (3.23)

where|x〉 and|y〉 denote standard qubit basis states (|0〉 and|1〉). We note that the first qubit (|x〉)is left unchanged by the transformation; it acts as a control qubit on the second qubit. Thus, if

Page 80: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

80 CHAPTER 3. QUANTUM PHYSICS AND INFORMATION

0

1

H

H

H

Uf

Figure 3.5: Schematic representation of qubit transformations for Deutsch’s algorithm. The firstqubit is prepared in the|0〉 state and the second qubit in the|1〉 state. They are both transformedby a Hadamardoperation before a two-qubit operation is performed. This operation,Uf , isdetermined by the (unknown) one-bit functionf(x). Finally a secondHadamardtransformationis performed on the first qubit before a measurement is performed.

f(x) = 0 the state of the second qubit is left unchanged, iff(x) = 1 the state of the second qubitis flipped (0 ↔ 1). It is straight forward to check that (3.23) defines a unitary transformation.

We now consider the computation is performed that is shown in diagrammatic form in Fig. 3.4.2.It corresponds to the following sequence of unitary transformations

|0〉 ⊗ |1〉 → 1√2(|0〉+ |1〉)⊗ 1√

2(|0〉 − |1〉)

→ 1√2((−1)f(0)|0〉+ (−1)f(1)|1〉)⊗ 1√

2(|0〉 − |1〉)

→ 1

2√

2

[((−1)f(0) + (−1)f(1))|0〉

+((−1)f(0) − (−1)f(1))|1〉]⊗ 1√

2(|0〉 − |1〉)

≡ |q1〉 ⊗ |q2〉 (3.24)

We note that the for the first qubit there are two possible final states depending on whetherf(0)andf(1) are equal or different. Thus,

f(0) + f(1) = 0 ⇒ |q1〉 = (−1)f(0)|1〉f(0) + f(1) = 1 ⇒ |q1〉 = (−1)f(0)|0〉 (3.25)

This implies that by making a measurement on this qubit, which projects it either to the state|0〉or |1〉, we can decide the value off(0)+f(1). This does not fully determine the functionf(x), butit determines whether it belongs to set A (withf(0)+f(1) = 1) or set B (withf(0)+f(1) = 0).This demonstrates the point that the use of quantum superpositions makes it possible to performin one operation what would normally correspond to two classical computations.

This use of superposition is what we have referred to asquantum parallel processing. Thereis a close relation between the evaluation discussed here, where the quantum state contains in-formation about both functional valuesf(0) andf(1), and the interaction-free measurement dis-cussed earlier, where the state vector contains information about both the paths that the photonmay follow.

Page 81: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

3.4. PRINCIPLES FOR A QUANTUM COMPUTER 81

The example given by Deutsch’s algorithm may be too simple to convincingly justify theclaim that the quantum computation is more efficient than the classical computation. Obviouslythe unitary transformationUf corresponding to the functionf(x) has to be supplemented byother qubit operations and this does not make it completely clear that there is a net gain. Theimportant point to make is that only one evaluation which involvesf(x) has been done ratherthan two. For the other algorithms mentioned one can demonstrate more explicitly the gain byshowing that the number of qubit operations scales in a different way than the number of opera-tions in a classical computer. This makes it clear that quantum parallelism may indeed speed upcertain types of calculations.

3.4.3 Can a quantum computer be constructed?

The considerations on how a quantum computer may more efficiently solve certain types ofproblems makes it a very interesting idea. But is it feasable that this idea can be implimentedin the form of a real physical computer? The difficulties to overcome are extremely demanding.At present qubit operations with a small number of qubits may be performed, but the idea thatthousands and thousands of qubits work coherently together at the quantum level is at this stagean attractive dream. Some people working in the field are rather pessimistic that the necessarycontrol of the quantum states can in reality be made. In particular the problem ofdecoherenceisextremely demanding, although algorithms for correcting quantum states that are modified dueto decoherence have been suggested.

But other workers in the field remain optimistic, and at the level of making controlled quan-tum operations on a few qubits there has been an impressive progress. There is in fact a com-petition between different types of realizations of physical qubits. In the context of electronicsystems interesting developments are based on the use of electronic spin as the two-level variable.In the context of quantum optics the use of trapped two-level atoms or ions has been extensivelystudied. One particularly interesting application is in the form ofoptical lattices, where a col-lection of laser beems are used to trap atoms in a periodic potential, where the lasers are usedto adress the atoms in the form of one-qubit operations and where where they are used to let theatoms interact in the form of two-qubit operations.

One should note that even if the construction of auniversalquantum computer at this stagemay be far away in time, there may be partial goals that can be more readily achieved. Ideasof quantum cryptography have already been implemented, and in the field of computation aquantum simulatormay be a much closer goal than a universal computer. Recent developmentssuggests that the use of optical lattices may give a realistic approach towards this goal, and thesimulation of quantum spin lattices in this way may be a possibility in a not too distant future.

Page 82: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

82 CHAPTER 3. QUANTUM PHYSICS AND INFORMATION

Page 83: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

Chapter 4

The quantum theory of light

In this chapter we study the quantum theory of interaction between light and matter. Historicallythe understanding how light is created and absorbed by atoms was central for development ofquantum theory, starting with Planck’s revolutionary idea ofenergy quantain the description ofblack body radiation. Even if now the theoretical understanding of the quantum theory of lightis embedded within the more completequantum field theorywith applications to the physicsof elementary particles, the quantum theory of light and atoms has continued to be importantand to be developed further. We will study the quantum description of the free electromagneticfield and the description ofsingle photon processes. But we will also examine somecoherentmany-photon processes that are important in the application of quantum optics today.

4.1 Classical electromagnetism

The Maxwell theory of electromagnetism is the basis for the classical as well as the quantumdescription of radiation. With some modifications due togauge invarianceand to the fact thatthis is afield theory(with an infinite number of degrees of freedom) the quantum theory can bederived from classical theory by the standard route ofcanonical quantization. In this approachthe natural choice of generalized coordinates correspond to thefield amplitudes.

In this section we make a summary of the classical theory and show how a Lagrangian andHamiltonian formulation of electromagnetic fields interacting with point charges can be given.At the next step this forms the basis for the quantum description of interacting fields and charges.

4.1.1 Maxwell’s equations

Maxwell’s equations (in Heaviside-Lorentz units) are

∇ · E = ρ

∇×B− 1

c

∂E

∂t=

1

cj

∇ ·B = 0

83

Page 84: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

84 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

∇× E +1

c

∂B

∂t= 0 (4.1)

with E andB as the electric and magnetic field strengths andρ and j as the charge densityand current density. The equations are invariant under the relativistic Lorentz transformations,and this symmetry is manifest in a covariant formulation of the equations. In the covariantformulation the electric and magnetic fields are combined in theelectromagnetic field tensorF µν , where the indicesµ andν run from 0 to 3. µ = 0 represents in the usual way the timecomponent of the space-time coordinates andµ = 1, 2, 3 the space components.1 Writtenas a matrix the field tensor takes the form (when the first line and row correspond to the 0’thcomponent),

F =

0 Ex Ey Ez

− Ex 0 Bz −By

− Ey −Bz 0 Bx

− Ez By −Bx 0

(4.2)

The dual field tensoris defined byF µν = (1/2)εµνρσFρσ, whereεµνρσ is the four-dimensionalLevi-Civita symbol, which is anti-symmetric in permutation of all pairs of indices, and withthe non-vanishing elements determined by the conditionε0123 = +1. Note that in this and otherrelativistic expressions we applyEinstein’s summation conventionwith summation over repeated(upper and lower) indices. In matrix form the dual tensor is

F =

0 Bx By Bz

−Bx 0 − Ez Ey−By Ez 0 − Ex−Bz − Ey Ex 0

(4.3)

When written in terms of the field tensors and the and the 4-vector current densityjµ = (c ρ, j)Maxwell’s equations gets the following compact form

∂νFµν =

1

cjµ

∂νFµν = 0 (4.4)

The Maxwell equations are, except for the presence of source terms, symmetric under theduality transformationF µν → F µν , which corresponds to the following interchange of electricand magnetic fields

E → B , B → −E (4.5)

1We shall use the relativistic notation where the metric tensor has the diagonal matrix elementsg00 = −1, gii =+1, i = 1, 2, 3. This means that there is a sign change when raising or lowering the time-index but not a space index,in particularxµ = (ct, r) andxµ = (−ct, r). We also use the short hand notations for differentiation,∂µ = ∂

∂xµ ,and when convenient∂µφ = φ,µ and∂νAµ = Aµ,ν .

Page 85: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.1. CLASSICAL ELECTROMAGNETISM 85

In fact, the equations for the free fields are invariant under a continuouselectric-magnetic trans-formation, which is an extension of the discrete duality transformation,

E → cos θE + sin θB

B → − sin θE + cos θB (4.6)

This symmetry is broken by the source terms of the Maxwell equation.Formally the symmetry can be restored by introducingmagnetic chargeandmagnetic current

in addition to electric charge and current. In such an extended theory there would be two kindsof sources for electromagnetic fields, electric charge and magnetic charge, and in the classicaltheory such an extension would be unproblematic. However for the corresponding quantumtheory, some interesting complications would arise. The standard theory is based on the use ofelectromagnetic potentials, which depend on the “magnetic equations” being source free.

A consistent quantum theory can however be formulated with magnetic charges, providedboth electric and magnetic charges are associated with point particles and provided they satisfyaquantization condition.2 This is interesting, since such a consistence requirement would give afundamental explanation for the observed quantization of electric charge. Theoretical considera-tions like this have, over the years, lead to extensive searches for magnetic charges (ormagneticmonopoles), but at this stage there is no evidence for the existence of magnetic charges in nature.

We will here follow the standard approach and assume the Maxwell equations (4.1) withoutmagnetic sources to be correct. This admits the introduction of electromagnetic potentials

E = −∇φ− 1

c

∂tA , B = ∇×A (4.7)

In covariant form this is written as

F µν = ∂µAν − ∂νAµ ≡ Aν,µ − Aµ,ν (4.8)

whereAµ are the components of the four potential,A = (φ,A). When the fieldsE andB areexpressed in terms of the potentials, the source free Maxwell equations are automatically fulfilledand the inhomogeneous equations get the covariant form

∂ν∂νAµ − ∂µ∂νA

ν = −1

cjµ

(4.9)

Separated in time and space components the equations are

∇2φ+∂

∂t∇ ·A = −ρ

∇2A−∇(∇ ·A)− 1

c2∂2

∂t2A− 1

c

∂t∇φ = −1

cj (4.10)

2Dirac’s quantization conditionrelates the strength of the fundamental electric chargee to the fundamnetalmagnetic chargeg in the following way, eg/(4πhc) = 1/2. Since the (electric) fine structure constant is small,αe = e2/(4πhc) ≈ 1/137, the quantization condition gives a correspondinlylarge value to the magnetic finestructure constant,αg = g2/(4πhc) = (eg/(4πhc))2 × [e2/(4πhc)]−1 ≈ 137/4 .

Page 86: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

86 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

4.1.2 Field energy and field momentum

Maxwell’s equations describe how moving charges give rise to electromagnetic fields. The fieldson the other hand act back on the charges, through theLorentz force, which for a pointlike chargeq has the form

F = q(E +v

c×B) (4.11)

When the field is acting with a force on a charged particle this implies that energy and momentumis transferred between the field and the particle. Thus, the electromagnetic field carries energyand momentum, and the form of the energy and momentum density can be determined fromMaxwell’s equation and the form of the Lorentz force, by assumingconservation of energyandconservation of momentum.

To demonstrate this we consider a single pointlike particle is affected by the field, and thecharge and momentum density therefore can be expressed as

ρ(r, t) = q δ(r− r(t))

j(r, t) = q v(t) δ(r− r(t)) (4.12)

wherer(t) andv(t) describe the posision and velocity of the particle of the particle as functionsof time. The time derivative of the field energyE will (up to a sign change) equal the effect ofthe work performed on the particle

d

dtE = −F · v = −qv · E (4.13)

For a (large) volumeV with surfaceS we rewrite this relation by use of the field equations,

d

dtE = −

∫Vd3r j · E

= −∫Vd3r

[cE · (∇×B)− E · ∂E

∂t

]=

∫Vd3r

[E · ∂E

∂t+ B · ∂B

∂t− c∇ · (E×B)

]=

d

dt

1

2

∫Vd3r

[E2 +B2

]− c

∫S(E×B) · dS (4.14)

where in the last step surface terms a part of the volume integral has been rewritten as a surfaceintegral by use og Gauss’ theorem. The two terms have a direct physical interpretation, withthe first term corresponds to the time derivative of the field energy within the volumeV and thesecond term corresponds to the energy current through the surfaceS. This gives the standardexpressions for the energy density

u =1

2

[E2 +B2

](4.15)

Page 87: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.1. CLASSICAL ELECTROMAGNETISM 87

and the energy current density

S = cE×B (4.16)

which is also called thePointing vector.In a similar way we consider the change in the field momentumP ,

d

dtP = −F = −q(E +

v

c×B) (4.17)

This we re-write as

d

dtP = −

∫d3r

[ρE +

1

cj×B

]= −

∫d3r

[E (∇ · E) + (∇×B− 1

c

∂tE)×B

]= −

∫d3r

[− (∇× E)× E + (∇×B− 1

c

∂tE)×B

]=

∫d3r

[− 1

c

∂tB× E +

1

c

∂tE×B

]=

d

dt

∫d3r

1

cE×B

(4.18)

where we here have taken the infinite volume limit,V →∞, and have assumed that in this limitall surface contributions vanish. The expression for the field momentum density is then

g =1

cE×B (4.19)

which up to a factor1/c2 is identical to the energy current density. In the relativistic formulationthe energy density and the momentum density are combined in the symmetricenergy-momentumtensor,

T µν = −(F µρF νρ +

1

4gµνFρσF

ρσ) (4.20)

The energy density corresponds to the time componentT 00 and the momentum density (timesc)to the componentT 0i, i = 1, 2, 3.

4.1.3 Lagrange and Hamilton formulations of the classical Maxwell theory

The Lagrangian densityMaxwell’ equation can be derived from a variational principle, similar to Hamilton’s principlefor a mechanical system. Since the variable is afield, i.e., a function of bothr andt, the action isan integral over space and time, of the form

S[Aµ(r, t)] =∫Ω

d4xL(Aµ, Aµ,ν) (4.21)

Page 88: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

88 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

whereL is the Lagrangian densityand Ω is the chosen space-time region. The Lagrangiandensity is alocal function of the field variableAµ and its derivatives,

L = −1

4FµνF

µν +1

cAµjµ (4.22)

whereF µν = Aν,µ − Aµ,ν should be treated as a derivative of the field variable.A dynamical field,i.e., a solution of Maxwell’s equations for given boundary conditions (on

the boundary ofΩ), corresponds to a solution of the variational equation,

δS = 0 (4.23)

where this condition should be fulfilled forarbitrary variations in the field variables, with fixedvalues on the boundary ofΩ. The equivalence between this variational equation and the differen-tial (Maxwell) equations is shown in the same way as for the equation of motion of a mechnicalsystem, with a discrete set of coordinates.

The variation inS is, to first order in the variation of the field variableAµ, given by

δS =∫Ω

d4x[ ∂L∂Aµ

− ∂ν( ∂L∂Aµ,ν

) ]δAµ (4.24)

where in this expression we have made a partial integration and used the fact thatδAµ vanisheson the boundary ofΩ. Since the variation should otherwise be free, the variational equation(4.23) is satisfied only if the field satisfies the Euler-Lagrange equation

∂L∂Aµ

− ∂ν( ∂L∂Aµ,ν

)= 0 (4.25)

With L given by (4.22) it is straight forward to check that the Euler-Lagrange equation repro-duces the inhomogeneous Maxwell equations.

The canonical field momentumThe Lagrange and Hamilton formulations give equivalent descriptions of the dynamics of a me-chanical system. Also in the case of field theories, a Hamiltonian can be derived from the La-grangian density, but with electromagnetism there are some complications to be dealt with. Wefirst note that the Lagrangian formulation is well suited to fit the relativistic invariance of thetheory, since it makes no difference in the treatment of the space and time coordinates. That isnot so for the Hamiltonian formulation, whichdoesdistinguish the time direction. This carriesover to the (standard) quantum description, where time is distinguished in the Schrodinger equa-tion. This does not mean that relativistic invariance cannot be incorporated in the Hamiltoniandescription, but it is not as explicit as in the Lagrange formulation.

To define the Hamiltonian we first need to identify a set of independent variables and theircorresponding canonical momenta. As field variables we tentatively use the potentialAµ(r),wherer can be viewed as a continuous andµ as a discrete index for the field coordinate. (Fora mechanical systemr andµ correspond to the discrete indexk which labels the independent

Page 89: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.1. CLASSICAL ELECTROMAGNETISM 89

generalized coordinatesqk.) Like the field variable, the canonical momentum now is afield,πµ = πµ(r). It is defined by

πµ =∂L∂Aµ

=1

c

∂L∂Aµ,0

(4.26)

We note here the different treatment of the time and space coordinate.The Lagrangian density (4.22) gives the following expression for the canonical field momen-

tum

πµ =1

c

∂Aµ,0

[− 1

2(Aµ,νA

µ,ν − Aµ,νAν,µ)− 1

cAµjµ

]=

1

cF µ0 (4.27)

The space part (µ = 1, 2, 3) is proportional to the electric field, but note that the time componentvanishes,π0 = 0. Also note that thecanonicalmomentum defined by the Lagrangian is this wayis not directly related to thephysicalmomentum carried by the electromagnetic field, as earlierhas been found from momentum conservation.

The fact that there is no canonical momentum corresponding to the field componentA0 indi-cates that the choice we have made for the generalized coordinates of the electromagnetic field isnot complitely satisfactory. The field amplitudesAµ cannot all be seen as describing independentdegrees of freedom. This has to do with the gauge invariance of the theory, and we proceed todiscuss how this problem can be handled.

Gauge invariance and gauge fixing

We consider the following transformation of the electromagnetic potentials

A → A′ = A + ∇χ , φ→ φ′ = φ+1

c

∂tχ (4.28)

whereχ is a (scalar) function of space and time. In covariant form it is

Aµ → Aµ′ = Aµ + ∂µχ (4.29)

This is agauge transformationof the potentials, and it is straight forward to show that sucha transformation leaves the field strengthsE andB invariant. The usual way to view the in-variance of the fields under this transformation is that it reflects the presence ofnon-physicaldegrees of freedomin the potentials. The potentials define an overcomplete set of variables forthe electromagnetic field. For the Hamiltonian formulation to work it is necessary to identify theindependent, physical degrees of freedom.

There are two different constraints, or gauge conditions, that are often used to remove theunphysical degrees of freedom associated with gauge invariance. The Coulomb (or radiation)gauge condition is

∇ ·A = 0 (4.30)

Page 90: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

90 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

and the Lorentz (or covariant) gauge condition is

∂µAµ = 0 (4.31)

The first one is often used when the interaction between radiation and atoms is considered, sincethe electrons then move non-relativistically and therefore there is no need for a covariant formof the gauge condition. This is the condition we will use here. The Lorentz gauge condition isoften used in the description of interaction between radiation and relativistic electrons and othercharged particles. The relativistic form of the field equations then are not explicitely broken bythe gauge condition, but the prize to pay is that in the quantum description one has to includeunphysical photon states.

Hamiltonian in the Coulomb gaugeIn Coulomb gauge Maxwell’s (inhomogeneous) equations reduce to the following form

∇2φ = −ρ (4.32)

(1

c2∂2

∂t2−∇2)A =

1

cjT (4.33)

Where

jT = j− ∂

∂t∇φ (4.34)

is the transversecomponent of th current density.3 It satisfies the transversality condition as aconsequence of the continuity equation for charge,

∇ · jT = ∇ · j− ∂

∂t∇2φ

= ∇ · j +∂

∂tρ

= 0 (4.35)

We note that the equation for the scalar fieldφ contains no time derivatives, and can be solved intems of the charge distribution,

φ(r, t) =∫d3r′

ρ(r′, t)

4π|r− r′|(4.36)

This is the electrostatic (Coulomb) potential of astationarycharge distribution which coincideswith the true charge distributionρ(r, t) at time t. Note that the relativistic retardation effectsassociated with the motion of charges are not present, and the vector potentialA is thereforeneeded to give the correct relativistic form of the electromagnetic field set up by the charges.

3Any vector fieldj(r) can be written asj = jT + jL, where∇ · jT = 0 and∇ × jL = 0. jT is referred to asthetransverseor solenoidalpart of the field andjL as thelongitudinalor irrotational part of the field. In the presentcase, withjL = −∇φ the irrotational form ofjL follows directly, while the transversality ofjT follows from chargeconservation.

Page 91: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.1. CLASSICAL ELECTROMAGNETISM 91

Theφ-field carries no independent degrees of freedom of the electromagnetic field, it is fullydetermined by the position of the charges. The dynamics of the Maxwell field is then carriedsolely by the vector potentialA. In the quantum description the photons are therefore associatedonly with the field amplitudeA and not withφ.4

Expressed in terms of the dynamicalA field the Lagrangian density gets the form

L = −1

4FµνF

µν +1

cjµA

µ

=1

2c2A2 +

1

cA ·∇φ+

1

2(∇φ)2 − 1

2(∇×A)2 +

1

cj ·A− ρφ (4.37)

Two of the terms we re-write as

1

cA ·∇φ =

1

c∇ · (Aφ)− 1

cφ∇ · A

=1

c∇ · (Aφ) (4.38)

and

1

2(∇φ)2 =

1

2∇ · (φ∇φ)− 1

2(φ∇2φ)

=1

2∇ · (φ∇φ) +

1

2ρφ (4.39)

where the Coulomb gauge condition has been used in the first equation. We note that the diver-gences in these expressions give unessential contributions to the Lagrangian and can be deleted.This is so since divergences only give rise to surface terms in the action integral and therefore donot affect the field equations.

With the divergence terms neglected the Lagrangian density gets the form5

L =1

2c2A2 − 1

2(∇×A)2 +

1

cj ·A− 1

2ρφ (4.40)

The field degrees of freedom are now carried by the components of the vector field, which stillhas to satisfy the constraint equation∇ ·A = 0. The conjugate momentum is

π =1

c2· A ≡ −1

cET (4.41)

4A curious consequence is that whereas for a stationary charge there are no photons present, only the non-dynamical Coulomb field, for a uniformly moving charge there will be a “cloud” of photons present to give thefield the right Lorentz transformed form. This is so even if there is no radiation from the charge. We may see thisas a consequence of the Coulomb gauge condition and our separation of the system into field degrees of freedom(photons) and particle degrees of freedom. In reality, for the interacting system of charges and fields this separationis not so clear, since on one hand the Coulomb field is non-dynamically coupled to the charges, and on the otherhand is dynamically coupled to the radiation field.

5In this expression we have used the freedom to replace the transverse currentjT with the full currentj, which ispossible since the difference gives rise to an irrelevant derivative term. Note, however, that when the full current isused, the transversality condition∇ ·A has to be imposed as a constraint to derive the correct field equations fromthe Lagrangian. When theA-field is coupled to the transverse current, that is not needed.

Page 92: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

92 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

whereET denotes the transverse part of the electric field, the part where−∇φ is not included.So far we have not worried about the degrees of freedom associated with the charges. We

will now include them in the description by assuming the charges to be carried by point particles.This means that we express the charge density and current as

ρ(r, t) =∑i

eiδ(r− ri(t))

j(r, t) =∑i

eivi(t)δ(r− ri(t)) (4.42)

where the sum is over all the particles withri(t) andvi(t) as the position and velocity of particlei as functions of time. The action integral of the full interacting system we write as

S =∫dtL =

∫dt(Lfield + Lint + Lpart) (4.43)

whereLfield + Lint is defined as the space integral of the Lagrangian density (4.40) andLpart isthe standard Lagrangian of free non-relativisic particles. By performing the space integral overthe charge density and current, we find the following expression for the LagrangianL,6

L =∫d3r

[ 1

2c2A2 − 1

2(∇×A)

]+∑i

eicvi ·A(ri)

−1

2

∑i6=j

eiej4π[ri − rj|

+∑i

1

2miv

2i (4.44)

The corresponding Hamiltonian is found by performing a (Legendre) transformation of the La-grangian in the standard way

H =∫d3r π · A +

∑i

pi · vi − L

=∫d3r

1

2(E2

T + B2) +∑i<j

eiej4π[ri − rj|

+∑i

1

2mi

(p− eicA(ri))

2

(4.45)

This expression for the Hamiltonian is derived for the classical system of interacting fields andparticles, but it has the same form as the Hamiltonianoperatorthat is used to describe the quan-tum system of non-relativistic electrons interacting with the electromagnetic field. However, toinclude correctly the effect of themagnetic dipole momentof the electrons, a spin contributionhas to be added to the Hamiltonian. It has the standard form of a magnetic dipole term,

Hspin = −∑i

giei2mic

Si ·B(ri) (4.46)

wheregi is the g-factor of particlei, which is close to2 for electrons. Often the spin term is smallcorresponding to the other interaction terms and can be neglected.

6For point charges an ill-definedself energy contributionmay seem to appear from the Coulomb interaction term.It is here viewed as irrelevant since it gives no contribution to theinteractionbetween the particles and is thereforenot included. In a more complete field theoretic treatment of the interaction of charges and fields such problems willreappear and have to be handled within the framework ofrenormalization theory.

Page 93: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.2. PHOTONS – THE QUANTA OF LIGHT 93

4.2 Photons – the quanta of light

With the degrees of freedom of the electromagnetic field and of the electrons disentangled, wemay consider the space of states of the full quantum system as the product space of afield-statespace and aparticle-state space,

H = Hfield ⊗Hparticle (4.47)

In this section we focus on the quantum description of thefree electromagnetic field, whichdefines the spaceHfield and the operators (observables) acting there. The energy eigenstatesof the free field are thephoton states. When we as a next step include interactions, these willintroduce processes where photons are emitted and absorbed.

4.2.1 Constructing Fock space

Since the field amplitudeA is constrained by the transversality condition∇ ·A = 0 it is con-venient to make a Fourier transform to plane wave amplitudes. In a standard way we introducea periodicity condition on the components of the space coordinater, so that the components ofthe Fourier variablek take discrete values,ki = 2πni/L, with L as a (large) periodic length andni as a set of integers for the componentsi = 1, 2, 3. The field amplitude then can be written asa discrete Fourier sum

A(r, t) =1√V

∑k

2∑a=1

Aka(t)εkaeik·r (4.48)

where theV = L3 is the normalization volume and the vectorsεka are unit vectors which satisfyk · εka = 0 as a consequence of the the transversality condition∇ ·A = 0. With the constrainton the field taken care of in this way amplitudesAka can be taken to represent the independentdegrees of freedom of the field. Note, however that the amplitudes for opposite wave vectorskand − k are not independent, as we shall discuss below.

Thenormal modesof the field are the the plane wave solutions of the field equations,

Aka(t) = A0kae

±iωkt , ωk = ck (k = |k|) (4.49)

They represent plane waves ofmonochromatic, polarized light. The two vectorsεka are thepolarization vectors, which may be taken to be two orthonormal vectors in the sense,

ε∗ka · εk′a′ = δkk′δaa′ (4.50)

Both vectors are perpendicular tok. The polarization vectors can be taken to be real, in whichcase the normal modes correspond to plane polarized light or the may be complex, in which casethey correspond to circularly or more generally to elliptically polarized light.

The fact that the fieldA(r, t) is a real field gives the following relation between the fieldamplitudes at wave vectorsk and − k,

A−kaε−ka = A∗kaε

∗ka (4.51)

Page 94: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

94 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

The polarization vectors can be chosen so that

ε−ka = ε∗ka (4.52)

so that for the field amplitude the relation is

A∗ka = A−ka (4.53)

The Lagrangian of the free electromagnetic field expressed in terms of the Fourier amplitudesis

L =1

2

∑ka

[ 1

c2A∗

kaAka − k2A∗kaAka

](4.54)

We note that there is no coupling between the different Fourier components, and for each com-ponent the Lagrangian has the same form as for a harmonic oscillator of frequencyω = ck. Thevariables are however complex rather than real, since reality of the field is represented by therelation (4.53) with each Fourier component being complex. Thus, the variables in (4.54), foreachk anda, correspond to the complex coordinatez of the harmonic oscillator rather than therealx. A rewriting of the Lagrangian in tems of real variables is straight forward, but it is moreconvenient to continue to work with the complex variables.

The conjugate momentum to the variableAka is

Πka =1

c2A∗

ka = −1

cE∗

ka (4.55)

whereEka is the Fourier component of the electric field. From this the form of the free fieldHamiltonian is found

H =∑ka

ΠkaAka − L

=∑ka

1

2(E∗

kaEka + k2A∗kaAka) (4.56)

which is consistent with the earlier expression found for the Hamiltonian of the electromagneticfield, represented by the first term of (4.45) written in Fourier transformed field components.

Quantization of the theory means that the classical field amplitudeA and field strengthEare now replaced by operatorsA and E, while the complex conjugate fields are replaced byhermitian conjugate operators. The conjugate field variables satisfy the canonical commutationrelations in the form [

E†ka, Ak′b

]= −1

c

[˙A

ka, Ak′b

]= −ihc δkk′ δab (4.57)

It is convenient to change to new variables,

Aka = c

√h

2ωk(aka + a†−ka)

Eka = i

√hωk2

(aka − a†−ka) (4.58)

Page 95: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.2. PHOTONS – THE QUANTA OF LIGHT 95

where the reality condition (5.47) has been made explicit. In tems of the new variables theHamiltonian takes the form

H =∑ka

1

2hωk(akaa

†ka + a†kaaka) (4.59)

and the canonical commutation relations are[aka, a

†k′b

]= δkk′δab

[aka, ak′b] = 0 (4.60)

Expressed in this way the Hamiltonian has exactly the form of a collection of independent quan-tum oscillators, one for each field mode, withaka as lowering operator anda†ka as raising operatorin the energy spectrum of the oscillator labelled by(ka).

It is convenient to work, in the following, in the Heisenberg picture, where the observablesare time dependent. In this picture the field amplitude, which is now afield operator, and theelectric field get the following form,

A(r, t) =∑ka

c

√h

2V ωk

[akaεkae

i(k·r−ωkt) + a†kaε∗kae

−i(k·r−ωkt)]

E(r, t) = i∑ka

√hωk2V

[akaεkae

i(k·r−ωkt) − a†kaε∗kae

−i(k·r−ωkt)]

(4.61)

The state space of the free electromagnetic field then can be viewed as a product space ofharmonic oscillator spaces, one for each normal mode of the field. The vacuum state is definedas the ground state of the Hamiltonian (4.59), which means that it is the state where all oscillatorsare unexcited,

aka|0〉 = 0 forallk, a (4.62)

The operatora†ka, which excites one of the oscillators, is interpreted as acreation operatorwhichcreates onephotonfrom the vacuum,

a†ka|0〉 = |1ka〉 (4.63)

An arbitrary number of photons can be created in the same state(ka),

(a†ka)n|0〉 =

√nka! |nka〉 (4.64)

and this shows that the photons arebosons. The operatoraka is anannihilation operatorwhichreduces the number of photons in the stateka,

aka|nka〉 =√nka |nka − 1〉 (4.65)

The photon number operator is

Nka = a†kaaka (4.66)

Page 96: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

96 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

It counts the number of photons present in the state(ka),

Nka|nka〉 = nka |nka〉 (4.67)

The general photon state, often referred to as aFock state, is a product state with a well-defined number of photons for each set of quantum numbers(ka). It is specified by a set ofoccupation numbersnka for the single-photon states. The set of Fock states|nka〉 form abasis of orthonormal states that span the state space, theFock space, of the quantized field. Thusa general quantum state of the free electromagnetic field is a linear superposition of Fock states,

|ψ〉 =∑nka

c(nka)|nka〉 (4.68)

The energy operator, which is identical to the Hamiltonian, has the same form as the classicalfield energy. It is diagonal in the Fock basis and can be expressed in terms of the photon numberoperator

H =∫d3r

1

2(E2 + B2)

=∑ka

1

2hωk(a

†kaaka + akaa

†ka)

=∑ka

hωk(Nka +1

2) (4.69)

One should note that the vacuum energy is formally infinite, since the sum over the ground stateenergy1

2ωk for all the oscillators diverges. However, since this is an additive constant which is

common for all states it can be regarded as unphysical and simply be subtracted. The vacuumenergy is on the other hand related tovacuum fluctuationsof the quantized fields and in this wayit is connected to properties of the quantum theory that correspond to real, physical effects.

In a similar way the classical field momentum is replaced by an operator of the same form,

P =∫d3r

1

c(E× B)

=∑ka

1

2hk(a†kaaka + akaa

†ka)

=∑ka

hkNka (4.70)

The energy-momentum relation for a single photon,hωk = h|k|, shows that the photons behavelike massless particles.

4.2.2 Coherent and incoherent photon states

In the same way as for the quantum description of single particles, we may view expectationvalues of the fields to represent classical field configurations within the quantum description.

Page 97: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.2. PHOTONS – THE QUANTA OF LIGHT 97

The fields will not be sharply defined since there will in general be quantum fluctuations aroundthe expectation values. In particular we note that the expectation value of the electric field takesthe usual form of a classical field expanded in plane wave components

⟨E(r, t)

⟩= i

∑ka

√hωk2V

[αkaεkae

i(k·r−ωkt) − α∗kaε

∗kae

−i(k·r−ωkt)]

(4.71)

whereαka andα∗ka are expectation values of the annihilation and creation operators.

αka = 〈aka〉 , α∗ka =

⟨a†ka

⟩(4.72)

We note from these expressions the curious fact that for Fock states, with a sharply definedset of photon numbers, the expectation values vanish. Such states which may be highly excitedin energy, but still with vanishing expectation values for electric and magnetic fields are highlynon-classical states. Classical fields, on the other hand, correspond to states with a high degreeof coherence, in the form of superposition between Fock states. As already discussed for theone-dimensional harmonic oscillator, thecoherent statesare minimum uncertainty states in thephase spase variables. Here these variables are represented by the field amplitudeA(r, t) and theelectric field strengthE(r, t). For a single field mode the coherent state has the form as discussedin section (1.3.3) for the harmonic oscillator

|αka〉 =∑nka

e−12|αka|2 (αka)

nka

√nka!

|nka〉 (4.73)

whereαka is related to the expectation values ofaka anda†ka as in (4.72). The full coherent stateof the electromagnetic field is then a product state of the form

|ψ〉 =∏ka

|αka〉 (4.74)

and the corresponding expectation value of the electric field is given by (4.71).The fluctuations in the electric field for the coherent state can be evaluated in terms of the

following correlation function

Cij(r− r′) ≡ 〈Ei(r)Ej(r′)〉 − 〈Ei(r)〉 〈Ej(r′)〉

=∑k

hωk2V

eik·(r−r′)(δij −kikjk2

)

→ ch

2(2π)3

∫d3kkeik·(r−r′)(δij −

kikjk2

) (4.75)

where in the last step we have taken the infinite volum limit. For largek this integral has anundamped oscillatory behaviour, but the integral can be made well defined by introducing adamping factore−εk and takingε to 0. We introduce the short hand notationa = r− r′ and write

Page 98: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

98 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

1.5 2 2.5 3 3.5 4

0.2

0.4

0.6

0.8

1

C

|r-r'|

Figure 4.1: The correlation function for the electric field in a coherent state,Cij(r − r′) ≡〈Ei(r)Ej(r′)〉 − 〈Ei(r)〉 〈Ej(r′)〉. The form of the non-vanishing elements,i = j, is shown asthe function of distancer− r′ between two points in space.

the integral as

Cij(r− r′) =ch

2(2π)3(∂

∂ai

∂aj− δij

∑k

∂ak

∂ak)∫d3k

1

keik·ae−εk

=ch

2(2π)3(∂

∂ai

∂aj− δij

∑k

∂ak

∂ak)

a2

=ch

π2a4(2aiaja2

− δij)

= − ch

π2|r− r′|4(δij − 2

(xi − x′i)(xj − x′j)

|r− r′|2) (4.76)

We note the fall-off of the correlation with separation between the two points and also notes thatthe function diverges asr → r′. This divergence is typical for field theories and reflects the factthat the physical fields cannot be defined with infinite resolution. A way to avoid such infinitiesis to define the physical field by averaging over a small volume. This introduces effectively amomentum cut-off in the Fourier transformation of the field.

The expression for the correlation function is independent of which of the coherent states thatis chosen. This means in particular that it is the same for the vacuum state and can be viewed asdemonstrating the presence of the vacuum fluctuations of the electric field.

A similar calculation for the magnetic field shows that the correlation function is the same asfor the electric field. This is typical for the coherent state which is symmetric in its dependenceof the field variable and its conjugate momentum. SinceA andE are conjugate variables, theelectric and magnetic fields do not commute as operator fields. Formally the commutator is

[Bi(r), Ej(r′)] = δij∇× δ(r− r′) (4.77)

Page 99: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.2. PHOTONS – THE QUANTA OF LIGHT 99

where one should express it in terms of the Fourier transformed fields in order to give it a moreprecise meaning. There are states where the fluctuations in theE-field are supressed relative tothat of the vacuum state, but due to the non-vanishing of the commutator withB, the fluctuationsin theB-field will then be larger than that of the vacuum. States with reduced fluctuations inone of the fields, and which still satisfy the condition of minimum (Heisenberg) uncertainty forconjugate variables, are often referred to assqueezed states.

Radio waves, produced by oscillating currents in an antenna, can clearly be regarded asclassical electromagnetic waves and are well described as coherent states of the electromagneticfield within the quantum theory of radiation. The mean value of the fields at the receiver antennainduces the secondary current that creates the electric signal to the receiver. Also for shorterwavelengths in the microwave and the optical regimes coherent states of the electromagneticcan be created, but not by oscillating macroscopic currents. In masers and lasers the intrinsictendency of atoms to correlate their behaviour in a strong electromagnetic field is used to createa monocromatic beam with a high degree of coherence. Ordinary light, on the other hand, asemitted by a hot source is highly incoherent, since the emission from different atoms only to alow degree is correlated. This means that normal light is highly non-classical in the sense that itcannotbe associated with a classical waves with oscillating (macroscopic) values ofE andB.

To demonstrate explicitely the non-classical form on the radiation from a hot source, weconsider light in a thermal state described by a density operator of the form

ρ = N e−βH , N = (Tr e−βH)−1 , β = (kBT )−1 (4.78)

We first calculate the expectation value of the photon numbers

nka ≡⟨Nka

⟩= N Tr (e−βhωka

†kaaka a†kaaka)

= N Tr(e−βhωka†kaaka a†kae

βhωka†kaaka e−βhωka

†kaaka aka)

= N Tr(e−βhωk a†ka e−βhωka

†kaaka aka)

= e−βhωkN Tr( e−βhωka†kaaka akaa

†ka)

= e−βhωkN Tr(e−βhωka†kaaka(a†kaaka + 1))

= e−βhωk(nka + 1) (4.79)

which gives

nka =1

eβhωk − 1(4.80)

with ωk = ck. This is the well-known Bose-Einstein distribution for photons in thermal equilib-rium with a heat bath. From this distribution the Planck spectrum can be determined. The totalradiation energy is

E =∑ka

hωknka

Page 100: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

100 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

radiation

2000 K

1500 K

1000 K

wave length [nm]

2000 4000 6000heat bath

cavity

detector inte

nsi

ty

Figure 4.2: The Planck spectrum. A schematic experimental set up is shown where thermal light iscreated in a cavity surrounded by a heat bath (an oven). The radiation escapes through a narrow holeand can be analyzed in a detector. The right part of the figure shows the intensity (energy density) of theradiation as a function of wave length for three different temperatures.

→ 2V

(2π)3

∫d3k

hωkeβhωk − 1

=V h

π2c3

∫dω

ω3

eβhω − 1(4.81)

This corresponds to the following energy density per frequency unit to be

u(ω) =h

π2c3ω3

eβhω − 1(4.82)

which displays the form of the Planck spectrum.We now consider the expectation values of the photon annihilation and creation operators. A

similar calculation as for the number operator gives⟨a†ka

⟩= N Tr(e−βhωka

†kaaka a†ka)

= N Tr(e−βhωka†kaaka a†kae

βhωka†kaaka e−βhωka

†kaaka)

= N Tr(e−βhωk a†ka e−βhωka

†kaaka)

= e−βhωkN Tr( e−βhωka†kaaka a†ka)

= e−βhωk

⟨a†ka

⟩(4.83)

This shows that the expectation value of the creation operator vanishes. A similar reasoningapplies to the annihilation operator, so that⟨

a†ka⟩

= 〈aka〉 = 0 (4.84)

As a consequence the expectation value of the electric and magnetic field vanish identically

〈E(r, t)〉 = 〈B(r, t)〉 = 0 (4.85)

Page 101: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.2. PHOTONS – THE QUANTA OF LIGHT 101

The last result for the expectation values of the electric and magnetic field strengths maybe expected since thermal ligth isunpolarized. Due to rotational invariance no direction canbe distinguished, which means that the expectation values for thevector fieldsE andB shouldvanish. We note that classical fieldsE andB cannot vanish identically unless there is no radiationpresent. Rotational invariance can therefore be restored for a classical radiation field only byintroducingstatistical fluctuatuationsin the fields which average out the field without settingE2

andB2 to zero. However, the form of the density matrix (4.78) shows that the vanishing of theexpectation valuesis not due to statistical averaging over classical configurations. The densitymatrix can be interpreted as describing an incoherent mixture (i.e., a statistical ensemble) ofenergy eigenstates, where each of these is a Fock state with vanishing expectation values forthe electromagnetic field strengths. The vanishing of the expectation values therefore is due toquantum fluctuationsof the fields rather thanstatistical fluctuations.

This view of light, that it is not to be seen as (a statitistical mixture of) classical field config-urations may seem to be problematic when confronted with the classical demonstrations of thewave nature of light, in particular Young’s interference experiments. This experiment seems toconfirm the picture of light as (classical) waves that can interfere constructively or destructvely,depending on the relative phases of partial waves of light. However, one should remember thatinterferenceis notdepending of coherent behavior of many photons. We may compare this withthe double slit experiment for electrons where interference can be seen as a single particle effect.Many electrons are needed to build up the interference pattern, but no coherent effect betweenthe states ofdifferentelectrons is needed. The interference can be seen as a single-electron ef-fect, but the geometry of the experiment introduces a correlation in space between the patternassociated with each particle so that a macroscopic pattern can be built. In the same way we mayinterprete interference in incoherent light to be a single-photon effect. Each photon interfers withitself, without any coherence with other photons. But they all see the same geometrical structureof the slits and this creates a correlated interference pattern.

4.2.3 Photon emission and photon absorption

We shall in the following consider processes where only a single electron is involved. To be morespecific we may consider transitions in a hydrogen atom, or more generally in analkali atom,with a single electron in the outermost shell, where the electron either absorbs or emits a photon.The full Hamiltonian has the form

H = Hfield0 + Hatom

0 + Hint (4.86)

where

Hatom0 =

p2

2m+ V (r) (4.87)

is the unperturbed Hamiltonian of the electron, which moves in the electrostatic potentialV fromthe charges of the atom. The transitions are induced by the interaction part of the Hamiltonian ofthe electron and electromagnetic field,

Hint = − e

mcA(r) · p +

e2

2mc2A(r)2 − e

mcS · B(r) (4.88)

Page 102: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

102 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

wherer is the electron coordinate andp is the (conjugate) momentum operator. Theg-factor ofthe electron has here been set to2.

The two first terms inHint arecharge interaction termswhich describe interactions betweenthe charge of the electron and the electromagnetic field. The third term is thespin interactionterm, which describes interactions between the magnetic dipole moment of the electron and themagnetic field. We note that to lowest order in perturbation expansion, the the first and third termof the interaction Hamiltonian (4.88) describe processes where a single photon is either absorbedor emitted. The second term describe scattering processes for a single photon andtwo-photonemission and absorption processes. This term is generally smaller than the first term and in aperturbative treatment it is natural to collect first order contributions from the second term withsecond order contributions from the first term. This means that we treat the perturbation series asan expansion in powers or the chargee (or rather the dimensionles fine-structure constant) ratherthan in the the interactionHint.

The spin interaction term is also generally smaller than the first (charge interaction) term.However there are different selection rules for the transitions induced by these two terms, andwhen the direct contribution from the first term is forbidden the spin term may give an importantcontribution to the transition. However, we shall in the following restrict the discussion to tran-sitions where the contribution from the first term is dominant. For simplicity we use the samenotationHint when only the first term is included.

The interaction Hamiltonian we may now separated in a creation (emmision) part and anannihilation (absorption) part

Hint = Hemis + Habs (4.89)

Separately they are are non-hermitian withH†emis = Habs. Expressed in terms of photon creation

and annihilation operators they are

Hemis = − em

∑ka

√h

2V ωkp · ε∗kaa

†kae

−i(k·r−ωkt)

Habs = − em

∑ka

√h

2V ωkp · εkaakae

i(k·r−ωkt) (4.90)

Note that when written in this way the operators are expressed in theinteraction picturewherethe time evolution is determined by the free (non-interacting) theory. The time evolution of thestate vectors are in this picture determined by the interaction Hamiltonian only, not by the free(unperturbed) Hamiltonian. This picture is most conveniently used in a perturbative expansion,as already given as Equation (1.56) in section (??).

From the above expressions we can find the interaction matrix elements corresponding toemission and absorption of a single photon. We note that to first order the photon number onlyof one mode(k, a) is changed and we write the photon number only of this mode explicitly. Wewrite the initial and final states as

|i〉 = |A, nka〉|f〉 = |B, nka ± 1〉 (4.91)

Page 103: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.2. PHOTONS – THE QUANTA OF LIGHT 103

where|A〉 is the (unspecified) initial state of the atom (i.e., the electron state),|B〉 is the finalstate of the atom, andnka is the photon number of the initial state. This means that we considertransitions between Fock states of the electromagnetic field.

For absorption the matrix element is

〈B, nka − 1|Hint|A, nka〉 = 〈B, nka − 1|Habs|A, nka〉

= − e

m

√h

2V ωk〈B, nka − 1|p · εkaakae

i(k·r−ωkt)|A, nka〉

= − e

m

√hnka

2V ωkεka · 〈B|peik·r|A〉e−iωkt (4.92)

and the corresponding expression for photon emission is

〈B, nka + 1|Hint|A, nka〉 = 〈B, nka + 1|Hemis|A, nka〉

= − e

m

√h

2V ωk〈B, nka + 1|p · ε∗kaa

†kae

−i(k·r−ωkt)|A, nka〉

= − e

m

√h(nka + 1)

2V ωkε∗ka · 〈B|pe−ik·r|A〉eiωkt (4.93)

In the final expressions of both (4.92) and (4.155) note that only the matrix elements for theelectron operator between the initial and final statesA andB remain, while the effect of thephoton operators is absorbed in the prefactor, which now depends on the photon number of theinitial state.

It is of interest to note that the electron matrix elements found for the interactions with aquantizedelectromagnetic field is quite analugous to those found for interaction with a classicaltime-dependentelectromagnetic field of the form

A(r, t) = A0 ei(k·r−ωkt) + A∗

0 e−i(k·r−ωkt) (4.94)

where the positive frequency part of the field (i.e., the term proportional toe−iωkt)) corresponds tothe absorption part of the matrix element and the negative frequency part (proportional toeiωkt))corresponds to the emission part. The use of this expression for theA-field gives asemi-classicalapproach to radiation theory, which in many cases is completely satisfactory. It works well foreffects likestimulated emissionwhere the classical field corresponds to a large value of thephoton number in the initial state and where the relation between the amplitude of the oscillatingfield and the photon number is given by (4.92) and (4.155). However, for small photon numbersone notes the difference in amplitude for emission and absorption. This difference is not reflectedin the classical amplitude (4.94). In the case ofspontaneous emission, with nka = 0 in the initialstate, the quantum theory correctly describes the transition of the electron from an excited stateby emission of a photon, whereas a semiclassical description does not, since electronic transitionsin this approach depends on the presence of an oscillating electromagnetic field.

Page 104: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

104 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

4.2.4 Dipole approximation and selection rules

For radiation from an atom, the wave length of the radiation field is typically much larger thanthe dimension of the atom. This difference is exemplified by the wave length of blue light andthe Bohr radius of the hydrogen atom

λblue ≈ 400nm , a0 = 4πh2

me2≈ 0.05nm (4.95)

This means that the effect of spacial variations in the electromagnetic field over the dimensionsof an atom are small, and therefore thetime-variationsrather than the space variations of thefield are important. In the expression for the transition matrix elements (4.92) and (4.155) thisjustifies an expansion of the phase factors in powers ofk · r,

e±ik·r = 1± ik · r− 1

2(k · r)2 + ... (4.96)

where the first term is dominant. The approximation where only this term is kept is referred toas thedipole approximation. The other terms give rise to highermultipolecontributions. Thesemay give important contributions to atomic transitions only when the contribution from the firstterm vanishes due to a selection rule. However the transitions dominated by higher multipoleterms are normally much slower than the ones dominated by the dipole contribution.

When the dipole approximation is valid, the matrix elements of the interaction Hamiltoniansimplify to

〈B, nka − 1|Habs|A, nka〉 = − e

m

√hnka

2V ωkεka · pBA e−iωkt

〈B, nka + 1|Hemis|A, nka〉 = − e

m

√h(nka + 1)

2V ωkε∗ka · pBA eiωkt (4.97)

wherepBA is the matrix element of the operatorp between the states|A〉 and|B〉. It is convenientto re-express it in terms of the matrix elements of the position operatorr, which can be done byuse of the form of the (unperturbed) electron Hamiltonian,

Hatom0 =

p2

2m+ V (r) (4.98)

whereV (r) is the (Coulomb) potential felt by the electron. This gives

[Hatom

0 , r]

= −i hm

p (4.99)

With the initial state|a〉 and the final state|B〉 as eigenstates ofHe0 , with eigenvaluesEA and

EB, we find for the matrix element of the momentum operator between the two states

pBA = im

h(EB − EA)rBA ≡ imωBA rBA (4.100)

Page 105: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.2. PHOTONS – THE QUANTA OF LIGHT 105

Note that in the expression for the interaction the change frompBA to rBA corresponds to atransformation

− e

mcA · p → e

cr · A = −er · E (4.101)

where the last term is identified as the electric dipole energy of the electron.By use of the identity (4.100) the interaction matrix elements are finally written as

〈B, nka − 1|Habs|A, nka〉 = −ie√hnkaωBA

2Vεka · rBA e−iωkt

〈B, nka + 1|Hemis|A, nka〉 = ie

√h(nka + 1)ωAB

2Vε∗ka · rBA eiωkt

(4.102)

In these expressions we have usedenergy conservation, by settingωk = ωBA for photon absorp-tion andωk = ωAB for photon emission. A justification for this will be given below.

Selection rulesThe matrix elements

rBA = 〈B |r|A〉 (4.103)

are subject to certainselection ruleswhich follow from conservation of spin and parity. Thus, theoperatorr transforms as avectorunder rotation and changes sign under space inversion. Sincea vector is aspin 1quantity, the operator can change the spin of the state|A〉 by maximally oneunit of spin. Physically we interpret this as due to the spin carried by the photon. The changein sign under space inversion corresponds to the parity of the photon being− 1. This changeimplies that the parity of the final state is opposite that of the initial state.

Let us specifically consider the states of an electron in a hydrogen (or hydrogen-like) atom.We assume that the stateA is characterized by orbital spinlA and a spin component in thez-directionmA, while the parity isPA = (−1)lA. The corresponding quantities for the stateB arelB,mA andPB = (−1)lB . We note that the change in parity forces the angular moment to changeby one unit (lA 6= lB). The selection rules for the electric dipole transitions (referred to asE1transitions) are

∆l = ±1 (lA 6= 0) , ∆l ≡ lB − lA

∆l = +1 (lA = 0)

∆m = 0,±1 , ∆m ≡ mB −mA (4.104)

The transitions that do not follow these rules are “forbidden” in the sense that the interaction ma-trix element vanishes in the dipole approximation. Nevertheless such transitions may take place,but as a much slower rate than theE1 transitions. They may be induced by higher multipoleterms in the expansion (4.96), or by higher order terms inA which give rise to multi-photon pro-cesses. The multipole terms include higher powers in the components of the position operator,

Page 106: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

106 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

and these terms transform differently under rotation and space inversion than the vectorr. As aconsequence they are restricted by other selection rules. Physically we may consider the highermultipole transitions as corresponding to non-central photon emmision and absorption, wherethe total spin transferred is not only due to the intrinsic photon spin but also orbital angular mo-mentum of the photon. In addition to these lowest order transitions, there may be contributionsfrom higher order processes, in particular two-photon transitions.

4.3 Photon emission from excited atom

In this section we examine the photon emission process for an excited atom in some more detail.We make the assumption that the atom at a given instantt = ti is in an excited state|A〉 andconsider the amplitude for making a transition to a state|B〉 at a later timet = tf by emission ofa single photon. The amplitude is determined to first order in the interaction, and the transitionprobability per unit time and and unit solid angle for emission of the photon in a certain directionis evaluated and expressed in terms of the dipole matrix element. We subsequently discuss theeffect of decay of the initial atomic state and its relation to the formation of aline width for thephoton emission line.

4.3.1 First order transition and Fermi’s golden rule

The perturbation expansion of the time evolution operator in the interaction picture is (see Eq.(1.56))

Uint(tf , ti) = 1− i

h

tf∫ti

dtHint(t) +1

2(− i

h)2

tf∫ti

dt

t∫ti

dt′Hint(t)Hint(t′) + ...

(4.105)

where the time evolution of the interaction Hamiltonian is

H int(t) = eihH0tHinte

− ihH0t (4.106)

with H0 as the unperturbed Hamiltonian. The transition matrix element between an initial state|i〉 at timeti and final state|f〉 at timetf is

〈f |Uint(tf , ti)|i〉 = 〈f |i〉 − i

h〈f |Hint|i〉

tf∫ti

dteih(Ef−Ei)t

+1

2(− i

h)2∑m

〈f |Hint|m〉〈m|Hint|i〉tf∫ti

dt

t∫ti

dt′eih(Ef−Em)te

ih(Em−Ei)t

′+ ...

(4.107)

In this expressions we have assumed that the initial state|i〉 and the final state|f〉 as well as acomplete set of intermediate states|m〉 are eigenstates of the unperturbed HamiltonianH0. The

Page 107: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.3. PHOTON EMISSION FROM EXCITED ATOM 107

corresponding eigenvalues areEi, Ef andEm. We perform the time integrals, and in order tosimplify expressions we introduce the notationωfi = (Ef−Ei)/h, T = tf−ti andt = (ti+tf )/2.To second order in the interaction the transition matrix element is

〈f |Uint(tf , ti)|i〉 = 〈f |i〉

−2isin[1

2ωfiT ]

hωfieiωfi t

[〈f |Hint|i〉 −

∑m

〈f |Hint|m〉〈m|Hint|i〉hωmi

+ ...]

−2ieiωfi t∑m

sin[1

2ωfmT ]eiωmiT

〈f |Hint|m〉〈m|Hint|i〉h2ωfm ωmi

+ ...

(4.108)

where we assume thediagonalmatrix elements ofHint to vanish in order to avoid ill-definedterms in the expansion. The factor depending ont is unimportant and can be absorbed in aredefinition of the time coordinate so, thatt = 0. (The interesting time dependence lies in therelative coordinateT = tf − ti.) The last term in (4.129) does not contribute (at average) to loworder due to rapid oscillations. Without this term the result simplifies to

〈f |Uint(tf , ti)|i〉 = 〈f |i〉 − 2isin[1

2ωfi T ]

hωfieiωfi t Tfi (4.109)

whereTfi is theT-matrixelement

Tfi =[〈f |Hint|i〉 −

∑m

〈f |Hint|m〉〈m|Hint|i〉hωmi

+ ...]

(4.110)

The transition probability forf 6= i is

Wfi =

(2 sin[1

2ωfi T ]

hωfi

)2

|Tfi|2 (4.111)

Also this is an oscillating function, but for largeT it gives a contribution proportional toT . Tosee this we consider the function

f(x) =(

sin x

x

)2

(4.112)

The function is shown in fig.(4.3.1). It is localized aroundx = 0 with oscillations that aredamped like1/x2 for largex. The integral of the function is

∞∫−∞

f(x)dx = π (4.113)

The prefactor of (4.122) can be expressed in terms of the functionf(x) as(2 sin[1

2ωfi T ]

hωfi

)2

=T 2

h2 f(1

2ωfi T ) (4.114)

Page 108: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

108 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

-15 -10 -5 5 10 15

0.2

0.4

0.6

0.8

1

x

f(x)

Figure 4.3: Frequency dependence of the transition probability. For finite transition timeT thefunction has a non-vanishing width. In the limitT →∞ the function tends to a delta-function.

When regarded as a function ofωfi it gets increasingly localized aroundωfi = 0 asT increases.In the limit T → ∞ it approaches a delta function, with the strength of the delta function beingdetermined by the (normalization) integral (4.113),(

2 sin[12ωfi T ]

hωfi

)2

→ 2πhTδ(hωfi) (4.115)

This gives a constant transition rate

wfi =Wfi

T=

h|Tfi|2δ(Ef − Ei) (4.116)

whereTfi to lowest order in the interaction is simply the interaction matrix element. Whenapplied to transitions in atoms the expression (4.116) for the transition rate is often referred toasFermi’s golden rule. The delta function expresses energy conservation in the process. Note,however, that only in the limitT → ∞ the energy dependent function is really a delta function.For finite time intervals there is a certain width of the function which means thatEf can deviateslightly fromEi. This apparent breaking of energy conservation for finite times may happen sinceEf andEi are eigenvalues of theunperturbedHamiltonian rather than thefull Hamiltonian. Inreality the timeT cannot be taken to infinity for an atomic emission process, since the excitedstate has afinite life time. The width of the energy-function then has a physical interpretation interms of aline widthfor the emission line.

4.3.2 Emission rate

We consider now the case where initially the atom is in an excited stateA and finally in a stateB with a photon being emitted. The inital and final states of the full quantum system are

|i〉 = |A, 0〉 , |f〉 = |B, 1ka〉 (4.117)

where0 in the initial state indicates the photon vacuum, while1ka indicates one photon withquantum numberska. We will be interested in finding an expression for thedifferentialtransition

Page 109: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.3. PHOTON EMISSION FROM EXCITED ATOM 109

rate,i.e., the transition probability per unit time and unit solid angle, as well as thetotal transitionrate.

We write the transition rate summed over all final states of the photon as

wBA =∑ka

h|〈B, 1ka|Hemis|A, 0〉|2δ(EA − EB − hωk)

→ V

(2π)3

∫d3k

∑a

h|〈B, 1ka|Hemis|A, 0〉|2δ(EA − EB − hωk)

=V

(2π)2h

∫dΩ

∞∫0

dkk2∑a

|〈B, 1ka|Hemis|A, 0〉|2δ(EA − EB − hωk)

=V ω2

BA

(2π)2c3h2

∫dΩ

∑a

|〈B, 1ka|Hemis|A, 0〉|2

(4.118)

where in the integration overk the delta function fixes the frequency of the emitted photon tomatch the atomic frequecyωk = ωAB = (EA − EB)/h. The integrand gives the differentialemission rate, which in the dipole approximation is

dwBAdΩ

=e2ω3

AB

8π2hc3|rBA · ε∗ka|2 (4.119)

whereεka is the polarization vector of the emitted photon.Summed over photon states we have

∑a

|rBA · ε∗ka|2 = |rBA|2 −|rBA · k|2

k2(4.120)

From this we find the total transition rate into all final states of the photon,

wBA =e2ω3

AB

8π2hc3

∫dΩ

[|rBA|2 −

|rBA · k|2

k2

]=

e2ω3AB

4πhc3|rBA|2

π∫0

dθ sin θ(1− cos2 θ)

=e2ω3

AB

4πhc3|rBA|2

+1∫−1

du(1− u2)

=e2ω3

AB

3πhc3|rBA|2

=4α

3c2ω3AB|rBA|2 (4.121)

whereα is the fine structure constant. This expression shows how the transition rate depends onthe dipole matrix element and on the energy released in the transition.7

7Note that in the evaluation we have treatedrBA as a real vector. In realityrBA may be complex, but this doesnot change the result, one only has to repeat the evaluation for the real and imaginary part separately.

Page 110: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

110 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

The formalism employed here for the case of spontaneous emission can also be used to de-scribe photonabsorptionprocesses andscatteringof photons on atoms. In the latter case theamplitude should be calculated to second order in the electric charge, and that would involve theA ·A-term as well as theA ·p-term of the interaction. We note from the expression (4.121) thatthe emission rate increases strongly with the energy of the emitted photon. To a part that can beseen as adensity of stateseffect, since for high energy photons many more states are avaliable(within an energy interval) than for low energy photons. This is demonstrated explicitely by(4.118) which shows that a factorω2

BA can be ascribed to the frequency dependence of the denityof states. A similar effect takes place when light scattering is considered. This gives a qualitativeexplanation for why blue light is more readily scattered than red light, and thereby why the skyis blue and the sunset is red.

4.3.3 Life time and line width

Fermi’s golden rule which gives a constant transition rate from the excited to the lower energylevel of the atom can be correct only in an approximate sense. Thus, the corresponding expres-sion for the integrated transition probability,

WBA(t) = twBA (4.122)

which shows thatWBA increases linearly with time, can obviously be correct only for a suffi-ciently short time intervalt < w−1

BA. One way to view this problem is that the expression (4.122)does not take into account the fact that the probability of stateA to be occupied is reduced duringthe transition. This motivates the following modification of the expression for the transition rate,

d

dtWBA = wBAPA(t) (4.123)

wherewBA is the time independent rate determined from the golden rule andPA(t) is the proba-bility for level A to be occupied after timet. This probability is on the other hand related to thesum of transition rates over all final statesB,

PA(t) = 1−∑B

WBA(t) (4.124)

and taking the time derivative and making use of (4.122), we find

d

dtPA = −

∑B

d

dtWBA

= −∑B

wBAPA

≡ −PA/τA (4.125)

whereτA = (∑BwBA)−1. This can be integrated to give

PA(t) = e−t/τA (4.126)

Page 111: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.3. PHOTON EMISSION FROM EXCITED ATOM 111

which shows an exponential decay of the excited stateA with τA as thelife time of the state.From this follows that the corrected transition rate is

d

dt= WBA = wBAe

−t/τA (4.127)

which integrates to

WBA = wBA τA (1− e−t/τA) (4.128)

The original expression(4.122) for the transition probability corresponds to (4.128) expanded tofirst order int/τA, and is clearly a good approximation to the full expression only whent << τA.

The decay of the initial state can be built into the expressions for the the transitions to lowerenergy states in a simple way by including a damping factorexp(−t/2τA) as a normalizationfactor in the amplitude. For the transition amplitude this gives a correction

〈f |Uint(tf , ti)|i〉 = − i

h〈f |Hint|i〉

tf∫ti

dteih(Ef−Ei)te

− (t−ti)

2τA + ...

≡ − i

h〈f |Hint|i〉eti/τA

tf∫ti

dteih(Ef−Ei+

i2ΓA)t + ...

= − 〈f |Hint|i〉Ef − Ei +

i2ΓA

(eih(Ef−Ei)tf e−

12h

ΓA)(tf−ti) − eih(Ef−Ei)ti) + ...

→ 〈f |Hint|i〉Ef − Ei +

i2ΓA

eih(Ef−Ei)ti + ... (4.129)

We have here introducedΓA = h/τA and at the last step assumedtf − ti >> τA. As before theinitial state is|i〉 = |A, 0〉 and the final state is|f〉 = |B; 1ka〉. For the transition probability thisgives

Wfi =|〈f |Hint|i〉|2

(Ef − Ei)2 + 14Γ2A

+ ... (4.130)

Viewed as a function ofωfi = (Ef−Ei)/h = (EA−EB)/h−ωk, the transition probability is nolonger proportional to a delta function in the difference between the inital and final energy. Thedelta function is replaced by aLorentzian, which is strongly peaked aroundωfi = 0, but whichhas a width proportional toΓA. For the emitted photon this is translated to a finiteline width forthe emission line corresponding to the transitionA → B. If we compare the two figures (4.3.1)and (4.3.3) we note that a finite cut-off in the time integral (i.e., a finite value ofT = tf − tigives essentially the same frequency dependence for the emitted photon as the one obtained byan exponential damping due to the finite life time of the initial atomic state.

We finally note that sinceΓA depends on the transition probabilities, to include it in thelowest order transition amplitude means that we effectively have included higher order contribu-tions. This means that the perturbative expension is no longer an order by order expansion in theinteraction Hamiltonian.

Page 112: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

112 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

-10 -5 5 10

0.2

0.4

0.6

0.8

1

Figure 4.4:Level broadening. The emission probability given byWfi = |〈f |Hint|i〉|2/[(hωfi)2 + 14Γ2

A]is shown as a function ofωfi.

4.4 Stimulated photon emission and the principle of lasers

As we have already noticed, the rate forstimulatedemission of a photon by an atomic transitionis larger than the correspondingspontaneousemission. Thus, to lowest order in the interactionthe transition probability by photon emission into a given mode is enhanced by a factorn equalto the number of photons already present in the mode. This means that if an atom in an excitedstate emits a photon with equal probability in all directions, the same atom, when placed in astrong field which resonates with the atomic transition, will preferably emit the photon into themode that is already occupied. However, in free space, since a continuum of modes is available,the probability for emission into the preferred mode may still be small.

A laser is based on the principle of stimulated emission, but the probability for emission intothe preferred mode is enhanced by use of a reflecting cavity. The boundary conditions imposedby the reflecting mirrors of the cavity reduce the number of available electromagnetic modes toa discrete set, and the trapping of the photons makes it possible to build a large population ofphotons in one of the modes.

Figure 4.5:Schematic representation of a laser. A gas of atoms is trapped inside aFabry-Perotcavitywith reflecting walls. The atoms arepumpedto an excited state that subsequently make transitions to alower energy mode. The emitted photons resonate with a longitudinal mode of the cavity and build up astrong field in this mode. Some of the light from this mode escapes through one end of the cavity to forma monochromatic laser beam.

A schematic picture of a laser is shown in fig.(4.4) where a elongated cavity is filled with gas

Page 113: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.4. STIMULATED PHOTON EMISSION AND THE PRINCIPLE OF LASERS 113

of (Rubidium) atoms. These are continuously excited (pumped) from the ground state to a higherenergy state. The excited atoms subsequently make transitions to a lower energy state and emitphotons with energy that matches one of the modes selected by the distance between the mirrorsin the longitudinal direction. The emission tend to increase the excitation of the preferred modeto a high level. Some of the light that builds up inside the cavity escapes through a small hole inone of the mirrors in the form of a monochromatic laser beam.

The laser light is a beam ofcoherentor classical light in the sense previously discussed.The reason for this coherence is that the photons emitted by the atoms in the cavity tend toact coherently with the light already trapped in the cavity. However, even if the light to a highdegree is coherent, there will be some incoherent admixture due to spontaneous emission. In thefollowing we shall discuss the mechanism of the laser in some more detail.

4.4.1 Three-level model of a laser

We consider a simple model of a laser where a single mode is populated by stimulated emission.A three-level model is used for the atom, where most atoms are in the ground state|0〉, butwhere there is a continuous rate of “pumping” of atoms to a higher energy level|2〉, by a lightsource or some other way of excitation. The atoms in the upper level next make transitions to anintermediate level|1〉 by emission of a photon. This may be by stimulated emission of a photonto the field mode that is already strongly populated, or by spontaneous emission to one of theother photon states in the emission line. From the state|1〉 there is a fast transition back to theground state due to a strong coupling between the levels|1〉 and|0〉.

Due to the assumed strength of the transitions we have

N0 >> N2 >> N1 (4.131)

This means that the number of atoms in the ground stateN0 is essentially identical to the totalnumber of atomsN . When we consider the two levels1 and2, which are relevant for populatingthe laser mode, the inequality means that there ispopulation inversion, since the upper level ismore strongly populated than the lower mode. Population inversion is a condition for the atomsto be able to “feed” the preferred photon mode, since the probability of reabsorbing a photonfrom the preferred mode by the inverse transition|1〉 → |2〉 is much smaller than emission of aphoton by the process|2〉 → |1〉.

When the atoms are pumped to a higher energy level this creates initially a situation wherethe photon modes within the line width of the transition2 → 1 begins to get populated, butwhere no single mode is preferred. This is an unstable situation due to the effect of stimulatedemission to preferrably populate a mode which already is excited. As a result one of the modeswill spontaneously tend to grow at the expense of the others. We will consider the situation afterone of the modes has been preferred in this way.

The transition from state2 to 1 may now go in two ways, either by spontaneous emission toone of the modes that have not been populated or by stimulated emission to the preferred (laser)mode. We denote the transition rate by spontaneous emissionΓsp and by stimulated emissionΓst n, wheren is the photon number of the laser mode. The difference betweenΓsp andΓst

Page 114: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

114 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

0

1

2

R

Γsp Γst n

T

Figure 4.6: Transitions between three atomic levels in a laser.R is the pumping rate from theground state0 to an excited state2. The transition from level2 to an intermediate level1 can goeither by spontaneous emission (Γsp), or by stimulated emission (Γst n) to the laser mode.n isthe photon number of the laser mode. A fast transitionT brings the atoms back to the groundstate. The direct transition probability2 → 0 is assumed to be negligible.

is due to the large number available for spontaneous emission compared to the single modeavailable for stimulated emission. Typical values are

Γsp ≈ 107s−1 , Γst ≈ 1s−1 (4.132)

The smallness ofΓst is compensated by the factorn and when a stationary situation is reachedthe photon numbern in the laser mode will be so large so that the probabilities of the two typesof transitions are comparable. The ratio

ns =ΓspΓst

(4.133)

is referred to as thesaturation photon number.The quantum state of the laser mode should be described by a (mixed state) density matrix

ρnn′ rather than a (pure state) wave function. This is so since we cannot regard the electromag-netic field as a closed (isolated) system. It is a part of a larger system consisting of both atomsand field, but even that is not a closed system due to coupling of the atoms to the pumping fieldand due to the leak from the laser mode to the escaping laser beam and to the surroundings.

We will not approach the general problem of describing the time evolution of the densitymatrix ρnn′, but rather show that the steady state form of the photon probability distributionpn = ρnn can be found with some simplifying assumptions.8 We first note that for the steadystate there is a balance between transitions to and from atomic level2,

N0R = N2(Γsp + Γstn) (4.134)

whereR is the transition rate, per atom atom, from the ground state to the excited state2. Wesimplify this expression by replacing the photon numbern by its expectation value〈n〉. This

8The discussion here mainly follows the approach of Rodney Loudon,The Quantum Theory of Light, OxfordScience Publications, 2000.

Page 115: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.4. STIMULATED PHOTON EMISSION AND THE PRINCIPLE OF LASERS 115

means that we neglect the (quantum and statistical) fluctuations in the photon number, which weregard as small compared to its mean value.

We next consider the photon probability distributionpn. The time evolution of the distributionis given by

dpndt

= N2Γst(np(n−1) − (n+ 1)pn)− Γcav(npn − (n+ 1)p(n+1)) (4.135)

where we have introduced thecavity loss rate per photon, Γcav. With |T |2 as the transmissionprobability to the outside for each reflection andL as the length of the cavity, it is given byΓcav = c|T |2/L. This gives for the time evolution of the mean photon number

d〈n〉dt

= N2Γst(1 + 〈n〉)− Γcav 〈n〉 (4.136)

with the steady state solution

N2Γst(1 + 〈n〉) = Γcav 〈n〉 (4.137)

This equation relates the numberN2 of atoms in level 2 to the average photon number in the lasermode. We insert this in the steady state equation (4.134), withn replaced by〈n〉, and we alsoreplaceN0 by the total number of atomsN ,

ΓstΓcav 〈n〉2 − (NRΓst − ΓspΓcav) 〈n〉 −NRΓst = 0 (4.138)

By introducing the coefficient

C =NRΓstΓspΓcav

(4.139)

the equation simplifies to

〈n〉2 − (C − 1)ns 〈n〉 − Cns = 0 (4.140)

with solution

〈n〉 =1

2

[(C − 1)ns + [(C − 1)2n2

s + 4Cns ]12

](4.141)

The expectation value for the photon number is in Fig. (4.4.1) shown as a function of the param-eterC for ns = 107. As a characteristic feature one notes that aroundC = 1 the photon numberrapidly increases from a small number to a number of the order ofns. After this rapid increasethere is a continued less dramatic increase where〈n〉 changes linearly withC.

The curve demonstrates the presence of a threshold for the laser aroundC = 1. For smallervalues the effects of spontaneous emisssion and emission from the cavity prevents the build up ofthe laser mode, while for larger values ofC there is a net input of photons into the mode whichalows the photon number to grow to a large number.

Page 116: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

116 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

0.5 1 1.5 2

2

4

6

log <n>

C

Figure 4.7: The photon number of the laser mode as a function of the parameterC =NR/(nsΓcav). The curve corresponds to a saturation photon numberns = 107.

4.4.2 Quantum coherence

So far we have considered theexpectation valueof the photon number, which can increase to alarge value forC > 1. This implies that the mode is strongly excited, but does not necessarilymean that it is in a coherent state that can be described by a (classical) monochromatic wave.For this to be the casefluctuationsof the field variables have to be restricted. The fluctuations inthe photon number can be determined from the evolution equation of the probability distribution(4.135), which for a stationary situation is

N2Γst(np(n−1) − (n+ 1)pn) = Γcav(npn − (n+ 1)p(n+1)) (4.142)

This should be satisfied for alln, includingn = 0 with p−1 = 0, and from this we conclude thatfollowing simpler equation has to be satisfied

N2Γstp(n−1) = Γcavpn (4.143)

With the occupation numberN2 determined by (4.134), we get

pn =NRΓst

Γcav(Γsp + Γst(n− 1))p(n−1)

=Cns

ns + n− 1p(n−1)

(4.144)

By repeated use of the equation we find

pn =(Cns)

n(ns − 1)!

(ns + n− 1)!p0 (4.145)

wherep0 is determined by the normalization of the probability distribution.Well above the laser threshold,C >> 1, the we have

〈n〉 = (C − 1)ns (4.146)

Page 117: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.4. STIMULATED PHOTON EMISSION AND THE PRINCIPLE OF LASERS 117

as follows from Eq. (4.141). By use of this expression forC, the distribution can be re-writtenas

pn =(ns + 〈n〉)n

(n+ ns)!ns!p0 (4.147)

We note that this distribution has a form very similar to that of a coherent state, which we canwrite as

pcsn =〈n〉n

(n)!e−〈n〉 (4.148)

Then-dependence is the same, except for a shiftn → n + ns. For the coherent state the widthof the distribution is given by the variance

(∆n)2cs = 〈n〉 (4.149)

whereas, due to the shift, the width of the probability distribution (4.147) is

(∆n)2 = 〈n〉+ ns (4.150)

For large photon numbers,〈n〉 >> ns the last term can be neglected and the variance is thesame as expcted for a coherent state. However, for smaller valuesns introduces a non-negligiblecontribution to the fluctuation in the photon number. This can be interpreted as the influenceof spontaneous emission: It affects the fluctuations in the occupation numberN2 which in turninfluences the fluctuations in the photon number of the laser mode through stimulated emission.

The fluctuation in photon number only describes a part of the fluctuations of the laser field.If we compare with a coherent state, of the form

|α〉 = e−12|α|2 ∑

n

αn√n!

(4.151)

the probability distributionpn gives information about the absolute square of then-components,here given by|α|2n, but there is no information about the relative phases of the components. Toget information about the fluctuations both in absolute value and in phase, we apply the coherentstate representation, which for a general state|ψ〉 defines a (phase space) probability distributionin terms of the overlap of|ψ〉 with a coherent state|z〉,

fψ(z) = |〈z|ψ〉|2 (4.152)

For the coherent state|α〉 the function is

fα(z) = e−|z−α|2 (4.153)

which means that it is strongly localized around the pointα in the complex (phase) plane. AFock state, with a well defined photon number is on the other hand given by

fn(z) = e−|z|2 |z|2nn!

(4.154)

Page 118: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

118 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

Re ζ

Im ζ

Fock state Coherent state

Figure 4.8: The complex phase plane of a single electromagnetic mode. A photon state is repre-sented by the overlap with acoherent state|z〉. The radial coordinate correspond to the directionof increasing energy (or photon number), while the angular variable correspond to the phase ofthe electromagnetic wave. A Fock state (blue), with sharp photon number, is completely delo-calized in the angular direction. A coherent state is well localized in both directions. The photoncreation operatora†, when acting on a coherent state will shift it in the radial direction.

which depends only on the modulus ofz, and which therefore localized in a ring in the complexplane.

Even if the photon probability distribution of the laser mode approaches that of a coherentstate for large photon numbers, this does not determine the angular distribution of the state. Tohave well defined values for for the electric and magnetic fields, like in the case of coherentstates, a restriction on the fluctuations in the angular direction is needed. We will not discussthis question in depth here, which would require that we consider the evolution of the densitymatrix, including its off diagonal matrix elements. Instead we shall make the coherence probableby considering the effect of the emission Hamiltonian on a state that is already coherent.

The matrix element of the emission matrix element between the two atomic states2 and1 isin the dipole approximation

〈1|Hemis|2〉 = −ie√hω

2Vr12 · ε a† ≡ m a† (4.155)

In this expressionV is the volume of the cavity, and the quantum numbers of the photon mode hasbeen omitted since we consider only stimulated emission to a single mode. Since the prefactorof the photon creation operator is independent of the photon number they are collected in theconstantm. The transition2 → 1 will therefore, when we consider this as a first order effectchange the state simply by the action ofa†. If the photon state is before the transition welldescribed by a coherent stateα this gives a change

|α〉 → |α〉 = N a†|α〉 (4.156)

whereN is a normalization factor.

Page 119: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

4.4. STIMULATED PHOTON EMISSION AND THE PRINCIPLE OF LASERS 119

With |α〉 normalized to unity, the spread of the state in the complex phase plane is

fα(z) = |N |2|〈z|a†|α〉|2

=|z|2

|α|2 + 1e−|z−α|2 (4.157)

For large photon number, corresponding to large|α|2, it is convenient to change variable toξ = z − α, which by the gaussian factor is restricted to values of the order of unity. This gives

fα(ξ) = |N |2|〈z|a†|α〉|2

=|α|2 + αξ∗ + α∗ξ + |ξ|2

|α|2 + 1e−|ξ|2

≈ exp[− |ξ|2 +

ξ

α+ξ∗

α∗ −1

2(ξ2

α2+ξ∗2

α∗2 − 4|ξ|2

|α|2)]

(4.158)

where the last expression is correct to order1/|α|2. The linear terms inξ andξ∗ show that thelocation of the state is slightly shifted by the action of the photon creation operator. This is dueto the added energy. The shift is in the radial direction in the complex plane, with the valueproportional to1/|α| = 1/

√〈n〉. There are also quadratic terms, which show that the width of

the gaussian is slightly changed. But these are of the order of1/|α|2 = 1/〈n〉, and are thereforeless important than the shift for large photon numbers.

Even if this discussion does not showhowcoherence of the laser mode is created, it showsthat the action of the stimulated emission on the photon state is mainly to increase its energy, andonly to a lesser degree to change the coherence of the state. However, the rotational invariancein the complex shows that there is no preferred value for the complex phase ofα. The effect ofspontaneous emission will in reality be to introduce a slow, random drift in the phase of the lasermode.

Page 120: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

120 CHAPTER 4. THE QUANTUM THEORY OF LIGHT

Page 121: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

Chapter 5

Quantum mechanics and geometry

Physics and geometry are closely linked in many different ways. In its original form Euclideangeometry describes the mathematical structure of physical space and the shapes of physical bod-ies. In the theory of relativity this geometry of space is extended to a geometry of space-time. Al-though general relativity represents in many ways the clearest realization of geometry in physics,there are also many other examples, both in classical and quantum physics, of how a geometricalformulation is both possible and important in order to understand the structure of the theory.

In quantum theory geometry is usually not emphasized to the same degree as it is in the theoryof relativity. But there are some parts of quantum theory where the underlying geometry is quiteclear. One example has to do with gauge invariance of electromagnetism and the generalizationof electromagnetic theory to gauge theories. Another example has to do with geometrical phasesthat appear when a quantum state is evolving in a periodic fashion. However, beyond theseparticular cases a geometrical view of quantum theory at a more fundamental level is possible,and an interesting question to consider is whether the geometrical view of quantum physics maybe of importance beyond the applications studied today.

In this chapter the intention is not to make an extensive study of the geometry of quantummechanics, but rather to examine the basis for such a geometrical formulation and to discusssome simple examples where the geometry is clearly revealed. In particular I will focus attentionon the geometrical phases, but I will also discuss the general aspects of quantum geometry thathave to do with the fact that the natural geometry of quantum states is complex rather than real.

5.1 Geometry in Hilbert space

The state space of a quantum system is a vector space, and in the same way as for three-dimensional space or four-dimensional space-time the geometry of this vector space is definedby the scalar product of vectors. However, the physical basis for the geometry is rather differentin the two cases. The geometry of physical space is based on the notion of the physical lengthbetween points as the fundamental quantity, but in quantum theory the scalar product insteadis linked to the probability interpretation of the theory. Thus the squared scalar product of twovectors|〈φ|ψ〉|2 can be viewed as the probability of measuring (by a projective measurement) the

121

Page 122: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

122 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

system in stateφ when it is prepared in stateψ (or the other way around). However, even if theinterpretation is different, there are at the formal level close relations between the geometry ofphysical space and that of quantum mechanics. For this reason I will first make a brief discussionof some basic elements of the (real) geometry of curved surfaces, before I turn to the (complex)geometry of Hilbert space and the related geometry of the continuum of physical quantum states.

5.1.1 Real space geometry

As a starting point we consider a real vector space ofd dimensions equipped with a scalar prod-uct. A concrete realization may be the three-dimensional physical space of non-relativistic clas-sical or quantum physics. A general vector can be expanded in a complete set of basis vectors inthe standard way

a =d∑i=1

aiei (5.1)

and the scalar product of two vectors can be written as

a · b =∑ij

gijaibj (5.2)

with

gij = ei · ej = gji (5.3)

With the basis vectors chosen as an orthonormal set we simply havegij = δij, but in the presentdiscussion of geometry it may be convenient also to open for the use of non-orthonormalized setsof basis vectors. The fundamental geometric variable is the distance∆s between pairs of pointsand for two points specified by (position) vectorsa andb it is given by the length of the relativevector∆a = b− a,

∆s2 = ∆a2 =∑ij

gij∆ai∆aj (5.4)

From a geometrical point of view a vector space is a flat space, and this flatness is linked tothe ability of choosing an orthonormal basis set, so thatgij = δij. But the geometry of vectorspaces can readily be generalized to the geometry ofcurvedspaces, ormanifoldsin mathematicalterminology, and that is what we consider next. The discussion I will limit to cases where thecurved space is embedded in a higher-dimensional flat space. This is for convenience, since itsimplifies the discussion, although theintrinsic geometryof the space will be of most interest.

The curved space we considered to be traced out by a position vectorr(x) in the flat space byvarying a set ofd coordinatesx = x1, x2, ..., xd. These define a coordinate set for the curvedspace. The curved space defines a manifold within the flat space, but is in general not a vectorspace. However, there are vector spaces associated with the manifold, which are thetangent

Page 123: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

5.1. GEOMETRY IN HILBERT SPACE 123

r(θ,φ)

Figure 5.1:Tangent spaces of a curved surface. The curved surface is here represented by a half-sphereand the tangent spaces by two-dimensional planes. The coordinate basis vectors corresponding to thepolar angles(θ, φ) are shown. Vectors belonging to two different tangent planes cannot be compared ina unique way, since the planes have different orientation. However, all tangent vectors can be viewed asvectors in the same higher-dimensional space, namely the three-dimensional flat space in which the sphereis embedded.

vector spacesattached to each point of the manifold. A natural basis set for the tangent vectorspace at a given pointx is given by

ei(x) =∂r

∂xi, (5.5)

and are referred to ascoordinate basis vectors. They define the metric tensor as

gij(x) = ei(x) · ej(x) =∂r

∂xi· ∂r∂xj

(5.6)

The metric tensor determines the distance between neighbouring points,

ds2 = dr2 = gij(x)dxidxj (5.7)

and the curvature of the space is reflected by the fact that one cannot find a smooth coordinateset so thatgij(x) = δij everywhere. (Note that in Eq.(5.24) as well as in subsequent equationsthe summation symbol is suppressed. That means that we employ the Einstein summation con-vention with summation over repeated indices, unless otherwise stated.)

Note that even in curved space we may choose basis vectors, which we denote byei that areorthonormal,

ei(x) · ej(x) = δij (5.8)

but these are not linked to a coordinate system in the way given by (5.5).By expanding an arbitrary vectora in the coordinate basis sets, we find the completeness

relations for the sets,

a = ak ek = ek gkl el · a (5.9)

Page 124: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

124 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

This gives the completeness relation

(ek)igkl(el)

j = gij (5.10)

where(ek)i denotes componenti of basis vectork. In the above equations we have introduced

the inverse of the metric tensor

gijgjk = δik (5.11)

Note that the the basis vectorsei form a complete set only for vectors restricted to the tangentspace. For general vectors in the higher-dimensional flat space where all the tangent spaces aresubspaces, the right-hand side of Eq. (5.9) can be viewed as defining theprojectionof the vectora on the tangent space.

Parallel transportof vectors is a direct way to demonstrate the presence of curvature. Thus ifparallelism is globally defined, which means that the result of transporting a vector between anytwo points is independent of the path between the points, that shows that the space is flat. Whenthe result of the transport depends on the path the space is curved.1In general parallel transportwill introduce rotation of the vector, thus the direction changes while the length of the vectoris preserved. For infinitesimal tanslations, however, parallel transport is well defined withoutreference to a path. This follows, since a curved space at sufficiently small distances cannot bedistinguished from a flat space. Thus, curvature appears as a second order effect in the (small)distances.

The general form of parallel transport between neighbouring pointsx andx+ dx is revealedby its action on the basis vectors. It can be written in the form

P (x, x+ dx)ei(x) = (δij − Γjikdxk)ej(x+ dx) (5.12)

whereΓjik, which is a tensor field on the manifold, is known as theconnection. The last term canbe interpreted as the change in the basis between the two pointsx andx+ dx, in the sense

dei = ei(x+ dx)− P (x, x+ dx)ei(x)

= Γjikdxkej (5.13)

In the present case, with the curved space embedded in a higher dimensional flat space, thecurved space will inherit its parallel transport from the trivial one of the flat space. However, sincethe vectors of the curved space are confined to the tangent vectors spaces the parallel transport ofthe flat space and the curved space are not identical. The latter will involve a projection into thetangent space. Thus, for an infinitesimal displacement the parallel transport on the curved spacecan be viewed as a transport in the flat space combined with a projection on the tangent space.Since for small projection angles the change in length of the projected vector will be of secondorder, the length will be preserved for infinitesimal displacements. Viewed as a projection on

1This refers to theintrinsic curvature. Note that there may be anextrinsic curvature, even if the intrinsic curvevanishes. An example is a two-dimensional flat sheet rolled up within three-dimensional space.

Page 125: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

5.1. GEOMETRY IN HILBERT SPACE 125

the tangent space at the shifted position we can write the expression for an infinitesimal paralleldisplacement as

P (x, x+ dx)ei(x) = ej(x+ dx)gjk(x+ dx) ek(x+ dx) · ei(x) (5.14)

which means that

ej(x+ dx) · ei(x) = δij − Γjikdxk (5.15)

This gives as expression for the connection

Γkij = gkl el · ∂jei (5.16)

It is symmetric in the indicesij, as follows from the definition of the coordinate basis (5.5).There is a close relation between the metric tensor and the connection, which can be found

by considering the change ingij under an infinitesimal translation,

dgij = dei · ej + ei · dej (5.17)

The right-hand-side can be expressed in terms of the connection, and sinceΓkij is symmetric inij, the equation can be solved to give the connection expressed in terms of the metric

Γkij =1

2gkm(

∂gim∂xj

+∂gmj∂xi

− ∂gji∂xm

) (5.18)

Parallel transportation around a closed curve will introduce a rotation of the transported vec-tors, and the rotation matrix gives information about the curvature in the space enclosed by thecurve. For a curve enclosing an infinitesimal area, the corresponding infinitesimal change in thevector will be proportional to the areadσ of the curve. Applied to the basis vectors the changecan be written in the form

P (dσ) ei = ei +1

2Rkimndσ

mnek (5.19)

whereP (dσ) denotes the rotation produced by parallel transport around the infinitesimal curve,Rkimn is the intrinsic orRiemanncurvature tensor anddσij is the surface element, which is an-

tisymmetric in the indicesij. The curvature tensor can be expressed in terms of the connectionΓkij. A specific way to derive this expression is to consider transport of basis vectors around aninfinitesimal parallelogram.

5.1.2 Geometry of the sphere

We consider a sphere of unit radius embedded in three-dimensional flat space. The point on thesphere are identified by the polar angles(θ, φ), and the corresponding three-dimensional vectoris

r(θ, φ) = sin θ(cosφi + sinφj) + cos θk (5.20)

Page 126: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

126 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

The corresponding coordinate basis vectors are

eθ =∂r

∂θ= cos θ(cosφi + sinφj)− sin θk

eφ =∂r

∂θ= sin θ(− sinφi + cosφj) (5.21)

The basis vectors in this case are orthogonal, but whileeθ is normalized to 1,eφ is not. Thematrix elements of the metric tensor are

gθθ = e2θ = 1

gθφ = eθ · eφ = 0

gφφ = e2φ = sin2 θ (5.22)

In matrix form the metric tensor and its inverse therefore are,

(gij) =(

1 00 sin2 θ

), (gij) =

(1 00 1

sin2 θ

)(5.23)

and the the distance squared of an infinitesimal displacement is

ds2 = dθ2 + sin2 θdφ2 (5.24)

The connection is determined by the derivatives of the coordinate basis vectors,

∂θeθ = −r

∂φeθ = cos θ(− sinφi + cosφj) = ∂θeφ

∂φeφ = − sin θ(cosφi + sinφj) (5.25)

From this we find the non-vanishing components of the lower index form of the connection ,

Γθφφ = eθ · ∂φeφ = − cos θ sin θ

Γφφθ = eφ · ∂φeθ = cos θ sin θ = Γφθφ (5.26)

and by applying the inverse metric tensor we further obtain

Γθφφ = − cos θ sin θ

Γφφθ = cot θ = Γφθφ (5.27)

We consider parallel transport of the basis vectoreθ from the pointx = (θ, φ) to x + dx =(θ + dθ, φ+ dφ),

P (x, x+ dx)eθ(x) = eθ(x+ dx)− Γφθφ dφ eφ(x+ dx)

= eθ(x+ dx)− cot θ dφ eφ(x+ dx) (5.28)

Page 127: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

5.1. GEOMETRY IN HILBERT SPACE 127

The right-hand side can be viewed as an infinitesimal rotation of the basis vector, and since thetangent space is two-dimensional such a rotation is determined by a single rotation angledχ.Expressed in terms of the rotation angle the infinitesimal rotation can be written as

P (x, x+ dx)eθ(x) = eθ(x+ dx) +dχ

sin θeφ(x+ dx) (5.29)

where in the last term we have compensated for the lack of normalization ofeφ. By comparisonwith(5.28) we find the rotation angle to be

dχ = − cos θdφ (5.30)

For parallel transport around a finite, closed curveC the rotation angle can be found byintegrating the infinitesimal contributions along the path. The result is

χ(C) = −∮C

cos θdφ (5.31)

For an infinitesimal closed curve with displacements along the coordinate lines, with shiftsdθanddφ respectively, the rotation angle is

dχ = −(cos(θ + dθ)− cos θ)dφ

= sin θdθdphi

= dΩ (5.32)

where dΩ is the solid angle (or area of the unit sphere) covered by the loop. For a non-infinitesimal closed curve this implies that the rotation angle is equal to the solid angle traced outby the curve, since the large closed curve can be decomposed into many infinitesimal loops,

χ(C) = Ω(C) (5.33)

We check the above result by considering parallel transport from the north poleθ = 0 of thesphere along the longitudeφ = 0, then next along the equator (θ = π/2) and finally back to thenorth pole alongφ = π/2. As indicated in the figure, the rotation angle introduced by paralleltransport along the curve is equal to the change inφ, namelyπ/2. On the other hand the curveencloses the fraction1/8 of the full sphere. This corresponds to a solid angle4π/8 = π/2, whichmeans that the relation (5.33) is satisfied.

5.1.3 The complex geometry of Hilbert space

In the same way as the scalar product of a real vector space is the starting point for a geometricdescription of the vector space and its curved generalizations, the scalar product of the Hilbertspace is the starting point for a geometrical understanding of quantum mechanics. We expandagain the vectors in a complete basis

|ψ〉 =∑i

ψi|χi〉 (5.34)

Page 128: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

128 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

N

S

Figure 5.2: Parallel transport on a sphere. A tangent vector is transported from the north pole to theequator, then along the equator and finally back to the north pole. This introduces a rotation of the vectorwhich is identical to the solid angle traced out by the closed curveC.

The scalar product now has the form

〈φ|ψ〉 =∑ij

γijφ∗iψj (5.35)

with

γij = 〈χi|χj〉 = γ∗ji (5.36)

Since the scalar product is complex rather than real it now defines two related real structures,expressed by the symmetric and antisymmetric parts ofγij,

gij =1

2(γij + γji)

ηij =1

2i(γij − γji) (5.37)

In a similar way as for the real vector space, the distance between points in Hilbert space canbe defined in terms of the scalar product. For two points specified by vectors|φ〉 and |ψ〉, thesquare norm of the relative vector∆ψ = |ψ − φ〉 may be used as the definition of the squareddistance between the two points,

∆s2 = |〈∆ψ|∆ψ〉|2 = γij∆ψi∗∆ψj (5.38)

(again using the Einstein summation convention).

5.1.4 Geometry of physical states

However, the geometry of the Hilbert space, discussed above, is not identical to the geometryof the space of physical quantum states. The reason for this is, as we have discussed before,

Page 129: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

5.1. GEOMETRY IN HILBERT SPACE 129

|ψx >

|χx >

i |χx >

Re ψ(x)

Im ψ(x)

Figure 5.3:A physical quantum state represented as aray in Hilbert space. The ray is a one-dimensionalcomplex vector space, similar to a complex plane. A basis vector|χx〉 defines the real axis of the planeand i|χx〉 defines the imaginary axis. A representative from the ray is a vector|ψx〉 in the plane, hereshown with the real and imaginary components of the vectors as the projections on the two axis definedby |χx〉 andi|χx〉.

that there is not a one-to-one correspondence between physical states and Hilbert space vectors.Physical (pure) states are restricted tonormedvectors, and furthermore all normed vectors thatdiffer only by an overall phase factor are considered equivalent. Thus, to make the correspon-dence between vectors and physical states unique we have both to restrict these to normed vectorsand to pick a single representative from each set of equivalent vectors. An alternative view is toconsider the state as corresponding to anequivalence classof vectors. Such a class consists ofall vectors that differ be a complex number,ψ ' zψ. It is a one-dimensional subspace, oftenreferred to as aray in Hilbert space. Only if we want to pick a representative from this ray, wehave to specify the normalization and the phase of the vector. The density operator associatedwith the physical state is the projection operator that projects on the one-dimensional space ofthe ray.

The space of rays is not a vector space. We refer to this as aprojective space, and sinceit can be associated with directions in Hilbert space it has topology similar to a sphere. In thesimplest case, for a quantum two-level system, the pure states can be associated with points ona two-dimensional sphere, as discussed in Sect.1.3.1. Even if the space of physical states is notidentical to the Hilbert space, the latter will introduce a natural geometry on the former. To seethis we will make use of a formulation similar to that used for the real space geometry. As aspecific example we consider the two-level system.

Let us label the physical states by a set of coordinatesx = x1, x2, .... In the two-levelcase this represents the unit vectorn that specify points on the sphere. To each pointx there is,as discussed above, associated a one-dimensional complex space, a ray, and from a geometricalpoint of view this space can be treated in a similar way as the tangent space of a curved space.Thus, in a similar way as when introducing basis vectors of the tangent space, we may choosea representative from the ray,|χx〉, and use this as a basis vector for the one-dimensional space.

Page 130: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

130 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

Any vector in this space can then be written as

|ψx〉 = ψ(x)|χx〉 (5.39)

where the complex numberψ(x) can be viewed as the coordinate of the vector|ψx〉. For sim-plicity we in the following restrict the basis vectors to be of unit length, so that the freedom ofchoosing basis vectors is limited to the choice of a complex phase factor.

We note that the situation is similar to the one we have discussed for the real geometry, wherethe curved space was embedded in a higher-dimensional flat space. In the present case the vectorspaces associated with each pointx are embedded in a higher-dimensional flat space, which isthe full Hilbert space of the quantum system. We make use of this, in a similar way as for tangentvectors to define parallel transport and connection on the space of physical states,

P (x, x+ dx)|χx〉 = |χx+dx〉〈χx+dx|χx〉= (1− iak(x)dx

k)|χx+dx〉 (5.40)

with ∂k = ∂∂xk . The connection now is represented by

ak(x) = i〈χx|∂kχx〉 (5.41)

Parallel transport between distant points can be found by integrating the infinitesimal dis-placement. For a closed curveC it gives rise to a complex phase factor

P (C)|χx〉 = exp[i∮Cak(x)dx

k]|χx〉 (5.42)

and by use of Stoke’s theorem the phase factor can also be written as

P (C) = exp[i

2

∫Sfij(x) · dσij

]|χx〉 (5.43)

whereS is a surface withC as boundary, anddσij represents an infinitesimal surface element onS.fij is given by

fij = ∂iaj − ∂jai = 2Im〈∂iχx|∂jχx〉 (5.44)

We note the formal similarity to the expression for an electromagnetic field in terms of a vectorpotential. This similarity is not merely a coincidence. Electromagnetic field configurations canbe interpreted geometrically, as we will later show. The geometrical interpretation here offij isthat it describes some kind of local curvature, which is responsible for the curve dependence ofthe parallel transport. But curvature now is of a more abstract origin, since it is not associatedwith (tangent) vectors of real space, but rather with the complex Hilbert space vectors.

As pointed out for the geometry of vectors in Hilbert space we may make a distinction be-tween two related geometrical structures associated with the real and imaginary parts ofγij. Thisis also the case here, and we note thatfij corresponds to the imaginary (or antisymmetric) partof the scalar product of the derivatives|∂iχx〉 and|∂jχx〉. To discuss the real part we first have

Page 131: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

5.1. GEOMETRY IN HILBERT SPACE 131

to consider the distinction between changes in the Hilbert space vectors and changes in the cor-responding physical state. Since all states differing by a phase factor are considered equivalent,a change of a state|∂iχx〉 that is proportional to the state itself is not considered as a change inthe physical state. To take that into account we define the projected derivative

Di|χx〉 = ∂i|χx〉 − |χx〉〈χx|∂iχx〉 = (∂i − iai(x))|χx〉 (5.45)

This derivative only takes into account real changes in the state, and we note the similarity withthe expression for minimal coupling of a matter field to an electromagnetic field.

The scalar product of two such derivatives is

〈Diχx|Djχx〉 = 〈∂iχx|∂jχx〉 − aiaj〈χx|χx〉 (5.46)

and we note that the projection (and thereby the presence of thea-field) only affects the real(symmetric) part of the scalar product. Separated in the real and imaginary parts we define

gij = 2Re〈Diχx|Djχx〉 = 2(Re〈∂iχx|∂jχx〉 − aiaj) (5.47)

while the fieldfij previously introduced can be written as

fij = 2Im〈Diχx|Djχx〉 = 2Im〈∂iχx|∂jχx〉 (5.48)

The symmetric matrixgij we interpret as defining ametric structureon the space of quantumstates, while the antisymmetric matrixfij defines a structure that is referred to as asymplecticstructure. In reality these two structures are not independent, but closely related.

In the discussion of the geometry of quantum states given above we have referred to thestate vectors rather than the density operators that we earlier have stressed as giving a betterrepresentation of the physical states. However the metric structure can easily be reformulated interms of the density operators. We recall that the density operator specified by the parametersxprojects on the one-dimensional subspace defined by the pure state|χx〉,

ρx = |χx〉〈χx| (5.49)

It is invariant under phase changes of the vector. It is now straight forward to check that themetric tensor can be written as

gij = Tr(∂iρ∂jρ) (5.50)

which is in fact the natural metric derived from the scalar product of matrices

〈A|B〉 = Tr(A†B) (5.51)

One should also note that the expression (5.50) can be viewed as defining a metric on the spaceof all quantum states, including mixed states in addition to pure states.

As far as the antisymmetric matrixfij is concerned the situation is somewhat different. Inthe same way asgij it is invariant under phase transformations of the states|χx〉 and thereforedefines a structure on the set of physical states. However, it cannot be given a simple expressionin terms of the density matrices. This also means that the extension of this from pure to mixedstates is not so obvious.

Page 132: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

132 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

5.1.5 Example 1. Geometry of the Bloch sphere

Let us apply the formalism to the two level system. We then identifyx with the unit vectornwhich defines the surface of the Bloch sphere, where the pure states are located. As coordinatesof the sphere we use as before the polar angles(θ, φ). As set of normalized basis vectors we use

|χn〉 ≡ |n〉 = e−iφ cosθ

2|χ+〉+ sin

θ

2|χ−〉 (5.52)

The vectors|χ+〉 and |χ−〉 is a set of orthonormalized states, corresponding to ”spin up” and”spin down” along the z-axis. The connection, or vector potential derived from this set of basisstates is,

aθ = i〈χn|∂θχn〉 = 0

aφ = i〈φn|∂φχn〉 = sin2 θ

2(5.53)

We also find

〈∂θχn|∂θχn〉 = 1

〈∂θχn|∂φχn〉 = − i2

sinθ

2cos

θ

2

〈∂φχn|∂φχn〉 = sin2 θ

2(5.54)

From these expressions we can derive the matrix elements of the metric tensor

gθθ = 2(Re〈∂θχn|∂θχn〉 − a2θ) = 2

gθφ = 2(Re〈∂θχn|∂φχn〉 − aθaφ) = 0

gφφ = 2(Re〈∂φχn|∂φχn〉 − a2φ) = 2 sin2 θ (5.55)

We note that this is (up to a constant factor) identical to the standard metric of the sphere. Wefurthermore find for the antisymmetric tensor

fθφ = 2Im〈∂θχn|∂φχn〉 =1

4sin θ (5.56)

(The θθ andφφ vanish due to the antisymmetry.) By analogy with the magnetic field we notethat this field can also be represented as a three-dimensional vector

b = f θφ eθ × eφ

= gθθgφφfθφ eθ × eφ

=1

2n (5.57)

Thus, it is like a radially directed magnetic field, amagnetic monopolefield.

Page 133: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

5.2. ADIABATIC EVOLUTION AND BERRY’S PHASE 133

5.2 Adiabatic evolution and Berry’s phase

In the previous section we have considered the geometry of the space ofall physical states,parametrized by a set of coordinatesx. But the formalism can equally well be used whenxis a set of coordinates for asubmanifoldin this space, so that the basis vectors|χx〉 define acontionuous subset of rays in the full space of rays. In this section we will consider the casewhere such a submanifold is defined by theground stateof a Hamiltonian that depends on a setof (external) parameters. If the parameters change with time sufficiently slowly (compared to thetypical time scale of the quantum system), the evolution of the quantum state will be restrictedto the submanifold of the (parameter dependent) ground state.

5.2.1 The adiabatic approximation

We consider the eigenvalue problem for a HamiltonianH = H(x) that depends on a set ofcontinuous parametersx = x1, x2, ...xd,

H(x)|χnx〉 = En(x)|χnx〉 (5.58)

The parametersx we assume to be slowly changing with time,x = x(t), and the time evolutionof the system is described by the Schrodinger equation

ihd

dt|ψ〉 = H(x(t))|ψ〉 (5.59)

The time dependent state|ψ(t)〉 can be expanded in terms of the (time dependent) eigenstates ofH(x(t)),

|ψ(t)〉 =∑n

un(t)|χnx〉 (5.60)

and we note that the time dependence then sits both in the expansion coefficientun and in thetime dependent basis vectors|χnx〉 = |χnx(t)〉.

The time derivative of the state vector is

d

dt|ψ〉 =

∑n

dundt|χnx〉+

∑n

unxk∂k|χnx〉 (5.61)

where in the last term have used the fact that the time dependence of the basis vector entersthrough the time dependence ofx. (For the coordinates we have also made use of the Einsteinsummation convention.) Entered into the time dependent Schrodinger equation, this gives for theexpansion coefficientun,

ihun + ih∑m

umxk〈χnx|∂kχmx 〉 = Enun (5.62)

We pull out the factor that depends on the the energy term,

un(t) = vn(t) exp(− i

h

∫ t

0En(t

′)dt′) (5.63)

Page 134: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

134 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

so that the coefficientvn satisfies the equation

vn = −∑m

vmxk〈χnx|∂kχmx 〉 exp

(i

h

∫ t

0(En(t

′)− Em(t′))dt′)

(5.64)

As initial condition we assume the system at timet = 0 to be in the ground state, and by(formally) integrating the equation we get,

vn(t) = δno −∑m

∫ t

0dt1vm(t1)x

k〈χnx|∂kχmx 〉 exp(i

h

∫ t1

0(En(t2)− Em(t2))dt2

)(5.65)

To proceed we make an assumption about slow time dependence in the following way. Wedefine theadiabatic limitso that the coordinatesx make a finite change in a time intervalT thattends to infinity. This can be expressed as the limitT →∞, xk → 0, with xk(0) andxk(T ) fixed.(In reality the approximation we shall do is good also for finiteT , as long as it is long comparedto the typical time of the evolution of the quantum system, as we shall see below. ) We considerthe evolution of the system for a finite subintervalδt << T . If δt is sufficiently short comparedto T , we can neglect any change inx and the energies become time independent. On such aninterval we approximate (5.65) by

vn(t+ δt) = vn(t)−∑m

vm(t)xk〈χnx|∂kχmx 〉∫ t+δt

tdt′ exp

(i

h(En(t)− Em(t))t′

)= vn(t)(1− δxk〈χnx|∂kχnx〉)

+ih∑m6=n

vmxk〈χnx|∂kχmx 〉

exp(ih(En − Em)

)En − Em

sin(1

h(En − Em)δt)

(5.66)

where in the last expression the termm = n has been separated fromm 6= n, and we haveassumed the energy levels to be non-degenerate. Whereas the first term has a contribution thatis linear inδt the other term does not, but shows instead an oscillatory behaviour forδt >>h/(En − Em). Since the amplitude is proportional toxk, this means that there is in reality notransition between states with finite difference in energy in the limitxk → ∞. In the adiabaticlimit all terms withm 6= n are therefore eliminated and Eq.(5.65) can be simplified to

vn(t) = δno(1−∫ t

0dt1vn(t1)x

k〈χnx|∂kχnx〉) (5.67)

Thus, when the system is in the ground state att = 0, it will in the adiabatic limit continue tobe in the ground state when this state evolves with the change of parametersx. This happensprovided the excited states are separated from the ground state by a finite gap.

5.2.2 Berry’ phase

In the adiabatic approximation, where all excitations to excited states can be neglected, the sys-tem simply stays all the time in the slowly evolving ground state. The evolution relative to the

Page 135: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

5.2. ADIABATIC EVOLUTION AND BERRY’S PHASE 135

chosen basis stateχ0x of the ground state is then reduced to a phase factor given byu0. The

interesting point to note is that this factor can be separated in a natural way in two parts. Thefirst part is anenergy dependentone which reduces to the standard factorexp((−i/h)E0t) whenthe parametersx are kept fixed. The other part isenergy independentand only depends onthe geometry of thex-dependent ground states and on the particular pathx(t) which is chosenamong these states. In order to show this, we consider the time evolution ofv0, but rather thanusing(5.67) which gives the the time evolution ofv0 implicitly, we return to the equation for thetime derivative. This now takes the form

v0 = −vmxk〈χ0x|∂kχ0

x〉= −ivmxkak(x) (5.68)

where we have introduced the connectionak defined on the submanifold ofx-dependent groundstates.

It is straight forward to integrate this equation, and if we write the phase factor asv0 =exp(−iφ0), we find for the phase,

φ0 =∫ t

0xk〈χ0

x|∂kχ0x〉dt =

∫ x(t)

x(0)ak(x)dx

k (5.69)

This phase is identical to the phase angle introduced by parallel transport of a state vector of thesubmanifold defined by the statesχ0

x, as we have earlier discussed. If we in particular consider aperiodic evolution of the state, withx(T ) = x(0), the phase angle is

φ0 =∮Cakdx

k =1

2

∫Sfij(x)dσ

ij (5.70)

wherefij = ∂iaj − ∂jai andS is a surface inx-space, with the the closed curveC, traversedduring the time evolution, as boundary. The full phase factor, with the energy dependent termincluded is then

u0 = exp(− i

h

∫ t

0E0(t

′)dt′ − iφ0) (5.71)

The geometrical significance of the contributionv0 to the phase factor introduced by adiabatictime evolution was first understood and described in detail by Michael Berry, and the geometricalphaseφ0 is therefore usually referred to as Berry’s phase. We note that for periodic motion thisphase is gauge independent in the sense that it does not depend on the phases chosen for thex-dependent basis vectors|χ0

x〉. This is not so for an open path, where the geometrical phase willdepend on the relative phases of the two basis vectors associated with the end points of the path.

5.2.3 Example: Spin motion in a magnetic field

As an example we consider again the motion of electron spin in a magnetic field. The spinHamiltonian is

H(B) = − eh

2mcσ ·B ≡ 1

2hωB · σ (5.72)

Page 136: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

136 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

This time we think of theB-field as variable, and its components, which we may take as thepolar coordinates(B, θ, φ), then correspond to the set of coordinatesx in the previous section.For each value ofB(6= 0) there is a unique ground state and excited state associated with theHamiltonian. We label these as|χ±B〉,

H(B)|χ±B〉 = ±1

2hωB|χ±B〉 (5.73)

If e > 0 (which we in the following assume to be the case) the two vectorsB andωB point inopposite directions and the ground state is the ”spin up” state alongn = B/B, while the excitedstate is the ”spin down” state,

|χ±B〉 = | ± n〉 (5.74)

From this follows that whenB varies over all possible directionsn, the ground state vector|χ−B〉will trace out the full manifold of spin states, withn as a point on the Bloch sphere. In the sameway the excited states|χ+

B〉 will trace out the full manifold of spin states whenB is varied, butthe two set of states|χ−B〉 and|χ+

B〉 give different parametrization of the manifold in terms ofB,since for givenB the spin up and spin down states are located at opposite points on the Blochsphere.

Thus, the manifold of states defined by the ground state for variableB is the full manifold ofspin states on the surface of the Bloch sphere and the parametrization of the sphere in terms of(B, θ, φ) corresponds to the standard one in the sense that(θ, φ) are the standard polar coordi-nates previously used for the states on the sphere. We note that the spin state is independent ofB,so in this sense it is superfluous. However, rather than viewingB as defining a set of coordinateson the space of states, we may take the opposite view that the mapping between the space ofBvectors and the Bloch sphere introduces a geometrical structure on the parameter space describedby B. The connection defined on the three-dimensionalB space is

a−(B) = −i〈χ−B|∇Bχ−B〉 (5.75)

where∇B is the gradient in the three-dimensionalB space. Expressed in polar coordinates theconnection is

a−(B) = −i[eB〈χ±B|∂Bχ±B〉+ eθ

1

B〈χ±B|∂θχ±B〉+ eφ

1

B sin θ〈χ±B|∂φχ±B〉

]= eθ

aθ(θ, φ)

B+ eφ

aφ(θ, φ)

B sin θ

= − 1

2Btan

θ

2eφ (5.76)

where in the above expressionaθ andaφ are the two components of the connection on the Blochsphere, previously used in Sect. 5.1.5. The corresponding pseudo magnetic field is

b−(B) = ∇B × a−(B) =1

B2eB (5.77)

Page 137: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

5.2. ADIABATIC EVOLUTION AND BERRY’S PHASE 137

which should be distinguished from the real magnetic fieldB. We see that this has the form of amagnetic monopole field, a radially directed field which falls off quadratically with the distancefrom the origin in theB space. This scaling withB represents the independence of B for thegeometry, it only depends on the anglesθ andφ. Thus the form of the geometry here representedin B space is essentially the same as the one previously discussed as the geometry of the Blochsphere.

As pointed out above, the geometry of the Bloch sphere can alternatively be represented bythe the excited state rather than the ground state. This gives a slightly different representation ofthe geometry inB space. The connection now is

a+(B) = −i〈χ+B|∇Bχ

+B〉 = − 1

2Bcot

θ

2eφ (5.78)

with a corresponding pseudo magnetic field

b+(B) = ∇B × a+(B) = − 1

B2eB (5.79)

Also this is a monopole field, but now the direction of the field inB space has been reversed.The form of the monopole field shows that there is a geometrical singularity at the point

B = 0, the location of the monopole. A more physical understanding of this singularity is thatit corresponds to a point in parameter space where the ground state is degenerate. Althoughthe relation between the degeneracy of the energy levels and the geometrical singularity hereis demonstrated only for a two-level system, it is of interest to note that such a relation existsmore generally. Thus, also in higher-dimensional quantum systems degeneracies in energy canbe viewed as geometrical singularities in the parameter space. If two levels coincide in energyat some point, the geometry close to this point is essentially determined only by these two stateswhile the other energy states in a sense are only ”spectators”. In this sense the geometry we havediscussed for the two-level system is relevant for the geometry of higher-dimensional systemsclose to points of degeneracy.

After this discussion of representations of the geometry in parameter space, we return to thequestion of adiabatic evolution and Berry phases. We consider a situation where the magneticfieldB is slowly changing at the time scale set by the precession frequency of the the spin vector,and assume that theB field after a timeT returns to its original value. The adiabatic conditionthen is

T >>2π

ωB= 2π

∣∣∣∣mceB∣∣∣∣ (5.80)

For simplicity let us assume thatB stays fixed under the evolution so that only the direction ofB changes. We assume the spin initially (t = 0) to be in the ground state, and from the previousdiscussion it will at the end of the period(t = T ), return to the same state, but now modified bya phase factor. Since the ground state energy is unchanged we find the phase factor to be

u− = exp(−eBh2mc

T + φ−) (5.81)

Page 138: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

138 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

whereφ− is the Berry phase of the ground state associated with the closed curve in parameterspace. It is

φ− =∮C

a− · dB = Ω (5.82)

whereC is the closed curve inB space andΩ is the solid angle traced out by theB vector whenfollowing the curve.

If the spin initially is in the excited state rather than the ground state it will return to the samestate but now modified with a different phase factor. This factor is determined in a similar way,and has the form

u+ = exp(eBh

2mcT + φ+) (5.83)

where the Berry phaseφ+ is now is

φ+ =∮C

a+ · dB = −Ω (5.84)

We note that both the dynamical part of the phase angle and the Berry phase have changed signrelative to those of the ground state. The change of sign of the dynamical phase angle is due tothe change in sign of the energy while the change in sign of the Berry phase is due to the fact thatwhenB (which defines the orientation of the spin up state) traces out a curve with solid angleΩ, the vector− B (which defines the orientation of the spin down state) will trace out a curvewhich covers the same solid angle, but in the opposite direction.

To give a more physical understanding of the meaning of the Berry phases associated with thetwo energy states let us finally consider a situation where the spin initially is in a superposisionof spin up and spin down,

|ψ(0)〉 = cos ξ |χ−B〉+ sin ξ |χ+B〉 (5.85)

In this case we distinguish between the direction of theB field, described by the unit vectornB = B/B and the direction of the spin vector given by,

n = 〈ψ|σ|ψ〉 (5.86)

The qualitative description of the spin motion now is that is composed by two motions. There isa slow (adiabatic) change in the direction of theB field, and superimposed on this is the rapidprecession of the spin vectorn aroundB.

To be more specific, we first note that in the adiabatic limit the angleξ between the magneticfield and the spin vector stays fixed during the evolution. This follows since there is no transitioninduced between the two energy levels. However, the relative phase between the two componentsof the state vector will change, and as shown by earlier expressions for the spin state (Sect.1.3.1),the relative phase determines the rotation angle ofn around theB field. Thus, the state vector att = T has a similar form as the initial state when expressed in terms of spin up and spin down

Page 139: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

5.3. GEOMETRICAL INTERPRETATION OF ELECTROMAGNETIC FIELDS 139

B

S

z

x

y

Figure 5.4:Spin precession in a slowly varying magnetic field. The direction of the magnetic fieldBis slowly rotating around thez-axis (indicated by the blue cone), while the spin vectorS = (h/2)n israpidly precessing around theB field (indicated by the red cones). The total precession angle is composedof a dynamical part determined by the spin energy (and the time spent on the path) and a geomatrical partdetermined by the solid angle traced out by theB field.

alongnB, but each of these states are modified by the phase factors picked up during the timeevolution

|ψ(T )〉 = cos ξ exp(−eBh2mc

T + φ−)|χ−B〉+ sin ξ exp(eBh

2mcT + φ+)|χ+

B〉 (5.87)

This shows that the relative phase of the two states is

φ =eBh

mcT + 2φ+ =

eBh

mcT + 2Ω (5.88)

and this phase angle is equal to the total rotation angle of the spin vectorn performed during thetime evolution. Such a rotation angle can be detected experimentally, and spin measurementshave demonstrated the presence of the additional rotation angle2Ω given by the Berry phase.

5.3 Geometrical interpretation of electromagnetic fields

In the above discussion a certain formal analogy between geometrical variables and electromag-netic fields has been pointed out. Thus, the connectionak corresponds to the electromagnetic po-tentials and the curvature fieldfij (or bk in three dimensions) corresponds to the electromagneticfield tensor. Similarly invariance under phase transformations of the basis vectors correspondsto invariance under gauge transformations. The purpose of this section is to show more directlyhow the quantum description of a charged particle interacting with an electromagnetic field canbe given a geometrical interpretation.

Page 140: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

140 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

5.3.1 Local spaces of quantum states

In the general, abstract formulation of quantum mechanics, all representations correspondingto different choices of bases in Hilbert space are treated on the same footing. They are unitar-ily equivalent. In particular the coordinate representation is intrinsically no different from themomentum representation, although the Hamiltonian will normally make a distinction betweenthese variables. However, the generalization of the quantum formalism to be studied in thissectiondoesmake a distinction, with the coordinate representation playing a special role.

As a reminder, let us focus on the relation between the abstract state vector and a coordinaterepresentation of the same vector. For a spinless particle we write this as

|ψ〉 =∫dxψ(x)|x〉 (5.89)

where|x〉 is a coordinate basis state withψ(x) as the corresponding complex valued wave func-tion. (The variablex is here represented as a single continuous coordinate, but more generally itmay correspond to a set of coordinates for one or several particles in more than one dimension.)If the particle has spin or otherinternal degrees of freedom, a complete specification of a basisvector implies the quantization of both position and spin, and the expansion on the coordinatebasis then takes the form

|ψ〉 =∫dx∑i

ψi(x)|x, i〉 (5.90)

with the discrete indexi representing the internal degree of freedom. With no special link be-tween the coordinate space and the space of the internal degree of freedom, we may view theHilbert space as a tensor product space of a coordinate state space and an internal state space,

H = Hcoord ⊗Hinternal (5.91)

Similarly the basis vectors of the expansion (5.90) can be viewed as a product basis in this space,

|x, i〉 = |x〉 ⊗ |i〉 (5.92)

In the generalization now to be discussed we restrict the discussion to the quantum mechan-ics of a single particle, wherex represents either points in three dimensional space or four-dimensional space-time. We further make the fundamental assumption that the internal space isnot completely independent of the coordinate space. We assume instead that the spaceHinternal

to be associated with a particular pointx in coordinate space in a similar way as the tangent spaceof a curved manifold is associated with a point on the manifold. Let us denote such an internalspace associated with the pointx byHint

x . Thex-dependence of this space means that the Hilbertspace can no longer be viewed as the tensor product of a coordinate space and an internal space.

In the same way as a vector fieldf(x) is a function which takes value in the tangent spaces,the wave functionΨ(x) is now considered to be a function which takes values in the local spacesHintx . This means thatΨ(x) is primarily an abstract function which only when expanded in the

local basis gets the form of a (multicomponent) complex valued wave function,

Ψ(x) =∑k

ψk(x)χxk (5.93)

Page 141: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

5.3. GEOMETRICAL INTERPRETATION OF ELECTROMAGNETIC FIELDS 141

We note that even when there is no internal degree of freedom, so that the local vector space isa one-dimensional complex space, this generalization gives meaning, so that the wave function,written as

Ψ(x) = ψ(x)χx (5.94)

is reduced to the complex wave functionψ(x) only after a local basis vectorχx is chosen.This formulation of quantum mechanics is formally similar to the one given in the preceding

sections, where the local basis vector was denoted by|χx〉. However there is a main difference,which is stressed by writing the local basis vector asχx and not as a ”ket”. In the previousdiscussion the one-dimensional space spanned by the the local basis vector was a subspace inthe larger, flat Hilbert space of states, and each subspace corresponded to a physical quantumstate. This is no longer true for the subspace spanned by the local basis vectorχx. The localspace is not a subspace of the Hilbert space, and the quantum state is not represented by a vectorbelonging to one of the local spaces. The quantum state is instead represented by afunctionΨ(x), and the Hilbert space is the vector space of these (quadratically integrable) functions.

Comparing with the geometry of tangent spaces, the correspondence now is between theHilbert space and an infinite-dimensional space of vector fields defined on the continuousbundleof tangent spaces. And since the local spaces no longer are subspaces of the same higher dimen-sional vector space, this corresponds to the situation where the tangent spaces are derived from acurved manifold that is no longer embedded in a higher dimensional flat space. It is well knownfrom the time of Riemann that such an embedding is not needed to introduce a geometrical struc-ture on the manifold. The same is the situation with the continuum of internal quantum spaces. Ageometrical structure can be introduced without reference to a higher dimensional space. Againwe focus onparallel transportas a basic operation which connect neighbouring spaces, but nowthis operation has to be introduced without reference to the scalar product of the Hilbert space.We introduce parallel transport as a basic relation of the formalism and study the question of howto interpret this relation in physical terms.

5.3.2 Parallel transport and gauge fields

At the formal level the geometry of the local spaces can be introduced in a similar way as inprevious sections, with the exception that equations that refer to the embedding in the higherdimensional space should be omitted. Thus, parallel transport between neighbouring spaces canbe represented by a mapping which acts on the basis vectors in the following way,

P (x, x+ dx)χx = (1− iqAk(x)dxk)χx+dx (5.95)

The equation expresses that, with respect to the parallel transport operation, the basis vectorvaries continuously withx. Therefore the change of the vector is first order in the displacementdx. The transformation is assumed to be unitary (norm preserving) so that the connectionAk is areal valued field. The physical interpretation of this field is that it describes the electromagneticpotential, and the parameterq which has been extracted form the connection is the charge of the

Page 142: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

142 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

particle described by the formalism. Thus, the formalism that is now discussed is a geometricalformulation of the quantum mechanics of a charged particle in an electromagnetic field.

The curvature tensor associated with the connectionAk(x) is

Fkl = ∂kAl(x)− ∂lAk(x) (5.96)

and this is now identified as the electromagnetic field tensor. We note that ifx represents the threedimensional space coordinates only the part which represents the magnetic field is included. Ifthe full electromagnetic field tensor should be included,x has to represent the four coordinates ofspace-time. This means that the local spaces, with the basis vectorsχx, are considered as beingindependent spaces not only with respect to a displacement in space, but also to a displacementin time.

Parallel transport around a closed curveC at fixed time gives rise to a phase factor whichnow has a direct physical meaning. It is

P (C) = exp(−iq∮C

A · dx)

= exp(−iq∫SB · dS)

= exp(−iqΦm) (5.97)

with B = ∇ ×A andA as the 3-vector part of the electromagnetic potential, and furtherS isa surface withC as boundary, whileΦm is the total magnetic flux throughS. Thus the phaseangle introduced by transport around the curve is proportional to the magnetic flux enclosed bythe curve, with the electric charge as the proportionality factor.

The set of (normalized) basis vectorsχx is uniquely defined only up to a local phase factor.Thus a redefinition of the basis,

χx → χ′x = exp(iθ(x))χx (5.98)

will change the wave function of Eq. (5.94) and the potentialAk(x) in the following way,

ψ(x) → ψ′(x) = exp(−iθ(x))

Ak(x) → A′k(x) = Ak(x)−

1

q∂kθ (5.99)

since the (abstract) functionΨ(x) as well as the unitary transformationP (x, x+dx) we consideras independent of choice of basis vectors. This transformation we identify as an electromagneticgauge transformation, and within the present approach it is viewed as caused by a change of basisin the continuous set of local quantum spaces.

The gauge independent quantities are those that do not depend on a specific choice of basisstates. They represent the geometrical objects of the theory and correspond to the basic physicalvariables. The abstract wave functionΨ(x) is gauge independent, while the corresponding com-plex valued functionψ(x) is not. Differentiation of this function does not represent any gauge

Page 143: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

5.3. GEOMETRICAL INTERPRETATION OF ELECTROMAGNETIC FIELDS 143

Φm

Figure 5.5:Visualizing the geometry of a magnetic field confined to a flux tube. Only within the flux tubethere is a non-vanishing local curvature. This will however give rise to a global curvature in the regionaround the tube. This is similar to the geometry of a cone. Parallel transport of a vector around the conereveals the global curvature in the form of the rotation angle of the vector. The global effect of the fluxtube is revealed through the phase factor of the wave function for a charged particle that can traverse theregion around the tube.

independent quantity, butcovariant differentiationdoes. To introduce covariant differentiationDkΨ in an obviously gauge invariant form, we write

dxkDkΨ = Ψ(x+ dx)− P (x, x+ dx)Ψ(x) (5.100)

The covariant derivativeDk is in a specific basis represented byDk, with

DkΨ = (Dkψ)χx (5.101)

and from the expression for the parallel transport (5.95) we find

Dk = ∂k − iqAk (5.102)

This form of the derivative gives rise to theminimal couplingof the charged particle to theelectromagnetic field, and we note that the standard non-relativistic Schrodinger equation by useof the covariant derivative can be written in the gauge independent form

ihDtΨ = − 1

2m

3∑k=1

Dk2Ψ (5.103)

with Dt = ∂t − iqA0 as the covariant derivative in the time direction.

Page 144: Non-Relativistic Quantum Mechanics - Universitetet i oslo...Non-Relativistic Quantum Mechanics Lecture notes – FYS 4110 Jon Magne Leinaas Department of Physics, University of Oslo

144 CHAPTER 5. QUANTUM MECHANICS AND GEOMETRY

To summarize this section we note from the above discussion that the electromagnetic fieldcan be viewed as defining a geometrical structure that affects a charged particle through the con-nection it defines on the set of local quantum spaces. In the discussion of adiabatic spin motionmotion we have earlier showed how the geometry of quantum states can be viewed as defining ageometry in parameter space instead of on the manifold of quantum states. A similar view can betaken here. The electromagnetic field defines a geometrical structure on physical space-time, andfor a charged particle this structure affects the quantum field through the geometrical structure ofthe continuous set of local spaces. It is interesting to note that the geometrical structure definedby the antisymmetric field tensorFij in this sense supplements the metric structure of space-timedefined by the symmetric tensorgij.