8 lorentz invariance and special relativity

87
8 Lorentz Invariance and Special Relativity The principle of special relativity is the assertion that all laws of physics take the same form as described by two observers moving with respect to each other at constant velocity v. If the dynamical equations for a system preserve their form under such a change of coordinates, then they must show a corresponding symmetry. Newtonian mechanics satisfied this principle in the form of Galilei relativity, for which the relation between the coordinates was simply r (t)= r(t) vt, and for which time in the two frames was identical. However, Maxwell’s field equations do not preserve their form under this change of coordinates, but rather under a modified transformation: the Lorentz transformations. 8.1 Space-time symmetries of the wave equation Let us first study the space-time symmetries of the wave equation for a field component in the absence of sources: 2 1 c 2 2 ∂t 2 ψ = 0 (411) As we discussed last semester spatial rotations x k = R kl x l are realized by the field transfor- mation ψ (x ,t)= ψ(x,t)= ψ(R 1 x ,t). Then k ψ = R 1 lk l ψ = R kl l ψ, and the wave equation is invariant under rotations because 2 is a rotational scalar. If we have set up a fixed Cartesian coordinate system we may build up any rotation by a sequence of rotations about any of the three axes. Instead of specifying the axis of one of these basic rotations, it is more convenient to specify the plane in which the coordinate axes rotate. For example, we describe a rotation by angle θ about the z -axis as a rotation in the xy-plane. x = cos θx sin θy , y = sin θx + cos θy . (412) Then it is easy to see from the properties of trig functions that 2 x + 2 y + 2 z = 2 x + 2 y + 2 z , (413) under a rotation in any of the three planes, and through composition under any spatial rotation. There must be a similar symmetry in the xt-, yt-, and zt-planes. But because of the relative minus sign we have to use hyperbolic trig functions instead of trig functions: x = cosh λx + sinh λ c∂t , c∂t = + sinh λx + cosh λ c∂t (414) The invariance of 2 x 2 /c 2 ∂t 2 under this transformation then follows. Clearly there are analogous symmetries in the yt- and zt-planes. These transformations will replace the Galilei boosts of Newtonian relativity. 85 c 2010 by Charles Thorn

Upload: others

Post on 12-Sep-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 8 Lorentz Invariance and Special Relativity

8 Lorentz Invariance and Special Relativity

The principle of special relativity is the assertion that all laws of physics take the same formas described by two observers moving with respect to each other at constant velocity v. Ifthe dynamical equations for a system preserve their form under such a change of coordinates,then they must show a corresponding symmetry. Newtonian mechanics satisfied this principlein the form of Galilei relativity, for which the relation between the coordinates was simplyr′(t) = r(t) − vt, and for which time in the two frames was identical. However, Maxwell’sfield equations do not preserve their form under this change of coordinates, but rather undera modified transformation: the Lorentz transformations.

8.1 Space-time symmetries of the wave equation

Let us first study the space-time symmetries of the wave equation for a field component inthe absence of sources:

−(

∇2 − 1

c2∂2

∂t2

)

ψ = 0 (411)

As we discussed last semester spatial rotations x′k = Rklxl are realized by the field transfor-

mation ψ′(x′, t) = ψ(x, t) = ψ(R−1x′, t). Then ∇′kψ

′ = R−1lk ∇lψ = Rkl∇lψ, and the wave

equation is invariant under rotations because ∇2 is a rotational scalar. If we have set up afixed Cartesian coordinate system we may build up any rotation by a sequence of rotationsabout any of the three axes. Instead of specifying the axis of one of these basic rotations, itis more convenient to specify the plane in which the coordinate axes rotate. For example,we describe a rotation by angle θ about the z-axis as a rotation in the xy-plane.

∇′x = cos θ∇x − sin θ∇y, ∇′

y = sin θ∇x + cos θ∇y. (412)

Then it is easy to see from the properties of trig functions that

∇′2x +∇′2

y +∇′2z = ∇2

x +∇2y +∇2

z, (413)

under a rotation in any of the three planes, and through composition under any spatialrotation.

There must be a similar symmetry in the xt-, yt-, and zt-planes. But because of therelative minus sign we have to use hyperbolic trig functions instead of trig functions:

∇′x = coshλ∇x + sinhλ

c∂t,

c∂t′= +sinhλ∇x + coshλ

c∂t(414)

The invariance of ∇2x − ∂2/c2∂t2 under this transformation then follows. Clearly there are

analogous symmetries in the yt- and zt-planes. These transformations will replace the Galileiboosts of Newtonian relativity.

85 c©2010 by Charles Thorn

Page 2: 8 Lorentz Invariance and Special Relativity

Now let us interpret this symmetry in terms of how the coordinates transform:

x′ = x coshλ− ct sinhλ ≡ γ(x− vt) (415)

ct′ = ct coshλ− x sinhλ ≡ γ(ct− vx/c) (416)

where v = c tanhλ. From the identity cosh−2 = 1 − tanh2, we see that γ ≡ coshλ =1/√

1− v2/c2. We see from the first equation that the origin of the primed coordinatesystem x′ = 0 corresponds to x = vt which means that the origin of the primed system ismoving on the x-axis at the speed v. We say that the primed system is boosted by velocityv = vx with respect to the unprimed system. It is also essential for the invariance of thewave equation that the time t transform as shown.

The three rotations in the xy-, yz-, and zx-plane together with the three boosts in thext-, yt-, and zt-planes generate the group of Lorentz transformations. If the only entitiesin the physical world were fields satisfying the wave equation, we would have to identifythe Lorentz boosts with transformations to new inertial frames. As we shall see, Maxwell’sequations are also invariant under Lorentz transformations, provided that the electric andmagnetic fields are appropriately transformed among themselves. Indeed we already knowthat the electric and magnetic fields transform as vectors under ordinary rotations. Todiscuss their transformation under boosts we will have to develop the formalism of 4-vectorsand 4-tensors.

8.2 Einstein’s Insights

Einstein’s contribution to special relativity was not the discovery of Lorentz transformations(which were already well-known). Rather, he recognized that the clash between Lorentzinvariance and the Galilei invariance of Newtonian mechanics was inconsistent with thephysical principle of relativity. The laws of Maxwell electrodynamics and Newtonian me-chanics cannot both look the same to different inertial observers, since they require differenttransformation rules. One or the other required modification, and Einstein realized thatthe defect was in Newtonian mechanics, for which inertial frames are related by Galileitransformations

r′ = r − vt, t′ = t (417)

For v << c Lorentz transformations are indistinguishable from Galilei transformations,which accounts for the lack of experimental evidence at the time that Newtonian mechanicswas wrong. On the other hand experiments explicitly designed to show that Maxwell’sequations took different forms in different inertial frames had repeatedly obtained null results.Einstein recognized this as direct evidence that Lorentz transformations gave the correctrelationship between inertial frames. If this is the case, Newtonian mechanics has to bemodified. Accordingly, Einstein developed a new relativistic (i.e. Lorentz invariant) particlemechanics.

An alternative way to discover the necessary modifications to Newtonian mechanics is todescribe massive matter with Lorentz invariant wave equations. Unlike light, which always

86 c©2010 by Charles Thorn

Page 3: 8 Lorentz Invariance and Special Relativity

travels at speed c, material bodies can be brought to rest. The wave equation for lightdescribes wave packets with definite wave number with group velocity vg = ck, which alwayshas magnitude c. However consider the (Klein-Gordon) wave equation

(

−∇2 +1

c2∂2

∂t2+ µ2

)

ψ = 0 (418)

Then a plane wave eik·r−iωt will solve this equation if ω2 = c2(k2+µ2), so the group velocity

is vg = ck/√

k2 + µ2, which vanishes as k → 0. On the other hand the new wave equationis transparently Lorentz invariant, in the same way that the massless wave equation is.

Of course, in Einstein’s time the idea that material bodies were wave packets must haveseemed a little far-fetched. so he, instead, worked to directly modify the laws of particlemechanics. Ironically, Einstein was actually the first to realize that because of quantummechanics electromagnetic waves display particle-like properties: the photon can be thoughtof as a plane wave with energy E = ~ω and momentum p = ~k. Applying this insight toour massive wave packet leads, with ~µ ≡ mc, to

E = c√

p2 +m2c2 ∼ mc2 +p2

2m+O

(

p4

m3c2

)

, vg = cp

p2 +m2c2(419)

where the last form on the right displays the low momentum limit p≪ mc. This shows thatthe Newtonian kinetic energy is obtained at low momentum and that m is the Newtonianmass. At p = 0 we have Einstein’s famous E = mc2!

We can easily solve for p in terms of v:

v2

c2=

p2

p2 +m2c2, E =

mc2√

1− v2/c2= γmc2, p =

vE

c2= γmv (420)

As we shall soon see this modification of the relation between momentum and velocity issufficient to make particle mechanics consistent with Lorentz invariance. In particular weshall see that with the Lorentz force law and this relation, Newton’s equation of motion inthe form

dp

dt= F = q(E + v ×B) (421)

is valid in all inertial frames connected by Lorentz transformations. But to see this clearly,we need to develop the machinery of 4-vectors and 4-tensors and their transformation laws.

8.3 Some Kinematical Aspects of Lorentz transformations

Time Dilatation Let us consider a clock moving down the x-axis according to x(t) =vt, y(t) = z(t) = 0. The clock is at rest at the origin of coordinates in the primed frame withspacetime coordinates given by

x′ = γ(x− vt), t′ = γ(t− vx/c2), y′ = y, z′ = z (422)

87 c©2010 by Charles Thorn

Page 4: 8 Lorentz Invariance and Special Relativity

The time registered on the clock is t′, which is related to the time in the unprimed frame by

t′ = γ(v)(t− vx(t)/c2) = γ(v)(t− v2t/c2) = t√

1− v2/c2 (423)

where we took into account the fact that the x coordinate of the clock is changing in theunprimed frame, x(t) = vt. Thus the observer in the unprimed frame judges the clock torun more slowly than in its rest frame by a factor of 1/

1− v2/c2 = γ(v).Now consider a clock at rest at the origin in the unprimed frame. Now the relation

between the times in the two frames is simply t′ = γt. And the primed observer also judgesthis clock to run more slowly by the same factor γ(v). Each observer concludes that the otherobserver’s clocks slow down! This effect is literally confirmed daily in accelerator experimentswhere fast unstable particles are able to travel distances many times their lifetime times c.

More generally, consider a clock in arbitrary motion, described by a trajectory r(t) in theunprimed frame. At a given moment we can consider a primed frame in which the clock isinstantaneously at rest. Then assuming at that moment r = vx the Lorentz transformationof an infinitesimal time interval dt′ to the Lab frame reads dt′ = γ(dt − vdx/c2) = γ(1 −v2/c2)dt = dt

1− v2/c2. By rotational invariance, if r = v is in a general direction this

implies dt′ = dt√

1− v2/c2. Then a finite lapse of time in the instantaneous rest frame∆T = t′2 − t′1 is given by the integral

∆T =

∫ t2

t1

dt

1−(

1

c

dr

dt

)2

< t2 − t1 . (424)

The inequality confirms that a moving clock slows down no matter what the details of themotion. In particular the clock could start out at rest at the origin, accelerate to some speed,decelerate to rest, turn around accelerate and decelerate to arrive at rest where it started.The clock will show an elapsed time ∆T whereas a clock that stays at rest at the origin willshow an elapsed time t2− t1 > ∆T . Because the clock is undergoing acceleration an observermoving with the clock is no longer an inertial observer (he is feeling all kinds of (fictitious)centrifugal forces), so there is no longer a symmetry between the moving observer and theone at rest.

Invariant intervals and the Light Cone Points in spacetime are more precisely thoughtof as events. By construction Lorentz transformations leave the quantity x · x = x2 − c2t2

invariant. But since all events are subject to the same transformation, the “interval” betweentwo events s212 = (x1 − x2) · (x1 − x2) is also invariant. Intervals can be positive (space-like),negative (time-like) or zero (light-like). If one of the two events is at the origin xµ1 = 0,the events light-like separated from the origin lie on a double cone with vertices at theorigin. Events within the cone have time-like separation from the origin, and those outsidehave space-like separation. There can be no causal connection between space-like separatedevents because they cannot be connected by light signals. We may define the proper time τof a body in motion through dτ 2 = −dx · dx/c2 = dt2 − dr2/c2 = dt2(1− r2/c2). Note thata clock moving with the body registers the body’s proper time.

88 c©2010 by Charles Thorn

Page 5: 8 Lorentz Invariance and Special Relativity

Length Contraction Consider a rod of length L at rest in the unprimed frame parallel tothe x-axis, with ends, say at x1 = 0 and x2 = L. In the primed system it is moving in thenegative x direction at speed v: x′1 = γ(−vt), x′2 = γ(L − vt). Its length will be judged tobe L′ = x′2(t

′) − x′1(t′) where the coordinates are measured at the same t′. But these will

correspond to different values of t: t′ = γ(t1) = γ(t2 − vL/c2), so t2 = t1 + Lv/c2. Then

L′ = γ(L− vt2) + γvt1 = γL(1− v2/c2) = L√

1− v2/c2 (425)

A moving object is therefor observed to be contracted by a factor 1/γ in the direction ofmotion.

We can also consider the rod at rest in the primed frame, in which case x′1 = 0, x′2 = L.At a particular time t in the unprimed frame these coordinates are related to the unprimedones by x′1 = γ(v)(x1−vt), x′2 = γ(v)(x2−vt), so L = x′2−x′1 = γ(v)(x2−x1), so we reach thesame conclusion, that length is perceived to be contracted by motion x2−x1 = L

1− v2/c2.

Simultaneity is relative Simultaneous events (x1, t), (x2, t) in the unprimed frame cor-respond to events (x′1, t

′1), (x

′2, t

′2) where t

′2 − t′1 = γv(x1 − x2)/c

2 in the primed frame. Ofcourse two simultaneous events at the same spatial point are simultaneous in all frames.

Addition of velocities Suppose a particle is moving with constant velocity (Vx, Vy, Vz)in the primed frame, so x′ = Vxt

′, y′ = Vyt′, z′ = Vzt

′. Then x − vt = Vx(t − vx/c2),y = Vyγ(t− vx/c2), z = Vzγ(t− vx/c2).

x =v + Vx

1 + vVx/c2t, y =

Vyγ(1 + vVx/c2)

t, z =Vz

γ(1 + vVx/c2)t (426)

Calling the components of velocity in the unprimed frame (Ux, Uy, Uz) we see that

Ux =v + Vx

1 + vVx/c2, Uy =

Vyγ(1 + vVx/c2)

, Uz =Vz

γ(1 + vVx/c2)(427)

U 2 =(v + Vx)

2 + (V 2 − V 2x )(1− v2/c2)

(1 + vVx/c2)2

= c2 − c2(1− V 2/c2)(1− v2/c2)

(1 + vVx/c2)2(428)

From the last form we see that U 2 → c2 when either v2 → c2 or V 2 → c2 or both. Thus onenever achieves a velocity greater than c by adding two velocities, no matter how close theyare to c.

8.4 Space-time Tensors and their Transformation Laws

Let us introduce a unified notation xµ for space-time coordinates where the index assumesthe values µ = 0, 1, 2, 3 and we identify x0 = ct, x1 = x, x2 = y, x3 = z. The factor of cin x0 assures that all components of xµ have dimensions of length. Then a general Lorentz

89 c©2010 by Charles Thorn

Page 6: 8 Lorentz Invariance and Special Relativity

transformation can be expressed as a linear transformation x′µ = Λµνx

ν , where the 4 × 4matrix Λµ

ν is restricted by the requirement that x · x ≡ x2 − c2t2 is invariant under thetransformation. To express the consequences of this invariance, it is convenient to introducethe diagonal matrix ηµν , which has nonvanishing components η11 = η22 = η33 = −η00 = 1.Then we can write x · x = ηµνx

µxν , and the invariance condition becomes

x′ · x′ = ηµνΛµρΛ

νσx

ρxσ = ηρσxρxσ = x · x, ηµνΛ

µρΛ

νσ = ηρσ (429)

An immediate consequence of the second equation is (det Λ)2 = 1 implying det Λ = ±1.Notice that with the stated condition on Λ, it follows that v′ ·w′ = v ·w for any vµ, wµ whichtransform like coordinates: v′µ = Λµ

ρvρ and w′µ = Λµ

ρwρ.

Matrix Notation We can arrange the components of xµ in a column vector

x =

x0

x1

x2

x2

(430)

and the components of Λµν in a 4× 4 matrix

Λ =

Λ00 Λ0

1 Λ02 Λ0

3

Λ10 Λ1

1 Λ12 Λ1

3

Λ20 Λ2

1 Λ22 Λ2

3

Λ30 Λ3

1 Λ32 Λ3

3

. (431)

Then the equation x′µ = Λµνx

ν becomes simply x′ = Λx, and the equation ηµνΛµρΛ

νσ = ηρσ

reads

ΛTηΛ = η, η =

−1 0 0 00 1 0 00 0 1 00 0 0 1

(432)

As an example a boost v in the x-direction and a rotation in the xy plane are given by

Bx =

γ −γv/c 0 0−γv/c γ 0 0

0 0 1 00 0 0 1

Rxy =

1 0 0 00 cos θ − sin θ 00 sin θ cos θ 00 0 0 1

(433)

respectively.

Contravariant Tensors We define a contravariant 4-vector vµ as a 4 component entitywhose components transform just like the coordinates v′µ = Λµ

νvν . We also stipulate that

4-vectors are unchanged under space-time translations. For this reason the four coordinates

90 c©2010 by Charles Thorn

Page 7: 8 Lorentz Invariance and Special Relativity

xµ are not components of a 4-vector, because they change under translations x → x + a.But displacements, which are differences of coordinates are 4-vectors. In particular, thedifferentials dxµ are contravariant 4-vectors.

Another important contravariant 4-vector is the energy momentum 4-vector of a particlepµ = (E/c,p) = γm(c,v). The easiest way to see that it transforms as a 4-vector is toconsider the boost which takes the particle from rest to velocity v = vx. Let the unprimedframe be the one where the particle is at rest p1,2,3 = 0 and p0 = mc. Then the primed frameshould move in the negative x direction with speed v. If pµ is a contravariant 4-vector, itscomponents in the new frame should then be

p′0 = γ(p0 +v

cp1) = γmc, p′1 = γ(p1 +

v

cp0) = γmv (434)

which are precisely the values we have found for a particle with this velocity. An ordinaryrotation can then point the velocity in any direction. Incidentally, in our discussion of theDoppler shift we shall see that the frequency and wave number of a plane wave kµ = (ω/c,k)also transform like a 4-vector. This ensures that the Einstein/deBroglie quantum relationspµ = ~kµ are compatible with Lorentz invariance.

Doppler Shift We examine how a plane wave eik·r−iωt in the unprimed frame looks in theprimed frame. Thinking of kµ = (ω/c,k) as a 4-vector, the plane wave can be writteneik·x = eik

′·x′

. thus in the primed frame, we also have a plane wave but with frequency andwave numbers

ω′ = γ(ω − vkx), k′x = γ(kx − vω/c2), k′y = ky k′z = kz (435)

Writing kx = ω cos θ/c, ky = ω sin θ cosϕ/c, kz = ω sin θ sinϕ/c, and similarly for k′, θ′, ϕ′ =ϕ, these transformation laws lead to

ω′ = γω(1− v

ccos θ), tan θ′ =

sin θ

γ(cos θ − v/c)(436)

If θ = 0 the primed observer is moving away from the source of light and we see in this caseω′ = ω

(c− v)/(c+ v) < ω which means a red shift. On the other hand, if θ = π the ob-

server is moving toward the source of light and we find a blue shift ω′ = ω√

(c+ v)/(c− v) >ω. Note that there is a relativistic Doppler effect when θ = π/2 when ω′ = γω > ω.This transverse Doppler shift is a blue shift. In contrast, at θ′ = π/2, cos θ = v/c, andω′ = ω

1− v2/c2, a red shift!

Covariant Tensors In special relativity, there is another type of 4-vector called a covari-ant 4-vector, which transforms as the 4-gradient operator ∂µ ≡ ∂/∂xµ. To see that thetransformation is different we use the chain rule to write

∂µ =∂

∂xµ=∂x′ν

∂xµ∂

∂x′ν= Λν

µ∂′ν , ∂′ν = ∂µ(Λ

−1)µν (437)

More generally any covariant vector vµ transforms as v′µ = vν(Λ−1)νµ. We see two differences

with the contravariant transformation law: first Λ−1 occurs and secondly the first rather

91 c©2010 by Charles Thorn

Page 8: 8 Lorentz Invariance and Special Relativity

than the second index is summed, i.e. the transpose occurs. Note that because of thisdifference the scalar vµw

µ is invariant under all linear transformations, not just Lorentztransformations. The condition for ordinary rotations R−1 = RT obliterates the distinctionbetween the two kinds of vectors. But for Lorentz transformations, we have instead thecondition

Λ−1 = η−1ΛTη

(Λ−1)µν = ηµσΛρσηρν = ηνρΛ

ρση

σµ (438)

where ησµ has identical components to ησµ but we raise the indices because we are visualizingit as η−1. Because of the −1 entries in η, covariant and contravariant 4-vectors have slightlydifferent transformation laws. We also see from this exercise that a contravariant vector canbe converted to a covariant one using the matrix η: vµ ≡ ηµνv

ν . Or the other way aroundvµ = ηµνvν . We see that v1,2,3 = v1,2,3 but v0 = −v0. Although the distinction is very slightwe shall maintain it throughout. The following forms for the scalar product of two fourvectors summarize the distinctions we are making:

v · w = ηµνvµwν = vµwµ = vνw

ν = ηµνvµwν (439)

From now on the summation convention for repeated indices will always involve the contrac-tion of an upper (contravariant) index with a lower (covariant) index.

Once we have understood contravariant and covariant 4-vectors, multi-index tensors (e.g.T µν

λ are an easy generalization. Each contravariant (upper) index transforms as a contravari-ant vector and each covariant (lower) index like a covariant vector. A tensor field also hasits space-time argument transformed:

T ′µνλ (x

′) = ΛµρΛ

νσ(Λ

−1)κλTρσκ(Λ

−1x′) (440)

8.5 Lorentz covariance of Maxwell’s equations

Because tensors transform homogeneously under Lorentz transformations, a set of dynamicalequations that set a tensor to zero will automatically retain its form in all inertial frames.Thus we seek to cast Maxwell’s equations in this way. The first step is to identify the electricand magnetic fields as the components of some tensor. We start by expressing them in termsof potentials

Ek = −∇kφ− ∂Ak

∂t= ∂k(−φ)− c∂0Ak ≡ c∂kA0 − c∂0Ak (441)

ǫijkBk = ǫijkǫ

kmn∇mAn = ∂iAj − ∂jAi (442)

We already know that ∂µ transforms as a covariant 4-vector. So the above forms stronglysuggest that we should interpret Aµ = (−φ/c, Ak) also as a covariant four vector and Fµν =c(∂µAν − ∂νAµ) as a second rank covariant 4-tensor. Since it is antisymmetric, there are

92 c©2010 by Charles Thorn

Page 9: 8 Lorentz Invariance and Special Relativity

precisely 6 independent components, corresponding to the 6 components of E and B. Wemay write out the components of Fµν as a 4× 4 matrix array as follows:

Fµν =

0 −E1 −E2 −E3

E1 0 cB3 −cB2

E2 −cB3 0 cB1

E3 cB2 −cB1 0

(443)

Let us work out what the tensor transformation law F ′µν = ΛµρΛ

νσF

ρσ implies for theelectric and magnetic fields. Consider boosts in the x direction, so Λ0

0 = Λ11 = γ, Λ0

1 =Λ1

0 = −γv/c, and Λ22 = Λ3

3 = 1. Then

E ′1 = F ′10 = F ′01 = Λ0

0Λ11F

01 + Λ10Λ

01F

10 = F 01 = E1 (444)

cB′1 = F ′23 = F ′23 = Λ2

2Λ33F

23 = F 23 = cB1 (445)

E ′2 = F ′02 = Λ22(Λ

00F

02 + Λ01F

12)

= γ(

E2 − vB3)

= γ(

E2 + (v ×B)2)

(446)

cB′2 = F ′31 = Λ33(Λ

11F

31 + Λ10F

30)

= γ(

cB2 +v

cE3)

= γ(

cB2 − (v

c×E)2

)

(447)

E ′3 = γ(

E3 + (v ×B)3)

, cB′3 = γ(

cB3 − (v

c×E)3

)

(448)

We can infer from these formulas the boosts in a general direction v, by defining ‖ and ⊥ asthe components of a vector parallel to and perpendicular to v respectively. Then

E′‖ = E‖, B′

‖ = B‖ (449)

E′⊥ = γ(E⊥ + v ×B), B′

⊥ = γ(

B⊥ − v

c2×E

)

(450)

Since we have used potentials, the homogeneous Maxwell equations are automatically satis-fied. The remaining ones are

ρ = ǫ0∇ ·E = ǫ0∂kFk0 = −ǫ0∂kF k0 = ǫ0∂kF0k (451)

J i =1

µ0

ǫikm∂kBm − cǫ0∂0E

i =1

cµ0

∂kFik − cǫ0∂0Fi0

= cǫ0∂kFik + cǫ0∂0F

i0 (452)

Thus we see that if we interpret Jµ = (cρ, Jk) as a contravariant 4-vector, Maxwell’s equa-tions assume the form5

ǫ0∂νFµν =

1

cJµ ≡ ρµ, (453)

5In SI units electric and magnetic fields have different units. Since we have defined Fµν to have theunits of the electric field, ǫ0∂νF

µν has dimensions of charge density as does Jµ/c ≡ ρµ. Perhaps we shouldhave defined a field strength Bµν = ∂µAν − ∂νAµ = Fµν/c with dimensions of the magnetic field. ThenMaxwell’s equations would assume the form ǫ0c

2∂νBµν = ∂νB

µν/µ0 = Jµ, which resembles the magneticAmpere-Maxwell equation.

93 c©2010 by Charles Thorn

Page 10: 8 Lorentz Invariance and Special Relativity

which resembles the Gauss-Maxwell equation. This puts the equations in manifestly Lorentzcovariant form, so they take this form in all inertial frames connected by Lorentz transfor-mations. Note that charge conservation 0 = ∂µJ

µ = ∇ · J + ∂ρ∂t

is a direct consequence ofthe antisymmetry of F µν = −F νµ. Its validity in all inertial frames follows from the 4-vectorinterpretation of Jµ.

Although we have used potentials to identify the fields as components of Fµν , we candispense with them by restoring the homogeneous Maxwell equations. We cast them incovariant form as follows:

0 = ∇ ·B =1

c(∂1F23 + ∂2F31 + ∂3F12)

0 = (∇×E)i +∂Bi

∂t= ǫijk∂jFk0 +

1

2∂0ǫ

ijkFjk

=1

2ǫijk(∂jFk0 + ∂kF0j + ∂0Fjk) (454)

These 4 equations are the various components of the covariant condition

∂µFνρ + ∂ρFµν + ∂νFρµ = 0 (455)

One can check directly that the left side is antisymmetric under the interchange of anypair of indices, which means that no information is lost if one multiplies by the completelyantisymmetric 4-tensor ǫκµνρ and sums over repeated indices:

0 = ǫκµνρ∂µFνρ = ∂µ(ǫκµνρFνρ) ≡ 2∂µF

∗κµ (456)

where F ∗κµ ≡ ǫκµνρFνρ/2 is known as the dual of Fµν . This dual transformation interchangesthe roles of electric and magnetic fields.

Incidentally, from the definition of determinants ΛκρΛ

λσΛ

µτΛ

νωǫ

ρστω = ǫκλµν det Λ, we seethat ǫ is invariant under Lorentz transformations (actually under all linear transformations!)with unit determinant. The defining condition of Lorentz transformations shows that trans-forming ηµν as a covariant tensor leaves it unchanged. Thus we can freely use both ηµν andǫκλµν in the process of constructing new tensors (or pseudotensors) from old ones. Someexamples:

ηµρηνσFµνFρσ = FµνFµν = 2(c2B2 −E2) (457)

1

2ǫµνρσFµνFρσ = FµνF

∗µν = −4cE ·B (458)

are two important scalar fields. The first times −ǫ0/4 is just the Lagrangian density for theelectromagnetic field.

S =ǫ02

dtd3x(E2 − c2B2) =1

2µ0

dtd3x(E2/c2 −B2)

= −ǫ04

dtd3xηµρηνσFµνFρσ = −ǫ04

dtd3xFµνFµν

= − ǫ04c

d4xFµνFµν = − 1

4cµ0

d4xBµνBµν (459)

94 c©2010 by Charles Thorn

Page 11: 8 Lorentz Invariance and Special Relativity

where we understand d4x = d(ct)d3x. These are the only two invariants quadratic in thefields.

Finally we need to confirm that the equations of motion for a charged particle in electricand magnetic fields are the same in all inertial frames. Before we worry about a particlemoving in fields we should say a word about the motion of free particles. Then Newton’sequation just says that the three momentum p of a free particle is a constant. Then theenergy E =

p2c2 +m2c4 is also a constant and we have the Lorentz covariant statementthat pµ is constant. More generally, there will be conservation of the total energy andmomentum of a system made up of several interacting particles. For example a scatteringprocess in which two particles of momenta pµ1 , p

µ2 collide and emerge with altered momenta

p′µ1 , p′µ2 , will respect the conservation laws pµ1 + pµ2 = p′µ1 + p′µ2 in all inertial systems. In this

context one can define Lorentz invariant scalar products like (p1 + p2) · (p1 + p2) which havethe same values in all frames. Two common frames are the Lab frame where p2 = 0 and thecenter of mass frame where p1 + p2 = 0. By evaluating the invariant in these two frames welearn that

(p1 + p2)2 = − 1

c2E2

CM = p21 + p22 + 2p1 · p2 = −m21c

2 −m22c

2 − 2m2E1,Lab

ECM = c√

m21c

2 +m22c

2 + 2m2E1,Lab ∼ c√

2m2E1,Lab (460)

at high energies. To make heavy particles we need high ECM which means much higherE1,Lab because of the square root. This is why colliding beams are much more effective thansingle beams colliding with stationary targets in producing novel heavy particles.

Turning to the motion of a particle in an electromagnetic field, we seek to cast the Newtonequation

dp

dt= q(E + v ×B), p = γmv (461)

as a 4-tensor equation. We have already seen that pµ = (E/c,p) is a contravariant 4-vector.It is convenient to divide by the mass to form the “velocity” 4-vector Uµ = (γc, γv). Thenit is natural to consider the quantity

FµνUν = γ(cFµ0 + Fµkv

k) (462)

FjνUν = γ(cEj + cǫjklv

kBl) = cγ(E + v ×B)j (463)

F0νUν = γF0kv

k = −γv ·E (464)

Then Newton’s equation can be written

dpj

dt=

q

γcFjνU

ν (465)

which are the spatial components of the 4-vector equation

γdpµdt

=q

cFµνU

ν (466)

95 c©2010 by Charles Thorn

Page 12: 8 Lorentz Invariance and Special Relativity

But this equation has a time component which seems independent of the Newton equation:

dp0dt

= −1

c

dE

dt= −q

cv ·E, d

dt

p2c2 +m2c4 = qv ·E (467)

which, since qv · E is just the rate at which work is done on the particle by the electricfield, we recognize as the relativistic work energy theorem. It is an easy consequence of theNewton equation:

d

dt

p2c2 +m2c4 =pc2

γmc2· q(E + v ×B) = qv ·E (468)

and so is not really an independent equation. Finally, we note that dt/γ = dt√

1− v2/c2 ≡dτ is just the time interval dt as measured by a clock in the instantaneous particle rest frame.The measure of time given by τ is therefore a Lorentz invariant notion and τ is accordinglyknown as the particle’s proper time. With this concept we can write Newton’s equation inmanifestly Lorentz covariant form:

dpµdτ

= mdUµ

dτ=q

cFµνU

ν (469)

Here Lorentz covariance implicitly requires that the charge q has the same value for allinertial observers. Since it is a property of the particle we could simply assume it. However,the statement should be consistent with the 4-vector transformation properties of Jµ. Forthis we need the local statement of charge conservation: ∂µJ

µ = 0, from which we can write

0 =

d4x∂µJµ =

d3SnµJµ (470)

where on the right we have used Gauss’ theorem to write the integral of a divergence as aboundary integral. Now we assume all components of Jµ vanish rapidly enough at spatial

infinity, so that any part of the boundary at spatial infinity can be dropped. We may freelychoose the spatially finite parts of the closed contour. Consider the region bounded by twohypersurfaces, one at t = t1 and the other at t′ = t′2, where t, t

′ are times in two differentinertial frames. Then the relation becomes

0 =

d3xJ0(x, t1)−∫

d3x′J ′0(x′, t′2) = Q(t1)−Q′(t′2) (471)

This equation can be read in two ways. First, if the primed observer is in fact the same asthe unprimed observer, it reads Q(t2) = Q(t1), i.e. charge is conserved in all inertial frames.Then if we take the case that the primed observer is boosted, we learn that Q′ = Q, i.e.charge is a scalar and has the same value at all times in all inertial frames.

This completes the demonstration that Maxwell’s electrodynamics takes the same formin all inertial frames connected by Lorentz transformations. This statement is conciselysummarized by:

ǫ0∂νFµν =

1

cJµ, ǫµνρσ∂νFρσ = 0,

dpµdτ

=q

cFµνU

ν =q

mcFµνp

ν (472)

96 c©2010 by Charles Thorn

Page 13: 8 Lorentz Invariance and Special Relativity

We should emphasize, that apart from justifying the choice of any convenient inertial framefor the solution of a given problem, the covariant formalism does not really offer new tech-niques for solving problems. For that one simply settles on a convenient frame and solvesthe equations as we did last semester.

8.6 Action Principles

We have already mentioned the action for electromagnetic fields coupled to sources whichcan be put in covariant form as follows6:

Sfields =

dtd3x[ǫ02(E2 − c2B2)− ρφ+ J ·A

]

=1

c

d4x[

−ǫ04FµνF

µν + JµAµ

]

(473)

Maxwell’s equations follow from δS = 0 under variations of the vector potential δAµ, soδFµν = c(∂µδAν − ∂νδAµ):

δS =1

c

d4x[

−ǫ02F µνδFµν + JµδAµ

]

=1

c

d4x [−ǫ0F µν∂µδcAν + JµδAµ]

=1

c

d4x [−c∂µ(ǫ0F µνδAν) + (Jν + ǫ0∂µcFµν)δAν ] (474)

Taking δAµ = 0 at the space-time boundaries, but otherwise arbitrary, then δS = 0 impliesMaxwell’s equations, ǫ0∂νF

µν = Jµ/c.The relativistic version of the action for a particle moving in fixed fields can also be put

in covariant form:

Spart = −mc2∫

dt

1− v2

c2− q

dt(φ(r(t), t)− v ·A(r(t)), t)

= −mc2∫

dτ + q

dτUµAµ (475)

The last form shows that the action is a scalar, assuming the same form in all inertialreference frames. This is another way of understanding why Newton’s equation is covariant.

The complete Maxwell electrodynamics follows from the combined action

S = − ǫ04c

d4xFµνFµν −mc2

dτ + q

dτUµAµ (476)

6One could also add a term θ∫

d4xǫµνρσFµνFρσ to the action. This term is invariant under proper Lorentztransformations but it is odd under parity and also odd under time reversal. If we want to keep parity ortime reversal symmetry, it would be forbidden. But even if it is allowed, it has no effect on the classicalequations of motion because one can write ǫµνρσFµνFρσ = 2∂µ(ǫ

µνρσAνFρσ) showing that its variation willnot contribute to δS under the conditions of Hamilton’s principle.

97 c©2010 by Charles Thorn

Page 14: 8 Lorentz Invariance and Special Relativity

which is also a Lorentz scalar. It therefore implies covariant equations of motion, which ofcourse we have already shown directly. The current due to the particle is, by equating thelast term to (1/c)

d4xJµAµ, yielding Jµ = qc

dτUµδ4(x− x(τ)). Explicitly, we have

ρc = J0 = qc2∫

dτγδ(ct− x0(τ))δ(x− x(τ)) = qc2γ

dx0/dτδ(x− x(τ(t)))

ρ = qδ(x− x(τ(t))) (477)

J = qc

dτγdx

dtδ(ct− x0(τ))δ(x− x(τ)) = q

dx

dtδ(x− x(τ)) (478)

where τ(t) is implicitly defined by ct = x0(τ(t)). These are the familiar equations we usedlast semester.

8.7 Some particle motions in electromagnetic fields

The motion of a charged particle in uniform fields Fµν(x) =constant is an important case.The covariant form of the equations of motion

dUµ

dτ=

q

mcF µ

νUν , Uµ =

dxµ

dτ, UµU

µ = −c2 (479)

reduces to a set of 4 coupled first order differential equations for Uµ(τ) with constant coeffi-cients. If we employ matrix notation, the solution can be expressed as U(τ) = eqFτ/mcU(0).This formal solution can be made more concrete by expressing it as a linear combination of4 “normal mode” solutions in each of which all components of U have a common exponen-tial τ dependence Uµ = V µ

r eqrτ/mc. Then one needs to solve a matrix eigenvalue problem

FVr = rVr, with the matrix elements of F given by F µν . This is a straightforward problem

in linear algebra. One determines the possible eigenvalues r from the characteristic equationdet(F − rI) = 0. The Lorentz transformation rules

F ′µν = Λµ

ρFρσ(Λ

−1)σν = (ΛFΛ−1)µν (480)

Show that det(F ′ − rI) = det(F − rI) = 0, so the characteristic equation may be evaluatedin any Lorentz frame, for example, the one in which the electric and magnetic fields are bothparallel to the x-axis. In that frame

F =

0 E 0 0E 0 0 00 0 0 cB0 0 −cB 0

, det(F − rI) = (r2 − E2)(r2 +B2) (481)

so the four possible eigenvalues are r = ±E and r = ±icB. The general solution will thenbe Uµ =

r crVµr e

qrτ/mc. The constraint UµUµ = −c2 will be independent of τ because

ηµνVµr V

νr′ = 0 unless r + r′ = 0. A simple further integration yields

xµ(τ) = xµ(0) +∑

r

crmc

qrV µr (e

qrτ/mc − 1) ≡ Xµ +∑

r

crmc

qrV µr e

qrτ/mc (482)

98 c©2010 by Charles Thorn

Page 15: 8 Lorentz Invariance and Special Relativity

To represent the trajectory in space as a function of the frame’s time one would have toinvert ct = x0(τ) to get τ(t) and then obtain x(τ(t)). In the general case this last step iscomplicated.

A general feature of the motion can be brought out by noticing that

(x(τ)−X)2 =∑

r,s

crcsm2c2

q2rsVr · Vs eq(r+s)τ/mc

(x(τ)−X)2 = −∑

r

crc−rm2c2

q2r2Vr · V−r (483)

which is to say that xµ(τ) lies on a hyperboloid. We next find the motion in a few specialsituations.

Uniform electric field The equation of motion can be immediately integrated to yieldp(t) = p0 + qE(t− t0), where without loss of generality we can take p0 ·E = 0 and t0 = 0.Then since v = dr/dt = pc/

p2 +m2c2, we have

r(t) = r0 +

∫ t

0

dt′p0c+ qcEt′

q2E2t′2 + p20 +m2c2

∼ ctE +p0c

q|E| ln t+O(1) (484)

which shows that at large times the particle’s speed approaches that of light.

Boosted uniform electric field A boost parallel to E leaves the field invariant and simplymaps the solution on to the analogous solution in the primed frame. To see this, takeE = Exand p0 = p0y. Then taking a boost in the x-direction the transformed momentum is

p′x = γ

(

qEt− v

c

q2E2t2 + p20 +m2c2

)

, p′y = p0, p′z = 0 (485)

The time in the new frame t′ = γ(t− vx(t)/c2) can be obtained by computing

x(t) = x0 +c

qE

(

q2E2t2 + p20 +m2c2 −

p20 +m2c2

)

(486)

t′ = γ

(

t− vx0c2

− v

qcE(

q2E2t2 + p20 +m2c2 −

p20 +m2c2)

)

(487)

p′ = p0 + qEt′ + const (488)

A boost perpendicular to E generates a magnetic field in the primed frame, E′⊥ = γE⊥

and B′⊥ = −v ×E′

⊥/c2. This is a new physical situation: we can express v = c2B′/E ′ and

γ = 1/√

1− c2B′2/E ′2. Consider the above solution in the special case p0 = 0, with E = Eyand putting x0 = z0 = 0,

x(t) = z(t) = 0

y(t) = y0 +

∫ t

0

dt′qcEt′

q2E2t′2 +m2c2= Y + cy

t2 +m2c2/q2E2 (489)

99 c©2010 by Charles Thorn

Page 16: 8 Lorentz Invariance and Special Relativity

in a frame moving with velocity vx in the x-direction, so E′ = γEy and B′ = −γvEz/c2 =−vE ′z/c2.

t′ = γ(t− vx(t)/c2) = γt, x′ = γ(x(t)− vt) = −vt′ = −c2B′

E ′ t′, z′ = 0

y′(t′) = y(t) = y(t′/γ) = Y + c

t′2

γ2+m2c2γ2

q2E ′2

= Y + c

t′2(

1− c2B′2

E ′2

)

+m2c2

q2(E ′2 − c2B′2)(490)

This gives us the solution for motion in perpendicular electric and magnetic fields. There isaccelerated motion in the direction of E and a uniform drift at speed c2B′/E ′ in the directionof E′ ×B′.

Uniform magnetic field

dp

dt= qv ×B (491)

Dotting both sides with p shows that p2 and hence v2 and γ are constants. Thus we canwrite the equation of motion as dv/dt = v × ω, where ω = qB/mγ. Thus v‖ is constantand v⊥ rotates in the transverse plane with angular frequency |ω|. Explicitly, with B = Bz,vz = constant, vx = v⊥ cosωt, vy = v⊥ sinωt, and

z = vzt, x = x0 +v⊥ω

sinωt, y = y0 −v⊥ω

cosωt (492)

Thus the particle moves in a circle of radius RB = v⊥/ω = mv⊥γ/qB = p⊥/qB.

Boosted uniform magnetic field Again a boost parallel to the field just scales the fieldwithout altering the physical situation. So we just examine a boost perpendicular to thefield, which we take in the z-direction B = Bz. So our canonical boost in the x-directionv = vx does what we want. In the primed frame B′ = γBz And an electric field E′ = E ′y =γv ×B = v ×B′ = −vB′y appears. So we have perpendicular electric and magnetic fieldswith |E ′| < c|B′|. To find the motion in such fields all we have to do is boost the solutionwe found with E = 0.

For simplicity let’s just consider the circular motion in the xy-plane:

x(t) =v0ωB

cosωBt, y(t) = − v0ωB

sinωBt, z(t) = 0 (493)

The boost is then

x′ = γ

(

v0ωB

cosωBt− vt

)

, t′ = γ

(

t− vv0c2ωB

cosωBt

)

, y′ = − v0ωB

sinωBt, z′ = 0

We see from the x′ equation that there is an average drift in the negative x′-direction. Toexpress the boosted solution in primed frame variables, we note that v = −E′ × B′/B′2

100 c©2010 by Charles Thorn

Page 17: 8 Lorentz Invariance and Special Relativity

so the average drift −v of the gyrating particle is in the direction of E′ × B′. Also ωB =qB/mγ0 = qB′/mγ0γ, where γ0 = 1/

1− v20/c2. Ideally we would eliminate t in favor of t′,

but we can only do that implicitly. We can do a partial elimination however

x′ = −vt′ + γ(1− v2/c2)v0ωB

cosωBt = −vt′ + v0ωBγ

cosωBt (494)

A simple statement can be made at the times when cosωBt = 0, i.e. the times when x′ =−vt′, and y′ = ±v0/ωB ≡ ±Y0, its extreme values. These are at t′ = γt = γ(n+ 1/2)π/ωB.

The special case v0 = 0 is interesting. The particle is at rest in the unprimed frame, whereE = 0. In the primed frame it moves uniformly at velocity −v. This is consistent with theequations of motion since E′ = v×B′ so that F = q(E′ − v×B′) = 0. The magnetic forcecancels the electric force. This means that perpendicular electric and magnetic fields can beused as a velocity filter: only particles with velocity V = E′ × B′/B′2 will pass throughundeflected.

Drift in Inhomogeneous Magnetic Fields. It is not generally possible to find analyticsolutions for nonuniform magnetic fields. But if the fields vary slowly one may use thesolution in uniform fields as the starting point for an approximate solution. First considerthe case where the magnetic field is in the z direction, but is consistently stronger in onetransverse direction, say in the x direction: B ≈ z(B0 + xB′). if we ignore the B′ term, asimple motion would follow a circle in the xy plane at constant speed. Including the xB′

term would tighten the circle for x > 0 but loosen the circle for x < 0. Thus there should bea systematic drift parallel to the y-axis. To estimate this drift velocity, we write v = v0+v1,where v0 is the solution for uniform field. Then

dv1

dt≈ q

m[xB′v0 × z + v1 × B0z] +O(v1B

′) (495)

We can identify the drift velocity as 〈v1〉, the time average over a cycle of the zeroth ordermotion. Averaging the above equation gives

[B′〈xv0〉+ 〈v1〉B0]× z] ≈ 0 (496)

Presuming the drift velocity is in the xy plane, we conclude

vdrift ≈ −B′

B0

〈xv0〉 (497)

The average involves the zeroth order solutions

x(t) = a cosωBt, y = −a sinωBt,

vx0 = −aωB sinωbt, vy0 = −aωB cosωBt (498)

with ωB = qB/mγ. We then see that

vdrift ≈B′

2B0

a2ωB y = −∇⊥B ×B

2B2a2ωB (499)

101 c©2010 by Charles Thorn

Page 18: 8 Lorentz Invariance and Special Relativity

where we wrote the last form in a way that doesn’t depend on the choice of coordinate axes.The sign of ωB matters: it is the same as the sign of the charge.

Adiabatic Invariants for slowly varying magnetic fields. I will just say a few wordsabout the use of Adiabatic invariants to explain some interesting features of the motion ifthe gyrating particle has a component of velocity in the direction of the B field.

The canonical formalism of classical mechanics draws attention to the action integrals

Ik =

pkdqk (500)

over canonical coordinates that are periodic as the particle follows its trajectory. If someparameter λ in the Hamiltonian varies sufficiently slowly with time, i.e. λ≪ λ/T , then evenafter a long time T during which the Hamiltonian makes a finite change, the values of theIk will remain unchanged. An example is the helical motion of a particle about a magneticfield line which is periodic in the coordinates perpendicular to the field. The correspondingaction integral is

I =

(γmv + qA) · dl = 2πγmωBa2 + q

dSn ·B (501)

where we used Stokes theorem. The normal n is in the direction such that the contour is in acounterclockwise sense looking down on it. The sign of the first integral is written assumingthe velocity is in the same sense. This requires that the B is antiparallel to n. Thus

I = 2πγmωBa2 − qπa2B = πa2qB (502)

Now consider a particle initially gyrating around a field line with a component of initialvelocity parallel to the field in a direction of a slowly increasing field. If the field is staticthe particle’s energy γmc2 will be constant. The adiabatic invariant I will also be constant,so a will decrease as 1/

√B. Meanwhile the transverse velocity v⊥ = aωB = aqB/γm will

increase as√B. Thus the parallel velocity will decrease as

v2‖ = v20 − v2⊥ = v20 − v20⊥B

B0

(503)

If B increases enough v‖ will drop to 0 and the particle will be reflected back from regionsof strong magnetic field.

8.8 Electrodynamics of a Scalar Field

Up until now we have developed electrodynamics by modeling the currents and charges withpoint particles. In such a description two very different types of physical entities, particlesand fields must interact with each other in a consistent way. While this hybrid approach toelectrodynamics can achieve some level of success, it engenders many nasty complications.A more coherent approach is to try describing all entities in terms of fields. This is possible

102 c©2010 by Charles Thorn

Page 19: 8 Lorentz Invariance and Special Relativity

because quantization endows fields with particle-like properties. Indeed the fields of thestandard model have the scope to describe all phenomena seen in experiment thus far. Thesefields include, in addition to electromagnetic fields, nonabelian gauge fields which mediatethe weak and strong interactions, fermionic quark and lepton fields, and finally scalar fieldswhich generate mass. We limit our field description of charged matter to scalar fields.

We start with a free scalar field, φ(x). The only linear field equation, second order inderivatives and consistent with Lorentz invariance, φ′(x′) = φ(Λ−1x′), is

(−∂2 + µ2)φ =

(

µ2 −∇2 +1

c2∂2

∂t2

)

φ = 0 (504)

An action which leads to this field equation is easy to write down:

Sφ =

dtd3x1

2(−∂µφ∂µφ− µ2φ2) (505)

δSφ =

dtd3x(−∂µδφ∂µφ− µ2φδφ) =

dtd3xδφ(∂2 − µ2)φ = 0 (506)

We have already considered such a field when we initially motivated Einstein’s modificationof particle mechanics. The key was to notice that a packet of plane waves which have the

frequency wave number relation ω2 = c2(k2 + µ2) have a group velocity vg = k/√

k2 + µ2,which with the Planck-Einstein connection p = ~k, E = ~ω, leads to the usual Einsteinmodification of momentum p = γmv and E =

m2c4 + p2c2. Now we explore the scalarfield in its own right.

Before turning to electrodynamics let’s review some basics of Hamilton’s principle forfields in a more general setting. In a field theory the action

dtL is a space time integralrather than just a time integral. We generally restrict the integrand, the Lagrangian density,to be a function of the fields ψi and their space-time derivatives

S =1

c

d4xL(ψi, ∂µψi) (507)

δS =

d4x

[

i

δψi∂L∂ψi

+∑

i

∂µδψi∂L

∂(∂µψi)

]

=

d4x∑

i

δψi

[

∂L∂ψi

− ∂µ∂L

∂(∂µψi)

]

where we have dropped a surface term. Thus in this general context the field equationsfollowing from δS = 0 are

∂µ∂L

∂(∂µψi)=∂L∂ψi

(508)

The analogue of a particle Hamiltonian in field theory is the conserved canonical energymomentum tensor which is constructed as follows

T µν = −∑

i

∂µψi∂L

∂(∂νψi)+ ηµνL (509)

103 c©2010 by Charles Thorn

Page 20: 8 Lorentz Invariance and Special Relativity

∂νTµν = −

i

[

∂µ∂νψi∂L

∂(∂νψi)+ ∂µψi∂ν

∂L∂(∂νψi)

]

+ ∂µL

= −∑

i

[

∂µ∂νψi∂L

∂(∂νψi)+ ∂µψi

∂L∂ψi

]

+ ∂µL = −∂µL+ ∂µL = 0 (510)

For the simple scalar field theory we introduced above we have

∂L∂(∂νφ)

= −∂νφ, ∂L∂φ

= −µ2φ (511)

T µν = ∂µφ∂νφ− ηµν1

2[∂ρφ∂

ρφ+ µ2φ2] (512)

T 00 =1

2[φ2 + (∇φ)2 + µ2φ2 (513)

In this case T µν is symmetric in its indices, but the construction doesn’t guarantee that.Another example is the free electromagnetic field, with L = −ǫ0FµνF

µν/4 = −ǫ0c2∂µAν(∂µAν−

∂νAµ)/2. Then

∂L∂(∂νAρ)

= −ǫ0cF νρ,∂L∂Aρ

= 0

T µν = ǫ0c∂µAρF

νρ + ηµνL = ǫ0FµρF

νρ + ηµνL+ ǫ0c∂ρAµF νρ

= ǫ0FµρF

νρ + ηµνL+ ǫ0c∂ρ(AµF νρ)− ǫ0cA

µ∂ρFνρ (514)

→ ǫ0FµρF

νρ + ηµνL+ ǫ0c∂ρ(AµF νρ) (515)

because of the equations of motion. The first two terms in the final form are symmetric underµ ↔ ν. The last term satisfies ∂νǫ0c∂ρ(A

µF νρ) = 0 by virtue of the antisymmetry of F . Itfollows that the first two terms are conserved by themselves. Furthermore the contributionof the last term to the total energy-momentum vanishes:

d3xǫ0c∂ρ(AµF 0ρ) =

d3xǫ0c∂k(AµF 0k) = ǫ0c

dSn · (AµE) → 0 (516)

for localized fields. Therefore the improved energy momentum tensor T µνS = ǫ0F

µρF

νρ+ηµνLis just as good as the canonical one: indeed it is better since it is both gauge invariant andsymmetric. It is also traceless (in 4 dimensions):

T µνS = ǫ0F

µρF

νρ + ηµνL (517)

T µSµ = ǫ0FµρF

µρ + 4L = 0 (518)

The reason that symmetric is better is that the simple intuitive construction of theangular momentum/boost tensor

Mµνρ = xµT νρS − xνT µρ

S (519)

104 c©2010 by Charles Thorn

Page 21: 8 Lorentz Invariance and Special Relativity

is automatically conserved ∂ρMµνρ = T νµS − T µν

S = 0. If the canonical one were used onewould need to discover an extra term to make it conserved. The densities of energy andmomentum computed from TS are the familiar ones

T 00 = ǫ0E2 − ǫ0

2(E2 − c2B2) =

ǫ02E2 +

ǫ0c2

2B2 =

ǫ02E2 +

1

2µ0

B2 (520)

T i0 = ǫ0FiρF

0ρ = ǫ0FikF

0k = ǫ0cǫiklBlEk = ǫ0c(E ×B)i (521)

= c(D ×B)i = cgi (522)

=1

c(E ×H)i =

1

cSi (523)

the factors of c just reflecting the fact that T i0 has dimensions of energy density, whereas ghas dimensions of momentum density and S has dimensions of energy flux.

The next question is how to set up a consistent interaction between a scalar field andthe electromagnetic fields. For starters we need to find a current Jν built out of the scalarfield which is conserved. ∂νJ

ν = 0. With a single real field one possibility would be ∂νφ, butthen current conservation would require µ = 0. To have µ > 0 we need to make φ complex.Then we can form the combination Jν = −iQ(φ∗∂νφ− φ∂νφ

∗). Now we check

∂νJν = −iQ(φ∗∂2φ− φ∂2φ∗) = −iQ(µ2 − µ2)|φ|2 = 0 (524)

by the field equation for φ. Consider the combination

−∂νφ∗∂νφ+ AνJν = −(∂νφ

∗ + iQAνφ∗)(∂νφ− iQAνφ) +Q2AνA

νφ∗φ (525)

We can make the first term on the right gauge invariant under Aν → Aν + ∂νΛ, providedφ→ eiQΛφ, because then

(∂ν − iQ(Aν + ∂νΛ))eiQΛφ = eiQΛ(∂ν − iQAν)φ (526)

The last term is not gauge invariant so to make a gauge invariant Lagrangian we should dropit and just keep the first term. Thus the complete Action should be

S =1

c

d4x(

−ǫ04FρνF

ρν − (∂ν + iQAν)φ∗(∂ν − iQAν)φ− µ2φ∗φ

)

≡ 1

c

d4xL(527)

Any function of φ∗φ can be added to the integrand without disturbing gauge invariance orLorentz invariance. Thus we could replace −µ2φ∗φ with −U(φ∗φ), so we write more generally

L = −ǫ04FρνF

ρν − (∂ν + iQAν)φ∗(∂ν − iQAν)φ− U(φ∗φ) (528)

= −ǫ04FρνF

ρν − (Dµφ)∗Dµφ− U(φ∗φ) (529)

where we introduced the shorthand Dµ ≡ ∂ν − iQAν for the gauge covariant derivative. Thedimension of q = Q~ is Coulombs. To see this, simply recall from quantum mechanics thatp ↔ (~/i)∇ and p and qA have the same dimensions.

105 c©2010 by Charles Thorn

Page 22: 8 Lorentz Invariance and Special Relativity

We now need to rederive the field equations. We need:

∂L∂(∂νAρ)

= −ǫ0cF νρ,∂L∂Aρ

= −iQ[φ∗(∂ν − iQAν)φ− φ(∂ν + iQAν)φ∗] ≡ Jρ

∂L∂(∂νφ∗)

= −(∂ν − iQAν)φ,∂L

∂(∂νφ)= −(∂ν + iQAν)φ∗ (530)

∂L∂φ∗ = −iQAρ(∂

ρ − iQAρ)φ− φU ′(φ∗φ),∂L∂φ

= iQAρ(∂ρ + iQAρ)φ∗ − φ∗U ′(φ∗φ)

Then the equations of motion are

−ǫ0c∂νF νρ = +ǫ0c∂νFρν = Jρ, Jρ = −iQ[φ∗∂ρφ− φ∂ρφ

∗]− 2Q2Aρφ∗φ

∂ν(∂ν − iQAν)φ = iQAρ(∂

ρ − iQAρ)φ+ φU ′(φ∗φ), (∂ − iQA)2φ = φU ′(φ∗φ) (531)

∂ν(∂ν + iQAν)φ∗ = −iQAρ(∂

ρ + iQAρ)φ∗ + φ∗U ′(φ∗φ), (∂ + iQA)2φ∗ = φ∗U ′(φ∗φ)

Notice that the last line is just the complex conjugate of the second line. Notice also the veryinteresting feature that the current Jρ has a linear term in Aρ. This arose from our consistenttreatment of gauge invariance. It is easy to check that this current is gauge invariant.

Finally for completeness we quote the improved symmetric energy momentum tensorwhose derivation is assigned in the homework:

T µνS = ǫ0F

µρF

νρ + (∂µ + iQAµ)φ∗(∂ν − iQAν)φ+ (∂ν + iQAν)φ∗(∂µ − iQAµ)φ

−ηµν[ǫ04FρσF

ρσ + (∂ρ + iQAρ)φ∗(∂ρ − iQAρ)φ+ U(φ∗φ)

]

(532)

In particular the energy density is given by

T 00S =

ǫ02(E2 + c2B2) + (∂0 + iQA0)φ

∗(∂0 − iQA0)φ+ (∇+ iQA)φ∗ · (∇− iQA)φ+ U(φ∗φ)

8.9 Lorentz Invariant Superconductivity: The Higgs Mechanism

We have introduced the scalar field into our discussion of electromagnetism with two pur-poses in mind. The first was to give you a glimpse into the only known systematic wayof consistently coupling charged matter to the electromagnetic field. However, to properlyunderstand charged particles in terms of the excitations of a field–scalar or other wise– re-quires quantum mechanics: the quantum field has particle-like properties, as first realized byEinstein in his explanation of the photoelectric effect in terms of photons. The particles thatmake up atoms and molecules and indeed all materials must be understood in this way. Onestarts with quantum field theory, and then imagines a classical limiting description in whichthe particle aspects of the quantum field theory instead of the classical field/wave aspectsdominate the approximation. Then one is back to studying the dynamics of charged pointparticles interacting with the electromagnetic fields.

We shall use the classical scalar field to understand two related phenomena. The first isthe Higgs mechanism, which provides a way to consistently give the photon a mass. Thenwe will use the insights gained to explain some aspects of superconductivity.

106 c©2010 by Charles Thorn

Page 23: 8 Lorentz Invariance and Special Relativity

The way a scalar field can generate a mass for the photon can be seen from the behaviorof the current for constant nonzero φ → φ0, Jρ → −2Q2Aρφ

∗0φ0. In this limit Maxwell’s

equations read

ǫ0c∂νFµν + 2Q2Aµφ∗

0φ0 = ǫ0c2∂µ∂νA

ν + (−ǫ0c2∂2 + 2Q2φ∗0φ0)A

µ = 0

∂νAν = 0,

(

−∂2 + 2Q2φ∗0φ0

ǫ0c2

)

Aµ = 0 (533)

which shows that the components of Aµ satisfy a massive wave equation with mass parameterµ2 = 2Q2φ∗

0φ0/ǫ0c2 = 2µ0Q

2|φ0|2. So the question is: under what circumstances such aconstant solution exists? Plugging a constant ansatz into the equation for φ at A = 0 showsthat one must have φ0U

′(|φ0|2) = 0. That is |φ0| > 0 must be an extremum of U . A simpleexample would be U = λ(|φ|2 − |φ0|2)2/4. If U ′ were a (positive) constant the only constantsolution would be φ = 0. It is true that φU ′ is also zero at φ = 0 in the example, but thevalue of U at φ = 0 is higher than at φ = φ0. Of course a finite difference in energy densitytranslates to a total energy difference proportional to the volume of the system, a very largeenergy difference indeed!

After finding a constant solution, we can then describe the excitations by writing φ =φ0 + h+ ia and then expanding the action in powers of Aν , h. One can conveniently choosea = 0 as a gauge condition. Then

U = λ(h2 + 2φ0h)2/4 = λφ2

0h2 +O(h3)

L = −ǫ04FρσF

ρσ −Q2φ20A

2 − (∂h)2 − λφ20h

2 +O(A2h, h3) (534)

So the theory describes a massive photon, with mass MA = ~µ/c or M2A = 2~2Q2φ2

0/ǫ0c4,

interacting with a massive scalar M2h = ~

2λφ20/c

2. The Higgs mechanism is important inelementary particle physics not to give a mass to the photon, for which there is no evidence,but rather to give mass to the gauge bosons that mediate the weak interactions, which areextremely short range. Indeed the effect of the mass term on the analogue of the Coulombpotential is dramatic. Instead of Poisson’s equation one has

(−∇2 + µ2)φ = δ(r), φ =e−µr

4πr(535)

The weak interaction vector bosons have a mass roughly 100 times the proton mass, whichcorresponds to µ ∼ 1/(10−16cm)! A particle associated with the scalar field h was discoveredin 2012. Finding the “Higgs” particle was one of the goals of the large hadron collider (LHC)still running at CERN in Geneva. From the quartic potential model for U the Higgs (mass)2

is M2H = ~

2λφ20/c

2 compared to the vector boson mass (Note that Q = e/~, where −e isthe electron charge.) M2

Z = 2e2φ20/ǫ0c

4. The ratio M2H/M

2Z = λǫ0~

2c2/2e2 = λ~c/8πα wasnot determined by the properties of any of the particles previously observed. But noticethat if MH > 10MZ, λ~c/8π > 100α = O(1), and the electroweak theory would cease tobe perturbative. Fortunately for theoretical physics, MH waas measured to be only 125GeV≈ 1.4MZ so λ~c/8π ≈ 2α ≪ 1.

107 c©2010 by Charles Thorn

Page 24: 8 Lorentz Invariance and Special Relativity

In condensed matter physics, the Higgs mechanism is in essence a model of supercon-ductivity, which is caused by the condensation, at low enough temperature, of pairs ofelectrons (Cooper pairs) which are quasi bound states with total spin S = 0 and total chargeq = Q~ = −2e. In this case the scalar field provides a phenomenological description of theCooper pair condensate. The energy density difference U(0) − U(φ2

0) = λφ40/4 models the

energy gain when the pairs condense, which is reflected in the scalar field acquiring a finitebackground value φ0 6= 0. In this application the scalar field is only present in the material.The induced current in the material, in the approximation that the scalar field is just itsuniform background value, is just Jρ ≈ −2Q2φ2

0Aρ. Its conservation in this approximationspecifies the Lorenz gauge condition ∂ρA

ρ = 0.We can relate this form for the current to the effective magnetic permeability of the

superconducting material. Recall that for a static situation, the magnetization M is relatedto the bound current by J = ∇×M . Thus for our model superconductor we can write

∇× (∇×M ) = −2Q2φ20∇×A

−∇2M +∇(∇ ·M ) = −2Q2φ20B (536)

Remembering that M = B/µ0 − H = (µ−10 − µ−1)B. This means that in Fourier space,

assuming for simplicity that ∇ ·M = 0, (µ−10 − µ−1) = −2Q2φ2

0/k2, or µ(k2) = µ0k

2/(k2 +2µ0Q

2φ20). In particular µ → 0 for k = 0, i.e. for a uniform applied field. This is why we

say that a superconductor is a perfect diamagnet. The fact that µ 6= 0 for finite k meansthat a spatially varying magnetic field would be allowed to exist to some extent within thesuperconductor. Notice that µ→ µ0 at very short wavelength (k → ∞).

To understand this application a little better, let’s consider the interface between a“superconductor” filling the half space z < 0, with a static uniform electric or magnetic fieldin otherwise empty space for z > 0. The static vector potential satisfies

(−∇2 +M2)Aµ = 0, z < 0; −∇2Aµ = 0, z > 0 (537)

At the interface z = 0 we require that Aµ and ∂Aµ/∂z be continuous. A solution with Aµ(z)a function of z only is easily found:

Aµ = aµeMz, z < 0; Aµ(z) = aµ(Mz + 1), z > 0 (538)

the Lorenz gauge condition requires a · z = 0. With this solution the uniform fields forz > 0 are E = −cMa0z and B = z × a. Thus they automatically fulfill the conditionsEt = 0,Bn = 0 expected of a perfect conducting diamagnet. However, with M finite,the fields penetrate into the superconductor, a distance M−1 called the London penetrationdepth. The mass of the scalar determines another length scale characterizing the “size” ofthe scalar particle. In conventional superconductivity this would be the size of the Cooperpair which is actually quite large compared to the atomic size. This second length scale iscalled the coherence length.

Meissner Effect

108 c©2010 by Charles Thorn

Page 25: 8 Lorentz Invariance and Special Relativity

The expulsion of magnetic field from the interior of a superconductor is the Meissnereffect. At a rough level postulating that µ = 0 in superconducting material accounts forthe effect. We have just seen that interpreting the effect in terms of an effective photonmass incorporates the more subtle effect of a finite penetration depth for the magnetic field.But so far we have not allowed the scalar field to vary spatially, though the field equationscertainly allow it. Superconductivity occurs when φ = φ0, but we can imagine a regionin which superconductivity is destroyed because φ ≈ 0 in that region. The energy cost ofdestroying superconductivity in a volume V is ∆E = [U(0)− U(|φ0|2)]V → λφ4

0V/4 for ourquartic model. This energy difference determines a critical magnetic field Hc via

1

2µ0H

2c ≡ U(0)− U(|φ0|2). (539)

For H > Hc in a region of superconductor, there is enough magnetic energy to drive thatregion of the superconductor normal. Of course, at the boundary of this region, the field φshould smoothly vary from 0 to φ0, so there will be some surface energy arising from the(∇φ)2 term in the energy. To go further we need to look to the field equations.

Magnetic Flux tubes

Let us consider static solutions (φ = A = 0) of the equations of motion, with E = 0(A0 = 0). In this case the energy density of the system is proportional to the Lagrangiandensity, so Hamilton’s principle for static solutions is equivalent to extremizing the totalenergy, the density of which reduces to

T 00 → ǫ02c2B2 + (∇+ iQA)φ∗ · (∇− iQA)φ+

λ

4(φ∗φ− φ2

0)2 (540)

where we have assumed the simple quartic model for U . A very interesting solution describesa magnetic flux vortex tube which penetrates the superconductor. This solution describes areal physical phenomenon in which magnetic fields penetrate a finite sized superconductorthrough narrow tubes in which the superconductor has been driven normal. To make atractable problem we assume that the superconductor fills all of space and that the vortextube is centered on the z-axis and extends from z = −∞ to z = +∞. All the fields will beassumed independent of z, and we will be interested in the energy per unit length. Since weare looking for a static z-independent solution it is enough to extremize the energy per unitlength for the axially symmetric situation you worked out in Problem Set 4:

T = 2π

∫ ∞

0

[

ǫ0c2 [(ρA)

′]2

2ρ+ ρf ′2 + ρ

(

m

ρ−QA

)2

f 2 + ρλ

4(f 2 − φ2

0)2

]

(541)

where A = A(ρ)φ and φ = f(ρ)eimϕ. As you showed, such a potential produces a magneticfield B = [ρA]′/ρ.

In Set 5 I have asked you to find an estimate for T based on the variational principle.Here we will discuss the necessary conditions on the fields at ρ → 0,∞. Since each of the

109 c©2010 by Charles Thorn

Page 26: 8 Lorentz Invariance and Special Relativity

four terms in the integrand is positive, there are 4 independent integrals that must be finite.At ρ → ∞ we must have f → φ0 and QA ∼ m/ρ. By Stokes theorem, the last behaviorshows that the magnetic flux ΦB =

dl ·A = 2πm/Q. This is why we call this solution amagnetic flux tube. In order for the first integral to converge at ρ = 0 you will show thatnecessarily A→ 0 as ρ→ 0. If this is so the convergence of the third term requires f(0) = 0.Thus on the axis of the flux tube the superconductor has been driven normal. If all theseconditions are met T will be finite. It then remains to find its minimum for each integervalue m. Clearly if m = 0 the minimum is for f = φ0, A = 0 for which T = 0. This is justthe ground state of the superconductor. For m 6= 0 the minimum is for a flux tube with totalflux 2πm/Q. Without going any further we know a lot about the solution. As ρ varies from0 to ∞ f rises from 0 to φ0. Meanwhile B(ρ) falls from some nonzero value to zero. Yourexercise is to flesh out these qualitative conclusions with a concrete variational approximatesolution.

110 c©2010 by Charles Thorn

Page 27: 8 Lorentz Invariance and Special Relativity

9 Propagation of Plane waves in Materials

9.1 Oscillator model for frequency dependence of a dielectric

The dielectric constant and magnetic permeability describe the response of a material toapplied fields. As such they depend on two space-time points: the location and time thefield is applied and the location and time the response is measured. For example, in fullgenerality we should write, in the linear response approximation,

Ek(x) =

d4yǫ−1kj (x, y)D

j(y), Bk(x) =

d4yµkj(x, y)Hj(y) (542)

We shall generally be interested in homogeneous and isotropic materials, in which caseǫ−1kj = δkjǫ and µkj = δkjµ will depend only on x − y, but the relationship will still, ingeneral, be non-local in spacetime

E(x) =

d4yǫ−1(x− y)D(y), B(x) =

d4yµ(x− y)H(y) (543)

Convolution integrals of this type can be diagonalized by simple Fourier transforms:

E(x) =

d4k

(2π)4E(k)eik·x, D(x) =

d4k

(2π)4D(k)eik·x

ǫ−1(x− y) =

d4k

(2π)4ǫ−1(k)eik·(x−y) (544)

and similarly for B,H , µ. Then we have

E(k) = ǫ−1(k)D(k), B(k) = µ(k)H(k) (545)

We shall not attempt a serious treatment of the theory of matter in this course. Insteadwe will use simplified models. In an insulator, all of the charged particles are locally boundwithin atoms and molecules. The main qualitative fact we need to account for is that if anelectron is pulled by the action of an electric field, say, the binding mechanism supplies arestoring force. For many purposes an adequate model for the restoring force is a dampedharmonic oscillator force. Then the equation of motion for the electron in an applied electricfield, assuming nonrelativistic motion, would be

r + γr + ω20r = − e

mE(r, t) (546)

We make the simplifying assumption that E ≈ E0e−iωt, so that the steady state solution

is r = (−e/m)(ω20 − ω2 − iγω)−1E. If there are N molecules per unit volume and each

molecule has Z electrons fj of which have binding frequency ωj and damping parameter γj, so∑

j fj = Z, then the dipole moment density will be P = (e2/m)N∑

j fj(ω2j −ω2−iγjω)−1E.

Then D = ǫ0E + P ≡ ǫE implies that

ǫ(ω) = ǫ0 +e2N

m

j

fjω2j − ω2 − iγjω

= ǫ0 +e2N

m

j

fj(ω2j − ω2 + iγjω)

(ω2j − ω2)2 + γ2jω

2(547)

111 c©2010 by Charles Thorn

Page 28: 8 Lorentz Invariance and Special Relativity

The essential structure of the frequency dependence given by this simple minded model isa feature of more serious quantum mechanical calculations, where the ωj, fj, γj are effectivequantum mechanical parameters. The resonance behavior given by the denominators is anubiquitous feature of metastable excited states in quantum mechanics.

The consequences of such frequency dependence for a plane wave eikz−iωt propagatingin the dielectric can be read off from the relation k = ω

√ǫµ → (ω/c)

ǫ(ω)/ǫ0. Since ǫ iscomplex, the wave number will also be complex k = β + iα/2:

β2 − α2

4=

ω2

c2Re

ǫ

ǫ0, αβ =

ω2

c2Im

ǫ

ǫ0(548)

Re ǫ = ǫ0 +e2N

m

j

fj(ω2j − ω2)

(ω2j − ω2)2 + γ2jω

2(549)

Im ǫ =e2N

m

j

fjγjω

(ω2j − ω2)2 + γ2jω

2(550)

Thus the imaginary part of ǫ, which is manifestly positive, implies an attenuation factore−αz/2 multiplying the phase of the plane wave. This represents a loss of energy from theplane wave to the medium. Since the model result for Im ǫ is proportional to the dampingcoefficients of the oscillator, this outcome makes good sense.

For ω < ωj for all j, Re ǫ is manifestly larger than ǫ0, corresponding to a n index ofrefraction larger than 1. Assuming none of the ωj vanish, limit ω → 0 reads

ǫ(0) = ǫ0 +e2N

m

j

fjω2j

(551)

which is characteristic of an insulator. However as ω increases past the resonant frequencies,more and more terms become negative, until eventually ǫ becomes less than ǫ0. Far abovethe largest ωj, ǫ ∼ ǫ0(1 − ω2

p/ω2) where the plasma frequency is given by ω2

p = Nee2/ǫ0m,

where Ne = NZ is just the total number density of electrons in the material. For such highfrequencies, the restoring forces have no time to act and all electrons are effectively free. Forinsulating materials, this characteristic form only applies for ω ≫ ωp, so that the index ofrefraction is only slightly less than 1.

9.2 Conductivity

The simple resonance model discussed above can be used as a model of conductivity if oneof the ωj, say ω0 = 0. Then separating their contribution from the rest gives

ǫ(ω) = ǫ0 +e2N

m

j

fjω2j − ω2 − iγjω

− e2N

f0ω + iγ0

≡ ǫb(ω)−e2Nf0

mω(ω + iγ0)

∼ ie2N free

e

mωγ0, as ω → 0 (552)

112 c©2010 by Charles Thorn

Page 29: 8 Lorentz Invariance and Special Relativity

where we have identified Nf0 ≡ N freee as the number density of free (i.e. unbound) electrons.

To interpret this behavior, notice that the displacement current

∂D/∂t→ −iωD = −iωǫ(ω)E = −iωǫb(ω)E +e2N free

e

m(γ0 − iω)E (553)

The last term can be interpreted in the zero frequency limit as Ohm’s law

e2N freee

m(γ0 − iω)E → σE, σ =

e2N freee

mγ0(554)

So this model reproduces Ohm’s law in the quasi-static limit. In the formula for the con-ductivity, 1/γ0 can be interpreted as the mean time τ that an electron travels freely betweencollisions with impurities or lattice perturbations. Then the average drift velocity in anelectric field would be vd = −eEτ/m, so J = −eN free

e vd = (e2N freee τ/m)E ≡ σE.

We can write the dielectric constant for a conductor in terms of the effective plasmafrequency ω2

p ≡ e2N freee /mǫ0,

ω2p =

e2N freee

mǫ0(555)

ǫ(ω) = ǫb(ω)− ǫ0ω2p

ω(ω + iγ0)(556)

Since ωp is typically many orders of magnitude larger than γ0, there is a range of frequenciesfor which the second term is real negative and large. In that range the dielectric constant isnegative and the wave number k = ω

ǫ/ǫ0/c is imaginary meaning that em waves cannotenter the conductor. In this frequency range light incident on such a metal is completelyreflected. However at some critical frequency ǫ(ωc) = 0 and above that value the metal startsto transmit radiation reducing the reflectivity drastically.

9.3 Plasmas and the Ionosphere

An ionized gas is called a plasma. For a dilute (tenuous) plasma the charged particles areessentially free with negligible damping. In our resonance model none of the ωj 6= 0 appearand the formula collapses to

ǫ(ω)

ǫ0= 1−

ω2p

ω(ω + iγ0)≈ 1−

ω2p

ω2(557)

where the approximate form neglects damping. This is of course the high frequency behaviorfor all materials, but for a tenuous plasma it is also valid for ω << ωp for which ǫ < 0. Then

kc = ω

ǫ

ǫ0=√

ω2 − ω2p → iωp, α = 2

ωp

c(558)

In typical laboratory plasmas α−1 is .002 cm to .2 cm.

113 c©2010 by Charles Thorn

Page 30: 8 Lorentz Invariance and Special Relativity

If we rewrite the relation between frequency and wave number in the form ω2 = ω2p+k

2c2,we see that the group velocity of a wave packet propagating in a plasma will be vg =∇kω(k) = c2k/ω(k). Traveling waves with frequency ω > ωp are supported by a plasma,and they travel at a speed less than light. Oscillations at the plasma frequency can existwhen k = 0, i.e. these disturbances are at rest in the plasma.

Plasma in a Magnetic field In the ionosphere the presence of the earth’s magnetic fieldalters the propagation of waves in an interesting way. To see how we return to the equationof motion for a particle in a background uniform magnetic field in addition to the em wave:

mr = −er ×B0 − eEe−iωt

−mω2r0 = ieωr0 ×B0 − eE (559)

To solve this equation first note that dotting both sides withB0 yields r0·B0 = eE ·B0/mω2.

Then taking the cross product of both sides with B0 gives;

−mω2r0 ×B0 = ieω(B0r0 ·B0 −B20r0)− eE ×B0

= ieω(B0eE ·B0/mω2 −B2

0r0)− eE ×B0

Plugging this back into the first equation then yields

−mω2

ieω(eE −mω2r0) = ieω(B0eE ·B0/mω

2 −B20r0)− eE ×B0

mω2

ieω(mω2r0) + ieω(B2

0r0) =mω2

ieωeE +

ie2ω

mω2B0(E ·B0)− eE ×B0

Introduce ωB = eB0/m and find

(ω2 − ω2B)r0 =

e

mE − e

mω2(ωBE · ωB)−

ie

mωE × ωB

P = −eNr0 = −ǫ0ω2p

ω2 − ω2B

(

E − iE × ωB

ω− ωB

ω

ωB

ω·E)

(560)

D = ǫ0E + P

= ǫ0

[

E −ω2p

ω2 − ω2B

(

E − iE × ωB

ω− ωB

ω

ωB

ω·E)

]

(561)

This is an example where the effective dielectric constant is a matrix, not just a number.We next use Maxwell’s equations to determine how a plane wave propagates through this

medium. Plugging Faraday’s law for a plane wave B = k ×E/ω into the Ampere-Maxwellequation yields

k × (k ×E) = −ω2µ0D

k2(E − kk ·E) =ω2

c2

[

E −ω2p

ω2 − ω2B

(

E − iE × ωB

ω− ωB

ω

ωB

ω·E)

]

(562)

114 c©2010 by Charles Thorn

Page 31: 8 Lorentz Invariance and Special Relativity

The right side of this equation dotted into k must be zero, which puts a constraint oncomponents of the electric field:

0 = k ·[

E −ω2p

ω2 − ω2B

(

E − iE × ωB

ω− ωB

ω

ωB

ω·E)

]

= E ·[

k −ω2p

ω2 − ω2B

(

k − iωB

ω× k − k · ωB

ω

ωB

ω

)]

(563)

(When B = 0 this is simply the constraint that E is perpendicular to the propagationdirection. But when B 6= 0 it causes complications in the solution for general k,ωB.

We consider two simpler cases. First consider propagation parallel to the magnetic field,k = ωB/ωB. Then it easily follows (assuming ω 6= ωp) that the constraint becomes k ·E = 0,and the equation reduces to

k2E =ω2

c2

[

E −ω2p

ω2 − ω2B

(

E − iωB

ωE × k

)

]

(564)

We want to find two (necessarily complex) solutions for E such that E × k is a multipleof E. Taking k in the z-direction, one notes that (x ± iy) × z = ±i(x ± iy) so that theeigenvectors are the circularly polarized states and the relation between k and ω is differentfor each:

k2∓ =ω2

c2

[

1−ω2p

ω2 − ω2B

(

1± ωB

ω

)

]

≡ ω2

c2ǫ∓ǫ0

(565)

E∓ =x± iy√

2, ǫ∓ = ǫ0

(

1−ω2p

ω(ω ∓ ωB)

)

(566)

Notice that the relative phase of the two polarization states will not be maintained in space.In particular, if E starts out at z = 0 linearly polarized, say in the x direction, then atz 6= 0, we have

E =E

2[(x+ iy)eiωn−z/c + (x− iy)eiωn+z/c] = Eeiω(n++n−)z/2c[x cos θ(z) + y sin θ(z)] (567)

where θ(z) = ωz(n+ − n−)/2c. Thus the direction of linear polarization rotates with z(Faraday rotation).

The second simplifying assumption is to take a general propagation direction but assumeωB ≪ ω. Then the effect of the magnetic field is small, and it is appropriate to perturbaround B = 0. With k = kz, the two zeroth order solutions are E0∓ = (x ± iy) withdegenerate k20 = (ω2 − ω2

p)/c2. Then the first order equation reads

(k2 − k20)E0∓ + k20(E1∓ − kk ·E1∓) = k20E1∓ +ω2

c2

[

−ω2p

ω2

(

−iE0∓ × ωB

ω

)

]

115 c©2010 by Charles Thorn

Page 32: 8 Lorentz Invariance and Special Relativity

Taking the scalar product of both sides with x∓ iy, the E1 terms drop out, and one finds

2(k2 − k20) =ω2p

c2(x∓ iy) ·

(

i(x± iy)× ωB

ω

)

= iω2p

c2[(x∓ iy)× (x± iy)] · ωB

ω

k2 = k20 ∓ω2p

c2z · ωB

ω=ω2

c2

(

1−ω2p

ω2∓ω2p

ω2

z · ωB

ω

)

(568)

ǫ∓ = 1−ω2p

ω2

(

1± z · ωB

ω

)

(569)

which agrees with the first case when the magnetic field is in the z-direction.

9.4 Group Velocity

The frequency dependence of ǫ leads to a frequency dependent group velocity as we havealready discussed in the context of wave packets. In our previous discussion our packets weretreated as superpositions of a narrow band of wave numbers with the frequency a function ofk. Then the formula for the group velocity is vg = ∇kω(k). In our discussion of frequencydependence we have presented the wave number as a function of frequency kc = ωn(ω) wheren =

ǫµ/ǫ0µ0 is the index of refraction. Then we can express the group velocity in termsof frequency derivatives: c = vg(n+ ωdn/dω) or

vg =c

n+ ωdn/dω(570)

An interesting example is the ǫ− mode of a plasma with magnetic background field. We havein the low frequency limit

n− ≈ ωp√ωωB

, n− + ωdn−dω

≈ ωp

2√ωωB

, vg = c2√ωωB

ωp

(571)

Lower frequency components travel more slowly, so a broad frequency pulse will be receivedsmeared out in time with the later arrivals lower in frequency (whistlers).

In our resonance model of dispersion (i.e. frequency dependent index of refraction), ǫand hence n increase with frequency except for narrow windows surrounding the resonancelocations ωj. This is called normal dispersion and from the formula for the group velocitywe see that vg < c/n for normal dispersion. However, as one passes through a resonance,the index of refraction starts falling, and for narrow resonances there is a narrow rangewhere the decline is rapid, dn/dω is large and negative (anomalous dispersion). In a regionof anomalous dispersion vg can get large, even larger than the speed of light–it can evenbecome negative! There is no violation of causality though. The interpretation of vg as thespeed of a signal requires that the frequency width of the packet is much smaller than thescale over which the integrand varies. But in regions of anomalous dispersion the integrandis rapidly varying so the width has to be extremely tiny corresponding to a huge spread inthe duration of the packet in time and a corresponding fuzziness in the location of any wave

116 c©2010 by Charles Thorn

Page 33: 8 Lorentz Invariance and Special Relativity

front. To establish an actual signal velocity there must be a sharp wave front, which requiresa broad range of frequencies. The regions of anomalous dispersion contribute very little tosuch a broad packet.

The variation in the group velocity with frequency produces a spreading of the wavepacket. The current homework assignment includes a problem where this can be explicitlyseen using a Gaussian wave packet. But the phenomenon is inevitable because a band ofwave numbers ∆k has a spread of group velocities ∆vg = ∆kd2ω/dk2, and an additionalcontribution to the width in space of ∆ktd2ω/dk2. The initial width is at best 1/∆k, so thewidth after a time t is at least

∆x(t) =

1

∆k2+∆k2t2

(

d2ω

dk2

)2

(572)

This matches the behavior you found for a Gaussian wave packet. Notice that the narrowerthe band of wave numbers, the longer it takes for the spreading to take effect. Of course at thesame time the initial width of the packet is correspondingly large. In scattering experimentsthe only practical limit on the size of the wave-packets is the size of the apparatus.

9.5 Causality and Dispersion Relations

As we have mentioned the relation between D and E in the linear response approximationis a nonlocal one of the form

D(r, t) = ǫ0E(r, t) + ǫ0

d3r′dt′G(r − r′, t− t′)E(r′, t′) (573)

where we have further assumed translation invariance in space and time. Here G is theFourier transform of the wave number and frequency dependent dielectric constant

ǫ0G(r, t) =

d3qdω

(2π)4(ǫ(q, ω)− ǫ0)e

iq·r−iωt (574)

The physical interpretation is that a field applied at r′, t′ induces a response at r, t. In orderfor causality to be respected, the response cannot occur unless enough time has elapsed fora light signal to travel from r′ to r. That is

G(r − r′, t− t′) = 0, unless c(t− t′) > |r − r′| (575)

Considering the formula for ǫ

ǫ(q, ω)

ǫ0− 1 =

d3r

∫ ∞

|r|/cdtG(r, t)e−iq·r+iωt (576)

we see that the dependence on ω can be extended into the upper half complex plane becausethe iωt term in the exponent acquires a negative real part, exponentially improving the con-vergence of the t integral. Viewed as an analytic function of ω, there will be no singularities

117 c©2010 by Charles Thorn

Page 34: 8 Lorentz Invariance and Special Relativity

in the upper half plane. Singularities in the lower half plane are allowed by causality, becausethen iωt has a positive real part causing worse convergence of the t integration.

In the remaining discussion we ignore the q dependence. Assuming ǫ is independent ofq implies that the response is spatially local, i.e.

ǫ0G(r, t) =

(2π)(ǫ(ω)− ǫ0)e

−iωtδ(r) ≡ ǫ0G(t)δ(r) (577)

This will be valid if the distortion of the system due to the applied field is over a region smallcompared to the scale over which the field varies appreciably. In the case of a wave packetthis scale is the typical wavelength.

Let us now recall our resonance model for the frequency dependence of ǫ. Each resonancecontribution had the form

fjω2j − ω2 − iγjω

(578)

which has pole locations

ω = −iγj2

±

ω2j −

γ2j4

(579)

and we see that these poles are in the lower half plane for positive γj. So in this modelcausality demands γj > 0. Of course in our model positive γj corresponds to a damped asopposed to a runaway oscillator. But it is nice to know that this sort of unphysical runawaybehavior is forbidden in models that simply respect causality.

Analyticity is a very powerful mathematical fact about functions. This is manifested byCauchy’s theorem,

f(z) =

C

dz′

2πi

f(z′)

z′ − z(580)

where C is a closed curve within a simply connected region of analyticity for f(z).Mathematical Detour; We can understand Cauchy’s theorem in terms of known featuresof two dimensional elctrostatics. Write z = x+ iy and f = f1 + if2. Then

dzf(z) = dxf1 − dyf2 + i(dxf2 + dyf1) ≡ dl · f + i(dl · f) (581)

where the two-vector f = (f1,−f2)T and f = (f2, f1)T . Then by Stokes theorem

dzf(z) =

dA(

[∇× f ]z + i[∇× f ]z

)

(582)

But if f(z) = f(x+ iy) is analytic it satisfies

i∂f

∂x=∂f

∂y

i∂f1∂x

− ∂f2∂x

=∂f1∂y

+ i∂f2∂y

(583)

118 c©2010 by Charles Thorn

Page 35: 8 Lorentz Invariance and Special Relativity

which are the Cauchy-Rieman equations:

∂f1∂x

=∂f2∂y

−∂f2∂x

=∂f1∂y

(584)

They easily imply that f1 and f2 each satisfy the 2D Laplace equation. They also imply

∇× f = ∂x(−f2)− ∂yf1 = 0

∇× f = ∂x(f1)− ∂yf2 = 0 (585)

which together give the Cauchy theorem∮

C

dzf(z) = 9 (586)

for any closed curve C within a simply connected domaain of analyticity. The Cauchyformula follows by first using the Cauchy theorem to deform any closed curve that enclosesthe point z′ = z to an infinitesimal circle Cδ of radius δ centered on the point z′ = z. Then

C

dz′

2πi

f(z′)

z′ − z=

dz′

2πi

f(z′)

z′ − z

→∫ 2π

0

idθδeiθf(z)

2πiδeiθ= f(z) (587)

Dispersion Relations for Dielectrics Because ǫ(ω) is analytic in the whole upper half complexω plane, we can pick C to follow the real axis and close on a large semicircle at ω = ∞. Thebehavior of ǫ(ω) at large ω depends on the behavior of G(t) as t→ 0.

ǫ(ω)

ǫ0− 1 =

∫ ∞

0

dtG(t)eiωt =1

∫ ∞

0

dtG(t)d

dteiωt =

iG(0)

ω− 1

∫ ∞

0

dtG′(t)eiωt

=iG(0)

ω− G′(0)

ω2+iG′′(0)

ω3+ · · · (588)

It is reasonable to assume a smooth turn-on of G at t = 0 in which case the first term isabsent and then ǫ→ ǫ0+O(ω

−2). at large ω. This is easily fast enough that the contributionfrom the semicircle at infinity to the Cauchy integral vanishes and we have

ǫ(ω)

ǫ0− 1 =

∫ ∞

−∞

dω′

2πi(ω′ − ω)

[

ǫ(ω′)

ǫ0− 1

]

, Im ω > 0 (589)

We want to take ω to the real axis, but we have to be careful about the resulting singularityin the integrand at ω′ = ω. If there is finite conductivity there is also a pole in ǫ(ω′) atω′ = 0, ǫ ∼ iσ/ω. This pole was not enclosed by the original Cauchy contour C, so when wetake the contour to the real axis, it must pass above the conductivity pole. We can handle

119 c©2010 by Charles Thorn

Page 36: 8 Lorentz Invariance and Special Relativity

these singularities by deforming the contour near ω′ = ω into a semicircle of radius δ about ωinto the lower half plane (lhp). Similarly we deform the contour near ω′ = 0 into a semicircleof radius δ about 0 into the upper half plane (uhp). Then the contour follows the real axisfrom −∞ to −δ, a semicircle in the uhp from −δ to +δ, the real axis from +δ to ω − δ, asemicircle in the lhp from ω − δ to ω + δ and then, finally, the real axis from ω + δ to +∞.The real axis contribution, in the limit δ → 0 is what is traditionally called the PrincipalPart. The semicircle contributions are just half the full residues of the respective poles. Theresidue of the pole at 0 contributes negatively because the contour in that case is clockwise.

ǫ(ω)

ǫ0− 1 = P

∫ ∞

−∞

dω′

2πi(ω′ − ω)

[

ǫ(ω′)

ǫ0− 1

]

+1

2

(

ǫ(ω)

ǫ0− 1

)

+iσ

2ǫ0ω(590)

ǫ(ω)

ǫ0= 1 +

ǫ0ω+ P

∫ ∞

−∞

dω′

πi(ω′ − ω)

[

ǫ(ω′)

ǫ0− 1

]

(591)

Taking the real and imaginary parts of both sides leads to

Reǫ(ω)

ǫ0= 1 + P

∫ ∞

−∞

dω′

π(ω′ − ω)Im

ǫ(ω′)

ǫ0(592)

Imǫ(ω)

ǫ0=

σ

ǫ0ω− P

∫ ∞

−∞

dω′

π(ω′ − ω)Re

[

ǫ(ω′)

ǫ0− 1

]

(593)

These dispersion relations are also known as Kramers-Kronig relations after the physicistswho first introduced them. They reflect the consequences of a few very basic and fundamentalphysical principles.

For real ω, it follows from the reality of G(t) that ǫ(ω)∗ = ǫ(−ω). Hence Re ǫ is even inω, whereas Im ǫ is odd. Thus one can write the dispersion integrals with range 0 < ω <∞.

Reǫ(ω)

ǫ0= 1 + P

∫ ∞

0

dω′

π

2ω′

ω′2 − ω2Im

ǫ(ω′)

ǫ0(594)

Imǫ(ω)

ǫ0=

σ

ǫ0ω− P

∫ ∞

0

dω′

π

ω′2 − ω2Re

[

ǫ(ω′)

ǫ0− 1

]

(595)

9.6 Causal Propagation

To set up a test of causal wave propagation, we imagine a wave with a sharp wave fronttimed to arrive, at normal incidence, at the surface of a dispersive material, located at x = 0,at time t = 0. For t < 0, the wave is entirely in the region x < 0 and can be written

Incident wave : ψ(x, t) =

dωA(ω)eiωx/c−iωt, x < 0. (596)

A(ω) =

dtψ(0, t)eiωt (597)

Because ψ(0, t) = 0 for t < 0, A(ω) can be taken to be analytic in the upper half ω plane.

120 c©2010 by Charles Thorn

Page 37: 8 Lorentz Invariance and Special Relativity

For x > 0 we have, from the normal incidence refraction formula At(ω) = 2A(ω)/(1 +n(ω)),

Refracted wave : ψ(x, t) =

dω2

1 + n(ω)A(ω)eiωn(ω)x/c−iωt, x > 0 (598)

The analyticity in the uhp of A, ǫ and n, means that the integrand for x > 0 is analytic in theuhp. If the integrand vanishes rapidly enough as ω → ∞, we can close the contour. To decidewhen we can do this we need the fact that n→ 1 as ω → ∞, so that eiωn(ω)x/c−iωt → eiω(x/c−t).Thus we can close the contour in the uhp when x > ct. Then analyticity in the uhp implies,via Cauchy’s theorem, that ψ(x, t) = 0 for t < x/c. That is, no signal can propagate fasterthan light.

121 c©2010 by Charles Thorn

Page 38: 8 Lorentz Invariance and Special Relativity

10 Waveguides and Cavities

10.1 The approximation of perfect conductors

In our study of waveguides and cavities, we shall make the simplifying assumption of perfectconductivity. That is time varying fields (ω 6= 0) are strictly zero in the body of the conductorand surface currents and charge densities are instantly adjusted to maintain zero fields whenfields just outside the conductor are non-zero. Then Maxwell’s equations imply for the fieldsjust outside the conductor the boundary conditions

n ·B = 0, E‖ = 0, n ·D = Σ, n×H‖ = K (599)

where Σ is the free surface charge density and K is the free surface current density.To appreciate when this assumption is valid we briefly recall the behavior of time-varying

fields near the boundary of conductors with large but finite σ in the quasistatic approxi-mation, where the displacement current is neglected. Then, with J = σE, the Maxwellequations read (assuming time dependence e−iωt in all fields)

∇×B = µσE, ∇×E = iωB, ∇ ·B = 0, ∇ ·E = 0 (600)

−∇2B = iµσωB ≡ 2i

δ2B = −

(

1− i

δ

)2

B (601)

Suppose we have a non-zero magnetic field parallel to the surface just outside the conductor.Then instead of falling abruptly to zero just inside the conductor, the field drops off expo-nentially B(ξ) = B(0)e−(1−i)ξ/δ, where ξ is a coordinate that measures the normal depthinto the conductor. The skin depth δ =

2/µσω, which measure how deeply the fieldspenetrate, goes to zero as σ → ∞. Here we are assuming that over a scale δ the fieldsare essentially constant in the directions parallel to the surface of the conductor. In thissituation the corresponding electric field and current density are

E =1

µσ∇×B =

1− i

µσδn×B(0)e−(1−i)ξ/δ

=

ω

2µσ(1− i)n×B(0)e−(1−i)ξ/δ (602)

J =

ωσ

2µ(1− i)n×B(0)e−(1−i)ξ/δ, K =

∫ ∞

0

dξJ = n× B(0)

µ(603)

where n is the outward normal to the conductor. Notice that at ξ = 0 the electric field istangential to the surface and small. (For a perfect conductor the tangential component ofE would be 0.) This surface tangential field is simply proportional to the effective surfacecurrent

E‖ =

ωµ

2σ(1− i)K =

1− i

σδK (604)

122 c©2010 by Charles Thorn

Page 39: 8 Lorentz Invariance and Special Relativity

The proportionality constant has the units of resistance and is called the surface impedanceZs = (1−i)/σδ. Similarly Faraday’s law shows that there would be a small normal componentto B at the surface.

The last expression for the effective surface current matches that for the boundary condi-tion at a perfect conductor. We see that the effect of finite conductivity is to spread out theabrupt transition for a perfect conductor to a smooth but rapid transition over a distanceof the skin depth. Finite conductivity is also responsible for Ohmic losses. These can beevaluated either via the time averaged Poynting vector at the surface

Power

Area= n · 〈S〉 =

1

2µn · Re E ×B∗

=1

ω

2µσn · Re (1− i)(n×B(0))×B∗(0) (605)

= −ωδ4µ

|B(0)|2 = −ωµδ4

|H(0)|2 = − 1

2σδ|H(0)|2 (606)

or as the integral∫

dξRe J · E∗/2. These losses go to 0 as σ → ∞. The advantage ofthe last expression is that H is continuous across the interface and so in the limit of highconductivity it may be taken as its value just outside the conductor.

10.2 Waveguides

A typical waveguide is a long hollow conductor within which electromagnetic waves canpropagate. The fields satisfy Maxwell’s equations in the nonconducting hollow region. Theeffect of the conducting wave guide is, in the limit of a perfect conductor, just the impositionof boundary conditions as discussed in the previous section. We first consider a long cylinderwith uniform cross section. The interior of the cylinder is filled with a material with uniformǫ, µ. We also restrict attention to a perfect conductor, so the boundary conditions on theinner cylindrical wall are E‖ = 0 and n · B = 0. Maxwell’s equations imply that eachcomponent of E,B satisfies the wave equation:

(

−∇2 + ǫµ∂2

∂t2

)

E = 0,

(

−∇2 + ǫµ∂2

∂t2

)

B = 0 (607)

but they also imply relations among those components. For harmonic time variation e−iωt,Faraday’s and the Ampere-Maxwell equations become

B =1

iω∇×E, E = − 1

iωµǫ∇×B (608)

These formula’s automatically impose the divergence equations ∇ · E = ∇ · B = 0. Withcylindrical geometry we choose the z-axis parallel to the interior walls of the cylinder. Thewalls of the cylinder projected onto the xy-plane will be a closed curve, which for the momentwe allow to be arbitrary in shape. Because of this generality we cannot find a plane wave

123 c©2010 by Charles Thorn

Page 40: 8 Lorentz Invariance and Special Relativity

solution in all three coordinates, but we can find solutions with plane wave z dependenceeikz−iωt. In effect we can find a 1D plane wave solution. So we assume the form f(x, y)eikz−iωt

where f is any of the field components. We write ∇ = ∇⊥+ z∂/∂z, and also E = Ez z+E⊥,and B = Bz z+B⊥. Then we write out the curl and take its scalar and vector product withz:

∇×E = ∇⊥ ×E⊥ +∇⊥Ez × z + ikz ×E⊥ (609)

z · ∇ ×E = z · ∇⊥ ×E⊥ (610)

z × (∇×E) = z × (∇⊥Ez × z) + ikz × (z ×E⊥) = ∇⊥Ez − ikE⊥ (611)

z · ∇ ×B = z · ∇⊥ ×B⊥, z × (∇×B) = ∇⊥Bz − ikB⊥ (612)

Employing these decompositions in the Maxwell equations then gives

Bz =1

iωz · ∇⊥ ×E⊥, Ez = − 1

iωµǫz · ∇⊥ ×B⊥ (613)

z ×B⊥ =1

iω(∇⊥Ez − ikE⊥), z ×E⊥ = − 1

iωµǫ(∇⊥Bz − ikB⊥) (614)

∇⊥ ·E⊥ + ikEz = 0, ∇⊥ ·B⊥ + ikBz = 0 (615)

The equations on the middle line can be rearranged by taking the cross product of say thefirst one with z as

−B⊥ +k

ωz ×E⊥ =

1

iωz ×∇⊥Ez

− k

ωµǫB⊥ + z ×E⊥ = − 1

iωµǫ∇⊥Bz (616)

B⊥

(

k2

ω2µǫ− 1

)

=1

iωz ×∇⊥Ez +

k

iω2µǫ∇⊥Bz (617)

z ×E⊥

(

k2

ω2µǫ− 1

)

=k

iω2µǫz ×∇⊥Ez +

1

iωµǫ∇⊥Bz (618)

which shows that, except in the case k = ±ω√µǫ, we can express E⊥,B⊥ in terms of Ez, Bz.TEM Modes That exceptional case implies that all components of E and B satisfy the2D Laplace equation. Since Ez is parallel to the waveguide walls, the perfect conductorboundary conditions require Ez = 0 on all boundaries so the uniqueness theorem of theLaplace equation implies Ez = 0 everywhere inside the guide. Then the above equationsimply

Ez = 0, B⊥ =k

ωz ×E⊥, ∇⊥Bz = 0, for k2 = ω2µǫ (619)

The third equation shows that Bz is independent of x, y. This constant is not directlyrestricted by the perfect conductor boundary condition n · B = 0 since Bz is a parallelcomponent of B. However 0 = ∇ ·B = ∇⊥ ·B⊥ + ikBz show by Gauss’ theorem that

dxdy Bz =i

k

dl n ·B⊥ = 0 (620)

124 c©2010 by Charles Thorn

Page 41: 8 Lorentz Invariance and Special Relativity

by the perfect conductor boundary condition on B⊥7. Then Bz, being a constant, must in

fact be 0. Thus the exceptional solution is purely transverse (Ez = Bz = 0) and so is calledthe TEM (Transverse Electric Magnetic) mode. In order for it to exist, the electric field mustsatisfy the 2D electrostatic equations ∇⊥ · E⊥ = 0, ∇⊥ × E⊥ = 0, which is equivalent to∇2

⊥φ = 0, with the boundary condition E‖ = 0 equivalent to φ =constant on the boundary.There is no such solution for a simply connected 2D region, so it can only be relevant to asituation at least as complicated as a coaxial waveguide.

A simple example waveguide that supports a TEM mode is a pair of concentric conduct-ing cylinders of radii a < b with the gap between them a nonconducting material. Thenthe 2D electrostatics problem is solved with a radial electric field E(ρ) = ρE0a/ρ. ThenB =

√µǫϕE0a/ρ, and the perfectly conducting boundary conditions E‖ = Bn = 0 are auto-

matically satisfied. This example is explored further in a homework problem, in which theeffects of finite conductivity are also considered.TE and TM modes: Returning to the general case k2 6= ω2µǫ we see that we can describethe general solution in terms of two distinct modes: Ez = 0 (TE modes) and Bz = 0 (TMmodes). TE mode solutions are

B⊥ =ik

ω2µǫ− k2∇⊥Bz =

ik

γ2∇⊥Bz (622)

z ×E⊥ =iω

ω2µǫ− k2∇⊥Bz =

γ2∇⊥Bz =

ω

kB⊥ =

ωµ

kH⊥ (623)

(−∇2⊥ + k2 − ω2µǫ)Bz = 0, n · ∇⊥Bz = 0 on boundaries (624)

In this case the Neumann boundary conditions on Bz are equivalent to the Bn = E‖ = 0, ascan be easily seen by dotting the first two lines with n.

Similarly the TM mode solutions are

B⊥ =iωµǫ

ω2µǫ− k2z ×∇⊥Ez =

iωµǫ

γ2z ×∇⊥Ez (625)

z ×E⊥ =ik

ω2µǫ− k2z ×∇⊥Ez =

ik

γ2z ×∇⊥Ez =

k

ωǫH⊥ (626)

(−∇2⊥ + k2 − ω2µǫ)Ez = 0, Ez = 0 on boundaries (627)

Here Dirichlet conditions are necessary since Ez is parallel to the conducting surface. Butthis is also sufficient since it implies n×∇⊥Ez = 0. Then dotting the first two equations withn shows that Bn = E‖ = 0 on the boundaries, as required by perfect conductor boundaryconditions. In both TM and TE cases we have to solve the Helmholtz equation, but withNeumann or Dirichlet conditions in TE and TM cases respectively.

7Alternatively we can write

dxdy Bz =1

dxdyz · (∇×E) =1

dl ·E = 0. (621)

125 c©2010 by Charles Thorn

Page 42: 8 Lorentz Invariance and Special Relativity

We can regard the Helmholtz equation as an eigenvalue problem for the transverse Lapla-cian

−∇2⊥ψa = γ2aψa, γ2 = ω2µǫ− k2 (628)

where γ2a are the possible eigenvalues. Since the eigenvalues are a discrete set of numbers,fixed for given geometry of waveguide and boundary conditions, we can interpret ωa(k) =√

k2 + γ2a/√µǫ as the dispersion formula for 1D waves along the waveguide. The group

velocity of these waves is then vg = k/(ωaµǫ).

10.3 Rectangular Waveguide

Let the cross section be a rectangle of dimensions a× b with a > b. The eigenfunctions are

TE : Bzmn = B0 cos

mπx

acos

nπy

b, γmn = π

m2

a2+n2

b2(629)

TM : Ezmn = E0 sin

mπx

asin

nπy

b, γmn = π

m2

a2+n2

b2(630)

The cutoff frequencies of TE and TM modes have a large overlap. The difference is that themodes with m or n equal to 0 are possible for TE but not TM. In particular the lowest cutofffrequency is the TE mode γ10/

√µǫ = π/(a

√µǫ). For this mode Ez = By = Ex = 0 and

Bz = B0 cosπx

a, Bx = − ik

γ2π

aB0 sin

πx

a, Ey = −ω

kBx =

γ2π

aB0 sin

πx

a(631)

10.4 Energy Flow and Attenuation

To calculate the energy and energy flow in a wave guide we use the formulas for the time-averaged energy density and Poynting vectors valid for harmonic time dependence e−iωt:

〈u〉 =1

2Re( ǫ

2E ·E∗ +

µ

2H ·H∗

)

, 〈S〉 = 1

2ReE ×H∗ (632)

We first evaluate these expressions assuming perfect conductivity. The evaluation is slightlydifferent for TE and TM modes.TE Modes For the former

〈STE〉 =1

2Re [E⊥ ×H∗

⊥ +E⊥ × zH∗z ]

=1

2µRe

[

k

ωE⊥ × (z ×E∗

⊥) +E⊥ × zB∗z

]

=1

2µRe

[

zk

ωE⊥ ·E∗

⊥ +iω

k2 − ω2µǫB∗

z∇⊥Bz

]

=1

2µRe

[

zωk

(k2 − ω2µǫ)2∇⊥Bz · ∇⊥B

∗z +

k2 − ω2µǫB∗

z∇⊥Bz

]

126 c©2010 by Charles Thorn

Page 43: 8 Lorentz Invariance and Special Relativity

The total power transported along the guide is

PTE =

dxdyz · 〈S〉

=1

ωk

(k2 − ω2µǫ)2

d2x∇⊥Bz · ∇⊥B∗z =

1

ωk

γ2

d2x|Bz|2 (633)

Finally, the time averaged energy density and energy per unit length are

〈uTE〉 =1

2Re

(

ǫ

2E⊥ ·E∗

⊥ +1

2µB⊥ ·B∗

⊥ +1

2µ|Bz|2

)

=1

2Re

(

ω2ǫµ+ k2

2µ(ω2µǫ− k2)2∇⊥Bz · ∇⊥B

∗z +

1

2µ|Bz|2

)

(634)

T ≡∫

d2x〈uTE〉 =∫

d2x1

2

(

ω2ǫµ+ k2

2µ(ω2µǫ− k2)BzB

∗z +

1

2µ|Bz|2

)

=1

ω2ǫµ

γ2

d2x|Bz|2 =ωµǫ

kPTE =

PTE

vg(635)

The relation PTE = vgT precisely corresponds to energy flowing down the guide with thegroup velocity.TM Modes The corresponding evaluation for TM modes is similar;

〈STM〉 =1

2Re [E⊥ ×H∗

⊥ +Ez z ×H∗⊥]

=1

2µRe[ωµǫ

kE⊥ × (z ×E∗

⊥) +Ez z ×B∗⊥

]

=1

2µRe

[

zωµǫ

kE⊥ ·E∗

⊥ +iωµǫ

k2 − ω2µǫEz∇⊥E

∗z

]

2Re

[

zωk

(k2 − ω2µǫ)2∇⊥Ez · ∇⊥E

∗z +

k2 − ω2µǫEz∇⊥E

∗z

]

The total power transported along the guide is

PTM =

dxdyz · 〈S〉

2

ωk

(k2 − ω2µǫ)2

d2x∇⊥Ez · ∇⊥E∗z =

ǫ

2

ωk

γ2

d2x|Ez|2 (636)

Finally, the time averaged energy density and energy per unit length are

〈uTM〉 =1

2Re

(

ǫ

2E⊥ ·E∗

⊥ +1

2µB⊥ ·B∗

⊥ +ǫ

2|Ez|2

)

=1

2Re

(

ǫ(ω2ǫµ+ k2)

2(ω2µǫ− k2)2∇⊥Ez · ∇⊥E

∗z +

ǫ

2|Ez|2

)

(637)

127 c©2010 by Charles Thorn

Page 44: 8 Lorentz Invariance and Special Relativity

T ≡∫

d2x〈uTE〉 =∫

d2xǫ

2

(

ω2ǫµ+ k2

2(ω2µǫ− k2)EzE

∗z +

1

2|Ez|2

)

2

ω2ǫµ

γ2

d2x|Ez|2 =ωµǫ

kPTM =

PTM

vg(638)

Resistive Losses Now that we have the solutions for perfect conductivity we can use themto get a first order estimate of the losses that occur with large but finite conductivity. TheOhmic losses are concentrated within a depth δ from the surface of the conductor and theloss per unit area was found in Eq.(??) to be ωδB‖ ·B∗

‖/4µc = |H‖|2/2σδ. We express theresult in terms of H‖ because that field is continuous across the interface. Then the onlyproperties of the conductor that come in are δ, σ. The loss per unit length is the integral ofthis quantity along a ribbon of width dz that encircles the waveguide. For the TM modes theparallel component of the magnetic field just outside the conductor’s surface is approximatelythe value for perfect conductivity and can be obtained from the cross productTM Modes:

n×H⊥ =−iωǫ

k2 − ω2µǫn× (z ×∇⊥Ez) =

−iωǫk2 − ω2µǫ

(zn · ∇⊥Ez) (639)

dPTM

dz= − 1

2σδ

ω2ǫ2

(k2 − ω2µǫ)2

dl|n · ∇⊥Ez|2

= − 1

2σδ

ω2ǫ2

γ4

dl|n · ∇⊥Ez|2 (640)

TE Modes For TE modes, there are two contributions to the parallel magnetic field: Hz

and the parallel component of H⊥ which can be inferred from

n×H⊥ =−ik

k2 − ω2ǫn×∇⊥Hz (641)

dPTE

dz= − 1

2σδ

dl

(

H2z +

k2

(k2 − ω2µǫ)2|n×∇⊥Hz|2

)

= − 1

2σδ

dl

(

H2z +

k2

γ4|n×∇⊥Hz|2

)

(642)

The attenuation constant β is defined according to an exponential law P (z) = P (0)e−2βz,so that

β = − 1

2P

dP

dz(643)

βTE =γ2

2σδ

[∮

dlH2z + (k2/γ4)

dl|n×∇⊥Hz|2µωk

d2x|Hz|2]

(644)

βTM =1

2σδ

ωǫ

γ2k

dl|n · ∇⊥Ez|2∫

d2x|Ez|2(645)

Since the integrals in these expressions depend only on the geometry of the wave guide, thefrequency dependence of the β’s is explicit (remembering that δ ∼ ω−1/2) and assuming that

128 c©2010 by Charles Thorn

Page 45: 8 Lorentz Invariance and Special Relativity

µ, µc, ǫ, σ are all frequency independent). Then βTM is proportional to

ω

δk=

1√µǫδ0

ω

ω0

(

1− ω20

ω2

)−1/2

(646)

where we have written ω0 = γ/√µǫ for the cutoff frequency and δ0 for the skin depth at

cutoff frequency. This expression blows up at the cutoff frequency, grows as√ω at high

frequency, and has a minimum at ω = ω0

√3. The frequency dependence of βTE is more

involved because of the two contributions:

βTE =

ω

ω0

(

1− ω20

ω2

)−1/2 [

I1ω20

ω2+ I2

(

1− ω20

ω2

)]

(647)

It also blows up at cutoff and increases as√ω at high frequency, but its minimum depends

on the details of the wave guide.

10.5 Resonant Cavities

Waveguides, because they leave one direction (z in our discussion) unrestricted, allow wavesto propagate in that direction. The walls of the guide prevent propagation in the x andy directions. These waves were described mathematically by assuming the dependencef(x, y)eikz−iωt for the various components of the fields and then finding solutions for a con-tinuous range of kc =

ω2 − ω2c . Except for TEM modes, the confinement of fields in the xy

dimensions generally produces a discrete set of lower cutoffs ωc > 0 on the allowed frequencyof propagation. It is instructive to invert the relationship between ω and k: ω = c

k2 + γ2,which is reminiscent of the relativistic relation of energy to momentum. If we compare thisto the relation for 3D space ω = c

k2x + k2y + k2z , we can interpret the effect of the waveguidewalls as quantizing the possible values of k2x + k2y. Similarly, we could have set up a guidethat allowed propagation in two directions by considering waves in the region between twoparallel planes, say at x = 0 and x = L. Then the wave ansatz would be f(x)eikyy+ikzz−iωt,and we would obtain a dispersion law ω = c

k2y + k2z + γ2, where now the effect of thewalls is just to quantize the values of kx. Notice that the effect of confining motion in onedimension has the effect of dimensional reduction. The extra dimension is not completelylost but makes itself felt through the spectrum of “masses” represented by the cutoff fre-quencies. If the extra dimension is very small, it would take a huge energy (~ω) to excitethese extra modes. Some very speculative notions about high energy physics include thehypothesis that our space-time actually has extra dimensions that are extremely tiny, thatwe will only discover in extremely high energy experiments by measuring the spectrum ofcutoff frequencies, which to us will be a spectrum of particle masses.

Returning to reality, we now consider confinement in all three dimensions, that is weconsider electromagnetic fields in cavities. Then we can expect that the frequency itselfwill be quantized. Again we simplify life by first assuming perfectly conducting walls, andconsider the effect of finite conductivity later as a small perturbation. Since we have alreadystudied cylindrical waveguides, the easiest cavities to discuss are cylinders with conducting

129 c©2010 by Charles Thorn

Page 46: 8 Lorentz Invariance and Special Relativity

end plates at fixed values of z. But of course cavities of any shape could be considered. Oneof the homework problems asks you to investigate a spherical cavity.

To describe a cylindrical cavity we can start with our known solutions for a cylindricalwaveguide and restrict the z dependence to satisfy the appropriate boundary conditions atz = 0 and z = L. We use the fact that for a given frequency, we get a waveguide solution forboth signs of k = ±

ω2 − ω2c/c. Thus each field component will have the general dependence

f(x, y)(A sin kz + B cos kz). We can maintain the distinction between TM and TE modes.At the endplates we must have Ex,y = 0 and Bz = 0. For TE modes, which are given in termsof Bz, we see that these boundary conditions imply Bz = Bz(x, y) sin(nπz/L). Because Ex,y

are determined in terms of ∂x,yBz, those components of E will automatically vanish as well.

Thus the frequency will be quantized to the discrete values ω = c√

γ2 + n2π2/L2. The effectof the endplates has been to quantize k = ±nπ/L. The x, y components of the fields are

Bz = Bz(x, y) sinnπz

L

B⊥ =i

γ2∇⊥Bz(x, y)

(

2iLeiπnz/L − −nπ

2iLe−iπnz/L

)

=nπ

γ2L∇⊥Bz(x, y) cos

nπz

L

z ×E⊥ =iω

γ2∇⊥Bz(x, y) sin

nπz

LTE Modes (648)

For TM modes Bz = 0 and the solution is specified in terms of Ez. For the wave guidesolution with z dependence e±ikz we had

E⊥ = ± ikγ2

∇⊥Eze±ikz (649)

to get a linear combination of these to vanish at z = 0, L we clearly need the two terms toappear with the same coefficients in Ez (so they appear with opposite coefficients in E⊥) toget E⊥ = 0 at z = 0. Then E⊥ = 0 at z = l requires k = nπ/L. Thus we have

Ez = Ez(x, y) cosnπz

L, B⊥ =

iωµǫ

γ2z ×∇⊥Ez(x, y) cos

nπz

L

z ×E⊥ = − nπ

γ2Lz ×∇⊥Ez(x, y) sin

nπz

LTM Modes (650)

For the right circular cylinder the transverse solutions are Bessel functions Jm(γmnρ)eimϕ.

For TM modes Dirichlet conditions on Ez imply that Rγmn ≡ xmn the zeroes of Jm(x).Then the cavity frequencies are ωmnp = c

x2mn/R2 + p2π2/L2. The lowest TM frequency

is ω010 = cx01/R ≈ 2.405c/R. For TE modes the Neumann conditions of Bz imply thatRγmn = ymn the zeroes of J ′

m(x). For TE modes p = 1 is the lowest allowed z mode,and y11 ≈ 1.841 is the lowest zero of the various J ′

m. Thus the lowest TE frequency isω111 = c

y211/R2 + π2/L2. since y11 < x01, this TE mode will have the lowest frequency for

sufficiently large L/R.

Energy, Finite Conductivity, and Q. The energy stored in a cylindrical cavity for asingle mode can be obtained from the energy per unit length of a wave guide by multiplying

130 c©2010 by Charles Thorn

Page 47: 8 Lorentz Invariance and Special Relativity

by cos2(nπz/L) or sin2(nzπ/L) and integrating z from 0 to L, which amounts, for n 6= 0 tomultiplying the energy per unit length by L/2. For the n = 0 mode one instead multipliesby L. This gives

UTE =L

γ2 + n2π2/L2

γ2

d2x|Bz|2 =µL

4

(

1 +n2π2

γ2L2

)∫

d2x|Hz|2 (651)

UTM =ǫL

4

(

1 +n2π2

γ2L2

)∫

d2x|Ez|2, n 6= 0; UTM =ǫL

2

d2x|Ez|2, n = 0 (652)

The rate of energy loss can be obtained by integrating the rate per unit length for a waveguide(times cos2 kz or sin2 kz) over z from 0 to L, and then adding the contribution from the twoendplates.

PTM = − 1

2σδ

[

L

2

ω2ǫ2

γ4

dl|n · ∇⊥Ez|2 + 2

d2x|H⊥|2]

= − 1

2σδ

[

L

2

ω2ǫ2

γ4

dl|n · ∇⊥Ez|2 + 2ω2ǫ2

γ4

d2x|∇⊥Ez|2]

= − ǫ

σδµ

(

1 +n2π2

γ2L2

)[

L

4γ2

dl|n · ∇⊥Ez|2 +∫

d2x|Ez|2]

, n 6= 0 (653)

The last term in square brackets is the contribution of the ends, and the first term in squarebrackets is multiplied by 2 for the case n = 0. The quality factor Q of a resonant cavity isdefined by

Q = ωEnergyStored

PowerLoss. (654)

Writing Cξ/A =∮

dl|n · ∇⊥Ez|2/γ2∫

d2x|Ez|2, with C the circumference and A the crosssectional area of the cylinder, we can write

QTM =ωσµδL

4(1 + ξCL/4A)=

µL

2µcδ(1 + ξCL/4A), n 6= 0

QTM =ωσµδL

2(1 + ξCL/2A)=

µL

µcδ(1 + ξCL/2A), n = 0. (655)

For TE modes, there are two contributions to the parallel magnetic field on the cylinderwall, Hz and the parallel component of H⊥, as well as the two endplate contributions:

PTE = − 1

2σδ

[∮

dlL

2

(

H2z +

n2π2

L2γ4|n×∇⊥Hz|2

)

+ 2n2π2

L2γ4

d2x|∇⊥Hz|2]

= − 1

σδ

[∮

dlL

4

(

H2z +

n2π2

L2γ4|n×∇⊥Hz|2

)

+n2π2

L2γ2

d2x|Hz|2]

(656)

The expression for QTE is more cumbersome because of the extra contribution. There arenow two parameters associated with the transverse geometry.

A=

dl|n×∇⊥Hz|2γ2∫

d2x|Hz|2,

A=

dl|Hz|2∫

d2x|Hz|2(657)

131 c©2010 by Charles Thorn

Page 48: 8 Lorentz Invariance and Special Relativity

Then we can write

QTE =ωσδµL

4

L2γ2 + n2π2

(L2γ2η + n2π2ζ) (LC/4A) + n2π2

=µL

2µcδ

L2γ2 + n2π2

(L2γ2η + n2π2ζ) (LC/4A) + n2π2(658)

The quality factor controls how rapidly an excited cavity loses its energy. From its definitionwe have

dU

dt= −ω0

QU, U(t) = U(0)e

−ω0t

Q (659)

Where we write ω0 for the eigenfrequency of the mode excited. Such exponential dampingcorresponds to damping in the fields at half the rate. So the effect of finite conductivityis to replace harmonic time dependence in the fields e−iω0t by e−ω0t/2Q−i(ω0+∆ω)t. This timedependence corresponds to a smoothed out frequency profile

E(ω) =

∫ ∞

0

dt

2πeiωtE0e

−ω0t/2Q−i(ω0+∆ω)t =E0

1

ω0/2Q+ i(ω0 +∆ω − ω)(660)

|E(ω)|2 =|E0|24π2

1

(ω − ω0 −∆ω)2 + ω20/4Q

2(661)

So instead of showing a single discrete frequency ω0, damping spreads out the frequencyprofile in the energy density over a peak with width at half maximum of Γ = ω0/Q. AsQ→ ∞ the profile sharpens to a delta function centered at ω0.

10.6 Perturbation of Boundary Conditions

We can get a little more information about the effects of finite conductivity by consideringhow its presence alters the boundary value problem for the Helmholtz equation. We haveseen that the tangential electric field is not strictly zero near the conductor, but rather

E‖ ≈1− i

σδK =

1− i

σδn×H (662)

Thus, taking the simpler case of TM modes, we see that instead of Ez = 0 on the walls ofthe cylinder, we should have

Ez z ≈ 1− i

σδn×H⊥ ≈ z

1− i

σδ

iωǫ

γ2n · ∇⊥Ez

Ez ≈ 1 + i

σδ

ωǫ

γ20n · ∇⊥Ez0 = (1 + i)

ω2µcδǫ

2γ20n · ∇⊥Ez0 (663)

where we added a 0 subscript on quantities valid in the perfect conductor limit. On the endplates instead of E⊥ = 0, we should have

E⊥ ≈ 1− i

σδ±z ×H⊥ ≈ ∓1− i

σδ

iωǫ

γ20∇⊥Ez0 = ∓(1 + i)

ω2µcδǫ

2γ20∇⊥Ez0

∂Ez

∂z= −∇⊥ ·E⊥ ≈ ±(1 + i)

ω2µcδǫ

2γ20∇2

⊥Ez0 = ∓(1 + i)ω2µcδǫ

2Ez0 (664)

132 c©2010 by Charles Thorn

Page 49: 8 Lorentz Invariance and Special Relativity

where the upper (lower) sign applies to the endplate at z = L (z = 0). Denoting the perfectconductor solutions with a 0 subscript, we have for infinite and finite conductivity

(−∇2 − ǫµω20)Ez0 = 0, Ez0 = 0 on walls,

∂Ez0

∂z= 0 on endplates (665)

(−∇2 − ǫµω2)Ez = 0, Ez = (1 + i)ω2µcδǫ

2γ20n · ∇⊥Ez0 on walls,

∂Ez

∂z= ∓(1 + i)

ω2µcδǫ

2Ez0 on endplates (666)

We now use Green’s Theorem∫

dzd2x(Ez∇2E∗z0 − E∗

z0∇2Ez) =

dS(−n) · (Ez∇E∗z0 − E∗

z0∇Ez))

µǫ(ω2 − ω20)

d3xEzE∗z0 =

dz

dl(−n) · (Ez∇E∗z0 − E∗

z0∇Ez))

−∫

d2x

(

Ez∂E∗

z0

∂z− E∗

z0

∂Ez

∂z

)

z=L

z=0

=

dz

dl(−n) · Ez∇E∗z0 +

d2xE∗z0

∂Ez

∂z

z=L

z=0

where n is the normal vector pointing out from the conductor or into the hollow region, andin the last line we used the fact that Ez0 = 0 on the cylinder walls and ∂Ez0/∂z = 0 on theendplates. Putting everything together we have

(ω2 − ω20)

d3xEz0E∗z0 ≈ −(1 + i)

ω20µcδ

µ

(

L

4γ20

dl|n · ∇⊥Ez0|2 +∫

d2x|Ez0|2)

ω − ω0 ≈ −(1 + i)ω0µcδ

(

ξLC

4A+ 1

)

= −(1 + i)ω0

2Q(667)

From the time dependence implied by finite Q, we infer that ω − ω0 = ∆ω − iω0/2Q fromwhich we infer that

∆ω = −ω0µcδ

(

ξLC

4A+ 1

)

= − ω0

2Q, Q =

2δµc(1 + ξCL/4A)(668)

The result for Q agrees with that obtained from our energy analysis and the result for ∆ωis some new information.

10.7 Excitation of Waveguide Modes

One way to set up waves in waveguides is to apply an oscillating current distribution atsome spot within the guide. This current would in turn produce fields oscillating at thesame frequency, which far from the current source would be in a superposition of the allowedpropagating modes at that frequency. That is, the modes with γ2 < ω2µǫ. Close to the source

133 c©2010 by Charles Thorn

Page 50: 8 Lorentz Invariance and Special Relativity

all the other modes will be present to form a proper solution of the Maxwell equations inthe presence of the source. This process is a mini-radiation problem in which the outgoingwaves are limited to two directions +z and −z. If we assume the driving source is localizedin z, say Jµ = 0 when |z| > L/2, then the fields will solve the sourceless Maxwell equationsin the whole region to the right z > L/2 and in the whole region to the left z < −L/2. Inboth cases the solution should be a superposition of outgoing waves. This means that thefields on the right will be superpositions of wave guide modes with eikz dependence for realk =

ω2µǫ− γ2 or e−z|k| for imaginary k. similarly on the left they will be superpositionsof e−ikz or e+z|k|.

An effective way to describe this physical situation is to expand the fields in waveguidenormal modes. As we have seen these modes divide into TM and TE modes, in which allcomponents of the field are expressed in terms of Ez and Hz respectively. As discussed in J,Problem 8.18, different modes of each type are orthogonal in the sense that

d2xEzλEzµ = 0, or

d2xHzλHzµ = 0, λ 6= µ (669)

if λ 6= µ. For λ = µ the integrals are nonzero and we may establish a convenient normalizationfor them. Because the Helmholtz equation and boundary conditions are real, the modescould be chosen to be real. However, it will be convenient to allow some modes to be purelyimaginary, and for later convenience adopt the normalizations

d2xEzλEzµ = −γ2λ

k2λδλµ, TM Modes (670)

d2xHzλHzµ = − γ2λk2λZ

δλµ, TE Modes (671)

Zλ =µω

k=

µ

ǫ

(

ω√µǫ

k

)

, λ ∈ TE

Zλ =k

ǫω=

µ

ǫ

(

k

ω√ǫµ

)

, λ ∈ TM (672)

The wave impedance Zλ (it has dimensions of resistance) is defined so that H⊥ = Z−1λ z×E⊥

for both kinds of modes. We see that this normalization implies that for TM modes Ez isimaginary for real kλ and Ez is real for imaginary kλ. For TE modes, however, Hz isimaginary for both cases. Finally note that, with these conventions, the dimensions of themode functions Ezλ are taken to be 1/Length, instead of the physical dimensions of an electricfield (Force/Charge). This is a convenient choice for their use as expansion functions, andsimply means that the physical units of the electric field will be restored by the units of theexpansion coefficients. However, the dimensions of Hzλ/Ezλ remain 1/Ohms, the same asfor physical fields.

The purpose of these strange normalization conditions on the z-components was to ar-range simple orthonormality conditions on E⊥:

d2xE⊥λ ·E⊥µ = δλµ, and

d2xH⊥λ ·H⊥µ =1

Z2λ

δλµ (673)

134 c©2010 by Charles Thorn

Page 51: 8 Lorentz Invariance and Special Relativity

It is also easy to check that the transverse fields for TE modes are orthogonal to those forTM modes. Notice that the signs of the right sides of these equations reflect that fact thatE⊥ is always real with our conventions, whereas H⊥ is real for propagating modes (kλ real)but imaginary for nonpropagating modes (kλ imaginary). Finally the orthogonality relation

d2xE⊥λ ×H⊥µ =1

δλµ (674)

will be useful in what follows.We can use the same set of mode fields to construct both right and left moving waves,

except that the relative sign between the longitudinal and transverse fields flips in going fromright moving to left moving. Thus if we fix kλ = +

ω2ǫµ− γ2λ if real and k = i√

γ2λ − ω2ǫµif imaginary, the right (+) moving and left (−) moving modes will be defined via

E+λ ≡ [Eλ⊥ + zEλz]e

ikz, H+λ ≡ [Hλ⊥ + zHλz]e

ikz (675)

Eλ− ≡ [Eλ⊥ − zEλz]e

−ikz, H−λ ≡ [−Hλ⊥ + zHλz]e

−ikz (676)

This notation subsumes both TE and TM modes, the index λ distinguishing different modes.If λ is a TE (TM) mode then Eλz = 0 (Hλz = 0). Each mode is associated with its owncutoff frequency γλ/

√µǫ.

Returning to the waveguide excitation problem, we can expand the solution of Maxwell’sequations outside the source in a compete set of waveguide modes

E = E+ +E− =∑

λ

A+λE

+λ +

λ

A−λE

−λ (677)

H = H+ +H− =∑

λ

A+λH

+λ +

λ

A−λH

−λ (678)

Maxwell’s equations imply that the expansion coefficients will be a fixed set of constantsthroughout any region of z completely free of sources. However, if two such regions areseparated by a region containing sources, the expansion coefficients will differ for the tworegions. Since we are assuming a localized source we have one set of expansion coefficientsA±

Rλ in the region to the right of the source and another set A±Lλ to the left of the source. For

the analysis of the waveguide excitation, we shall impose that E+ = 0 (A+Lλ = 0 for all λ)

everywhere to the left of the source and E− = 0 (A−Rλ = 0 for all λ) everywhere to the right

of the source. The problem is to relate the nonzero coefficients, A−Lλ, A

+Rλ to the properties

of the source.We can then use a trick with the Poynting vector to relate the expansion coefficients on

the left to those on the right and a volume integral of the expressions J ·E±λ . It is based on

the identity

∇ · (E ×H±λ ) = (∇×E) ·H±

λ −E · (∇×H±λ )

= iωµH ·H±λ + iωǫE ·E±

λ

∇ · (E±λ ×H) = (∇×E±

λ ) ·H −E±λ · (∇×H)

= iωµH ·H±λ + iωǫE ·E±

λ − J ·E±λ

∇ ·(

E ×H±λ −E±

λ ×H)

= J ·E±λ (679)

135 c©2010 by Charles Thorn

Page 52: 8 Lorentz Invariance and Special Relativity

We can now integrate this over the volume within the waveguide between −L/2 < z < L/2which completely contains the source. The left side can be written as a surface integral overthe boundary of the volume. On the walls of the guide the normal component of E × H

involves only the parallel component of E which vanishes on the walls. Thus the onlycontribution comes from the cross sections at z = ±L/2. At z = L/2, E− = H− = 0, sothat contribution is

µ

A+Rµ

d2xz ·(

E+µ ×H±

λ −E±λ ×H+

)

=∑

µ

A+Rµ

d2xz · (E⊥µ × (±)H⊥λ −E⊥λ ×H⊥µ)

=∑

µ

A+Rµ

(

± 1

δλµ −1

δλµ

)

=

0− 2

ZλA+

Rλ(680)

Similarly at z = −L/2, E+ = H+ = 0, so that contribution is

µ

A−Lµ

d2x(−z) ·(

E−µ ×H±

λ −E±λ ×H−

µ

)

= −∑

µ

A−Lµ

d2xz · (E⊥µ × (±)H⊥λ +E⊥λ ×H⊥µ)

= −∑

µ

A−Lµ

(

± 1

δλµ +1

δλµ

)

=

− 2ZλA−

0(681)

Putting the two contributions together gives

A±RLλ

= −Zλ

2

d3xJ ·E∓λ = −Zλ

2

d3x (J⊥ ·E⊥λ ± JzEλz) e∓ikλz (682)

Thus the strength of the excitation of each mode, propagating or not is determined by thissingle integral8. Far down the wave guide only the propagating modes with γ2λ < ω2ǫµ willsurvive. For a fixed frequency, there will be only a finite number of these.

8More generally if we had not set A±L

Rλ= 0, the left side of these equations would be the differences

A±R

Lλ− A±

L

Rλ. So, for instance, if J = 0, the equation would simply say that the coefficients on the left and

right would be the same: A±

Rλ = A±

Lλ.

136 c©2010 by Charles Thorn

Page 53: 8 Lorentz Invariance and Special Relativity

11 Radiation from Localized Sources

We shall use Green function methods to calculate how electromagnetic waves are emittedfrom time varying sources. Last semester we obtained the Green function for the waveequation

(

−∇2 +1

c2∂2

∂t2

)

G(r, t; r′, t′) = δ(r − r′)δ(t− t′) (683)

G(r, t; r′, t′) =δ(|r − r′|/c− t+ t′)

4π|r − r′| (684)

where a choice has been made so that G is zero for t < t′, that is: G is the retarded Greenfunction. Maxwell’s equations can be reduced to the wave equation by choosing Lorenzgauge ∂µA

µ = 0. This is easily seen in covariant notation:

Jµ = cǫ0∂νFµν = c2ǫ0∂ν(∂

µAν − ∂νAµ) = − 1

µ0

∂ · ∂Aµ (685)

−∂ · ∂Aµ =

(

−∇2 +1

c2∂2

∂t2

)

Aµ = µ0Jµ (686)

Aµ(r, t) = µ0

d3r′Jµ(r

′, t−R/c)

4πR, R ≡ |r − r′| (687)

where we note that the source contributes at the retarded time.Radiation, by definition, is electromagnetic energy that escapes to large distances, and

is determined by the large r behavior of the integral. We can write R ≈ r − r · r′/r

Aµ(r, t) ∼ µ0

4πr

d3r′Jµ(r′, t−R/c), r ≫ a (688)

If we Fourier analyze J(r, t) in time we get for the component with frequency ω,

Aµ(r, ω)e−iωt ∼ e−iωt µ0

4πr

d3r′Jµ(r′, ω)eik|r−r′|

Aµ(r, ω) ∼ µ0eikr

4πr

d3r′Jµ(r′, ω)e−ikr·r′/r (689)

The simplest radiating physical systems are point particles with harmonic trajectories,for instance moving in a circle of radius a in the xy-plane at constant speed ωa, r(t) =a(cos(ωt), sin(ωt, 0). The current for such a particle is

J = qdr

dtδ(r − r(t)) (690)

Jx ± iJy = ±iqaωe±iωtδ(r − a)δ(z)1

aδ(ϕ− ωt)

= ±iqωe±iωtδ(r − a)δ(z)δ(ϕ− ωt), Jz = 0 (691)

137 c©2010 by Charles Thorn

Page 54: 8 Lorentz Invariance and Special Relativity

The angle delta function in this formula must be understood as requiring ϕ = ωt mod 2π,which means it has the Fourier representation

δ(ϕ− ωt) =1

∞∑

n=−∞ein(ϕ−ωt) (692)

Thus we see that although the motion has the frequency ω, the current oscillates at all theharmonics nω of this fundamental frequency.

Jx ± iJy = ± iqω2π

e±iωtδ(r − a)δ(z)∞∑

n=−∞ein(ϕ−ωt−ϕ0) (693)

But notice that the higher harmonics are accompanied with high powers of eiϕ, which meansthat they contribute only to higher multipole moments. In the homework problems you areasked to analyze dipole and quadrupole arrangements of such revolving charges.

To calculate the Poynting vector which describes the energy flow of radiation, we needthe fields.

H =1

µ0

B =1

µ0

∇×A (694)

E = −∂A∂t

−∇φ = iωA−∇φ

= iωA− c2

iω∇(∇ ·A) (695)

where the last line uses the Lorenz gauge condition 0 = ∂A0/∂(ct)+∇·A → −iωφ/c2+∇·A.In the radiation zone, characterized by the fields falling off as 1/r, the only contribution to thespatial derivatives which does not add a further factor of 1/r is when they act on the phasefactor ∇eikr = ikreikr/r ≡ ikeikr. Thus, in the radiation zone characterized by kr ≫ 1, it isvalid to substitute ∇ → ik.

H ∼ ieikr

4πrk ×

d3r′J(r′, ω)e−ik·r′

(696)

E ∼ µ0eikr

4πωr

d3r′(

iω2J(r′, ω)− ic2kk · J(r′, ω))

e−ik·r′

∼ −i eikr

4πǫ0ωrk ×

(

k ×∫

d3r′J(r′, ω)e−ik·r′

)

= −Z0k ×H (697)

where Z0 = µ0c =√

µ0/ǫ0 ≈ 377Ω is sometimes called the impedance of the vacuum. Notethat, had we not used the Lorenz gauge condition to relate φ to A, the integrand wouldinvolve

iωJ(r′, ω)− ikc2ρ(r′, ω) = iωJ(r′, ω)− kc2

ω∇′ · J(r′, ω)

→ iωJ − ikc2

ωk · J = −ic

2

ωk × (k × J) (698)

138 c©2010 by Charles Thorn

Page 55: 8 Lorentz Invariance and Special Relativity

where we used current conservation ∇ · J = iωρ in the first line and the arrow denotesthe change in the integrand after an integration by parts. Finally the flux of energy in theradiation is given by the time averaged Poynting vector, which at large r is

〈S〉 =1

2Re E ×H∗

∼ −Z0

2kRe (k ×H)×H∗ =

Z0k

2kH ·H∗

dP

dΩ= r2r · 〈S〉 ∼ Z0

32π2

k ×∫

d3r′J(r′, ω)e−ik·r′

2

=Z0

32π2k2

k ×(

k ×∫

d3r′J(r′, ω)e−ik·r′

)∣

2

(699)

The main point of using the formula on the last line is that the vector inside the absolutevalue points in the direction of the electric field and hence determines the polarization of theradiation.

11.1 Long Wavelength Limit

We have just seen that the intensity of radiation involves the Fourier transform of the currentdensity

d3rJ(r, ω)e−ik·r (700)

When only one frequency contributes the normalization is that Re(Je−iωt) is the actualphysical current density. Suppose the source is localized within a region of size a. Thenthe exponent is of order ka = 2πa/λ. If the wavelength is large compared to the size of thesource, then ka≪ 1 and we may expand the exponential, relating the terms of the expansionto various multipole moments. We examine the first few terms of this expansion.

e−ik·r = 1− ik · r − 1

2(k · r)2 + · · · (701)

The first term contributes∫

d3rJk(r) =

d3r∂rk

∂rlJ l(r) = −

d3rrk∇ · J(r) = −iω∫

d3rrkρ = −iωpk (702)

where p is the electric dipole moment.The next term contributes

−ikl∫

d3rrlJk = −ikl

2

d3r(rlJk + rkJ l)− i

2

d3r(r · kJk − rkk · J)

= −ikl

2

d3rJm∇m(rkrl)− i

2

d3r[k × (J × r)]k

139 c©2010 by Charles Thorn

Page 56: 8 Lorentz Invariance and Special Relativity

= ikl

2

d3rrkrl∇ · J + ik ×m = −ωkl

2

d3rrkrlρ+ ik ×m

= −ωkl

2

(

Qkl

3+δkl3

d3rr2ρ

)

+ i[k ×m]k (703)

where we have recalled the definition of the electric quadrupole moment Qkl =∫

d3x(3rkrl−r2δkl)ρ. Finally, we have

d3rJk(r)e−ik·r = −iωpk + i[k ×m]k − ω

6Qklk

l − ωkk

6

d3rr2ρ (704)

The last term, which is not expressed in terms of a multipole moment does not contribute toradiation which involves the vector product of the current with k. The three terms that docontribute are electric dipole, magnetic dipole, and electric quadrupole respectively. Eachtype of radiation has a characteristic angular distribution.

For electric and magnetic dipole radiation one has

ω2

k2|k × (k × p)|2 =

ω2

k2|k(k · p)− k2p|2 = ω2(k2p · p∗ − k · pk · p∗)

1

k2|k × (k × (k ×m))|2 = k2(k2m ·m∗ − k ·mk ·m∗) (705)

Only if the components of p or m are all relatively real can we identify the right sides ofthese equations with ω2k2|p|2 sin2 α or k4|m|2 sin2 α, where α is the angle between k and thedipole moment.

dP

dΩ=

Z0

32π2ω2(k2|p|2 − |k · p|2), dP

dΩ=

Z0

32π2k2(k2|m|2 − |k ·m|2) (706)

The only qualitative difference between electric and magnetic dipole radiation is the po-larization, i.e. the orientation of the electric field. For the electric dipole it points in thedirection k × (k × p) which lies in the plane defined by k and p. For the magnetic dipoleit points in the direction perpendicular to the plane defined by k and m. The total powerradiated requires

dΩkikj = 4πk2/3δij

P =Z0

12πω2k2|p|2 = Z0

12πk4c2|p|2, P =

Z0

12πk4|m|2 (707)

respectively.Quadrupole radiation has a more complicated angular distribution. We have to square

the vector with components V i = ǫijkkjQklk

l

|V |2 = ǫijkkjQklk

lǫimnkmQ∗

npkp = (δjmδkn − δjnδkm)k

jQklklkmQ∗

npkp

= k2QklQ∗kpk

lkp − |Qmlkmkl|2 (708)

dP

dΩ=

ω2k4Z0

1152π2(QmlQ

∗mpk

lkp − |Qmlkmkl|2) (709)

140 c©2010 by Charles Thorn

Page 57: 8 Lorentz Invariance and Special Relativity

where k = (sin θ cosϕ, sin θ sinϕ, cos θ).To see the angular distribution in a simple example, take Qij diagonal with Q33 =

−2Q11 = −2Q22. Then QijQinkj kn/Q2

11 = sin2 θ + 4 cos2 θ = 1 + 3 cos2 θ and Qij kikj/Q11 =

sin2 θ − 2 cos2 θ = 1− 3 cos2 θ. then

dP

dΩ= 9

c2k6Z0

1152π2Q2

11 sin2 θ cos2 θ =

c2k6Z0

128π2Q2

11 sin2 θ cos2 θ. (710)

This shows a four lobed structure with vanishing radiation at θ = 0, π/2.To integrate over angles one can use rotational invariance and symmetry to argue that

dΩkmkl = C1δml,

dΩkmklknkp = C2(δmlδnp + δmnδlp + δmpδln) (711)

Then tracing over ml determines C1 = 4π/3, 5C2 = C1 so C2 = 4π/15. Then we find

P =4ω2k4Z0

1152π

(

1

3

ml

|Qml|2 −1

15(|TrQ|2 + 2

ml

|Qml|2))

=ω2k4Z0

1440π

ml

|Qml|2 (712)

for the total readiated power.

11.2 Beyond the Multipole Expansion

If the wavelength of radiation is comparable to the size of the source, the multipole expansionis no longer valid, and we have to try to evaluate the full Fourier transform

d3xJe−ik·r. Fora line antenna we can for example prescribe a simple form for J = zIδ(x)δ(y) sin(kd/2−k|z|)for −d/2 < z < d/2. Then

d3xJe−ik·r = zI

∫ d/2

−d/2

dz sin

(

kd

2− k|z|

)

e−ikz cos θ

=Iz

2i

∫ d/2

0

dz(

eikd/2−ikz(1−cos θ) + eikd/2−ikz(1+cos θ) − e−ikd/2+ikz(1+cos θ) − e−ikd/2+ikz(1−cos θ))

=Iz

2i

(

ei(kd/2) cos θ) + e−i(kd/2) cos θ) − eikd/2 − e−ikd/2

−ik(1− cos θ)+ei(kd/2) cos θ) + e−i(kd/2) cos θ) − eikd/2 − e−ikd/2

−ik(1 + cos θ)

)

=2Iz

k

cos[(kd/2) cos θ]− cos(kd/2)

1− cos2 θ=

2Iz

k

cos[(kd/2) cos θ]− cos(kd/2)

sin2 θ(713)

The angular power distribution is

dP

dΩ=

Z0

32π2

4I2

k4

cos[(kd/2) cos θ]− cos(kd/2)

sin2 θ

2

|k × (k × z)|2

=Z0I

2

8π2

cos[(kd/2) cos θ]− cos(kd/2)

sin θ

2

(714)

141 c©2010 by Charles Thorn

Page 58: 8 Lorentz Invariance and Special Relativity

If kd ≪ 1 we can expand in powers of kd which generates a multipole expansion. Thenumerator

cos[(kd/2) cos θ]− cos(kd/2) =∞∑

n=0

(−)n

(2n)!(kd/2)2n(cos2n θ − 1)

= − sin2 θ

∞∑

n=1

(−)n

(2n)!(kd/2)2n

n−1∑

m=0

cos2m θ

∼ k2d2

8sin2 θ − k4d4

384sin2 θ(1 + cos2 θ) + · · · (715)

We see that the multipole expansion of the angular distribution will be sin2 θ times a powerseries in cos2 θ.

But the formula also gives us a closed form expression even when kd = O(1). For exampleif kd = π, 2π the formula simplifies to

dP

dΩ kd=π=

Z0I2

8π2

cos2[(π/2) cos θ]

sin2 θ,

dP

dΩ kd=2π=Z0I

2

8π2

4 cos4[(π/2) cos θ]

sin2 θ(716)

The intensity at θ = π/2 is 4 times larger in the 2π case than in the π case. On the otherhand as θ → 0, π the intensity drops to zero much faster in the 2π case.

142 c©2010 by Charles Thorn

Page 59: 8 Lorentz Invariance and Special Relativity

11.3 Systematics of the Multipole Expansion

The way to systematically handle multipole radiation is to expand the angular dependencein spherical harmonics. The Helmholtz equation satisfied by all 6 field components separatesin spherical coordinates in the same way as the Laplace equation. We seek solutions of theform R(r)Ylm(θ, ϕ) and find the radial equation

(

−1

r

∂2

∂r2r +

l(l + 1)

r2− k2

)

R = 0 (717)

This equation can be compared to the Bessel equation(

d2

dr2+

1

r

d

dr+ k2 − m2

r2

)

Jm(kr) =

(

1√r

d2

dr2√r + k2 − m2 − 1/4

r2

)

Jm(kr) = 0 (718)

We see that the solution is R = 1√rJl+1/2(kr). These special Bessel functions are called

spherical Bessel functions and are denoted by lower case letters:

jl(x) ≡√

π

2xJl+1/2(x) ∼

cos(x− (l + 1)π/2)

x,

nl(x) ≡√

π

2xNl+1/2(x) ∼

sin(x− (l + 1)π/2)

x, (719)

h(1,2)l ≡ jl ± inl ∼ exp(±i(x−(l+1)π/2))

x

where the asymptotics shown is for x >> 1. Amazingly the spherical Bessel functions arerational functions of x and linear functions of sin x, cos x. E.g.

j0(x) =sin x

x, n0(x) = −cos x

x, h

(1)0 (x) = −ie

ix

x(720)

By a familiar argument we have the spherical expansion of the Green function for theHelmholtz equation:

eik|r−r′|

4π|r − r′| = ik∞∑

l=0

h(1)l (kr>)jl(kr<)

l∑

m=−l

Ylm(θ, ϕ)Y∗lm(θ

′, ϕ′) (721)

This expansion immediately gives an expansion in spherical waves for the potential every-where completely outside a harmonically varying source so r > r′:

Aµ(r, ω) = µ0

d3r′Jµ(r′)

eik|r−r′|

4π|r − r′|

= ikµ0

∞∑

l=0

l∑

m=−l

h(1)l (kr)Ylm(θ, ϕ)

d3r′Jµ(r′)jl(kr

′)Y ∗lm(θ

′, ϕ′) (722)

∼ eikr

rµ0

∞∑

l=0

(−i)ll∑

m=−l

Ylm(θ, ϕ)

d3r′Jµ(r′)jl(kr

′)Y ∗lm(θ

′, ϕ′) (723)

143 c©2010 by Charles Thorn

Page 60: 8 Lorentz Invariance and Special Relativity

in the radiation zone. Notice that the strength of each Ylm is not literlly a multipole momentbecause the rl of the latter has been replaced by jl(kr). But they become multipole momentss kr → 0, the low frequency limit. This expansion does not quite a completely organize theangular dependence of the fields, because of the angular information contained in J and inthe derivatives relating the potential to the fields. Next we turn to a more elegant description.

11.4 Vector Spherical Harmonics and Multipole Radiation

We can construct a vector from the components of Ylm by applying the angular momen-tum operator L = −ir × ∇. This suggests introducing vector spherical harmonics X lm ≡LYlm/

l(l + 1). Since L does not change the l value, the components of X lm are linearcombinations of the Ylm′ , so fl(kr)X lm satisfies the Helmholtz equation if fl is a sphericalBessel function. Furthermore ∇ · X lm = 0 and r · X lm = 0, which together imply that∇ · (fl(kr)X lm) = 0. Thus fl(kr)X lm is a candidate for either E or H outside of sources.Using ikH =

ǫ/µ∇ × E or −ikE =√

µ/ǫ∇ ×H then determines the expansions for ageneral solution of the sourceless Maxwell equations:

H =∑

lm

[

aElmfl(kr)X lm − i

kaMlm∇× (gl(kr)X lm)

]

(724)

E = Z0

lm

[

i

kaElm∇× fl(kr)X lm + aMlmgl(kr)X lm)

]

(725)

If all aM = 0, the fields satisfy r ·H = 0 and r ·E 6= 0, so they have the quality of electricmultipoles. With all aE = 0, the properties of E, H are switched, and the fields have thequality of magnetic multipoles.

To determine aE, aM in terms of sources, note that

r ·H = − i

k

lm

aMlmiL · (gl(kr)X lm) =1

k

lm

aMlm√

l(l + 1)gl(kr)Ylm (726)

r ·E = −Z0

k

lm

aElm√

l(l + 1fl(kr)Ylm (727)

gl(kr)aMlm =

k√

l(l + 1)

dΩY ∗lmr ·H ,

Z0fl(kr)aElm =

−k√

l(l + 1)

dΩY ∗lmr ·E (728)

The Helmholtz equations for the fields

(−∇2 − k2)H = ∇× J

(−∇2 − k2)E =iZ0

k(k2J +∇∇ · J), (729)

144 c©2010 by Charles Thorn

Page 61: 8 Lorentz Invariance and Special Relativity

where the second equation follows by taking the curl of the first one. From these, we canderive Helmholtz equations for r ·H , r ·E:

(−∇2 − k2)r ·H = r · (−∇2 − k2)H − 2∇ ·H= r · (∇× J) = iL · J (730)

(−∇2 − k2)r ·E = r · (−∇2 − k2)E − 2∇ ·E

=iZ0

kr · (k2J +∇∇ · J)− 2

Z0

ik∇ · J

=iZ0

kr · [(k2 +∇2)J +∇× (∇× J)] + 2

iZ0

k∇ · J

=iZ0

k[(k2 +∇2)r · J + iL · (∇× J)] (731)

Solving these equations with outgoing radiation boundary conditions

r ·H =

d3r′eik|r−r′|

4π|r − r′| iL · J (732)

r ·E = − iZ0

kr · J − Z0

k

d3r′eik|r−r′|

4π|r − r′|L · (∇× J) (733)

Letting r be outside all sources, so that r > r′ and J(r, ω = 0), we insert the expansion ofthe Green function in spherical harmonics to get

r ·H = −k∞∑

l=0

l∑

m=−l

h(1)l (kr)Ylm(θ, ϕ)

d3r′jl(kr′)Y ∗

lm(θ′, ϕ′)L · J (734)

r ·E = −iZ0

∞∑

l=0

l∑

m=−l

h(1)l (kr)Ylm(θ, ϕ)

d3r′jl(kr′)Y ∗

lm(θ′, ϕ′)L · (∇× J) (735)

from which we read off gl = fl = h(1)l (kr) and

aMlm =−k2

l(l + 1)

d3r′jl(kr′)Y ∗

lmL · J = −k2∫

d3r′jlX∗lm · J (736)

aElm =ik

l(l + 1)

d3r′jl(kr′)Y ∗

lmL · (∇× J)

= ik

d3r′jlX∗lm · (∇× J) (737)

Note that at finite k these are not strict lth moments of the current densities. But in thelong wavelength limit jl(kr

′) → (kr′)l/(2l + 1)!! and they do approach pure lth moments.with these definitions the fields outside the source but produced by the source J are

H =∑

lm

[

aElmh(1)l (kr)X lm − i

kaMlm∇× (h

(1)l (kr)X lm)

]

(738)

E = Z0

lm

[

i

kaElm∇× h

(1)l (kr)X lm + aMlmh

(1)l (kr)X lm)

]

(739)

145 c©2010 by Charles Thorn

Page 62: 8 Lorentz Invariance and Special Relativity

These expressions are exact, except for the stipulation that the observation point is outsidethe source. Going to the radiation zone kr ≫ 1 we use h

(1)l (kr) → (−i)l+1eikr/kr, and the

fact that in the leading behavior ∇ acts only on the factor eikr, ∇eikr = ikreikr ≡ ikeikr, wefind

H ∼ eikr

kr

lm

(−i)l+1[

aElmX lm + aMlmr ×X lm

]

(740)

E ∼ Z0eikr

kr

lm

(−i)l+1[

−aElmr ×X lm + aMlmX lm

]

∼ −Z0r ×H (741)

which determine dP/dΩ through the time averaged Poynting vector.

dP

dΩ=

Z0

2k2

lm

(−i)l+1[

aElmX lm + aMlmr ×X lm

]

2

(742)

When a single multipole contributes (e.g. in the long wavelength limit when the lowestnonvanishing multipole dominates) the angular distribution is carried by

|X lm|2 =1

l(l + 1)

(

1

2|L+Ylm|2 +

1

2|L−Ylm|2 +m2|Ylm|2

)

(743)

Example: lm = 11: L+Y11 = 0, L−Y11 = Y10√2 so

|X11|2 =1

42|Y10|2 +

1

2|Y11|2 =

3

8πcos2 θ +

3

16πsin2 θ =

3

16π(1 + cos2 θ)

We get the total radiated power by integrating over angles. The vector spherical harmonicssatisfy orthonormality conditions

dΩX∗l′m′ ·X lm = δl′lδm′m (744)

dΩX∗l′m′ · (r ×X lm) =

1√

l(l + 1)l′(l′ + 1)

dΩY ∗l′m′L · (r ×L)Ylm

=1

l(l + 1)l′(l′ + 1)

dΩY ∗l′m′L · (−irr · ∇+ ir∇)Ylm

=1

l(l + 1)l′(l′ + 1)

dΩY ∗l′m′ ·

(

−iL · r∂Ylm∂r

+ irL · ∇Ylm)

= 0 (745)

because L · ∇ = −ir · (∇×∇) = 0 and ∂Ylm/∂r = 0. This means that the angular integralover all the cross terms is 0, so we have the simple result:

P =Z0

2k2

lm

[

|aElm|2 + |aMlm|2]

, (746)

146 c©2010 by Charles Thorn

Page 63: 8 Lorentz Invariance and Special Relativity

12 Scattering of Electromagnetic Waves

So far we have described how systems with prescribed current density radiate, but themanner in which such current densities can be set up still needs to be understood. Oneway of course is to drive current in an antenna, say for the purpose of broadcast. Butspectroscopic observation of the radiation from atomic systems is an important tool forstudying those systems. To radiate the systems must be excited and a basic way to do thatis to scatter photons (or other fundamental particles like electrons or neutrons) off them.Here we consider the scattering of classical electromagnetic waves.

The basic quantity that characterizes the outcome of a scattering process is the differentialcross section which is the angular distribution of outgoing radiation normalized to unitflux. For example if the outgoing radiation is measured as power per unit solid angle, thedifferential cross section is obtained by dividing by the incident energy flux: energy per unitarea per unit time incident on the system.

12.1 Long Wavelength Scattering

To start with a very simple example, consider a long wavelength photon scattering from anonrelativistic particle. The particle responds to the fields in the wave according to

−mω2r = qEe−iωt (747)

where we assume that the velocities stay small so the magnetic force is negligible and thatthe amplitude of vibration is small compared with the wavelength so the particle only seesa uniform harmonically varying field. Then the induced dipole moment is p = −q2E/mω2,where we omit the e−iωt factor. This oscillating dipole emits dipole radiation according to

dP

dΩ=

Z0c2k4

32π2|r × (r × p)|2 = q4Z0c

2k4

32π2m2ω4|r × (r ×E)|2 = q4Z0

32π2m2c2|r × (r ×E)|2

The time averaged Poynting vector of the incident wave E(r, t) = ε0Eeik0·r−iωt, where we

normalize ε0 · ε∗0, is

〈S〉 =E2

2Re ε0 ×

(

k0

µ0ω× ε∗0

)

= k0E2

2µ0cRe ε0 · ε∗0 = k0

E2

2Z0

(748)

Dividing dP/dΩ by E2/2Z0 gives the differential cross section

dΩ=

q4Z20

16π2m2c2|r × (r × ε0)|2 =

(

q2

4πǫ0~c

)2(~

mc

)2

|r × (r × ε0)|2 (749)

If we observe a particular polarization ǫ in the scattered wave we replace r × (r × ε0) →ε∗ · r × (r × ε0) = −ε∗ · ε0 inside the absolute values:

dΩ=

(

q2

4πǫ0~c

)2(~

mc

)2

|ε∗ · ε0|2 =q4

e4m2

e

m2

(

α~

mec

)2

|ε∗ · ε0|2 (750)

147 c©2010 by Charles Thorn

Page 64: 8 Lorentz Invariance and Special Relativity

which is just the Thomson cross section. Here α = e2/4πǫ0~c ≈ 1/137 is the fine structureconstant. The length rc = α~/mec is actually independent of ~ and is sometimes called theclassical radius of the electron. The Bohr radius is the length a0 = ~/mecα ≈ .5× 10−8cm.we see that rc is α

2 = O(10−4) smaller than the Bohr radius. If polarization is not measuredwe use

λ

εiλεj∗λ = δij − kikj (751)

to obtain

dΩ=

q4

e4

(

α~

mc

)2

(1− |k · ε0|2) (752)

If in addition the beam is unpolarized we find

dΩ=

q4

e4

(

α~

mc

)21 + cos2 θ

2(753)

This simple long wavelength calculation of the cross section can be extended to any targetfor which the dipole moment induced by a uniform (since ka ≪ 1) field is known. In ourelectrostatic study of a dielectric sphere of radius a we found the induced electric dipolemoment

p = 4πǫ0a3 ǫ− ǫ0ǫ+ 2ǫ0

E (754)

Since an em wave contains a magnetic field as well as an electric field, a magnetic momentis also induced.

m = 4πa3µ− µ0

µ+ 2µ0

H (755)

The limits ǫ→ ∞ and µ→ 0 give the induced electric and magnetic moments of a perfectlyconducting sphere:

p = 4πǫ0a3E, m = −2πa3H , Perfectly Conducting Sphere (756)

In the radiation zone expressions for the fields p and m come together in the combination

i

ω

d3rJe−ik·r ∼ (p− 1

ck ×m+ · · · )

→ 4πa3ǫ0

(

E0 +1

2cǫ0k ×H0

)

= 4πa3ǫ0

(

E0 +1

2k × (k0 ×E0)

)

We can obtain the cross section for a perfectly conducting sphere from that of a particlethrough the substitutions q2/mω2 → 4πǫ0a

3 and ε0 → ε0 + k × (k0 × ε0)/2:

dΩ≈ a6k4|ε∗ · k × (k × (ε0 +

1

2k × (k0 × ε0)))|2

148 c©2010 by Charles Thorn

Page 65: 8 Lorentz Invariance and Special Relativity

≈ a6k4|ε∗ · (ε0 +1

2k × (k0 × ε0))|2 = a6k4

ε∗ · ε0 −1

2(k × ε∗) · (k0 × ε0)

2

≈ a6k4∣

ε∗ · ε0(

1− 1

2cos θ

)

+1

2k0 · ε∗k · ε0

2

(757)

The angular dependence of this expression is quite involved. But it is easy to see thatscattering is maximal in the backward direction: at θ = π the cross section is nine times itsvalue at θ = 0. This asymmetry is due to the interference between the electric and magneticdipole contributions to the amplitude. The k4 = ω4/c2 frequency dependence shows thatred light is much less scattered than blue light and is the essential reason the sky is blueand sunsets are red. This same frequency dependence is a typical feature of long wavelengthscattering of bound systems of finite size: the size a is needed for dimensional reasons.

Scattering off a collection of objects located at various positions rk can be easily handledin this approximation. The total fields are obtained by adding the contributions from eachobject. The incident field at kth location will be a common amplitude times the phasefactor eik0·rk . Also the field in the radiation zone from the kth object has an additionalphase factor e−ik·rk . If all the scatterers are otherwise identical and multiple scattering isneglected (valid for a dilute enough gas) the cross section from many scatterers is just thatof a single scatterer times the factor

F (q) =

k

eiq·rk

2

(758)

where q = k0 − k. There are two opposite extremes in the evaluation of F . First, if thescatterer locations are random, the cross terms in the absolute square have random phasesand so average to zero. Then F = N the total number of particles. The other extreme isif the locations are regularly spaced (as in a crystal). Suppose for example that they areequally spaced on a line with spacing L. Then we have

N−1∑

k=0

eiq·Lk

2

=

1− eiNq·L

1− eiq·L

2

=sin2(Nq ·L/2)sin2(q ·L/2) (759)

which for large N is strongly peaked in the forward direction (q = 0), where it is of orderO(N2).

12.2 General Formulation of Scattering

Suppose we were able to solve the full set of Maxwell equations and particle equations ofmotion exactly. How would we use the solution to calculate the differential cross section? Theanswer is that we would find the solution which satisfies the boundary conditions appropriateto the scattering process. These boundary conditions are that at large r well outside thetarget each field be a linear combination of a plane wave, representing the incident beamand outgoing radial waves of the form eikr/r. We lose no information by taking the incident

149 c©2010 by Charles Thorn

Page 66: 8 Lorentz Invariance and Special Relativity

wave in the positive z-direction. Then, assuming all fields have time dependence e−iωt, thefields should have the asymptotic behavior

E = E0eikz + F

eikr

r(760)

H =1

Z0

(z ×E0)eikz +

eikr

r

k × F

Z0

(761)

The square of the coefficient of eikr/r enters the radiated power while the square of thecoefficient of eikz enters the incident flux:

dP

dΩ=

1

2Z0

F ∗ · F , Flux =1

2Z0

E∗ ·E (762)

dΩ=

F ∗ · FE∗ ·E (763)

It is then convenient to define the scattering amplitude f as the F associated with thenormalized incident electric field E0 = ε0, ε

∗0 · ε0 = 1:

E ∼ ε0eikz +

eikr

rf ,

dΩ= |ε∗ · f |2 (764)

12.3 The Born Approximation

When the interaction of the incident beam with the target is sufficiently weak, a perturbativecalculation can be attempted. The Born approximation is the lowest order approximationto the scattering amplitude in such an approach. Our discussion will be restricted to targetscharacterized by dielectric and magnetic properties, D 6= ǫ0E and B 6= µ0H over a localizedregion. We also assume that all fields have the same harmonic dependence e−iωt. Then wecan rearrange Maxwell’s equations,

−iωD = ∇×H = ∇× B

µ0

+∇×(

H − B

µ0

)

= ∇× ∇×E

iωµ0

+∇×(

H − B

µ0

)

= ∇× ∇×D

iωǫ0µ0

+∇× ∇× (ǫ0E −D)

iωǫ0µ0

+∇×(

H − B

µ0

)

(−∇2 − k2)D = −∇× (∇× (ǫ0E −D))− iωǫ0∇× (µ0H −B) (765)

where k = ωǫ0µ0 = ω/c. We can then use the Green function for the Helmholtz equation towrite an integral equation with scattering boundary conditions:

D = D0eikz −

d3x′eik|r−r′|

4π|r − r′| [∇× (∇× (ǫ0E −D)) + iωǫ0∇× (µ0H −B)](766)

∼ D0eikz − eikr

r

d3x′

4πe−ik·r′

[∇× (∇× (ǫ0E −D)) + iωǫ0∇× (µ0H +B)]

∼ D0eikz +

eikr

r

d3x′

4πe−ik·r′

[k × (k × (ǫ0E −D)) + ωǫ0k × (µ0H −B)] (767)

150 c©2010 by Charles Thorn

Page 67: 8 Lorentz Invariance and Special Relativity

with a similar equation for B. If D0 is normalized to ε0 the coefficient of eikr/r is thescattering amplitude f .

f =

d3x′

4πe−ik·r′

[k × k × (ǫ0E −D) + ωǫ0k × (µ0H −B)] (768)

The Born approximation is to write

D − ǫ0E = (ǫ− ǫ0)E ≈ (ǫ− ǫ0)E0eikz = (ǫr − 1)ε0e

ikz

andB − µ0H = (µ− µ0)H ≈ (µ− µ0)H0e

ikz = (µ− µ0)z ×ε0

Z0ǫ0eikz

where E0 and H0 are the fields of the incident plane wave.

fBorn =

d3x′

4πeikz

′−ik·r′

[−k × (k × (ǫr − 1)ε0)− kk × ((µr − 1)z × ε0)]

ε∗ · fBorn =k2

d3x′e−iq·r′

[

(ǫr − 1)ε∗ · ε0 + (µr − 1)(k × ε∗) · (z × ε0)]

(769)

where q = k−kz, which is analogous to the momentum transfer in quantum mechanics. Thedepartures of ǫ, µ from their vacuum values plays the role of the potential in the quantummechanical born approximation.

12.4 Scattering from a Perfectly Conducting Sphere

If the target of the scattering process is a fixed perfectly conducting sphere of radius R, theinteraction of the target with the em field is completely subsumed in the boundary conditionsE‖(r = R) = 0 and r ·H = 0. Expanding the fields in vector spherical harmonics

H =∑

lm

[

flm(kr)X lm − i

k∇× (glm(kr)X lm)

]

(770)

E = Z0

lm

[

i

k∇× flm(kr)X lm + glm(kr)X lm)

]

(771)

we see that these boundary conditions amount to

glm(kR) = 0,∂

∂r(rflm(kr))

r=R= 0 (772)

To apply these conditions we need to expand the incident plane wave in vector sphericalharmonics. Since the plane wave is finite everywhere both spherical Bessel functions mustbe jl(kr):

E0 = ε0eikz = Z0

lm

[

i

kaE0lm∇× jl(kr)X lm + aM0lmjl(kr)X lm)

]

(773)

H0 =z × ε0

Z0

eikz =∑

lm

[

aE0lmjl(kr)X lm − i

kaM0lm∇× (jl(kr)X lm)

]

(774)

151 c©2010 by Charles Thorn

Page 68: 8 Lorentz Invariance and Special Relativity

Then, dotting both fields into r, we find

Z0jl(kr)aMlm =

k√

l(l + 1)

dΩY ∗lmr · (z × ε0)e

ikz (775)

Z0jl(kr)aElm =

−k√

l(l + 1)

dΩY ∗lmr · ε0eikz (776)

the easiest way to do these integrals is to identify the spherical harmonic expansion of eikz,which we can get by taking the r → ∞ and θ → 0 limit of

eik|r−r′|

4π|r − r′| = ik

∞∑

l=0

h(1)l (kr>)jl(kr<)

l∑

m=−l

Ylm(θ, ϕ)Y∗lm(θ

′, ϕ′)

e−ik·r′

4π= i

∞∑

l=0

(−i)l+1jl(kr′)Yl0(0, ϕ)Y

∗l0(θ

′, ϕ′))

eik·r′

=∞∑

l=0

il√

4π(2l + 1)jl(kr′)Yl0(θ

′, ϕ′)

eikz = eikr cos θ =∞∑

l=0

il√

4π(2l + 1)jl(kr)Yl0(θ, ϕ)

r sin θe±iϕeikz = ie±iϕ

k

∂θeikr cos θ =

i

k

∞∑

l=0

il√

4π(2l + 1)jl(kr)e±iϕ ∂

∂θYl0(θ, ϕ)

= ± i

k

∞∑

l=0

il√

4π(2l + 1)jl(kr)√

l(l + 1)Yl±1 (777)

Here we used the fact that L±Yl0 = ±e±iϕ∂Yl0/∂θ which is true because Yl0 is independentof ϕ. Next we write, with ǫ0± = (ǫ0x ± iǫ0y)

r · ε0 = r sin θ(ǫ0x cosϕ+ ǫ0y sinϕ) =1

2r sin θ(ǫ0+e

−iϕ + ǫ0−eiϕ) (778)

r · (z × ε0) = r sin θ(ǫ0x sinϕ− ǫ0y cosϕ) =i

2r sin θ(ǫ0+e

−iϕ − ǫ0−eiϕ) (779)

Then

Z0aM0lm =

1

2il√

4π(2l + 1) (ǫ0+δm,−1 + ǫ0−δm,1) (780)

Z0aE0lm =

i

2il√

4π(2l + 1) (ǫ0+δm,−1 − ǫ0−δm,1) (781)

H0 =1

2Z0

lm

il√

4π(2l + 1)[

jl(kr)i(ǫ0+X l−1 − ǫ0−X l1)

152 c©2010 by Charles Thorn

Page 69: 8 Lorentz Invariance and Special Relativity

− i

k∇× (jl(kr)(ǫ0+X l−1 + ǫ0−X l1)

]

(782)

E0 =1

2

lm

il√

4π(2l + 1)[ i

k∇× jl(kr)i(ǫ0+X l−1 − ǫ0−X l1)

+jl(kr)(ǫ0+X l−1 + ǫ0−X l1)]

(783)

To get the total fields we add the scattered wave which has only outgoing waves, and soinvolves the first spherical Hankel functions:

HS =∑

lm

[

aESlmh(1)l (kr)X lm − i

kaMSlm∇× (h

(1)l (kr)X lm)

]

(784)

ES = Z0

lm

[

i

kaESlm∇× h

(1)l (kr)X lm + aMSlmh

(1)l (kr)X lm)

]

(785)

We read off the scattering amplitude as the coefficient of eikr/r in E:

f =Z0

k

lm

(−i)l+1[

aMSlmX lm − aESlmk ×X lm

]

(786)

Then we have

flm(kr) = aE0lmjl(kr) + aESlmh(1)l (kr) ≡ aE0lm

[

jl(kr) +βl2h(1)l (kr)

]

glm(kr) = aM0lmjl(kr) + aMSlmh(1)l (kr) ≡ aM0lm

[

jl(kr) +αl

2h(1)l (kr)

]

(787)

and the boundary conditions then read

aM0lmjl(kR) + aMSlmh(1)l (kR) = 0,

aMSlmaM0lm

= − jl(kR)

h(1)(kR)≡ αl

2(788)

d

drr(aE0lmjl(kr) + aESlmh

(1)l (kr))

r=R= 0,

aESlmaE0lm

= − (rjl)′(kR)

(rh(1)l )′(kR)

≡ βl2

(789)

The scattering amplitude is then

f =Z0

2k

lm

(−i)l+1[

αlaM0lmX lm − βla

E0lmk ×X lm

]

= −i√π

2k

∞∑

l=1

√2l + 1

[

αl (ǫ0+X l−1 + ǫ0−X l1)− iβlk × (ǫ0+X l−1 − ǫ0−X l1)]

(790)

αl = − 2jl(kR)

h(1)l (kR)

= −1− h(2)l (kR)

h(1)l (kR)

≡ e2iδl − 1 (791)

βl = − 2(kRjl(kR))′

(kRh(1)l (kR))′

= −1− (kRh(2)l (kR))′

(kRh(1)l (kR))′

≡ e2iδ′

l − 1 (792)

153 c©2010 by Charles Thorn

Page 70: 8 Lorentz Invariance and Special Relativity

where δl, δ′l are called the scattering phase shifts. The formula for f gives the exact scattering

amplitude for a perfectly conducting sphere, albeit as an infinite sum. Notice that the sumover l starts at l = 1 because the m values of the vector spherical harmonics are m = ±1.

As a check on the formula, let’s take the long wavelength limit kR ≪ 1 to compare withour earlier calculation. For this we need the small argument limits of the the spherical Besselfunctions:

jl(x) ∼ xl

(2l + 1)!!, (xjl(x))

′ ∼ (l + 1)xl

(2l + 1)!!(793)

h(1)l (x) ∼ −i(2l − 1)!!

xl+1, (xh

(1)l (x))′ ∼ il

(2l − 1)!!

xl+1(794)

αl ∼ −2i(kR)2l+1

(2l + 1)!!(2l − 1)!!, βl ∼ 2i

l + 1

l

(kR)2l+1

(2l + 1)!!(2l − 1)!!(795)

Thus the l = 1 terms dominate the long wavelength limit, α1 ∼ −2i(kR)3/3 and β1 ∼4i(kR)3/3.

f = −√

π

3k2R3

[

(ǫ0+X1−1 + ǫ0−X11) + (2i)k × (ǫ0+X1−1 − ǫ0−X11)]

(796)

The vector spherical harmonics for l = 1 can be evaluated

Xz1m =

m√2Y1m, Xx

1m ± iXy1m =

2−m(m± 1)√2

Y1m±1 (797)

With a little algebra one can work out the relations

ǫ0+X1−1 − ǫ0−X11 = −i√

3

4πk × ε0 (798)

ǫ0+X1−1 + ǫ0−X11 = −√

3

4πk × (k0 × ε0) (799)

f = −k2R3

2

[

−k × (k0 × ε0) + 2k × (k × ε0)]

ε∗ · f = k2R3

[

−1

2(k × ε∗) · (k0 × ε0) + ε∗ · ε0

]

(800)

which precisely agrees with our direct long wavelength calculation. The first term is themagnetic and the second the electric dipole contributions.

12.5 Short wavelength approximation and diffraction

When the wavelength of the radiation is much shorter than the system size, we can bringin a different simplification. It is based on the idea that in the short wavelength limitgeometrical optics becomes valid. In quantum mechanics this idea is expressed in the WKB

154 c©2010 by Charles Thorn

Page 71: 8 Lorentz Invariance and Special Relativity

approximation in which the wave mechanics goes over to particle mechanics and interferencephenomena become negligible. For scattering of light it becomes a good approximation tothink of the beam as a bundle of rays that follow definite trajectories. Classic diffractiontheory has to do with the first corrections to the geometrical optics approximation in whichsome interference is taken into account.

We begin our discussion with diffraction. Imagine a screen with several apertures occu-pying the xy-plane. Light is incident on the screen from the region z < 0 and is observed atlarge z > 0. In zeroth approximation ka → ∞ the light rays are either completely reflectedor pass through the apertures undeflected casting sharp shadow images of the apertures ona distant screen. If ka is large but finite the wave nature of light comes into play. Thecomponents of the fields satisfy the Helmholtz equation. An approximate way to thinkabout this is to imagine that the fields in the region z > 0 are sourced by the fields in theapertures, which can be approximated by the fields of the incident wave in the apertures.To implement this idea introduce a Green function G(r, r′) for the Helmholtz equation,(−∇2 − k2)G = δ(r − r′) and write out Green’s Theorem in the half-space z > 0 for a fieldcomponent solving (−∇2 − k2)F = 0

F (r) = −∮

S

dS ′ (F (r′)n · ∇′G(r′, r)−Gn · ∇′F ) (801)

Here we take the surface to be the xy-plane closed by a hemisphere of infinite radius in theupper half space. Since G and F both will have large r behavior eikr/r the leading behaviorof the normal derivative on the hemisphere of either F or G is simply a factor of ik timesitself, so this leading 1/r2 term cancels between the two terms in the integrand, which thenhas large r behavior 1/r3. Thus the contribution of the hemisphere to the right side vanishes,leaving only the contribution of the xy-plane. On the xy-plane n = −z so we obtain

F (r) =

dx′dy′(

F∂G

∂z′−G

∂F

∂z′

)

z′=0(802)

this exact formula can then be used to implement plausible approximations. Intuitively wewould like the only contribution to come from the apertures, where the values of F, ∂z′F areapproximated by the their values in the incident field.

A self-consistent way to implement this approximation is to choose G to satisfy eitherNeumann of Dirichlet boundary conditions on the xy plane. Then the above formula can bewritten in two ways

F (r) = −∫

dx′dy′GN∂F

∂z′

z′=0, F (r) =

dx′dy′F∂GD

∂z′

z′=0(803)

GND

=eik|r−r′|

4π|r − r′| ±eik|r−r′

P |

4π|r − r′P |

(804)

GN → eik|r−r′|

2π|r − r′| , ∂z′GD → −(

ikz − z

|r − r′|

)

eik|r−r′|

2π|r − r′|2 (805)

155 c©2010 by Charles Thorn

Page 72: 8 Lorentz Invariance and Special Relativity

where rP = (x, y,−z). The limiting forms in the last line are for z′ = 0. Then the approxi-mation for diffraction is

F (r) ≈ −∫

apertures

dx′dy′eik|r−r′|

2π|r − r′|∂F0

∂z′

z′=0(806)

F (r) ≈ −∫

apertures

dx′dy′F0

(

ikz − z

|r − r′|

)

eik|r−r′|

2π|r − r′|2∣

z′=0(807)

where F0 is the corresponding field in the incident beam. An earlier less consistent versionof this approximation chooses G = G0 = eik|r−r′|/4π|r − r′|, the empty space version andrestricts the original expression to the apertures.

F (r) =

apertures

dx′dy′(

F0∂G0

∂z′−G0

∂F0

∂z′

)

(808)

This last form is less consistent because it implicitly assumes both F and ∂z′F vanish onthe screen off the apertures, which is impossible unless F = 0 everywhere. In the first twoversions one requires F = 0 or ∂z′F = 0 but not both. However, in the regime where theapproximation is valid, namely in the far forward region, all three forms agree.

What we have described so far treats all field components independently, whereas weknow that Maxwell’s equations imply relations among the field components, only two ofwhich are independent. A proper vectorial formulation is complicated except for the caseof a perfectly conducting screen with apertures. In that case a remarkably simple relationemerges due to Smythe. The derivation, which is subtle and involved, is sketched in Jackson.The upshot is the exact formula for the diffracted electric field in the region z > 0

Ediff = ∇×∫

dx′dy′z ×Eeik|r−r′|

2π|r − r′| = ∇×∫

apertures

dx′dy′ z ×Eeik|r−r′|

2π|r − r′| (809)

by perfect conductor boundary conditions. The approximate form then just inserts theincident field in the integral

Ediff ≈ ∇×∫

apertures

dx′dy′ z ×E0eik|r−r′|

2π|r − r′| (810)

∼ ik × eikr

2πr

apertures

dx′dy′ z ×E0 e−ik·r′

(811)

You will study an application of this formula in one of the new homework problems.

12.6 Short Wavelength Scattering

In order to apply geometrical optics concepts to a general scattering process including po-larization dependence, we need a vectorial version of Green’s theorem. A convenient form isgiven by

E(r) = −∮

S

dS ′ [iω(n′ ×B)G+ (n′ ×E)×∇′G+ (n′ ·E)∇′G] (812)

156 c©2010 by Charles Thorn

Page 73: 8 Lorentz Invariance and Special Relativity

To prove this one writes the right side as a volume integral of the expression with n′ → ∇′

and reduces that expression using the sourceless Maxwell equations:

iω(∇′ ×B)G+ (∇′ ×E)×∇′G+ (∇′ ·E)∇′G

= ∇′ × (G∇′ ×E) +∇′i(E∇′

iG)−∇′(E · ∇′G) +∇′i(Ei∇′G)

= ∇′i(G∇′Ei)−∇′

i(G∇′iE) +∇′

i(E∇′iG)−∇′(E · ∇′G) +∇′

i(Ei∇′G)

= G∇′2E +E∇′2G = −Eδ(r − r′) (813)

Integrating both sides over the region bounded by S completes the proof.We next choose S to bound the region between a surface S1 that encloses the target system

and is just outside it and a large spherical surface at infinity (for which n′ = r). Since (??)applies for any solution of the sourceless Maxwell equations, it applies separately to the totalfield E = E0 +Es and the scattered field Es when we discuss scattering. For the scatteredfield we now argue that the contribution of the spherical surface at infinity vanishes. Sinceboth Es and G have large r behavior eikr/r, the 1/r2 behavior of the integrand at infinitycomes from replacing ∇′ → ikn′:

iω(n′ ×Bs)G+ (n′ ×Es)×∇′G+ (n′ ·Es)∇′G → G(iω(n′ ×Bs) + ikEs)

→ O(r−3) (814)

where we used

iω(n′ ×Bs) = n′ × (∇′ ×Es) → ikn′ × (n′ ×Es) = ikn′n′ ·Es − ikEs → −ikEs +O(r−3)

since the the scattered electric field at large r is perpendicular to the radial direction. Thuswe have

Es(r) = −∮

S1

dS ′ [iω(n′ ×Bs)G+ (n′ ×Es)×∇′G+ (n′ ·Es)∇′G] (815)

Since the scattered field Es has asymptotic large r behavior feikr/r, we can read off thescattering amplitude in terms of the surface integral over S1 by taking the large r limit ofthe Green function

G ∼ eikr

4πre−ik·r′

, ∇′G ∼ −ik eikr

4πre−ik·r′

(816)

f =−i4π

S1

dS ′ [ω(n′ ×Bs)− (n′ ×Es)× k − k(n′ ·Es)] e−ik·r′

(817)

Since k·f = 0, the last term in square brackets should after integration cancel the componentfrom the first term parallel to k. However this issue can be finessed by specifying a definitefinal polarization, which automatically removes all terms parallel to k:

ε∗ · f =−i4π

S1

dS ′ε∗ · [ω(n′ ×Bs) + k × (n′ ×Es)] e−ik·r′

(818)

157 c©2010 by Charles Thorn

Page 74: 8 Lorentz Invariance and Special Relativity

This formula is exact, valid for all wavelengths. Turning now to the problem of short wave-length scattering, we consider an opaque target system that is large in all respects comparedto the wavelength. Then first considering the geometrical optics description, we can saythat the incident light illuminates the front of the object, leaving the back in total darkness.In the geometrical optics limit the boundary between the illuminated and shadow regionsis sharp. In applying the above formula we can divide the surface S1 into dark and lightregions. In the dark region it is reasonable to approximate Es ≈ −E0.

ε∗ · f shadow ≈ i

shadow

dS ′ε∗ · [(n′ × (k0 × ε0) + k × (n′ × ε0)] ei(k0−k)·r′

≈ i

shadow

dS ′ε∗ · [k0ε0 · n′ − ε0(k0 + k) · n′ + n′k · ε0] ei(k0−k)·r′

Since kr, k0r ≫ 1 the integrand oscillates rapidly and is suppressed unless k ≈ k0. Thus wecan set k = k0 in the prefactors and the amplitude simplifies to

ε∗ · f shadow ≈ −ε∗ · ε0i

shadow

dS ′k0 · n′ei(k0−k)·r′

(819)

Let d2x⊥ be an element of area perpendicular to k0. Then dS′k0·n′ = −kd2x (the sign coming

because n′ was the inward normal to S1). Furthermore r′ ·(k0−k) = z′k(1−cos θ)−k⊥ ·r⊥ ≈−k⊥ · r⊥+O(θ2). That is the integral is over the region projected by the shadow surface onthe plane perpendicular to k0:

ε∗ · f shadow ≈ ε∗ · ε0ik

shadow

d2x′e−ik⊥·r′

⊥ (820)

If the projected area is a disk of radius R, the integral is

shadow

d2x′e−ik⊥·r′

⊥ =

∫ R

0

ρdρ

∫ 2π

0

dϕe−ikρ sin θ cosϕ (821)

= 2π

∫ R

0

ρdρJ0(kρ sin θ) = 2πR2J1(kR sin θ)

kR sin θ(822)

ε∗ · f shadow ≈ ikR2ε∗ · ε0J1(kR sin θ)

kR sin θ(823)

The homework problem J, 10.16 shows that the part of the total cross section due to shadowscattering is the projected area, regardless of the shape of scatterer. In the case of the discthat is πR2. The part of the scattering amplitude coming from the illuminated part of S1

depends in detail on the nature of the surface.

12.7 The Optical Theorem

Conservation of energy in classical physics (and of probability in quantum physics) lead toan interesting useful connection between the total scattering cross section and the forward

158 c©2010 by Charles Thorn

Page 75: 8 Lorentz Invariance and Special Relativity

elastic scattering amplitude. We discuss this optical theorem in the context of classicalelectromagnetism. We assume harmonic time variation and take the time averaged Poyntingvector to measure the energy flow.

〈S〉 =1

2Re E ×H∗ (824)

The total cross section is total power removed from the incident beam divided by the incidentflux of energy. Power can be removed from the beam in two distinct ways: (1) the targetcan absorb energy and (2) the target can scatter energy out to infinity. to get expressionsfor these power losses we surround the target by a surface S1. Let us define the outwardnormal at each point on the surface to be n. We write the total fields as a sum of incidentand scattered fields

E = E0 +Es, H = H0 +Hs (825)

Then the total power scattered out to infinity is carried by the scattered fields:

Ps =1

2Re

S1

dSn ·Es ×H∗s (826)

The total power absorbed by the target is determined by the total fields

Pa = −1

2Re

S1

dSn ·E ×H∗ (827)

And of course the incident beam is not affected by the target so it carries zero net powerinto the region bounded by S1

1

2Re

S1

dSn ·E0 ×H∗0 = 0 (828)

The total power removed from the beam is then

P =1

2Re

S1

dSn · (Es ×H∗s −E ×H∗)

=1

2Re

S1

dSn · (−E0 ×H∗0 −E0 ×H∗

s −Es ×H∗0)

= −1

2Re

S1

dSn · (E∗0 ×Hs +Es ×H∗

0) (829)

The final step is to parameterize the incident beam as a plane wave, which we normalize tounit field strength E0 = 1:

E0 = ε0eik0·r, H0 = Z−1

0 k0 × ε0eik0·r (830)

159 c©2010 by Charles Thorn

Page 76: 8 Lorentz Invariance and Special Relativity

With this normalization convention, our expressions for the “power” P have the units ofwatts/(volts/m)2 or m2/Ω. Also the incident “flux” is just 1/2Z0! Plugging these expressionsinto the total power gives

σtotal ≡ P

1/2Z0

= −Re

S1

dSn · (ε∗0 × cµ0Hs +Es × (k0 × ε∗0))e−ik0·r

= −Re

S1

dSn · (ε∗0 × cBs +Es × (k0 × ε∗0))e−ik0·r

= Re

S1

dSε∗0 · (n× cBs + k0 × (n×Es))e−ik0·r (831)

Finally, recall our formula for the scattering amplitude

ε∗ · f =−i4π

S1

dS ′ε∗ · [ω(n′ ×Bs) + k × (n′ ×Es)] e−ik·r′

(832)

where we remember that the normal in this formula is inwardly directed w.r.t. S1. Changingthe sign and evaluating this formula in the forward elastic direction (k = k0 and ε = ε0) wefind

ε∗0 · f |k=k0=

ik

S1

dSε∗0 ·[

c(n×Bs) + k0 × (n×Es)]

e−ik0·r′

Im ε∗0 · f |k=k0=

k

4πRe

S1

dSε∗0 ·[

c(n×Bs) + k0 × (n×Es)]

e−ik0·r′

(833)

The optical theorem immediately follows

σtotal =4π

kIm ε∗0 · f |k=k0

(834)

Notice that the right side is the imaginary part of the completely forward amplitude: theinitial and final states being exactly the same. It is also important to appreciate that theright side involves the exact scattering amplitude. For example the Born approximation (firstorder perturbation theory) is real, with the imaginary part appearing only in corrections tothe approximation. In this regard, notice that a direct calculation of the left side involvesthe square of scattering amplitudes, whereas the right side is linear. Thus if f = O(g) theleft side of the optical theorem is O(g2) and it follows that the O(g) contribution to the rightside vanish, implying that the O(g) contribution to the forward scattering amplitude mustbe real.

160 c©2010 by Charles Thorn

Page 77: 8 Lorentz Invariance and Special Relativity

13 Energy Loss and Cherenkov Radiation

A charged particle moving through matter will lose energy through its interaction with theatoms in the material. This is a very broad subject, covered in some detail by Chapter 13of Jackson. Here, of all the different aspects of the subject, I will single out just one: theradiation by a particle passing through matter at a speed v that exceeds the phase velocityof light in the material v > c/n where n > 1 is the index of refraction. This is Cherenkovradiation, named after the physicist who discovered it. I will follow the treatment of Landauand Lifshitz, Electrodynamics of Continuous Matter, Chapter 12.

To set the stage, we first discuss the energy loss from a particle with any velocity, so wecan appreciate what is special about v > c/n. We need to solve Maxwell’s equations withthe free charge and current densities of a point particle moving at velocity v:

ρ(r, t) = Qδ(r − vt) =

d3k

(2π)3(Qe−ik·vt)eik·r

J(r, t) = Qvδ(r − vt) =

d3k

(2π)3(Qve−ik·vt)eik·r. (835)

We see that the Fourier transforms of the sources have harmonic time dependence e−ik·vt!This means that if we Fourier transform the Maxwell equations the fields in k space will alsohave this time dependence:

H =

d3k

(2π)3Hke

−ik·vteik·r (836)

E =

d3k

(2π)3Eke

−ik·vteik·r. (837)

We will assume that µ = µ0 and ǫ(ω) > ǫ0. Since the fields oscillate with frequency ω = k ·v,we use the dielectric constant ǫ(k · v) in the Maxwell equations for Hk,Ek:

k ·Hk = 0, ik ×Hk + ik · vǫEk = Qv (838)

ik · ǫE = Q, ik ×Ek = ik · vµ0Hk (839)

Hk =k ×Ek

k · vµ0

, Ek = −iQǫ

k − v · kµ0ǫv

k2 − (k · v)2µ0ǫ(840)

Choose coordinates so v = vz and calculate the electric field in coordinate space

E = −i∫

d2k⊥dkz(2π)3

Q

ǫ

k⊥ + zkz(1− v2µ0ǫ)

k2⊥ + k2z(1− v2µ0ǫ)

eik·(r−vt) (841)

This field exerts a force on the particle given by

F = QE(vt) = −iQ2z

d2k⊥dkz(2π)3

1

ǫ

kz(1− v2µ0ǫ)

k2⊥ + k2z(1− v2µ0ǫ)

(842)

= −iQ2z

πdk2⊥dkz(2π)3

1

ǫ

kz(1− v2µ0ǫ)

k2⊥ + k2z(1− v2µ0ǫ)

(843)

161 c©2010 by Charles Thorn

Page 78: 8 Lorentz Invariance and Special Relativity

This force should of course be 0 for motion in the vacuum ǫ = ǫ0. It will be if we interpretthe (log divergent) kz integral as the limit of the integral over the domain −K < kz < Kas K → ∞. In the material ǫ(kzv) and we can get a nonzero result. If v <

√µ0ǫ = c/n,

a nonzero integral will arise only if Imǫ 6= 0. This is because the real part is even and theimaginary part is odd.

We now see the what is special about the case v > c/n. In that case the denominatorin the integrand vanishes in the range of integration over k⊥, with the possibility of anonzero integral with real ǫ, i.e. a transparent material. To interpret the integral near thiszero we recall that ǫ will always have a small positive (negative) imaginary part for ω > 0(ω < 0). In the first case the denominator has a small negative imaginary part, so wetake the contribution of a small semi-circle contour into the lower half plane, which yields acontribution to F of

−iQ2z(πi)πdkz(2π)3

1

ǫkz(1− v2µ0ǫ) = −Q

2

8πzdkz

|kz|ǫ

(v2µ0ǫ− 1) (844)

the contribution for negative kz doubles this to give

dF = −Q2

4πzdkz

|kz|ǫ

(v2µ0ǫ− 1) = − Q2

4πǫ0zωdω

c2

(

1− c2

n2v2

)

(845)

This force is opposite to the direction of v, so its magnitude gives the energy loss per unitlength due to Cherenkov radiation, which is a measure of the intensity of the radiation.

Since the entire contribution comes from the wave number for which the denominator ofthe integrand vanishes, we conclude that the radiation wave vector satisfies

0 = k2 − k2zv2n2

c2= k2

(

1− v2n2

c2cos2 θ

)

(846)

cos θ =c

vn(ω)(847)

Here θ is the angle between the Cherenkov radiation wave vector and the velocity of thecharged particle. The radiation comes out on a cone of opening angle 2θ. This featuremakes Cherenkov radiation into a tool to measure velocities of particles: measuring theopening angle of the cone of radiation and its frequency (to determine n(ω)), one can thencalculate the speed of the particle. Furthermore, from the direction of the electric field it isclear that the polarization of the radiation is in the plane determined by k and v.

162 c©2010 by Charles Thorn

Page 79: 8 Lorentz Invariance and Special Relativity

14 Radiation from a particle in relativistic motion

In most applications so far we have assumed that the systems of particles producing radiationare in nonrelativistic motion. Here we remove this restriction, at least for the case of a singlecharged point particle.

14.1 Lienard-Wiechert Potentials and Fields

Let the position of the particle at time t be r(t). Then the charge and current densities aregiven by

ρ(r, t) = qδ(r − r(t)), J(r, t) = qr(t)δ(r − r(t)) (848)

In Lorenz gauge the vector potential satisfies the inhomogeneous wave equation, which wesolve in terms of the retarded Green function

−∂2Aµ(r, t) =

(

−∇2 +1

c2∂2

∂t2

)

Aµ = µ0Jµ(r, t) (849)

Aµ(r, t) = µ0

d3x′dt′Gr(r, t; r′, t′)Jµ(r

′, t′) (850)

Gr(r, t; r′, t′) =

δ(t− t′ − |r − r′|/c)4π|r − r′| (851)

Plugging the current for a point particle into these formulas leads to the remarkably simpleLienard-Wiechert potentials

φ(r, t) =1

4πǫ0

q

(1− β(t′) · n(t′))|r − r(t′)|∣

t′=t−|r−r′|/c(852)

A(r, t) =µ0

qr(t′)

(1− β(t′) · n(t′))|r − r(t′)|∣

t′=t−|r−r′|/c= ǫ0µ0r(t

′)φ(r, t) (853)

where we have introduced β(t′) ≡ r(t′)/c and n(t′) ≡ (r − r(t′))/|r − r(t′)|. The factors(1 − β(t′) · n(t′)) come from the integral of the δ(t − t′ − |r − r(t′)|/c) over t′: recallthat δ(f(x)) =

δ(x − zi)/|f ′(zi)| where zi are the zeroes of f . The simplicity of theseformulas is somewhat deceptive, since one only knows t′(r, t) implicitly as the solution oft′ = t− |r − r(t′)|/c.

To calculate the fields from the potentials we have to take into account the dependenceof t′ on both t and r:

∂t′

∂t= 1 +

∂t′

∂t

r(t′)

c· n, ∂t′

∂t=

1

1− β(t′) · n(t′) (854)

∇t′ = −n

c+∇t′ r(t

′)

c· n, ∇t′ = −n

c

1

1− β(t′) · n(t′) (855)

163 c©2010 by Charles Thorn

Page 80: 8 Lorentz Invariance and Special Relativity

Armed with these results it suffices to calculate the derivatives of the potentials w.r.t. r, t′:

∂φ

∂t′=

q

4πǫ0

∂t′[|r − r(t′)| − β(t′) · (r − r(t′))]−1

=−q4πǫ0

(β − n) · βc− β · (r − r(t′))

[|r − r(t′)| − β(t′) · (r − r(t′)]2= −q (β − n) · βc− β · (r − r(t′))

4πǫ0|r − r(t′)|2[1− β · n]2 (856)

∇φ|t′ =−q4πǫ0

n(t′)− β(t′)

[|r − r(t′)| − β(t′) · (r − r(t′)]2=

−q4πǫ0

n(t′)− β(t′)

|r − r(t′)|2[1− β · n]2 (857)

E = −∇φ|t −∂A

∂t= −∇φ|t′ −∇t′∂φ

∂t′− ∂t′

∂t

∂A

∂t′

= −∇φ|t′ −∇t′∂φ∂t′

− 1

c2∂t′

∂t

(

rφ+ r∂φ

∂t′

)

=q

4πǫ0

[ n(t′)− β(t′)

|r − r(t′)|2[1− β · n]2 − (n− β)(β − n) · βc− β · (r − r(t′))

c|r − r(t′)|2[1− β · n]3

− β

c

1

|r − r(t′)|[1− β · n]2]

=q

4πǫ0

[

[n− β](1− β2)

|r − r(t′)|2[1− β · n]3 +1

c

n× [(n− β)× β]

|r − r(t′)|[1− β · n]3

]

(858)

B = ∇×A = ǫ0µ0

[

∇t′ ×(

rφ+ r∂φ

∂t′

)

+∇φ|t′ × r

]

=qµ0

[

− n

c×(

r

|r − r(t′)|[1− β · n]2 + r(β − n) · βc− β · (r − r(t′))

|r − r(t′)|2[1− β · n]2

)

− (n− β)× βc

|r − r(t′)|2[1− β · n]2]

=qµ0

4πn×

[

− β[1− β · n] + βn · β|r − r(t′)|[1− β · n]3 − βc(1− β2)

|r − r(t′)|2[1− β · n]3]

(859)

= µ0ǫ0cn×E =µ0

Z0

n×E = µ0H (860)

In these formulas it is understood that β and n are evaluated at the retarded time t′.Radiation can be identified by taking r in the radiation zone |r| ≫ |r(t′)| sufficiently

large so the second terms (involving the acceleration) dominate. One can always do this whenthe motion of the charged particle is bounded. Even if it is not, one can always evaluatethe instantaneous Poynting vector on a large sphere centered on r(t′) which will then be anequal time surface, since the observation time t = t′ + |r− r(t′)|/c. When the radius of thissphere R = |r − r(t′)| is large enough the “acceleration terms” of the fields dominate: ThePoynting vector is

S = E ×H =1

Z0

E × (n×E) =1

Z0

(nE2 − n ·EE) ∼ 1

Z0

nE2 (861)

164 c©2010 by Charles Thorn

Page 81: 8 Lorentz Invariance and Special Relativity

where the last form uses the fact that the acceleration term of E is perpendicular to n.Then the power per unit solid angle passing through this sphere is

R2n · S ∼ q2

16π2ǫ0c

(n× [(n− β)× β])2

(1− β · n)6 (862)

This gives the energy per unit t per solid angle emitted into a given solid angle. We mustbear in mind that the quantities β(t′)and n(t′) appearing on the right are all evaluated atthe retarded time t′. If we want to get the total power by integrating over t′, we shouldmultiply it by a factor dt/dt′ = 1− β · n to get the energy per unit t′ which is the relevantquantity to integrate over the particle’s trajectory to get the total radiated energy.

dP

dΩ(t′) =

q2

e2α~

(n× [(n− β)× β])2

(1− β · n)5 (863)

where we introduced the finite structure constant α = e2/4πǫ0~c ≈ 1/137.The virtue of this formula is that it gives the instantaneous radiated power with no

restrictions on the speed of the radiating particle. When the particle is highly relativistic,the factor (1− β ·n)5 in the denominator gets very small in the forward direction, showingthat the angular distribution is sharply peaked in the forward direction. For example if β isparallel to β, we have

dP

dΩ(t′) =

q2

e2α~

β2sin2 θ

(1− β cos θ)5(864)

where θ is the angle between n and β.Calculating the total radiated power by integrating the differential power over angles is

a challenge with general β, but in a Lorentz frame in which the particle is instantaneouslyat rest it is a piece of cake:

dP

dΩ(t′) → q2

e2α~

4π(n× [n× β])2 =

q2

e2α~

4πβ

2sin2 θ (865)

P → 2

3

q2

e2α~β

2, β ≪ 1 (866)

which gives the total radiated power for nonrelativistic motion, first calculated by Larmor.For general β Lienard obtained the result9

P =2

3

q2

e2α~γ6

[

β2 − (β × β)2

]

=2

3

q2

e2α~[

γ4β2+ γ6(β · β)2

]

(870)

9The integral is elementary but tedious:

(n× [(n− β)× β])2 = (n− β)2(n · β)2 + (1− n · β)2β2 − 2n · β(1− n · β)(n− β) · β= (β2 − 1)(n · β)2 + (1− n · β)2β2

+ 2n · β(1− n · β)(β · β) (867)

For purposes of doing the angular integrals, choose β = βz and β = |β|(z cosα + x sinα). Then after the

azimuthal integral n · β → |β| cosα cos θ and (n · β)2 → β2

(sin2 α sin2 θ + 2 cos2 α cos2 θ). Thus under the

165 c©2010 by Charles Thorn

Page 82: 8 Lorentz Invariance and Special Relativity

One can rewrite the right side in terms of the time derivative of the four momentum dpµ/dtas follows:

dp0

dt= mc

dt= γ3mcβ · β, dp

dt= γmcβ + γ3mcββ · β (871)

dpµ

dt

dpµdt

= (γmc)2[

(β + γ2ββ · β)2 − γ4(β · β)2]

= (γmc)2[

β2+ γ2(β · β)2

]

(872)

P =2

3

q2

e2α~

m2c2γ2dpµ

dt

dpµdt

=2

3

q2

e2α~

m2c2dpµ

dpµdτ

(873)

which shows that P is in fact a Lorentz invariant.

14.2 Particle Accelerators

The radiation of highly relativistic particles has serious consequences for high energy physicswhich aims to accelerate particles to high momentum to probe the short distance structureof elementary particles as well as to produce new ones. First lets consider linear accelerators,so v is parallel to v. In this case

(

dp0

dt

)2

= β2

(

dp

dt

)2

,dpµ

dpµdτ

=1

γ2

(

dp

)2

=

(

dp

dt

)2

=

(

dE

dx

)2

(874)

P =2

3

q2

e2α~

m2c2γ2dpµ

dt

dpµdt

=2

3

q2

e2α~

m2c2

(

dE

dx

)2

(875)

The power required to increase the energy of the particle is dE/dt so in comparison to thisthe radiated power,

P

dE/dt=

2

3

q2

e2α~

m2c2v

dE

dx=

2

3

q2

e2α2a0mcv

dE

dx= O

(

10−14m

1MeV

)

dE

dx(876)

integral we can substitute

(n× [(n− β)× β])2 → β2

2(β2 − 1)(sin2 α sin2 θ + 2 cos2 α cos2 θ)

+(1− β cos θ)2β2

+ 2β2

β cos2 α cos θ(1− β cos θ)

→ β2

(β2 − 1) cos2 θ +β2

2sin2 α(1− 3 cos2 θ)(β2 − 1)

+[1− β2 cos2 θ]β2 − 2β

2

β sin2 α cos θ(1− β cos θ)

→ β2

(1− cos2 θ) +β2

sin2 α

2

[

(1− 3 cos2 θ)(β2 − 1)− 4β cos θ(1− β cos θ)]

→ β2

(1− cos2 θ) +β2

sin2 α

2

[

(3 + β2) cos2 θ + β2 − 1− 4β cos θ]

(868)

Then

P =q2α~

2e2β2

1

−1

dz(1− z2) + sin2 α

[

(3 + β2)z2 + β2 − 1− 4βz]

/2

(1− βz)5=

2q2α~

3e2β2

(1− β2 sin2 α)(869)

166 c©2010 by Charles Thorn

Page 83: 8 Lorentz Invariance and Special Relativity

for electrons. Linacs typically accelerate only 100 MeV/m so that for them radiation lossesare completely negligible.

For circular accelerators (synchrotrons) the radiation coming from the changing directionof β is always present whether or not the speed of the particle is increasing. This energyloss must be continually replenished just to keep particles in the ring at fixed energy. In thiscase we have

(

dp

dt

)2

= ω2p2 (877)

P =2

3

q2

e2α~

m2c2γ2ω2p2 =

2

3

q2

e2α~γ4ω2β2 =

2

3

q2

e2α~γ4β4 c

2

R2(878)

= O

(

10−19Wm2

R2γ4)

(879)

where R is the radius of hte ring. At the design energy of the LHC (7 TeV protons) γ ≈ 7000.Whereas at LEP (60 GeV electrons) γ ≈ 120000.

14.3 Charge in uniform motion

We next begin exploring the consequences of these exact expressions for the fields, startingwith the case of uniform motion. For definiteness let’s take the charge to move up the x-axisat constant speed βc. The expressions should then reduce to those obtained by a Lorentztransformation from the Coulomb electric field in the frame in which the particle is at rest:E′ = qr′/4πǫ0r

′3,B′ = 0. To check this we need the ingredients of the formulas: β = βxand r(t) = xctβ. The retarded time is determined by c(t− t′) =

(x− βct′)2 + y2 + z2. Weget t′ by solving a quadratic equation:

ct′ =ct− βx−

(ct− βx)2 + (r2 − c2t2)(1− β2)

1− β2

=ct− βx−

(x− βct)2 + (y2 + z2)(1− β2)

1− β2

|r − r(t′)|(1− n · β) = |r − r(t′)| − β · (r − xβct′)

= ct− ct′(1− β2)− βx =√

(x− βct)2 + (y2 + z2)(1− β2)

= r′√

1− β2

|r − r(t′)|(n− β) = r − xβct′ − xβc(t− t′) = r − xβct (880)

Putting these ingredients into the formula for the electric field gives

E =q

4πǫ0

γ(r − xβct)

[γ2(x− βct)2 + y2 + z2)3/2=

q

4πǫ0

γ(r − xβct)

r′3(881)

where r′ is the distance from the charge to the observation point as measured in the charge’srest frame. We now see that Ex = E ′

x and Ey,z = γE ′y,z, precisely the desired Lorentz

167 c©2010 by Charles Thorn

Page 84: 8 Lorentz Invariance and Special Relativity

transform of the fields. From the identities

n× (n− β) = n× (−β) = β × n = β × (n− β) (882)

we see that

B =µ0

Z0

n×E =µ0

Z0

β ×E =1

cβ ×E (883)

as required by the Lorentz transformation from the frame in which the particle is at rest.Of course, since the fields fall of as 1/r2, the Poynting vector falls off as 1/r4 and so there isno radiation of energy to infinity in nonaccelerating motion.

14.4 Charge moving with constant proper acceleration

Let us now consider a charge accelerating along the x-axis with constant acceleration inthe particle’s instantaneous rest frame. This problem has interesting implications for theequivalence principle of general relativity, because an observer moving with the particle willinterpret the acceleration as a static gravitational field, and surely a particle held fixed in astatic gravitational field should not radiate. We therefore focus on the nature of the radiationthat can be produced by such a motion.

The trajectory of such a motion is given by y(t) = z(t) = 0 and

x(t) =c2

a

1 +a2t2

c2, (884)

β = xat

c

(

1 +a2t2

c2

)−1/2

, (885)

β = xa

c

(

1 +a2t2

c2

)−3/2

=β(1− β2)

t(886)

At early times t→ −∞ the particle comes in from x = +∞ at the speed of light, deceleratesto rest at t = 0, reverses direction and accelerates toward x = +∞ eventually attaining thespeed of light once again.

Retarded time for t = 0

For simplicity we first work out the fields produced by this accelerating particle at t = 0.Then the retarded time is given by the equation

ct′ = −|r − r(t′)| = −√

(x− x(t′))2 + y2 + z2

x(t′) =r2 + c4/a2

2x, ct′ = −

(r2 + c4/a2)2

4x2− c4

a2(887)

Then

E =q

4πǫ0

[

[n− β](1− β2)

|r − r(t′)|2[1− β · n]3 +1

c

nn · β − β

|r − r(t′)|[1− β · n]3

]

, B =µ0

Z0

n×E

168 c©2010 by Charles Thorn

Page 85: 8 Lorentz Invariance and Special Relativity

Now β(t′) = β(1− β2)/t′ = −cβ(1− β2)/|r − r(t′)| by the definition of the retarded time.Thus the two terms in large square brackets combine very neatly into

E =q

4πǫ0

n(1− β2)

|r − r(t′)|2[1− β · n]2 , B = 0 (888)

The vanishing of B follows immediately from the fact that E is parallel to n. At t = 0 thePoynting vector vanishes everywhere, implying that there is no flow flow of energy anywhereat that moment. Continuing the evaluation

|r − r(t′)|[1− β · n] = |r − r(t′)| − β(x− x(t′)) = −βx1− β2

|r − r(t′)|2[1− β · n]2 =c2

a2x2t′2=

4c4

a2(r2 + c4/a2)2 − 4x2c4(889)

so we find

E =q

4πǫ0

4c4n

a2(r2 + c4/a2)2 − 4x2c4=

q

4πǫ0

8|x|c4(ρ+ x(x− x(t′))

a2((r2 + c4/a2)2 − 4x2c4/a2)3/2

=q

πǫ0

c4(2|x|ρ− x(r2 − 2x2 + c4/a2)

a2((r2 + c4/a2)2 − 4x2c4/a2)3/2(890)

In these expressions, we should understand that the fields using the retarded propagator areonly non-zero in the domain x ≥ 0− ct, since we assume they are generated by a charge ona hyperbolic trajectory that keeps x > 0. For t = 0 this means that the fields are zero forx < 0.

Retarded time for t 6= 0 [Omit on a first reading)

For t 6= 0 the expressions for the fields are considerably more complicated. The retardedtime t′ is then determined as the solution of a quadratic equation

2xx(t′) = r2 − c2t2 +c4

a2+ 2c2tt′ ≡ ξ2(r, t) + 2c2tt′

0 = (x2 − c2t2)t′2 − ξ2tt′ + x2c2

a2− ξ4

4c2

t′ =ξ2t±

ξ4t2 − (x2 − c2t2)(4x2c2/a2 − ξ4/c2)

2(x2 − c2t2)

ct′ =ξ2ct± x

ξ4 − 4(x2 − c2t2)c4/a2

2(x2 − c2t2)(891)

Of the two solutions, taking into account that the fields are nonzero only when x + ct ≥ 0,the choice − gives the correct retarded time

ct′ =ξ2ct− x

ξ4 − 4(x2 − c2t2)c4/a2

2(x2 − c2t2)(892)

169 c©2010 by Charles Thorn

Page 86: 8 Lorentz Invariance and Special Relativity

|r − r(t′)| = c(t− t′) =(2(x2 − c2t2)− ξ2)ct+ x

ξ4 − 4(x2 − c2t2)c4/a2

2(x2 − c2t2)

x(t′) =ξ2x− ct

ξ4 − 4(x2 − c2t2)c4/a2

2(x2 − c2t2)(893)

β(t′) =ct′

x(t′)=ξ2ct− x

ξ4 − 4(x2 − c2t2)c4/a2

ξ2x− ct√

ξ4 − 4(x2 − c2t2)c4/a2(894)

|r − r(t′)|(1− n · β) = |r − r(t′)| − β(x− x(t′)) = ct− β(t′)x

=ct′

x(t′)=

(x2 − c2t2)√

ξ4 − 4(x2 − c2t2)c4/a2

ξ2x− ct√

ξ4 − 4(x2 − c2t2)c4/a2(895)

The last line shows the denominator of the Lienard-Wiechert potentials which then read:

φ = −1

cA0 =

qθ(ct+ x)

4πǫ0

ξ2x− ct√

ξ4 − 4(x2 − c2t2)c4/a2

(x2 − c2t2)√

ξ4 − 4(x2 − c2t2)c4/a2(896)

Ax = A1 =qθ(ct+ x)

4πǫ0c

ξ2ct− x√

ξ4 − 4(x2 − c2t2)c4/a2

(x2 − c2t2)√

ξ4 − 4(x2 − c2t2)c4/a2, Ay = Az = 0 (897)

In these formulae we have supplied factors of θ(ct + x) to remind us that by constructionthe retarded potentials are zero when ct < −x

If we repeat the calculation of the potentials for a particle trajectory following the “image”hyperbola x−(t

′) = −x(t′) we find the results

φ− = −1

cA−

0 =qθ(ct− x)

4πǫ0

−ξ2x− ct√

ξ4 − 4(x2 − c2t2)c4/a2

(x2 − c2t2)√

ξ4 − 4(x2 − c2t2)c4/a2(898)

A−x = A−

1 =qθ(ct− x)

4πǫ0c

−ξ2ct− x√

ξ4 − 4(x2 − c2t2)c4/a2

(x2 − c2t2)√

ξ4 − 4(x2 − c2t2)c4/a2, A−

y = A−z = 0 (899)

What is interesting here is that in their common domain of support, i.e. ct ≥ |x| the sum ofthe original plus image potentials is a pure gauge:

A0 + A−0 =

q

4πǫ0

2t

x2 − c2t2= −∂0

q

4πǫ0cln(c2t2 − x2) (900)

A1 + A−1 =

q

4πǫ0c

−2x

x2 − c2t2= −∂1

q

4πǫ0cln(c2t2 − x2) (901)

What this means is that in the domain ct > |x| the fields produced by the image charge areprecisely the negative of the fields produced by the original charge.

Radiation and the Equivalence Principle

For the discussion of radiation, let’s divide the xt plane into four sectors, following theapproach of D. Boulware, Annals of Physics 124 (1980) 169-188. Region I is characterized

170 c©2010 by Charles Thorn

Page 87: 8 Lorentz Invariance and Special Relativity

by x2 > c2t2 and x > 0. The hyperbolic trajectory lies entirely within region I. It is also theentire space-time accessible to an observer moving with the charged particle. Region II ischaracterized by c2t2 > x2 and t > 0. The fields produced by the accelerating particle arenonzero only in I ∪ II. Region III, the image of region I, is characterized by x2 > c2t2 andx < 0. Every point in I is space-like separated from every point in Region III. The physicsin III is completely decoupled from the physics in I. However charges in either region canproduce fields in region II. Finally, region IV is the image of II, characterized by c2t2 > x2

and t < 0. Because of retarded boundary conditions, the fields are strictly zero in region IV.It is elementary that radiation emitted in region I will enter region II (see Peierls, Surprises

in Theoretical Physics). Assume it is emitted at x0, t0, x0 > ct0 at an angle θ with the x-axis.Then at time t = t0 +R/c it will have reached x = x0 +R cos θ. Then

ct− x = ct0 +R− x0 −R cos θ = ct0 − x0 +R(1− cos θ) (902)

which will be in region II as soon as R is large enough. This argument also shows that thedevice of separating radiation fields from quasi-static fields by observations on a sufficientlylarge sphere cannot be applied strictly within region I (Pauli: no radiation zone can develop).The fact that fields in region II produced by the original trajectory in Region I, will becancelled by fields produced by the image trajectory in region III, shows that it is consistentwith Maxwell’s equations to say that no radiation leaves region I. But since it is impossibleto unambiguously identify radiation by observations in region I, it is fair to say there is noradiation at all: this interpretation is confirmed by the known absence of radiation reactionforces for hyperbolic motion. Thus there is no conflict with the equivalence principle.

On the other hand, if no fields are produced by charges in region III, radiation is definitelyproduced in Region II. As Boulware shows, the energy of this radiation comes not fromradiation reaction on the particle, but from a cross term in the energy between the continuousfield and a boundary delta function field. To calculate the radiation in region II we go to alarge enough sphere so that the acceleration term of the Lienard-Wiechert field dominates:

E ∼ q

4πǫ0

[

1

c

n× [n× β]

|r − r(t′)|[1− β · n]3

]

(903)

dP

dΩ(t′) =

q2

e2α~

β2sin2 θ

(1− β cos θ)5(904)

P (t′) =2

3

q2

e2α~γ6β

2(905)

For hyperbolic motion 1/γ2 = 1− β2 = (1 + a2t2/c2)−1 and β = (a/cγ3), so

P (t′) =2a2

3c2q2

e2α~ (906)

which is identical to the nonrelativistic Larmor result. Since it is independent of t′ the totalenergy radiated is just PT where T is the time over which acceleration occurs.

171 c©2010 by Charles Thorn