physics 215 solution set 1 winter2018 - welcome to...

26
Physics 215 Solution Set 1 Winter 2018 1. (a) Let ω i (i =1, 2,...,n) be the eigenvalues of the n × n matrix Ω. 1 Prove the following two results: det Ω = n i=1 ω i , Tr Ω = n i=1 ω i . Do not assume that Ω is a diagonalizable matrix. The eigenvalues λ i are the solutions to the characteristic equation, det(Ω ωI )=0 , where I is the n × n identity matrix. Explicitly, the characteristic equation is det Ω 11 ω Ω 12 ··· Ω 1n Ω 21 Ω 22 ω ··· Ω 2n . . . . . . . . . . . . Ω n1 Ω n2 ··· Ω nn ω =0 . The determinant is defined by the equation, det A = n i 1 =i n i 2 =1 ··· n in=1 ǫ i 1 i 2 ···in A 1i 1 A 2i 2 ··· A nin , (1) where 2 ǫ i 1 i 2 ···in = 1 , when (i 1 ,i 2 ,...,i n ) is an even permutation of (1, 2, ··· ,n), 1 , when (i 1 ,i 2 ,...,i n ) is an odd permutation of (1, 2, ··· ,n), 0 otherwise. Using eq. (1), it follows that det(Ω ωI ) = (Ω 11 ω)(Ω 22 ω) ··· nn ω)+ X, (2) where X is the sum of products of matrix elements of Ω ωI such that each term in the sum contains at least two off-diagonal terms. Eq. (2) is a consequence of the fact that ǫ i 1 i 2 ···in = 0 unless all of the indicies i 1 ,i 2 , ··· ,i n are all distinct. Thus, the only way to produce terms proportional to ω n or ω n1 in the characteristic equation is from the product of the diagonal elements of Ω ωI . Hence eq. (2) yields det(Ω ωI )=(ω) n +(ω) n1 Ω 11 22 + ··· nn + n2 k=0 c k ω k =0 , (3) 1 Note that if there are degenerate eigenvalues, then they must be repeated according to their degeneracy in the list of eigenvalues ω i . 2 That is ǫ i1i2···in = 0 if at least two of the integers appearing in the index set (i 1 ,i 2 ,...i n ) are the same. 1

Upload: others

Post on 02-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Physics 215 Solution Set 1 Winter 2018

    1. (a) Let ωi (i = 1, 2, ..., n) be the eigenvalues of the n×n matrix Ω.1 Prove the followingtwo results:

    det Ω =

    n∏

    i=1

    ωi , TrΩ =

    n∑

    i=1

    ωi .

    Do not assume that Ω is a diagonalizable matrix.

    The eigenvalues λi are the solutions to the characteristic equation,

    det(Ω− ωI) = 0 ,

    where I is the n× n identity matrix. Explicitly, the characteristic equation is

    det

    Ω11 − ω Ω12 · · · Ω1nΩ21 Ω22 − ω · · · Ω2n...

    .... . .

    ...Ωn1 Ωn2 · · · Ωnn − ω

    = 0 .

    The determinant is defined by the equation,

    detA =n∑

    i1=i

    n∑

    i2=1

    · · ·n∑

    in=1

    ǫi1i2···inA1i1A2i2 · · ·Anin , (1)

    where2

    ǫi1i2···in =

    1 , when (i1, i2, . . . , in) is an even permutation of (1, 2, · · · , n),−1 , when (i1, i2, . . . , in) is an odd permutation of (1, 2, · · · , n),0 otherwise.

    Using eq. (1), it follows that

    det(Ω− ωI) = (Ω11 − ω)(Ω22 − ω) · · · (Ωnn − ω) +X , (2)

    where X is the sum of products of matrix elements of Ω − ωI such that each term in thesum contains at least two off-diagonal terms. Eq. (2) is a consequence of the fact thatǫi1i2···in = 0 unless all of the indicies i1, i2, · · · , in are all distinct. Thus, the only way toproduce terms proportional to ωn or ωn−1 in the characteristic equation is from the productof the diagonal elements of Ω− ωI. Hence eq. (2) yields

    det(Ω− ωI) = (−ω)n + (−ω)n−1[

    Ω11 + Ω22 + · · ·+ Ωnn]

    +n−2∑

    k=0

    ckωk = 0 , (3)

    1Note that if there are degenerate eigenvalues, then they must be repeated according to their degeneracyin the list of eigenvalues ωi.

    2That is ǫi1i2···in = 0 if at least two of the integers appearing in the index set (i1, i2, . . . in) are the same.

    1

  • where the expressions for the coefficients ck will not be explicitly needed here. Multiplyingthrough by (−1)n yields the following form for the characteristic equation.

    (−1)n det(Ω− ωI) = ωn − ωn−1TrΩ + (−1)nn−2∑

    k=0

    ckωk = 0 , (4)

    The roots of the characteristic equation are the eigenvalues ωi. This means that thecharacteristic equation can be written in the following form

    (ω − ω1)(ω − ω2) · · · (ω − ωn) = ωn − ωn−1(ω1 + ω2 + . . .+ ωn) + (−1)nn−2∑

    k=0

    ckωk = 0 . (5)

    Comparing the coefficient of ωn−1 in eqs. (4) and (5), it immediately follows that

    TrΩ =

    n∑

    k=1

    ωk . (6)

    Finally, if we set ω = 0 in eq. (3), it follows that c0 = detΩ. Likewise, if we set ω = 0in eq. (5), it follows that c0 = ω1ω2 · · ·ωn. Hence, we conclude that

    det Ω =n∏

    k=0

    ωk . (7)

    (b) Show that:det exp(Ω) = exp(TrΩ) .

    Starting with Ω |ωi〉 = ωi |ωi〉, it follows that3

    exp(Ω) |ωi〉 = eωi |ωi〉 . (8)Then eq. (7) yields4

    det exp(Ω) =∏

    i

    eωi . (9)

    On the other hand, eqs. (6) and (7) imply that

    exp(TrΩ) = exp

    (

    n∑

    i=1

    ωi

    )

    =∏

    i

    eωi = det exp(Ω) .

    which is the desired result.3To formally prove eq. (8), we note that exp(Ω) =

    k Ωk/k! is defined via its Taylor series. Using

    Ωk |ωi〉 = ωki |Ω〉 and resumming the resulting series then yields eq. (8).4To obtain eq. (9), we are implicitly assuming that any degeneracies in the eigenvalues of Ω are in

    one-to-one correspondence with the degeneracies in the eigenvalues of exp(Ω). Although this assertion iseasy to show if Ω is diagonalizable (in which case the number of linearly independent eigenvectors of Ωis equal to n), one has to work harder in the general case. A proof of the more general result using theJordan canonical form can be found in Peter Lancaster and Miron Tismenetsky, The Theory of Matrices,2nd edition (Academic Press, Orlando, FL, 1985). In particular, see Theorem 6 on p. 312 of this reference.

    2

  • ADDED NOTE: A sophisticated proof that det exp(Ω) = exp(TrΩ)

    Define the function,f(t) = det etΩ . (10)

    Then using the definition of the derivative,

    df

    dt= lim

    δt→0

    1

    δt

    [

    det e(t+δt)Ω − det etΩ]

    . (11)

    Noting that

    det e(t+δt)Ω = det(

    etΩeAδt)

    = det(

    etΩ)

    det(

    eΩδt)

    = f(t) det(

    eΩδt)

    ,

    it then follows from eq. (11) that

    df

    dt= lim

    δt→0

    1

    δtf(t)

    [

    det(

    eΩδt)

    − 1]

    . (12)

    We therefore require an expression for det(

    eΩδt)

    that is accurate up to first order in δt. Butthis is straightforward to obtain as follows,

    det(

    eΩδt) ≃ det(I + Ωδt) = det

    1 + Ω11δt Ω12δt · · · Ω1nδtΩ21δt 1 + Ω22δt · · · Ω2nδt

    ......

    . . ....

    Ωn1δt Ωn2δt · · · 1 + Ωnnδt

    = 1 + δt(Ω11 + Ω22 + . . .Ωnn) +O(

    (δt)2)

    ≃ 1 + δt TrΩ , (13)

    after dropping all terms of O(

    (δt)2)

    . Inserting this result into eq. (12) then yields,

    limδt→0

    1

    δt

    [

    det(

    eΩδt)

    − 1]

    = TrΩ .

    That is, eq. (12) reduces to a first order differential equation,

    df

    dt= f(t) TrΩ . (14)

    Furthermore, by setting t = 0 in eq. (10) it follows that f(0) = det I = 1. This servesas the boundary condition for eq. (14). Integrating eq. (14) and setting the constant ofintegration such that f(0) = 1 then yields

    f(t) = exp(

    tTrΩ)

    . (15)

    Finally, setting t = 1 in eqs. (10) and (15), we again end up with det exp(Ω) = exp(TrΩ).

    3

  • 2. (a) Consider a differentiable function f(x) such that f(xn) = 0. Assume that the xnare isolated zeros of the function f(x). Under the assumption that df/dxn 6= 0, wheredf/dxn ≡ (df/dx)x=xn, show that:

    δ(f(x)) =∑

    n

    δ(x− xn)|df/dxn|

    .

    Recall that the δ-function satisfies the following two properties. First, δ(x − y) = 0 forx 6= y. Second,

    ∫ y+b

    y−aδ(x− y) dx = 1 , (16)

    for any choice of positive constants a and b.Using the first property of the delta function given above, it follows that

    δ(

    f(x))

    = 0 , for x 6= xn ,

    since the delta function always vanishes when its argument is nonzero. Next, we use thefact that for x ≃ xn, we have

    f(x) ≃ An(x− xn) +O(

    (x− xn)2)

    , (17)

    by virtue of the fact that f(xn) = 0. Moreover,

    df

    dxn= An , (18)

    which by assumption is nonzero. That is, An 6= 0. It then follows that

    δ(

    f(x))

    =∑

    n

    cnδ(x− xn) , (19)

    since both sides of eq. (19) vanish at xn. We shall determine the cn from the second propertyof the delta function given above.

    Since the zeros of f(x) are assumed to be isolated, one can always find a positive numberǫ > 0 such that

    ∫ xn+ǫ

    xn−ǫδ(x− xm) dx = δmn .

    Thus, integrating both sides of eq. (19) from xn − ǫ to xn + ǫ yields

    cn =

    ∫ xn+ǫ

    xn−ǫδ(

    f(x))

    dx . (20)

    To evaluate this integral, we make a change in integration variables. Let y = f(x). Thendy = (df/dx)dx. Applying this to eq. (20) yields,

    cn =

    ∫ f(xn+ǫ)

    f(xn−ǫ)δ(y)

    dy

    df/dx≃∫ ǫAn

    −ǫAnδ(y)

    dy

    df/dx, (21)

    4

  • where df/dx should be re-expressed in terms of y. In the final step above, we have usedeq. (17) and dropped the quadratic term, which is a very good approximations if ǫ ≪ 1.5Using the fact that

    ∫ y+

    y−

    δ(y)g(y) dy = g(0) , (22)

    where y− < 0 < y+, it then follows from eq. (21) that

    cn =1

    (df/dx)y=0sgn An , (23)

    where sgn An is the sign of An.6 Note that y = 0 corresponds to x = xn. In light of eq. (18),

    we conclude that

    cn =1

    |df/dxn|,

    since |z| = z sgn z for any real number z. Inserting this result back into eq. (19) yields,

    δ(f(x)) =∑

    n

    δ(x− xn)|df/dxn|

    , (24)

    as required.

    An alternate proof

    Let us start from the integral representation,

    δ(x− y) = 12π

    ∫ ∞

    −∞eik(x−y) dk . (25)

    We make a change of variables k → ak, where a 6= 0. Then,7

    δ(x− y) = a sgn a2π

    ∫ ∞

    −∞eiak(x−y) dk = |a|δ

    (

    a(x− y))

    , (26)

    where we have used |a| = a sgn a and employed eq. (25) in the last step above. Hence, weconclude that

    δ(

    a(x− y))

    =1

    |a|δ(x− y) . (27)

    5In fact, we do not have to take the limit as ǫ→ 0. It is sufficient to choose the positive number ǫ smallenough such that the sign of f(xn + ǫ) ≃ ǫAn +O(ǫ2) is is the same as the sign of An.

    6If An < 0, then then the limits of integration run from negative to positive. In order to make use ofeq. (22), one must reverse the limits of integration, which yields an extra sign. Hence the factor of sgn Anin eq. (23).

    7Note that if a < 0, then the limits of integration run from ∞ to −∞. Reversing the direction ofintegration yields an extra sign. Hence the factor of sgn a in eq. (26) above.

    5

  • Applying eq. (27) to eq. (17) then yields

    δ(

    f(x))

    ≃ δ(

    An(x− xn))

    =1

    |An|δ(x− xn) =

    1

    |df/dxn|δ(x− xn) , for x ≃ xn .

    We now repeat the above calculation in the vicinity of all the other zeros of f(x). Theseparate contributions add, and the end result is eq. (24).

    (b) Use this result of part (a) to obtain simplified expressions for δ(ax) and δ(x2 − a2),assuming that a 6= 0.

    Using eq. (24), it follows that

    δ(ax) =1

    |a|δ(x) ,

    and

    δ(x2 − a2) = 12|a|

    [

    δ(x+ a) + δ(x− a)]

    .

    3. The δ–function may be defined as:

    δ(x− x′) = limǫ→0

    1√πǫ2

    exp(

    −(x− x′)2

    ǫ2

    )

    . (28)

    (a) Assuming that a is a real positive constant, verify that:

    δ(x− x′) = lima→0

    1

    ∫ +∞

    −∞e−ik(x−x

    ′)−ak2 dk . (29)

    It is convenient to define y = x− x′. Then, we consider the integral,

    I(y, a) ≡ 12π

    ∫ +∞

    −∞e−iky−ak

    2

    dk ,

    where a > 0. We can explicitly evaluate this integral by “completing the square.”

    −iky − ak2 = −a(

    k +iy

    2a

    )2

    − y2

    4a.

    Then,

    I(y, a) =1

    2πe−y

    2/(4a)

    ∫ ∞

    −∞exp

    {

    −a(

    k +iy

    2a

    )2}

    dk .

    6

  • Changing the integration variable to k′ = k + iy/(2a) then yields

    I(y, a) =1

    2πe−y

    2/(4a)

    ∫ ∞+iy/(2a)

    −∞+iy/(2a)e−ak

    ′2

    dk′ =1

    2πe−y

    2/(4a)

    ∫ ∞

    −∞e−ak

    ′2

    dk′ ,

    where in the last step, we deformed the contour of integration back down to the real axis.This deformation does not change the value of the integral, since no poles in the complexplane were traversed when moving the contour of integration in this manner. Using

    ∫ ∞

    −∞e−ak

    ′2

    dk′ =

    π

    a,

    for a > 0, we end up with

    I(y, a) =1√4πa

    e−y2/(4a) . (30)

    If we set ǫ2 ≡ 4a and y = x− x′, then eqs. (29) and (30) yield,

    δ(x− x′) = lima→0

    I(x− x′, a) = limǫ→0

    1√πǫ2

    e−(x−x′)2/ǫ2 ,

    in agreement with eq. (28).

    (b) Let f(x) = 1√2π

    ∫ +∞−∞ a(k)e

    ikx dk . Show that:

    ∫ +∞

    −∞|f(x)|2 dx =

    ∫ +∞

    −∞|a(k)|2 dk . (31)

    We provide a physicist’s derivation by interpreting eq. (29) as

    δ(x− x′) = 12π

    ∫ +∞

    −∞e−ik(x−x

    ′) dk . (32)

    Inserting

    f(x) =1√2π

    ∫ +∞

    −∞a(k)eikx dk .

    into the left hand side of eq. (31), it follows that

    ∫ +∞

    −∞|f(x)|2 dx = 1

    ∫ ∞

    −∞dk

    ∫ ∞

    −∞dℓ a(k)a∗(ℓ)

    ∫ ∞

    −∞ei(k−ℓ)x dx

    =

    ∫ ∞

    −∞dk

    ∫ ∞

    −∞dℓ a(k)a∗(ℓ)δ(k − ℓ) =

    ∫ +∞

    −∞|a(k)|2 dk ,

    after using eq. (32) at the penultimate step.

    7

  • (c) Interpret the result of part (b) in terms of Dirac bras and kets.

    In the notation of bras and kets, the result of part (b) is equivalent to

    〈f |f〉 =∫ ∞

    −∞dx 〈f |x〉 〈x|f〉 =

    ∫ ∞

    −∞dx |f(x)|2

    =

    ∫ ∞

    −∞dx 〈f |k〉 〈k|f〉 =

    ∫ ∞

    −∞dx |a(k)|2 ,

    depending on whether we insert a complete set of states in the x-basis or in the k-basis.

    An alternate proof of eq. (31)

    Eq. (31) is known as Parseval’s relation. A more mathematically correct derivation isas follows. First, we consider the Fourier transforms of two functions,

    f(x) =1√2π

    ∫ +∞

    −∞a(k)eikx dk , (33)

    g(x) =1√2π

    ∫ +∞

    −∞b(k)eikx dk . (34)

    The convolution of two integrable functions, a(k) and b(k), is defined by c(k) = a ∗ b(k),where

    c(k) ≡ 1√2π

    ∫ ∞

    −∞a(k − ℓ)b(ℓ) dℓ . (35)

    We first prove the following result,

    f(x)g(x) =1√2π

    ∫ ∞

    −∞c(k)eikx dk . (36)

    To verify eq. (36), we use eq. (35) to evaluate

    1√2π

    ∫ ∞

    −∞c(k)eikx dk =

    1

    ∫ ∞

    −∞dk

    ∫ ∞

    −∞dℓ a(k − ℓ)b(ℓ)eikx

    =1

    ∫ ∞

    −∞dm

    ∫ ∞

    −∞dℓ a(m)b(ℓ)eix(ℓ+m)

    =1

    ∫ ∞

    −∞a(m)eimx dm

    ∫ ∞

    −∞b(ℓ)eiℓx dℓ

    = f(x)g(x) , (37)

    where we changed integration variables, m = k − ℓ in the second line above.

    8

  • Taking the inverse Fourier transform of eq. (36) yields

    c(k) =1√2π

    ∫ ∞

    −∞f(x)g(x)e−ikx dx . (38)

    In light of eqs. (35) and (38), it follows that∫ ∞

    −∞a(k − ℓ)b(ℓ) dℓ =

    ∫ ∞

    −∞f(x)g(x)e−ikx dx .

    Setting k = 0 yields∫ ∞

    −∞a(−ℓ)b(ℓ) dℓ =

    ∫ ∞

    −∞f(x)g(x) dx . (39)

    We now choose g(x) = f ∗(x), where the ∗ indicates complex conjugation. Then, eqs. (33)and (34) yield

    f ∗(x) =1√2π

    ∫ +∞

    −∞a∗(−k)eikx dk = 1√

    ∫ +∞

    −∞b(k)eikx dk ,

    where we changed the integration variable, k → −k, in the first integral above. It thenfollows that

    1√2π

    ∫ +∞

    −∞

    [

    b(k)− a∗(−k)]

    eikx dk = 0 .

    Inverting this Fourier transform, we conclude that

    g(x) = f ∗(x) =⇒ b(k) = a∗(−k) .

    Inserting this result into eq. (39), we end up with

    ∫ +∞

    −∞|a(k)|2 dk =

    ∫ +∞

    −∞|f(x)|2 dx ,

    which is Parseval’s relation.

    4. The step function is defined as:

    Θ(k) =

    {

    1 , if k > 0 ,

    0 , if k < 0 .

    (a) Using contour integration in the complex plane and the residue theorem, derive thefollowing integral representation for Θ(k):

    Θ(k) = limε→0

    1

    2πi

    ∫ +∞

    −∞dx

    eikx

    x− iε ,

    where ε is a positive infinitesimal real number.

    9

  • We shall evaluate the integral,

    I(k, ε) ≡ 12πi

    ∫ +∞

    −∞dx

    eikx

    x− iε ,

    where ǫ > 0 by considering a semicircular contour in the complex z plane. We considertwo case.

    Case 1: k > 0. Then it follows that

    Re z

    Im z

    C

    I(k, ε) =1

    2πi

    C

    dzeikz

    z − iε

    where C is the closed contour shown above, and the radius of the contour is taken to infinity.Note that because k > 0, the integrand is exponentially damped along the semicircular partof the contour C and thus the contribution to the integral along the semicircular arc goesto zero as the radius of the semicircle is taken to infinity.

    Inside the contour C there exists a simple pole at z = iε (since by assumption, ε > 0).Thus, by the residue theorem of complex analysis,

    I(k, ε) = 2πi1

    2πiRes

    (

    eikz

    z − iε

    )

    = e−kε ,

    where Resf(z) = limz→z0(z − z0)f(z) is the residue due to a simple pole at z = z0.

    Case 2: k < 0. Then it follows that

    Re z

    Im z

    C

    iεI(k, ε) =

    1

    2πi

    C

    dzeikz

    z − iε

    where the contour C is now closed in the lower half plane. Since in this case k < 0, theintegrand is again exponentially damped along the semicircular part of the contour C andthus the contribution to the integral along the semicircular arc goes to zero as the radiusof the semicircle is taken to infinity. But, now the pole lies outside the closed contour C.Hence, by Cauchy’s Theorem of complex analysis, it follows that I(k, ε) = 0 for k < 0.

    10

  • Taking the limit as ε → 0, we conclude that

    limε→0

    I(k, ε) = limε→0

    1

    2πi

    ∫ +∞

    −∞dx

    eikx

    x− iε ={

    1 , if k > 0,

    0 , if k < 0.

    That is,

    limε→0

    1

    2πi

    ∫ +∞

    −∞dx

    eikx

    x− iε = Θ(k) . (40)

    (b) Prove that:

    δ(k) =d

    dkΘ(k) .

    First we present the physicist’s proof. Using the result of part (a)

    dΘ(k)

    dk= lim

    ε→0

    1

    2πi

    d

    dk

    ∫ +∞

    −∞dx

    eikx

    x− iε

    = limε→0

    1

    ∫ +∞

    −∞dx

    x

    x− iεeikx

    =1

    ∫ +∞

    −∞dx eikx = δ(k) ,

    after making use of eq. (32).A somewhat more rigorous argument can be given as follows. First for k 6= 0, Θ(k) is

    a constant, so it is clear that

    dΘ(k)

    dk= 0 , for k 6= 0. (41)

    For k = 0, Θ(k) is not well defined. However, note that for any pair of real positive numbersa, b > 0,

    ∫ b

    −a

    dΘ(k)

    dkdk =

    ∫ b

    −adΘ(k) = Θ(k)

    b

    −a= Θ(b)−Θ(−a) = 1 . (42)

    But, eqs. (41) and (42) are precisely the defining properties of the delta function [cf. eq. (16)].Namely, δ(k) = 0 for k 6= 0 and

    ∫ b

    −aδ(k) dk = 1 ,

    for any pair of real positive numbers a, b > 0. Hence, we can identify,

    δ(k) =d

    dkΘ(k) .

    11

  • Note that in the theory of distributions, a similar proof to the one just provided can begiven by examining

    ∫ ∞

    −∞

    dΘ(k)

    dkf(k) dk , (43)

    where f(k) is a “test function” (often taken to be an element of the set of infinitely differ-entiable functions that vanish as k → ±∞ faster than any inverse power of k). Then, oneevaluates eq. (43) by integrating by parts. Since f(k) vanishes as k → ±∞, the surfaceterms do not contribute, and one obtains,

    ∫ ∞

    −∞

    dΘ(k)

    dkf(k) dk = −

    ∫ ∞

    −∞Θ(k)

    df

    dkdk = −

    ∫ ∞

    0

    df = f(0)− f(∞) = f(0) .

    This is to be compared with

    ∫ ∞

    −∞δ(k)f(k) dk = f(0) .

    We conclude that∫ ∞

    −∞

    [

    dΘ(k)

    dk− δ(k)

    ]

    f(k) dk = 0 .

    Since this result must be valid for any test function, it follows that the integrand mustvanish exactly. That is,

    δ(k) =d

    dkΘ(k) .

    (c) Consider the following important result in the theory of distributions:8

    limε→0

    1

    x± iε = P1

    x∓ iπδ(x) , (44)

    where ǫ is a positive infinitesimal and P denotes the Cauchy principal value. Formally,eq. (44) only makes sense when multiplied by a well-behaved function F (x) and integratedfrom −∞ to +∞. (Here, well-behaved means that the resulting integrals are convergent.)In this case, the Cauchy principal value is defined by

    P

    ∫ +∞

    −∞

    1

    xF (x) dx ≡ lim

    a→0

    {∫ −a

    −∞

    F (x)

    xdx+

    ∫ +∞

    a

    F (x)

    xdx

    }

    .

    Using the result of part (a), show that the Fourier transform of eq. (44) is satisfied. Thatis, eq. (44) is valid when multiplied by the function F (x) = eikx and integrated from −∞to +∞.

    8Eq. (44) is called the Sokhotski-Plemelj Formula. Three different proofs of this formula are given in aclass handout entitled, The Sokhotski-Plemelj Formula.

    12

  • If we take the complex conjugate of eq. (40), and then take k → −k, we obtain

    limε→0

    1

    2πi

    ∫ +∞

    −∞dx

    eikx

    x+ iε= −Θ(−k) . (45)

    Thus, eqs. (40) and (45) yield

    limε→0

    ∫ ∞

    −∞

    eikx

    x± iε dx = ∓2πiΘ(∓k) . (46)

    Next, we consider

    P

    ∫ ∞

    −∞

    eikx

    xdx = P

    ∫ ∞

    −∞

    cos(kx)

    xdx+ iP

    ∫ ∞

    −∞

    sin(kx)

    xdx . (47)

    Since cos(kx)/x is an odd function of x (i.e., it changes sign under x→ −x), it immediatelyfollows from the definition of the Cauchy principle value that

    P

    ∫ ∞

    −∞

    cos(kx)

    xdx = 0 . (48)

    Next, we observe that limx→0 sin(kx)/x = k, so that the function sin(kx)/x is regular atx = 0. Thus,

    P

    ∫ ∞

    −∞

    sin(kx)

    xdx =

    ∫ ∞

    −∞

    sin(kx)

    xdx = π sgn(k) , (49)

    where the sign of k is defined by

    sgn(k) = Θ(k)−Θ(−k) ={

    1 , for k > 0,

    −1 , for k < 0.

    That is, the P symbol has no effect on the integral defined in eq. (49). The factor of sgn(k)arises after changing the integration variable, y = kx. When k < 0, the integration limitsmust be reversed, which then leads to the extra sign.

    Inserting eqs. (48) and (49) into eq. (47) then yields,

    P

    ∫ ∞

    −∞

    eikx

    xdx = πi

    [

    Θ(k)−Θ(−k)]

    . (50)

    Finally, we note that

    ∓iπ∫ ∞

    −∞eikxδ(x) dx = ∓iπ . (51)

    Combining eqs. (46), (50) and (51),

    limε→0

    ∫ ∞

    −∞

    eikx

    x± iε dx− P∫ ∞

    −∞

    eikx

    xdx ± iπ

    ∫ ∞

    −∞eikxδ(x) dx

    = ∓2πiΘ(∓k)− πi[

    Θ(k)−Θ(−k)∓ 1]

    = ∓πi[

    Θ(k) + Θ(−k)− 1]

    = 0 ,

    13

  • after using the identity Θ(k) + Θ(−k) = 1, which is valid for all values of k.9Hence, we can symbolically write,

    ∫ ∞

    −∞eikx{

    limε→0

    1

    x± iε − P1

    x∓ iπδ(x)

    }

    dx = 0 ,

    for all values of k. Thus, we have verified that

    limε→0

    1

    x± iε = P1

    x± iπδ(x) ,

    is a valid identity when multiplied by eikx and integrated over all x. Indeed, this symbolicidentity is valid when replacing eikx with any choice of reasonable test function that canbe written as a Fourier transform,

    F (x) =1√2π

    ∫ ∞

    −∞a(k)eikx dk .

    5. Define the operators X and K such that X |f〉 = |xf〉 and K |f〉 = −i |df/dx〉. Then[X,K] = iI, where I is the identity operator in an infinite dimensional space.

    (a) It appears that by using Tr(XK) = Tr(KX), we obtain 0 = ∞. Resolve theparadox.

    The paradox arises when one assumes that the relation, Tr(XK) = Tr(KX), which is validfor finite-dimensional matrices, continues to hold in general for operators acting on vectorsthat live in infinite-dimensional spaces. Indeed, in the latter case,

    TrΩ =∑

    i

    〈i|Ω |i〉 ,

    may fail to converge.I will not give a detailed mathematical discussion on whether TrΩ is well-defined, nor

    will I provide the conditions under which Tr(AB) = Tr(BA) for operators A and B thatact in an infinite-dimensional space.10 Suffice it to say that for the operators X and K,neither Tr(XK) nor Tr(KX) converge, so that

    Tr[X,K] = Tr(XK)− Tr(KX) = ∞−∞ ,

    which is not well defined. Thus, one cannot conclude that Tr[X,K] = 0, and there is noparadox.

    9Strictly speaking, Θ(k)+Θ(−k) is not defined at k = 0. But, it is convenient to define this sum in thelimit as k → 0, in which case we can take Θ(k) + Θ(−k) = 1 for all values of k.

    10For a mathematical treatment of these issues, see e.g. Guido Fano, Mathematical Methods of QuantumMechanics (McGraw Hill, New York, 1971) pp. 375–380.

    14

  • A proof that Tr(XK) and Tr(KX) do not converge

    To show that Tr(XK) does not converge, consider first

    Tr(XK) =

    ∫ ∞

    −∞dx 〈x|XK |x〉 =

    ∫ ∞

    −∞x dx 〈x|K |x〉 .

    But, 〈x|K |x′〉 = −iδ′(x− x′), so one might be tempted to conclude that

    Tr(XK) = −iδ′(0)∫ ∞

    −∞x dx . (52)

    Since δ′(x) is an odd function of x, it follows that δ′(0) = 0. Hence, eq. (52) yields

    Tr(XK) = 0×∞ ,

    which is not well defined. Thus, the argument above is not able to determine whetherTr(XK) is finite or infinite.

    Here is another approach. Consider a basis consisting of a complete set of orthonormaleigenstates, {|n〉}. Then, the diagonal elements of XK are

    〈n|XK |n〉 =∫ ∞

    −∞dx

    ∫ ∞

    −∞dx′ x 〈n|x〉 〈x|K |x′〉 〈x′|n〉

    = −i∫ ∞

    −∞x dxψ∗n(x)

    ∫ ∞

    −∞dx′ δ′(x− x′)ψn(x′)

    = i

    ∫ ∞

    −∞x dxψ∗n(x)

    ∫ ∞

    −∞dx′ δ′(x′ − x)ψn(x′)

    = −i∫ ∞

    −∞x dxψ∗n(x)

    dψn(x)

    dx, (53)

    where ψn(x) ≡ 〈x|n〉. In deriving eq. (53), we used the fact that δ′(x − x′) = −δ′(x′ − x)and then integrated by parts, under the assumption that the surface terms vanish, orequivalently limx→±∞ ψn(x) = 0.

    A similar computation yields

    〈n|KX |n〉 = i∫ ∞

    −∞x dx

    dψ∗n(x)

    dxψn(x) = −i

    ∫ ∞

    −∞dxψ∗n(x)

    [

    ψn(x) +dψn(x)

    dx

    ]

    = −i+ 〈n|XK |n〉 ,

    where we have integrated by parts (dropping the surface terms) and have used the factthat the {|n〉} are orthonormal, i.e.,

    1 = 〈n|n〉 =∫ ∞

    −∞dx 〈n|x〉 〈x|n〉 =

    ∫ ∞

    −∞dx |ψn(x)|2 .

    15

  • Thus,〈n| [X,K] |n〉 = i 〈n|n〉 = i ,

    as expected.To show that neither Tr(XK) and Tr(KX) is convergent, consider the following ex-

    ample. Let us choose the {|n〉} to be eigenstates of the operator d2/dx2, and consider thefunction space defined on an interval of the real axis, 0 ≤ x ≤ L. We choose boundaryconditions such that ψn(0) = ψn(L) = 0. Then,

    ψn(x) =

    2

    Lsin(nπx

    L

    )

    , for 0 ≤ x ≤ L ,

    0 , for x > L and for x < −L ,

    for all positive integers n = 1, 2, 3, . . .. Then,

    〈n|XK |n〉 = −2inπL2

    ∫ L

    0

    x sin(nπx

    L

    )

    cos(nπx

    L

    )

    dx =i

    2, (54)

    〈n|KX |n〉 = −i+ 〈n|XK |n〉 = − i2. (55)

    Then

    Tr(XK) =

    ∞∑

    n=1

    〈n|XK |n〉 = i2

    ∞∑

    n=1

    1 ,

    Tr(KX) =

    ∞∑

    n=1

    〈n|XK |n〉 = − i2

    ∞∑

    n=1

    1 ,

    which shows explicitly that Tr(XK) and Tr(KX) do not converge. Moreover, Tr[X,K]does not converge as well,

    Tr[X,K] =∞∑

    n=1

    〈n| [X,K] |n〉 = i∞∑

    n=1

    1 .

    Thus, there is no paradox.

    (b) Show that if the functions F and G can be expressed as power series in theirarguments, then:

    [X,G(K)] = idG

    dKand [K,F (X)] = −i dF

    dX.

    First, we shall employ the commutator identity,

    [A,BC] = B[A,C] + [A,B]C ,

    16

  • to evaluate [X,Kn]. After n steps,

    [X,Kn] = K[X,Kn−1] + [X,K]Kn−1 = K[X,Kn−1] + iKn−1

    = K

    {

    K[X,Kn−2] + [X,K]Kn−2}

    + iKn−1 = K2[X,Kn−2] + 2iKn−1

    = K2{

    K[X,Kn−3] + [X,K]Kn−3}

    + 2iKn−1 = K3[X,Kn−3] + 3iKn−1

    ...

    = Kn−2{

    K[X,K] + [X,K]K

    }

    + (n− 2)iKn−1 = Kn−1[X,K] + (n− 1)iKn−1

    = inKn−1 , (56)

    where we have used [X,K] = iI.Assuming G(K) is defined via its Taylor series,

    G(K) =∞∑

    n=0

    gnKn ,

    then eq. (56) yields

    [X,G(K)] =∞∑

    n=0

    gn[X,Kn] = i

    ∞∑

    n=1

    gnnKn−1 = i

    d

    dK

    ∞∑

    n=0

    gnKn = i

    dG

    dK.

    In light of [K,X ] = −[X,K] = −iI, a computation analogous to the one shown ineq. (56) yields,

    [K,Xn] = −inXn−1 . (57)Hence, if we assume that F (X) is defined via its Taylor series

    F (X) =

    ∞∑

    n=0

    fnXn ,

    then eq. (57) yields

    [K,F (X)] =

    ∞∑

    n=0

    fn[K,Xn] = −i

    ∞∑

    n=1

    fnnXn−1 = −i d

    dX

    ∞∑

    n=0

    FnXn = −i dF

    dX.

    (c) Let |x〉 be an eigenstate of X with eigenvalue x. Prove that exp(−iKa) |x〉 is aneigenstate of X . What is the corresponding eigenvalue?

    17

  • Using the results of part (b),

    [X, exp(−iKa)] = i ddK

    exp(−iKa) = a exp(−iKa) .

    Writing out the commutator, it follows that

    X exp(−iKa) = exp(−iKa)X + a exp(−iKa) = exp(−iKa)[

    X + aI]

    .

    Hence,

    X exp(−iKa) |x〉 = exp(−iKa)[

    X + aI]

    |x〉 = (x+ a) exp(−iKa) |x〉 .

    Rewriting this last result as follows,

    X

    {

    exp(−iKa) |x〉}

    = (x+ a)

    {

    exp(−iKa) |x〉}

    ,

    we see that exp(−iKa) |x〉 is an eigenstate of X with eigenvalue x+ a.

    An alternate method

    The exponential of an operator is defined via its Taylor series,

    exp(−iKa) =∞∑

    n=0

    (−ia)nn!

    Kn .

    To evaluate exp(−iKa) |x〉, we first evaluate

    〈x| exp(−iKa) |ψ〉 =∞∑

    n=0

    (−ia)nn!

    〈x|Kn |ψ〉 .

    Using

    〈x|Kn |ψ〉 =(

    −i ddx

    )n

    ψ(x) ,

    we obtain,

    〈x| exp(−iKa) |ψ〉 =∞∑

    n=0

    (−a)nn!

    dnψ

    dxn= ψ(x− a) , (58)

    after using the Taylor series expansion for ψ(x− a). Recalling that

    ψ(x− a) = 〈x− a|ψ〉 ,

    it then follows from eq. (58) that

    〈x| exp(−iKa) |ψ〉 = 〈x− a|ψ〉 .

    18

  • Since this equation is true for an arbitrary eigenket |ψ〉, it then follows that11

    〈x| exp(−iKa) = 〈x− a| 〉 . (59)

    Taking the adjoint of eq. (59) yields

    exp(iKa) |x〉 = |x− a〉 .

    Finally, replacing a→ −a, we conclude that

    exp(−iKa) |x〉 = |x+ a〉 .

    Thus, exp(−iKa) |x〉 is an eigenstate of X with eigenvalue x+ a.

    6. (a) Show that if A commutes with [A,B] then:

    eABe−A = B + [A,B] .

    Consider the function f(a) = eaABe−aA. Let us compute ts Taylor series,

    f(a) = f(0) + af ′(0) + 12a2f ′′(0) + · · · , (60)

    where f ′ = df/dx, etc. First we observe that f(0) = B. Next,

    f ′(a) = eaAABe−aA − eaABAe−aA = eaA[A,B]e−aA .

    Under the assumption thatA commutes with [A,B] , it follows that [A,B]e−aA = e−aA[A,B].Hence,

    f ′(a) = eaA[A,B]e−aA = eaAe−aA[A,B] = [A,B] ,

    which is independent of a. Thus, all higher derivatives of f are zero,Plugging the above results into eq. (60), we conclude that

    f(a) = B + a[A,B] . (61)

    Setting a = 1, we end up with

    eABe−A = B + [A,B] ,

    under the assumption that[

    A, [A,B]]

    = 0.

    (b) Suppose that A and B are two non-commuting operators such that their commutator[A,B] commutes with both A and B. Prove that

    expA expB = exp(

    A+B + 12[A,B]

    )

    .

    11If 〈χ1|ψ〉 = 〈χ2|ψ〉 for all kets |ψ〉, then 〈χ1 − χ2|ψ〉 = 0. But the only vector that is simultaneouslyorthogonal to all vectors of the Hilbert space is the zero vector. Hence, χ1 = χ2.

    19

  • Under the assumption that[

    A, [A,B]]

    =[

    B, [A,B]]

    = 0, we shall evaluate eAeB. Ourstrategy will be to define the function

    G(t) = eAteBt , (62)

    and deducing a differential equation for G. Differentiating G with respect to t yields

    dG

    dt= AeAteBt + eAtBeBt = (A+ eAtBe−At)G .

    Using the results of part (a) [cf. eq. (61)], eAtBe−At = B + t[A,B]. Hence,

    dG

    dt=(

    A+B + t[A,B])

    G . (63)

    Note that the general solution to

    dG

    dt= C(t)G ,

    under the assumption that [C(t1), C(t2)] = 0 for all t1, t2 is given by

    G(t) = G(0) exp

    {∫ t

    0

    C(t′)dt′}

    .

    We can apply this result to eq. (63) since both A and B commute with [A,B]. Hence, thesolution to eq. (63) is

    G(t) = exp{

    (A +B)t+ 12[A,B]t2

    }

    , (64)

    where we have noted that eq. (62) implies that G(0) = I (where I is the identity matrix).Finally, setting t = 1, eq. (64) yields

    expA expB = exp(

    A+B + 12[A,B]

    )

    .

    7. In Chapter 1, Sakurai and Napolitano introduce the spin-12operators Sx, Sy and Sz.

    (a) With respect to the basis, {|+〉 , |−〉}, determine the explicit matrix representationfor Sx, Sy and Sz. Denote the corresponding matrices by

    12~σi where i = 1, 2, 3 corresponds

    to x, y, z, respectively. The three 2×2 matrices, σ1, σ2 and σ3, are called the Pauli matrices.

    Recall that in class, we defined the matrix elements, Ωij , of a linear operator Ω by12

    Ω |j〉 =∑

    i

    Ωij |i〉 . (65)

    12Under the assumption that the basis{

    |i〉}

    is orthonormal, eq. (65) implies that Ωij = 〈i|Ω|j〉.

    20

  • When we apply this result to the operators Sx, Sy and Sz, the index i will run over thevalues 1 and 2, which correspond to the basis vectors |+〉 and |−〉, respectively. Note thatthe basis

    {

    |+〉 and |−〉}

    is orthonormal, i.e.,

    〈+|+〉 = 〈−|−〉 = 1 , 〈+|−〉 = 〈−|+〉 = 0 .

    First we use eq. (1.2.6) on p. 12 of Sakurai and Napolitano,

    Sz |+〉 = 12~ |+〉 , Sz− = −12~ |−〉 .

    Using eq. (65) one obtains the matrix representation, Sz =12~σ3, where

    σ3 =

    (

    1 00 −1

    )

    .

    Next, we use eqs. (1.4.17a) and (1.4.17b) on p. 27 of Sakurai and Napolitano,

    Sx =12~[

    |+〉 〈−| + |−〉 〈+|]

    , (66)

    Sy =12~[

    −i |+〉 〈−| + i |−〉 〈+|]

    . (67)

    Using then orthonormality of the {|+〉 , |−〉} basis, it follows from eq. (66) that

    Sx |+〉 = 12~ |−〉 , Sx |−〉 = 12~ |+〉 .

    Using eq. (65), one obtains the matrix representation, Sx =12~σ1, where

    σ1 =

    (

    0 11 0

    )

    .

    Likewise, it follows from eq. (67) that

    Sy |+〉 = 12 i~ |−〉 , Sy |−〉 = −12 i~ |+〉 .

    Using eq. (65), one obtains the matrix representation, Sy =12~σ2, where

    σ2 =

    (

    0 −ii 0

    )

    .

    (b) It is often convenient to write ~σ = (σ1 , σ2 , σ3). That is, ~σ is a three dimensional“vector” whose “components” are the three 2 × 2 Pauli matrices. Prove the followingidentities involving the Pauli matrices:

    (i) σiσj = δij12×2 + iǫijkσk .

    (ii) (~σ ·~a)(~σ ·~b) = (~a·~b)12×2 + i~σ ·(~a ×~b).

    (iii) exp(−iθn̂·~σ/2) = 12×2 cos θ/2− in̂·~σ sin(θ/2).

    21

  • where 12×2 is the 2 × 2 identity matrix, and n̂ is a unit vector. Note that in (i) above,there is an implicit sum over the repeated index k = 1, 2, 3.

    Proof of (i)

    For i = j, the identity above reduces to σ2i = 12×2. This is verified by explicit calcula-tion,

    σ21 =

    (

    0 11 0

    )(

    0 11 0

    )

    =

    (

    1 00 1

    )

    , σ22 =

    (

    0 −ii 0

    )(

    0 −i1 0

    )

    =

    (

    1 00 1

    )

    ,

    σ23 =

    (

    1 00 −1

    )(

    1 00 −1

    )

    =

    (

    1 00 1

    )

    .

    For i 6= j, we get six relations,

    σ1σ2 = −σ2σ1 = iσ3 , σ2σ3 = −σ3σ2 = iσ1 , σ3σ1 = −σ1σ3 = iσ2 ,

    which again can be verified explicitly. For example,

    σ1σ2 =

    (

    0 11 0

    )(

    0 −ii 0

    )

    =

    (

    i 00 −i

    )

    = iσ3 ,

    σ2σ1 =

    (

    0 −ii 0

    )(

    0 11 0

    )

    =

    (

    −i 00 i

    )

    = −iσ3 . (68)

    The four other relations can be similarly verified.

    Proof of (ii)

    Starting from identity (i), multiply by aibj and then sum over i and j. Using aiσi = ~σ ·~a

    (due to the implicit sum over the repeated index i) and (~a ×~b)k = ǫijkaibj (where theimplicit sum over the repeated indices i and j is assumed), we see that identity (ii) is animmediate consequence of identity (i).

    Proof of (iii)

    By definition,

    exp(−iθn̂·~σ/2) =∞∑

    j=0

    1

    j!

    (−iθn̂·~σ2

    )j

    .

    Using identity (ii), (n̂·~σ)2 = 12×2. It then follows that

    (n̂·~σ)2j = 12×2 , (n̂·~σ)2j+1 = n̂·~σ ,

    22

  • for any non-negative integer j. Hence, we end up with

    exp(−iθn̂·~σ/2) = 12×2∑

    j even

    1

    j!

    (−iθ2

    )j

    + n̂·~σ∑

    j odd

    1

    j!

    (−iθ2

    )j

    =

    ∞∑

    j=0

    (−1)j(2j)!

    (

    θ

    2

    )2j

    − in̂·~σ∞∑

    j=1

    (−1)j(2j + 1)!

    (

    θ

    2

    )j+1

    = 12×2 cos(θ/2)− in̂·~σ sin(θ/2) , (69)

    which is the desired result.

    8. Consider an arbitrary real 3× 3 antisymmetric matrix A, whose matrix elements satisfyAij = −Aji.

    (a) Show that the matrix elements A can be written in the form,

    Aij =3∑

    k=1

    ǫijkak , (70)

    where the ak are real numbers.

    Given a 3×3 antisymmetric matrix Aij = −Aji, we first observe that the diagonal elementsof A must vanish. That is Aii = 0 (no sum over i). Of the remaining six elements of thematrix, only three are independent. Thus, eq. (70) provides the most general form for a3× 3 antisymmetric matrix, which in matrix form is

    A =

    0 a3 −a2−a3 0 a1a2 −a1 0

    . (71)

    (b) Evaluate exp(A).

    The exponential of a matrix is defined via its Taylor series,

    eA =

    ∞∑

    n=0

    An

    n!. (72)

    We shall compute successive powers of the matrix A. We use eq. (70), which we write usingthe Einstein summation convention,

    Aij = ǫijkak .

    23

  • Then,(A2)ij = AiℓAℓj = ǫiℓmǫℓjnaman . (73)

    We can evaluate the product of ǫ-symbols using the identity,

    ǫiℓmǫℓjn = −ǫℓimǫℓjn = −(δijδmn − δinδjm) .

    Then, we end up with(A2)ij = aiaj − δija2 ,

    where we have defined

    a2 ≡3∑

    i=1

    a2i .

    Next, we evaluate

    (A3)ij = (A2)iℓAℓj = (aiaℓ − δiℓa2)ǫℓjnan = ǫℓjnaiaℓan − a2ǫijnan = −a2ǫijnan = −a2Aij .

    Note that in this derivation, we noted that ǫℓjnaiaℓan = 0, since ǫℓjn is antisymmetricunder the interchange of the indices ℓ and n, whereas the product aℓan is symmetric underthe interchange of the indices ℓ and n. This means that when the sum over ℓ and n isperformed, for every term in the sum there is another term of opposite sign. Hence, thefull sum is zero due to the cancellation of positive and negative terms.

    Thus, we have demonstrated that A3 = −a2A.13 This result immediately implies that

    A2n+1 = (−a2)nA , A2n+2 = (−a2)nA2 , (74)

    for n = 1, 2, 3, . . .. Hence, we can sum the Taylor series given in eq. (72),

    eA = 13×3 + A+A2

    2!+A3

    3!+A4

    4!+ · · ·

    = 13×3 + A2

    [

    1

    2!+

    (−a2)4!

    +(−a2)2

    6!+ · · ·

    ]

    + A

    [

    1 +(−a2)3!

    +(−a2)2

    5!+ · · ·

    ]

    = 13×3 +A2

    a2

    [

    a2

    2!− a

    4

    4!+a6

    6!+ · · ·

    ]

    +A

    a

    [

    a− a3

    3!+a5

    5!+ · · ·

    ]

    = 13×3 + A2

    (

    1− cos aa2

    )

    + A

    (

    sin a

    a

    )

    , (75)

    where 13×3 is the 3× 3 identity matrix.

    13One can also derive A3 = −a2A by explicitly computing the third power of the matrix given in eq. (71).However, the most elegant method for deriving this result is to make use of the Cayley-Hamilton theorem.Details can be found in the Appendix to these solutions (see Alternative Method 1).

    24

  • APPENDIX: Two Alternative Solutions to Problem 8(b)

    ALTERNATIVE METHOD 1

    We first evaluate the eigenvalues of eq. (71). The characteristic equation is given by

    p(λ) = det(A− λ13×3) = det

    −λ a3 −a2−a3 −λ a1a2 −a1 −λ

    = −λ(λ2 + a21)− a3(a3λ− a1a2)− a2(a3a1 + λa2) = −λ3 − λ(a21 + a22 + a23) .

    That is,p(λ) = −

    [

    λ3 + aλ]

    , (76)

    where a2 ≡ a21 + a22 + a23. Since A satisfies its characteristic equation, it follows thatA3 = −aA . Hence, eqs. (74) and (75) immediately follow.

    ALTERNATIVE METHOD 2

    Since A is a real antisymmetric matrix, we see that A†A = AA† = −A2. That is,A is a normal matrix, which implies that it can be diagonalized by a unitary matrix U .The elements of the diagonal matrix are the eigenvalues of A. Solving the characteristicequation obtained in eq. (76) yields three roots:

    λ = 0 , ia , −ia ,

    Hence, there exists a unitary matrix U such that

    U †AU = ia diag(0 , 1 , −1) .

    It is convenient to introduce three diagonal matrices,

    K ≡ diag(0 , 1 , −1) , (77)L ≡ K2 = diag(0 , 1 , 1) , (78)M ≡ 13×3 − L = diag(1 , 0 , 0) . (79)

    Using this notation,U †AU = iaK . (80)

    Exponentiating eq. (80) yields,

    U †eAU = U †

    { ∞∑

    n=0

    An

    n!

    }

    U =

    ∞∑

    n=0

    (U †AU)n

    n!= eU

    †AU† = eiaK = diag(1 , eia , e−ia) , (81)

    25

  • after making repeated use of UU † = 13×3. Employing the matrices K, L and M definedabove [cf. eqs. (77)–(79)] and noting that eia = cos a + i sin a, we can rewrite eq. (81) as

    U †eAU =M + L cos a + iK sin a . (82)

    Hence,eA = U

    [

    M + L cos a + iK sin a]

    U † . (83)

    To simplify eq. (83), we first note that eq. (80) implies that

    A = iaUKU † . (84)

    It then follows that

    A2 = (ia)2(UKU †)(UKU †) = −a2UK2U † = −a2ULU † (85)

    Finally,

    UMU † = U(13×3 − L)U † = 13×3 −A2

    a2. (86)

    Using the results of eqs. (84)–(86) in eq. (83) yields,

    eA = 13×3 +

    (

    sin a

    a

    )

    A+

    (

    1− cos aa2

    )

    A2 . (87)

    Thus, we have successfully rederived eq. (75). Note that in this analysis, we were ableto obtain the final result without having to explicitly determine U . The columns of thematrix U consists of the eigenvectors of A. However, the eigenvectors corresponding to theeigenvalues ±ia are quite messy. Thus, it was quite fortunate that we were able to avoidwriting the eigenvectors out explicitly in this derivation of eq. (87).

    Reference: Nicholas J. Higham, Function of Matrices: Theory and Computation (SIAM,the Society for Industrial and Applied Mathematics, Philadelphia, 2008) p. 373.

    26