functional analysis lecture notes chapter...

FUNCTIONAL ANALYSIS LECTURE NOTES

CHAPTER 2. OPERATORS ON HILBERT SPACES

CHRISTOPHER HEIL

1. Elementary Properties and Examples

First recall the basic definitions regarding operators.

Definition 1.1 (Continuous and Bounded Operators). Let X, Y be normed linear spaces,and let L : X → Y be a linear operator.

(a) L is continuous at a point f ∈ X if fn → f in X implies Lfn → Lf in Y .

(b) L is continuous if it is continuous at every point, i.e., if fn → f inX implies Lfn → Lfin Y for every f .

(c) L is bounded if there exists a finite K ≥ 0 such that

∀ f ∈ X, ‖Lf‖ ≤ K ‖f‖.Note that ‖Lf‖ is the norm of Lf in Y , while ‖f‖ is the norm of f in X.

(d) The operator norm of L is

‖L‖ = sup‖f‖=1

‖Lf‖.

(e) We let B(X, Y ) denote the set of all bounded linear operators mapping X into Y ,i.e.,

B(X, Y ) = L : X → Y : L is bounded and linear.If X = Y = X then we write B(X) = B(X,X).

(f) If Y = F then we say that L is a functional. The set of all bounded linear functionalson X is the dual space of X, and is denoted

X ′ = B(X,F) = L : X → F : L is bounded and linear.

We saw in Chapter 1 that, for a linear operator, boundedness and continuity are equivalent.Further, the operator norm is a norm on the space B(X, Y ) of all bounded linear operatorsfrom X to Y , and we have the composition property that if L ∈ B(X, Y ) and K ∈ B(Y, Z),then KL ∈ B(X,Z), with ‖KL‖ ≤ ‖K‖ ‖L‖.

Date: February 20, 2006.

These notes closely follow and expand on the text by John B. Conway, “A Course in Functional Analysis,”

Second Edition, Springer, 1990.

1

2 CHRISTOPHER HEIL

Exercise 1.2. Suppose that L : X → Y is a bounded map of a Banach space X into aBanach space Y . Prove that if there exists a c > 0 such that ‖Lf‖ ≥ c ‖f‖ for every f ∈ X,then range(L) is a closed subspace of Y .

Exercise 1.3. Let Cb(Rn) be the set of all bounded, continuous functions f : Rn → F. Let

C0(Rn) be the set of all continuous functions f : R

n → F such that lim|x|→∞ f(x) = 0 (i.e.,for every ε > 0 there exists a compact set K such that |f(x)| < ε for all x /∈ K). Provethat these are closed subspaces of L∞(Rn) (under the L∞-norm; note that for a continuousfunction we have ‖f‖∞ = sup |f(x)|).

Define δ : Cb(Rn) → F by

δ(f) = f(0).

Prove that δ is a bounded linear functional on Cb(Rn), i.e., δ ∈ (Cb)

′, and find ‖δ‖. Thislinear functional is the delta distribution (see also Exercise 1.26 below).

Example 1.4. In finite dimensions, all linear operators are given by matrices, this is juststandard finite-dimensional linear algebra.

Suppose that X is an n-dimensional complex normed vector space and Y is an m-dimensional complex normed vector space. By definition of dimension, this means thatthere exists a basis BX = x1, . . . , xn for X and a basis BY = y1, . . . , ym for Y . If x ∈ X,then x = c1x1 + · · ·+ cnxn for a unique choice of scalars ci. Define the coordinates of x with

respect to the basis BX to be

[x]BX=

c1...cn

∈ Cn.

The vector x is completely determined by its coordinates, and conversely each vector in Cn

is the coordinates of a unique x ∈ X. The mapping x 7→ [x]BXis a linear mapping of X onto

Cn. We similarly define [y]BY∈ Cm for vectors y ∈ Y .

Let A : X → Y be a linear map (it is automatically bounded since X is finite-dimensional).Then A transforms vectors x ∈ X into vectors Ax ∈ Y . The vector x is determined by itscoordinates [x]BX

and likewise Ax is determined by its coordinates [Ax]BY. The vectors x

and Ax are related through the linear map A; we will show that the coordinate vectors [x]BX

and [Ax]BYare related by multiplication by an m× n matrix determined by A. We call this

matrix the standard matrix of A with respect to BX and BY , and denote it by [A]BX ,BY. That

is, the standard matrix should satisfy

[Ax]BY= [A]BX ,BY

[x]BX, x ∈ X.

We claim that the standard matrix is the matrix whose columns are the coordinates of thevectors Axk, i.e.,

[A]BX ,BY=

[Ax1]BY· · · [Axn]BY

.

CHAPTER 2. OPERATORS ON HILBERT SPACES 3

To see this, choose any x ∈ X and let x = c1x1 + · · · + cnxn be its unique representationwith respect to the basis BX . Then

[A]BX ,BY[x]BX

=

[Ax1]BY· · · [Axn]BY

c1...cn

= c1 [Ax1]BY+ · · ·+ cn [Axn]BY

= [c1Ax1 + · · · + cnAxn]BY

= [A(c1x1 + · · ·+ cnxn)]BY

= [Ax]BY.

Exercise 1.5. Extend the idea of the preceding example to show that that any linearmapping L : `2(N) → `2(N) (and more generally, L : H → K with H, K separable) can berealized in terms of multiplication by an (infinite but countable) matrix.

Exercise 1.6. Let A be an m×n complex matrix, which we view as a linear transformationA : Cn → Cm. The operator norm of A depends on the choice of norm for Cn and Cm.Compute an explicit formula for ‖A‖, in terms of the entries of A, when the norm on Cn andC

m is taken to be the `1 norm. Then do the same for the `∞ norm. Compare your resultsto the version of Schur’s Lemma given in Theorem 1.23.

The following example is one that we will return to many times.

Example 1.7. Let enn∈N be an orthonormal basis for a separable Hilbert space H. Thenwe know that every f ∈ H can be written

f =

∞∑

n=1

〈f, en〉 en.

Fix any sequence of scalars λ = (λn)n∈N, and formally define

Lf =

∞∑

n=1

λn 〈f, en〉 en. (1.1)

This is a “formal” definition because we do not know a priori that the series above willconverge—in other words, equation (1.1) may not make sense for every f .

Note that if H = `2(N) and enn∈N is the standard basis, then L is given by the formula

Lx = (λ1x1, λ2x2, . . . ), x = (x1, x2, . . . ) ∈ `2(N).

We will show the following (the `∞-norm of the sequence λ is ‖λ‖∞ = supn |λn|).(a) The series defining Lf in (1.1) converges for each f ∈ H if and only if λ ∈ `∞. In

this case L is a bounded linear mapping of H into itself, and ‖L‖ = ‖λ‖∞.

4 CHRISTOPHER HEIL

(b) If λ /∈ `∞, then L defines an unbounded linear mapping from the domain

domain(L) =

f ∈ H :

∞∑

n=1

|λn 〈f, en〉|2 <∞

(1.2)

(which is dense in H) into H.

Proof. (a) Suppose that λ ∈ `∞, i.e., λ is a bounded sequence. Then for any f we have

∞∑

n=1

|λn 〈f, en〉|2 ≤∞

∑

n=1

‖λ‖2∞ 〈f, en〉|2 = ‖λ‖2

∞ ‖f‖2 < ∞,

so the series defining Lf converges (because en is an orthonormal sequence). Moreover,the preceding calculation also shows that ‖Lf‖2 =

∑∞n=1 |λn 〈f, en〉|2 ≤ ‖λ‖2

∞ ‖f‖2, so wesee that ‖L‖ ≤ ‖λ‖∞. On the other hand, by orthonormality we have Len = λnen (i.e., eachen is an eigenvector for L with eigenvalue λn). Since ‖en‖ = 1 and ‖Len‖ = |λn| ‖en‖ = |λn|we conclude that

‖L‖ = sup‖f‖=1

‖Lf‖ ≥ supn∈N

‖Len‖ = supn∈N

|λn| = ‖λ‖∞.

The converse direction will be covered by the proof of part (b).

(b) Suppose that λ /∈ `∞, i.e., λ is not a bounded sequence. Then we can find a subsequence(λnk

)k∈N such that |λnk| ≥ k for each k. Let cnk

= 1k

and define all other cn to be zero. Then∑

n |cn|2 =∑

k1k2 < ∞, so f =

∑

n cnen converges (and cn = 〈f, en〉). But the formal seriesLf =

∑

n λncnen does not converge, because

∞∑

n=1

|cnλn|2 =∞

∑

k=1

|cnkλnk

|2 ≥∞

∑

k=1

k2

k2= ∞.

In fact, the series defining Lf in (1.1) only converges for those f which lie in the domaindefined in (1.2). That domain is dense because it contains the finite span of enn∈N, whichwe know is dense in H. Further, that domain is a subspace of H (exercise), so it is an inner-product space. The map L : domain(L) → H is a well-defined, linear map, so it remains onlyto show that it is unbounded. This follows from the facts that en ∈ domain(L), ‖en‖ = 1,and ‖Len‖ = |λn| ‖en‖ = |λn|.

Exercise 1.8. Continuing Example 1.7, suppose that λ ∈ `∞ and set δ = infn |λn|. Provethe following.

(a) L is injective if and only if λn 6= 0 for every n.

(b) L is surjective if and only if δ > 0 (if δ = 0, use an argument similar to the one usedin part (b) of Example 1.7 to show that range(L) is a proper subset of H).

(c) If δ = 0 but λn 6= 0 for every n then range(L) is a dense but proper subspace of H.

(d) Prove that L is unitary if and only if |λn| = 1 for every n.


In Example 1.7, we saw an unbounded operator whose domain was a dense but propersubspace ofH. This situation is typical for unbounded operators, and we often write L : X →Y even when L is only defined on a subset of X, as in the following example.

Example 1.9 (Differentiation). Consider the Hilbert space H = L2(0, 1), and define anoperator D : L2(0, 1) → L2(0, 1) by Df = f ′. Implicitly, we mean by this that D is definedon the largest domain that makes sense, namely,

domain(D) =

f ∈ L2(0, 1) : f is differentiable and f ′ ∈ L2(0, 1)

.

Note that if f ∈ domain(D), then Df is well-defined, Df ∈ L2(0, 1), and ‖Df‖2 < ∞.Thus every vector in domain(D) maps to a vector in L2(0, 1) which necessarily has finitenorm. Yet D is unbounded. For example, if we set en(x) = einx then ‖en‖2 = 1, butDen(x) = e′n(x) = ineinx so ‖Den‖2 = n. While each vector Den has finite norm, there is noupper bound to these norms. Since the en are unit vectors, we conclude that ‖D‖ = ∞.

The following definitions recall the basic notions of measures and measure spaces. For fulldetails, consult a book on real analysis.1

Definition 1.10 (σ-Algebras, Measurable Sets and Functions). Let X be a set, and let Ωbe a collection of subsets of X. Then Ω is a σ-algebra if

(a) X ∈ Ω,

(b) If E ∈ Ω then X \ E ∈ Ω (i.e., Ω is closed under complements),

(c) If E1, E2, · · · ∈ Ω then⋃

Ek ∈ Ω (i.e., Ω is closed under countable unions)

The elements of Ω are called the measurable subsets of X.If we choose F = R then we usually allow functions on X to take extended-real values, i.e.,

f(x) is allowed to take the values ±∞. An extended-real-valued function f : X → [−∞,∞]is called a measurable function if x ∈ X : f(x) > a is measurable for each a ∈ R.

If we choose F = C then we require functions on X to take (finite) complex values—thereis no complex analogue of ±∞. A complex-valued function f : X → C is called a measurable

function if its real and imaginary parts are measurable (real-valued) functions.

Definition 1.11 (Measure Space). Let X be a set and Ω a σ-algebra of subsets of X. Thena function µ on Ω is a (positive) measure if

(a) 0 ≤ µ(E) ≤ +∞ for all E ∈ Ω,

(b) If E1, E2, . . . is a countable family of disjoint sets in Ω, then

µ( ∞

⋃

k=1

Ek

)

=∞

∑

k=1

µ(Ek).

1For example, R. Wheeden and A. Zygmund, “Measure and Integral,” Marcel Dekker, 1977, or G. Folland,

“Real Analysis,” Second Edition, Wiley, 1999.

6 CHRISTOPHER HEIL

In this case, (X,Ω, µ) is called a measure space.If µ(X) <∞, then we say that µ is a finite measure.If there exist countably many subsets E1, E2, . . . such that X =

⋃

Ek and µ(Ek) <∞ forall k, then we say that µ is σ-finite. For example, Lebesgue measure on Rn is σ-finite.

It is often useful to allow measures to take negative values.

Definition 1.12 (Signed Measure). Let X be a set and Ω a σ-algebra of subsets of X. Thena function µ on Ω is a signed measure if

(a) −∞ ≤ µ(E) ≤ +∞ for all E ∈ Ω and µ(∅) = 0,

(b) If E1, E2, . . . is a countable family of disjoint sets in Ω, then

µ( ∞

⋃

k=1

Ek

)

=

∞∑

k=1

µ(Ek).

Definition 1.13 (Integration). Let (X,Ω, µ) be a measure space.

(a) If f : X → [0,∞] is a nonnegative, measurable function, then the integral of f over Xwith respect to µ is

∫

X

f dµ =

∫

X

f(x) dµ(x) = sup

∑

j

(

infx∈Ej

f(x))

µ(Ej)

,

where the supremum is taken over all decompositions E = E1 ∪ · · · ∪ EN of E as the unionof a finite number of disjoint measurable sets Ek (and where we take the convention that∞ · 0 = 0 · ∞ = 0).

(b) If f : X → [−∞,∞] and we define

f+(x) = maxf(x), 0, f−(x) = −minf(x), 0,then

∫

X

f dµ =

∫

X

f+ dµ −∫

X

f− dµ,

as long as this does not have the form ∞−∞ (in that case the integral would be undefined).Since |f | = f+ +f− and

∫

X|f | dµ always exists (either as a finite number or as ∞), it follows

that∫

X

f dµ exists and is finite ⇐⇒∫

X

|f | dµ <∞.

(c) If f : X → C, then∫

X

f dµ =

∫

X

Re (f) dµ + i

∫

X

Im (f) dµ,

as long as both integrals on the right are defined and finite.

There are many other equivalent definitions of the integral.


Definition 1.14 (Lp Spaces). Let (X,Ω, µ) be a measure space, and fix 1 ≤ p < ∞. ThenLp(X) consists of all measurable functions f : X → [−∞,∞] (if we choose F = R) orf : X → C (if we choose F = C) such that

‖f‖pp =

∫

X

|f(x)|p dµ(x) < ∞.

Then Lp(X) is a vector space under the operations of addition of functions and multiplicationof a function by a scalar. Additionally, the function ‖ · ‖p defines a semi-norm on Lp(X).Usually we identify functions that are equal almost everywhere (we say that f = g a.e. ifµx ∈ X : f(x) 6= g(x) = 0), and then ‖ · ‖ becomes a norm on Lp(X).

For p = ∞ we define L∞(X) to be the set of measurable functions that are essentiallybounded, i.e., for which there exists a finite constant M such that |f(x)| ≤M a.e. Then

‖f‖∞ = ess supx∈X

|f(x)| = inf

M ≥ 0 : |f(x)| ≤ M a.e.

is a semi-norm on L∞(X), and is a norm if we identify functions that are equal almosteverywhere.

For each 1 ≤ p ≤ ∞, the space Lp(X) is a Banach space under the above norm.

Exercise 1.15 (`p Spaces). Counting measure on a set X is defined by µ(X) = card(E) if Eis a finite subset of X, and µ(X) = ∞ if E is an infinite subset. Let Ω = P(X) (the set of allsubsets of X), and show that (X,Ω, µ) is a measure space. Show that Lp(X,Ω, µ) = `p(X).Show that µ is σ-finite if and only if X is countable.

Exercise 1.16 (The Delta Measure). Let X = Rn and Ω = P(X). Define δ(E) = 1 if 0 ∈ Eand δ(E) = 0 if 0 /∈ E. Prove that δ is a measure, and find a formula for

∫

Rn

f(x) dδ(x).

Sometimes this integral is written informally as∫

Rn f(x) δ(x) dx, but note that δ is a measureon Rn, not a function on Rn (see also Exercise 1.26 below).

Exercise 1.17. Fix 0 ≤ g ∈ L1(Rn), under Lebesgue measure. Prove that µ(E) =∫

Eg(x) dx

defines a finite measure on Rn.

With this preparation, we can give some additional examples of operators on Banach orHilbert spaces.

Example 1.18 (Multiplication Operators). Let (X,Ω, µ) be a measure space, and let φ ∈L∞(X) be a fixed measurable function. Then for any f ∈ L2(X) we have that fφ ismeasurable, and

‖fφ‖22 =

∫

X

|f(x)φ(x)|2 dx ≤∫

X

|f(x)|2 ‖φ‖2∞ dx = ‖φ‖2

∞ ‖f‖22 < ∞,

8 CHRISTOPHER HEIL

so fφ ∈ L2(X). Therefore, the multiplication operator Mφ : L2(X) → L2(X) given byMφf = fφ is well-defined, and the calculation above shows that ‖Mφf‖2 ≤ ‖φ‖∞ ‖f‖2.Therefore Mφ is bounded, and ‖Mφ‖ ≤ ‖φ‖∞.

If we assume that µ is σ-finite, then we can show that ‖Mφ‖ = ‖φ‖∞, as follows. Chooseany ε > 0. Then by definition of L∞-norm, the set E = x ∈ X : |φ(x)| > ‖φ‖∞ − εhas positive measure. Since X is σ-finite, we can write X = ∪Fm where each µ(Fm) < ∞.Since E = ∪(E ∩ Fm) is a countable union, we must have µ(E ∩ Fm) > 0 for some m. LetF = E ∩Fm, and set f = 1

µ(F )1/2χF . Then ‖f‖2 = 1, but ‖Mφf‖2 ≥ (‖φ‖∞− ε) ‖f‖2. Hence

‖Mφ‖2 ≥ ‖φ‖∞ − ε.Exercise: Find an example of a measure µ that is not σ-finite and a function φ such that

‖Mφ‖ < ‖φ‖∞.

Exercise 1.19. Let (X,Ω, µ) be a measure space, and let φ be a fixed measurable function.Prove that if fφ ∈ L2(X) for every f ∈ L2(X), then we must have φ ∈ L∞(X).

Solution. Assume φ /∈ L∞(X). Set

Ek = x ∈ X : k ≤ |φ(x)| < k + 1.The Ek are measurable and disjoint, and since φ is not in L∞(X) there must be infinitelymany Ek with positive measure. Choose any Enk

, k ∈ N, all with positive measure and letE = ∪Enk

. Define

f(x) =

1

k µ(Enk)1/2

, x ∈ Enk,

0, x /∈ E.

Then∫

X

|f |2 dµ =∞

∑

k=1

∫

Enk

1

k2 µ(Enk)

=∞

∑

k=1

1

k2< ∞,

but∫

X

|fφ|2 dµ ≥∞

∑

k=1

∫

Enk

k2

k2 µ(Enk)

=∞

∑

k=1

1 = ∞,

which is a contradiction.

Exercise 1.20. Continuing Example 1.18, do the following.

(a) Determine a necessary and sufficient condition on φ which implies that Mφ : L2(X) →L2(X) is injective.

(b) Determine a necessary and sufficient condition on φ which implies that Mφ : L2(X) →L2(X) is surjective.

(c) Prove that if Mφ is injective but not surjective then M−1φ : range(Mφ) → L2(X) is

unbounded.

(d) Extend from the case p = 2 to any 1 ≤ p ≤ ∞.


Example 1.21 (Integral Operators). Let (X,Ω, µ) be a σ-finite measure space. An integral

operator is an operator of the form

Lf(x) =

∫

X

k(x, y) f(y) dµ(y). (1.3)

This is just a formal definition, we have to provide conditions under which this makes sense,and the following two theorems will provide such conditions. The function k that determinesthe operator is called the kernel of the operator (not to be confused with the kernel/nullspaceof the operator!).

Note that an integral operator is just a generalization of matrix multiplication. For, if Ais an m×n matrix with entries aij and u ∈ Cn, then Au ∈ Cm, and its components are givenby

(Au)i =

n∑

j=1

aij uj, i = 1, . . . , m.

Thus, the values k(x, y) are analogous to the entries aij of the matrix A, and the valuesLf(x) are analogous to the entries (Au)i.

The following result shows that if the kernel is square-integrable, then the correspondingintegral operator is bounded. Later we will define the notion of a Hilbert–Schmidt operator.For the case of integral operators mapping L2(X) into itself, it can be shown that L is aHilbert–Schmidt operator if and only if the kernel k belongs to L2(X ×X).

Theorem 1.22 (Hilbert–Schmidt Integral Operators). Let (X,Ω, µ) be a σ-finite measurespace, and choose a kernel k ∈ L2(X ×X). That is, assume that

‖k‖22 =

∫

X

∫

X

|k(x, y)|2 dµ(x) dµ(y) < ∞.

Then the integral operator given by (1.3) defines a bounded mapping of L2(X) into itself,and ‖L‖ ≤ ‖k‖2.

Proof. Although a slight abuse of the order of logic (technically we should show Lf existsbefore trying to compute its norm), the following calculation shows that L is well-definedand is a bounded mapping of L2(X) into itself:

‖Lf‖22 =

∫

X

|Lf(x)|2 dµ(x)

=

∫

X

∣

∣

∣

∣

∫

X

k(x, y) f(y) dµ(y)

∣

∣

∣

∣

2

dµ(x)

≤∫

X

(∫

X

|k(x, y)|2 dµ(y)

)(∫

X

|f(y)|2 dµ(y)

)

dµ(x)

10 CHRISTOPHER HEIL

=

∫

X

∫

X

|k(x, y)|2 dµ(y) ‖f‖22 dµ(x)

= ‖k‖22 ‖f‖2

2,

where the inequality follows by applying Cauchy–Schwarz to the inner integral. Thus L isbounded, and ‖L‖ ≤ ‖k‖2.

The following result is one version of Schur’s Lemma. There are many forms of Schur’sLemma, this is one particular special case.

Exercise: Compare the hypotheses of the following result to the operator norms youcalculated in Exercise 1.6.

Theorem 1.23. Let (X,Ω, µ) be a σ-finite measure space, and Assume that k is a measurablefunction on X ×X which satisfies the “mixed-norm” conditions

C1 = ess supx∈X

∫

X

|k(x, y)| dµ(y) < ∞ and C2 = ess supy∈X

∫

X

|k(x, y)| dµ(x) < ∞.

Then the integral operator given by (1.3) defines a bounded mapping of L2(X) into itself,and ‖L‖ ≤ (C1C2)

1/2.

Proof. Choose any f ∈ L2(X). Then, by applying the Cauchy–Schwarz inequality, we have

‖Lf‖22 =

∫

X

|Lf(x)|2 dµ(x)

=

∫

X

∣

∣

∣

∣

∫

X

k(x, y) f(y) dµ(y)

∣

∣

∣

∣

2

dµ(x)

≤∫

X

(∫

X

|k(x, y)|1/2(

|k(x, y)|1/2 |f(y)|)

dµ(y)

)2

dµ(x)

≤∫

X

(∫

X

|k(x, y)| dµ(y)

)(∫

X

|k(x, y)| |f(y)|2 dµ(y)

)

dµ(x)

≤∫

X

C1

∫

X

|k(x, y)| |f(y)|2 dµ(y) dµ(x)

= C1

∫

X

|f(y)|2∫

X

|k(x, y)| dµ(x) dµ(y)

≤ C1

∫

X

|f(y)|2C2 dµ(y)

= C1C2 ‖f‖22,

where we have used Tonelli’s Theorem to interchange the order of integration (here is wherewe needed the fact that µ is σ-finite). Thus L is bounded and ‖L‖ ≤ (C1C2)

1/2.


Exercise 1.24. Consider what happens in the preceding example if we take 1 ≤ p ≤ ∞instead of p = 2. In particular, in part b, show that if C1, C2 <∞ then L : Lp(X) → Lp(X)is a bounded mapping for each 1 ≤ p ≤ ∞ (try to do p = 1 or p = ∞ first).

Exercise 1.25 (Volterra Operator). Define L : L2[0, 1] → L2[0, 1] by

Lf(x) =

∫ x

0

f(y) dy.

Show directly that L is bounded. Then show that L is an integral operator with kernelk : [0, 1]2 → F defined by

k(x, y) =

1, y ≤ x,

0, y > x.

Observe that k ∈ L2([0, 1]2), so L is compact. This operator is called the Volterra operator.

Exercise 1.26 (Convolution). Convolution is one of the most important examples of integraloperators. Consider the case of Lebesgue measure on Rn. Given functions f , g on Rn, theirconvolution is the function f ∗ g defined by

(f ∗ g)(x) =

∫

Rn

f(y) g(x− y) dy,

provided that the integral makes sense. Note that with g fixed, the mapping f 7→ f ∗ g isan integral operator with kernel k(x, y) = g(x− y).

(a) Let g ∈ L1(Rn) be fixed. Use Schur’s Lemma (Theorem 1.23) to show that Lf = f ∗ gis a bounded mapping of L2(Rn) into itself. In fact, use Exercise 1.24 to prove Young’s

Inequality : If f ∈ Lp(Rn) (1 ≤ p ≤ ∞) and g ∈ L1(Rn), then f ∗ g ∈ Lp(Rn), and

‖f ∗ g‖p ≤ ‖f‖p ‖g‖1.

In particular, L1(Rn) is closed under convolution.

(b) Note that we cannot use the Hilbert–Schmidt condition (Theorem 1.22) to proveYoung’s Inequality, since

∫

Rn

∫

Rn

|g(x− y)|2 dx dy = ∞,

even if we assume that g ∈ L2(Rn).

(c) Prove that convolution is commutative, i.e., that f ∗ g = g ∗ f .

(d) Prove that there is no identity element in L1(Rn), i.e., there is no function g ∈ L1(Rn)such that f ∗ g = f for all f ∈ L1(Rn). This is not trivial—it is easier to do if you makeuse of the Fourier transform on Rn, and in particular use the Riemann–Lebesgue Lemma toderive a contradiction.

12 CHRISTOPHER HEIL

(e) Some texts do talk informally about a “delta function” that is an identity element forconvolution, defined by the conditions

δ(x) =

∞, x = 0,

0, x 6= 0,and

∫

Rn

δ(x) dx = 1,

but no such function actually exists. In particular, the function δ defined on the left-handside of the line above is equal to zero a.e., and hence is the zero function as far as Lebesgueintegration is concerned. That is, we have

∫

Rn δ(x) dx = 0, not 1. The “delta function” isreally just an informal use of the delta distribution (see Exercise 1.3) or the delta measure

(see Exercise 1.16). Show that if we define the convolution of a function f with the deltameasure δ to be

(f ∗ δ)(x) =

∫

Rn

f(x− y) dδ(y), (1.4)

then f ∗ δ = f for all f ∈ L1(Rn). Note that in the “informal” notation of Exercise 1.16,(1.4) reads

(f ∗ δ)(x) =

∫

Rn

f(x− y) δ(y) dy,

which perhaps explains the use of the term “delta function.”

Exercise 1.27. Prove that L1(Rn) is not closed under pointwise multiplication. That is,prove that there exist f , g ∈ L1(Rn) such that the pointwise product h(x) = (fg)(x) =f(x)g(x) does not belong to to L1(Rn).

Exercise 1.28 (Convolution Continued). (a) Consider the space Lp[0, 1], where we think offunctions in Lp[0, 1] as being extended 1-periodically to the real line. Define convolution onthe circle by

(f ∗ g)(x) =

∫ 1

0

f(y) g(x− y) dy,

where the periodicity is used to define g(x− y) when x − y lies outside [0, 1] (equivalently,replace x − y by x − y mod 1, the fractional part of x − y). Prove a version of Young’sInequality for Lp[0, 1].

(b) Consider the sequence space `p(Z). Define convolution on Z by

(x ∗ y)n =∑

m∈Z

xm yn−m.

Prove a version of Young’s Inequality for `p(Z).Prove that `1(Z) contains an identity element with respect to convolution, i.e., there exists

a sequence in `1(Z) (typically denoted δ) such that δ ∗ x = x for every x ∈ `p(Z).

(c) Identify the essential features needed to define convolution on more general domains,and prove a version of Young’s Inequality for that setting.


Exercise 1.29 (Convolution and the Fourier Transform). Let F be the Fourier transform

on the circle, i.e., it is the isomorphism F : L2[0, 1] → `2(Z) given by Ff = f = f(n)n∈Z,where

f(n) = 〈f, en〉 =

∫ 1

0

f(x) e−2πinx dx, en(x) = e2πinx.

(a) Prove that the Fourier transform converts convolution in to multiplication. That is,

prove that if f , g ∈ L2[0, 1], then (f ∗ g)∧ = f g, i.e.,

(f ∗ g)∧(n) = f(n) g(n), n ∈ Z.

(b) Note that if g ∈ L2[0, 1], then g ∈ L1[0, 1], so by Young’s Inequality we have thatf ∗ g ∈ L2[0, 1]. Holding g fixed, define an operator L : L2[0, 1] → L2[0, 1] by Lf = f ∗ g.Since enn∈Z is an orthonormal basis for L2[0, 1], we have

f =∑

n∈Z

f(n) en, f ∈ L2[0, 1].

Show that

Lf = f ∗ g =∑

n∈Z

g(n) f(n) en, f ∈ L2[0, 1].

Thus, in the “Fourier domain,” convolution acts by changing or adjusting the amount thateach “component” or “frequency” en contributes to the representation of the function in thisbasis: the weight f(n) for frequency n is replaced by the weight g(n) f(n). Explain why thissays that L is analogous to multiplication by a diagonal operator. In engineering parlance,convolution is also referred to as filtering. Explain why this terminology is appropriate.Compare this operator L to Example 1.7.

2. The Adjoint of an Operator

Example 2.1. Note that the dot product on Rn is given by x · y = xTy, while the dotproduct on Cn is x · y = xTy.

Let A be an m × n real matrix. Then x 7→ Ax defines a linear map of Rn into Rm, andits transpose AT satisfies

∀ x ∈ Rn, ∀ y ∈ R

m, Ax · y = (Ax)Ty = xTATy = x · (ATy).

Similarly, if A is an m× n complex matrix, then its Hermitian or adjoint matrix AH = AT

satisfies

∀ x ∈ Cn, ∀ y ∈ C

m, Ax · y = (Ax)Ty = xTATy = x · (AHy).

Theorem 2.2 (Adjoint). Let H and K be Hilbert spaces, and let A : H → K be a bounded,linear map. Then there exists a unique bounded linear map A∗ : K → H such that

∀ x ∈ H, ∀ y ∈ K, 〈Ax, y〉 = 〈x,A∗y〉.

14 CHRISTOPHER HEIL

Proof. Fix y ∈ K. Then Lx = 〈Ax, y〉 is a bounded linear functional on H. By the RieszRepresentation Theorem, there exists a unique vector h ∈ H such that

〈Ax, y〉 = Lx = 〈x, h〉.Define A∗y = h. Verify that this map A∗ is linear (exercise). To see that it is bounded,observe that

‖A∗y‖ = ‖h‖ = sup‖x‖=1

|〈x, h〉|

= sup‖x‖=1

|〈Ax, y〉|

≤ sup‖x‖=1

‖Ax‖ ‖y‖

≤ sup‖x‖=1

‖A‖ ‖x‖ ‖y‖ = ‖A‖ ‖y‖.

We conclude that A∗ is bounded, and that ‖A∗‖ ≤ ‖A‖.Finally, we must show that A∗ is unique. Suppose that B ∈ B(K,H) also satisfied

〈Ax, y〉 = 〈x,By〉 for all x ∈ H and y ∈ K. Then for each fixed y we would have that〈x,By − A∗y〉 = 0 for every x, which implies By − A∗y = 0. Hence B = A∗.

Exercise 2.3 (Properties of the adjoint).

(a) If A ∈ B(H,K) then (A∗)∗ = A.

(b) If A, B ∈ B(H,K) and α, β ∈ F, then (αA+ βB)∗ = αA∗ + βB∗.

(c) If A ∈ B(H1, H2) and B ∈ B(H2, H3), then (BA)∗ = A∗B∗.

(d) If A ∈ B(H) is invertible in B(H) (meaning that there exists A−1 ∈ B(H) such thatAA−1 = A−1A = I), then A∗ is invertible in B(H) and (A−1)∗ = (A∗)−1.

Remark 2.4. Later we will prove the Open Mapping Theorem. A remarkable consequenceof this theorem is that if X and Y are Banach spaces and A : X → Y is a bounded bijection,then A−1 : Y → X is automatically bounded.

Proposition 2.5. If A ∈ B(H,K), then ‖A‖ = ‖A∗‖ = ‖A∗A‖1/2 = ‖AA∗‖1/2.

Proof. In the course of proving Theorem 2.2, we already showed that ‖A∗‖ ≤ ‖A‖. If f ∈ H,then

‖Af‖2 = 〈Af,Af〉 = 〈A∗Af, f〉 ≤ ‖A∗Af‖ ‖f‖ ≤ ‖A∗‖ ‖Af‖ ‖f‖. (2.1)

Hence ‖Af‖ ≤ ‖A∗‖ ‖f‖ (even if ‖Af‖ = 0, this is still true). Since this is true for all f weconclude that ‖A‖ ≤ ‖A∗‖. Therefore ‖A‖ = ‖A∗‖.

Next, we have ‖A∗A‖ ≤ ‖A‖ ‖A∗‖ = ‖A‖2. But also, from the calculation in (2.1), wehave ‖Af‖2 ≤ ‖A∗Af‖ ‖f‖. Taking the supremum over all unit vectors, we obtain

‖A‖2 = sup‖f‖=1

‖Af‖2 ≤ sup‖f‖=1

‖A∗Af‖ ‖f‖ = ‖A∗A‖.


Consequently ‖A‖2 = ‖A∗A‖. The final equality follows by interchanging the roles of Aand A∗.

Exercise 2.6. Prove that if U ∈ B(H,K), then U is an isomorphism if and only if U isinvertible and U−1 = U∗.

Exercise 2.7. (a) Let λ = (λn)n∈N ∈ `∞(N) be given and let L be defined as in Example 1.7.Find L∗.

(b)Prove that the adjoint of the multiplication operator Mφ defined in Exercise 1.18 is themultiplication operator Mφ.

Exercise 2.8. Let L and R be the left- and right-shift operators on `2(N), i.e.,

L(x1, x2, . . . ) = (x2, x3, . . . ) and R(x1, x2, . . . ) = (0, x1, x2, . . . ).

Prove that L = R∗.

Example 2.9. Let L be the integral operator defined in (1.3), determined by the kernelfunction k. Assume that k is chosen so that L : L2(X) → L2(X) is bounded. The adjoint isthe unique operator L∗ : L2(X) → L2(X) which satisfies

〈Lf, g〉 = 〈f, L∗g〉, f, g ∈ L2(X).

To find L∗, let A : L2(X) → L2(X) be the integral operator with kernel k(y, x), i.e.,

Af(x) =

∫

X

k(y, x) f(y) dµ(y).

Then, given any f and g ∈ L2(X), we have

〈f, L∗g〉 = 〈Lf, g〉 =

∫

X

Lf(x) g(x) dµ(x)

=

∫

X

∫

X

k(x, y) f(y) dµ(y) g(x)dµ(x)

=

∫

X

f(y)

∫

X

k(x, y) g(x)dµ(x) dµ(y)

=

∫

X

f(y)

∫

X

k(x, y) g(x) dµ(x) dµ(y)

=

∫

X

f(y)Ag(y)dµ(y)

= 〈f, Ag〉.By uniqueness of the adjoint, we must have L∗ = A.

Exercise: Justify the interchange in the order of integration in the above calculation, i.e.,provide hypotheses under which the calculations above are justified.

16 CHRISTOPHER HEIL

Exercise 2.10. Let enn∈N be an orthonormal basis for a separable Hilbert space H. DefineT : H → `2(N) by T (f) = 〈f, en〉n∈N. Find a formula for T ∗ : `2(N) → H.

Definition 2.11. Let A ∈ B(H).

(a) We say that A is self-adjoint or Hermitian if A = A∗.

(b) We say that A is normal if AA∗ = A∗A.

Example 2.12. A real n × n matrix A is self-adjoint if and only if it is symmetric, i.e.,if A = AT. A complex n × n matrix A is self-adjoint if and only if it is Hermitian, i.e., ifA = AH.

Exercise 2.13. Show that every self-adjoint operator is normal. Show that every unitaryoperator is normal, but that a unitary operator need not be self-adjoint. For H = C

n, findexamples of matrices that are not normal. Are the left- and right-shift operators on `2(N)normal?

Exercise 2.14. (a) Show that if A, B ∈ B(H) are self-adjoint, then AB is self-adjoint ifand only if AB = BA.

(b) Give an example of self-adjoint operators A, B such that AB is not self-adjoint.

(c) Show that if A, B ∈ B(H) are self-adjoint then A + A∗, AA∗, A∗A, A + B, ABA,and BAB are all self-adjoint. What about A − A∗ or A − B? Show that AA∗ − A∗A isself-adjoint.

Exercise 2.15. (a) Let λ = (λn)n∈N ∈ `∞(N) be given and let L be defined as in Example 1.7.Show that L is normal, find a formula for L∗, and prove that L is self-adjoint if and only ifeach λn is real.

(b) Determine a necessary and sufficient condition on φ so that the multiplication operatorMφ defined in Exercise 1.18 is self-adjoint.

(c) Determine a necessary and sufficient condition on the kernel k so that the integraloperator L defined in (1.23) is self-adjoint.

The following result gives a useful condition for telling when an operator on a complex

Hilbert space is self-adjoint.

Proposition 2.16. Let H be a complex Hilbert space (i.e., F = C), and let A ∈ B(H) begiven. Then:

A is self-adjoint ⇐⇒ 〈Af, f〉 ∈ R ∀ f ∈ H.


Proof. ⇒. Assume A = A∗. Then for any f ∈ H we have

〈Af, f〉 = 〈f, Af〉 = 〈A∗f, f〉 = 〈Af, f〉.Therefore 〈Af, f〉 is real.

⇐. Assume that 〈Af, f〉 is real for all f . Choose any f , g ∈ H. Then

〈A(f + g), f + g〉 = 〈Af, f〉 + 〈Af, g〉 + 〈Ag, f〉 + 〈Ag, g〉.Since 〈A(f + g), f + g〉, 〈Af, f〉, and 〈Ag, g〉 are all real, we conclude that 〈Af, g〉+ 〈Ag, f〉is real. Hence it equals its own complex conjugate, i.e.,

〈Af, g〉 + 〈Ag, f〉 = 〈Af, g〉 + 〈Ag, f〉 = 〈g, Af〉+ 〈f, Ag〉. (2.2)

Similarly, since

〈A(f + ig), f + ig〉 = 〈Af, f〉 − i〈Af, g〉 + i〈Ag, f〉+ 〈Ag, g〉we see that

−i〈Af, g〉 + i〈Ag, f〉 = −i〈Af, g〉 + i〈Ag, f〉 = i〈g, Af〉 − i〈f, Ag〉.Multiplying through by i yields

〈Af, g〉 − 〈Ag, f〉 = −〈g, Af〉 + 〈f, Ag〉. (2.3)

Adding (2.2) and (2.3) together, we obtain

2〈Af, g〉 = 2〈f, Ag〉 = 2〈A∗f, g〉.Since this is true for every f and g, we conclude that A = A∗.

Example 2.17. The preceding result is false for real Hilbert spaces. After all, if F = R then〈Af, f〉 is real for every f no matter what A is. Therefore, any non-self-adjoint operatorprovides a counterexample. For example, if H = Rn then any non-symmetric matrix A is acounterexample.

The next result provides a useful way of calculating the operator norm of a self-adjointoperator.

Proposition 2.18. If A ∈ B(H) is self-adjoint, then

‖A‖ = sup‖f‖=1

|〈Af, f〉|.

Proof. Set M = sup‖f‖=1 |〈Af, f〉|.By Cauchy–Schwarz and the definition of operator norm, we have

M = sup‖f‖=1

|〈Af, f〉| ≤ sup‖f‖=1

‖Af‖ ‖f‖ ≤ sup‖f‖=1

‖A‖ ‖f‖ ‖f‖ = ‖A‖.

To get the opposite inequality, note that if f is any nonzero vector in H then f/‖f‖ is aunit vector, so

⟨

A f‖f‖, f

‖f‖

⟩

≤M . Rearranging, we see that

∀ f ∈ H, 〈Af, f〉 ≤ M ‖f‖2. (2.4)

18 CHRISTOPHER HEIL

Now choose any f , g ∈ H with ‖f‖ = ‖g‖ = 1. Then, by expanding the inner products,canceling terms, and using the fact that A = A∗, we see that

⟨

A(f + g), f + g⟩

−⟨

A(f − g), f − g⟩

= 2 〈Af, g〉 + 2 〈Ag, f〉= 2 〈Af, g〉 + 2 〈g, Af〉= 4 Re 〈Af, g〉.

Therefore, applying (2.4) and the Parallelogram Law, we have

4 Re 〈Af, g〉 ≤ |〈A(f + g), f + g〉| + |〈A(f − g), f − g〉|≤ M ‖f + g‖2 +M ‖f − g‖2

= 2M(

‖f‖2 + ‖g‖2)

= 4M.

That is, Re 〈Af, g〉 ≤M for every choice of unit vectors f and g. Write 〈Af, g〉 = |〈Af, g〉| eiθ.Then eiθg is another unit vector, so

M ≥ Re 〈Af, e−iθg〉 = Re eiθ〈Af, g〉 = |〈Af, g〉|.Hence

‖Af‖ = sup‖g‖=1

|〈Af, g〉| ≤ M.

Since this is true for every unit vector f , we conclude that ‖A‖ ≤M .

The following corollary is a very useful consequence.

Corollary 2.19. Assume that A ∈ B(H).

(a) If F = R, A = A∗, and 〈Af, f〉 = 0 for every f , then A = 0.

(b) If F = C and 〈Af, f〉 = 0 for every f , then A = 0.

Proof. Assume the hypotheses of either statement (a) or statement (b). In the case ofstatement (a), we have by hypothesis that A is self-adjoint. In the case of statement (b), wecan conclude that A is self-adjoint because 〈Af, f〉 = 0 is real for every f . Hence in eithercase we can apply Proposition 2.18 to conclude that

‖A‖ = sup‖f‖=1

|〈Af, f〉| = 0.

Lemma 2.20. If A ∈ B(H), then the following statements are equivalent.

(a) A is normal, i.e., AA∗ = A∗A.

(b) ‖Af‖ = ‖A∗f‖ for every f ∈ H.


Proof. (b) ⇒ (a). Assume that (b) holds. Then for every f we have⟨

(A∗A− AA∗)f, f⟩

= 〈A∗Af, f〉 − 〈AA∗f, f〉= 〈Af,Af〉 − 〈A∗f, A∗f〉= ‖Af‖2 − ‖A∗f‖2 = 0.

Since A∗A− AA∗ is self-adjoint, it follows from Corollary 2.19 that A∗A− AA∗ = 0.

(a) ⇒ (b). Exercise.

Corollary 2.21. If A ∈ B(H) is normal, then ker(A) = ker(A∗).

Exercise 2.22. Suppose that A ∈ B(H) is normal. Prove that A is injective if and only ifrange(A) is dense in H.

Exercise 2.23. If A ∈ B(H), then the following statements are equivalent.

(a) A is an isometry, i.e., ‖Af‖ = ‖f‖ for every f ∈ H.

(b) A∗A = I.

(c) 〈Af,Ag〉 = 〈f, g〉 for every f , g ∈ H.

Exercise 2.24. If H = Cn and A, B are n × n matrices, then AB = I implies BA = I.Give a counterexample to this for an infinite-dimensional Hilbert space. Consequently, thehypothesis A∗A = I in the preceding result does not imply that AA∗ = I.

Exercise 2.25. If A ∈ B(H), then the following statements are equivalent.

(a) A∗A = AA∗ = I.

(b) A is unitary, i.e., it is a surjective isometry.

(c) A is a normal isometry.

The following result provides a very useful relationship between the range of A∗ and thekernel of A.

Theorem 2.26. Let A ∈ B(H,K).

(a) ker(A) = range(A∗)⊥.

(b) ker(A)⊥ = range(A∗).

(c) A is injective if and only if range(A∗) is dense in H.

20 CHRISTOPHER HEIL

Proof. (a) Assume that f ∈ ker(A) and let h ∈ range(A∗), i.e., h = A∗g for some g ∈ K.Then since Af = 0, we have 〈f, h〉 = 〈f, A∗g〉 = 〈Af, g〉 = 0. Thus f ∈ range(A∗)⊥, soker(A) ⊆ range(A∗)⊥.

Now assume that f ∈ range(A∗)⊥. Then for any h ∈ H we have 〈Af, h〉 = 〈f, A∗h〉 = 0.But this implies Af = 0, so f ∈ ker(A). Thus range(A∗)⊥ ⊆ ker(A).

(b), (c) Exercises.

3. Projections and Idempotents: Invariant and Reducing Subspaces

Definition 3.1. a. If E ∈ B(H) satisfies E2 = E then E is said to be idempotent.

b. If E ∈ B(H) satisfies E2 = E and ker(E) = range(E)⊥ then E is called a projection.

Exercise 3.2. If E ∈ B(H) is an idempotent operator, then ker(E) and range(E) are closedsubspaces of H. Further, ker(E) = range(I − E) and range(E) = ker(I − E).

Lemma 3.3 (Characterization of Orthogonal Projections). Let E ∈ B(H) be a nonzeroidempotent operator. Then the following statements are equivalent.

(a) E is a projection.

(b) E is the orthogonal projection of H onto range(E).

(c) ‖E‖ = 1.

(d) E is self-adjoint.

(e) E is normal.

(f) E is positive, i.e., 〈Ef, f〉 ≥ 0 for every f ∈ H.

Proof. (e) ⇒ (a). Assume that E2 = E and E is normal. Then from Lemma 2.20 we knowthat ‖Ef‖ = ‖E∗f‖ for every f ∈ H. Hence Ef = 0 if and only if E∗f = 0, or in otherwords, ker(E) = ker(E∗). But we know from Theorem 2.26 that ker(E∗) = range(E)⊥.Hence we conclude that ker(E) = range(E)⊥, and therefore E is a projection.

The remaining implications are exercises.

Definition 3.4 (Orthogonal Direct Sum of Subspaces). Let Mii∈I be a collection of closedsubspaces of H such that Mi ⊥ Mj whenever i 6= j. Then the orthogonal direct sum of theMi is the smallest closed subspace which contains every Mi. This space is

⊕

i∈I

Mi = span(

⋃

i∈I

Mi

)

.


Exercise 3.5. Suppose that M , N are closed subspaces of H such that M ⊥ N . Prove thatM +N = m+ n : m ∈M,n ∈ N is a closed subspace of H, and that

M ⊕N = M +N.

Show that every vector x ∈M ⊕N can be written uniquely as x = m+ n with m ∈M andn ∈ N .

Extend by induction to finite collections of closed, pairwise orthogonal subspaces. (Un-fortunately, the analogous statement is not true for infinite collections.)

Exercise 3.6. Show that if A ∈ B(H,K) then H = ker(A) ⊕ range(A∗).

Definition 3.7. Let A ∈ B(H) and M ≤ H.

(a) We say that M is invariant under A if A(M) ⊆M , where

A(M) = Ax : x ∈M.

That is, M is invariant if x ∈M implies Ax ∈M . Note that it need not be the casethat A(M) = M .

(b) We say that M is a reducing subspace for A if both M and M⊥ are invariant underA, i.e., A(M) ⊆M and A(M⊥) ⊆M⊥.

Proposition 3.8. Let A ∈ B(H) and M ≤ H be given. Then the following statements areequivalent.

(a) M is invariant under A.

(b) PAP = AP , where P = PM is the orthogonal projection of H onto M .

Exercise 3.9. Define L : `2(Z) → `2(Z) by

L(. . . , x−1, x0, x1, . . . ) = (. . . , , x0, x1, x2, . . . ),

where on the right-hand side the entry x1 sits in the 0th component position. That is,L slides each component one unit to the left (L is called a bilateral shift). Find a closedsubspace of `2(Z) that is invariant but not reducing under L.

Exercise 3.10. Assume that M ≤ H is invariant under L ∈ B(H). Prove that M⊥ isinvariant under L∗.

22 CHRISTOPHER HEIL

4. Compact Operators

Definition 4.1 (Compact and Totally Bounded Sets). Let X be a Banach space, and letE ⊆ X be given.

(a) We say that E is compact if every open cover of E contains a finite subcover. That is,E is compact if whenever Uαα∈I is a collection of open sets whose union containsE, then there exist finitely many α1, . . . , αN such that E ⊆ Uα1

∪ · · · ∪ UαN.

(b) We say that E is sequentially compact if every sequence fnn∈N of points of Econtains a convergent subsequence fnk

k∈N whose limit belongs to E.

(c) We say that E is totally bounded if for every ε > 0 there exist finitely many pointsf1, . . . , fN ∈ E such that

E ⊆N⋃

k=1

B(fk, ε),

where B(fk, ε) is the open ball of radius ε centered at fk. That is, E is totallybounded if and only there exist finitely many points f1, . . . , fN ∈ E such that everyelement of E is within ε of some fk.

In finite dimensions, a set is compact if and only if it is closed and bounded. In infinitedimensions, all compact sets are closed and bounded, but the converse fails. Instead, wehave the following characterization of compact sets. (this characterization actually holds inany complete metric space).

Theorem 4.2. Let E be a subset of a Banach space X. Then the following statements areequivalent.

(a) E is compact.

(b) E is sequentially compact.

(c) E is closed and totally bounded.

Proof. (b) ⇒ (a).2 Assume that E is sequentially compact. Our first step will be to provethe following claim, where the diameter of a set S is defined to be

diam(S) = sup‖f − g‖ : f, g ∈ S.

Claim 1. For any open cover Uαα∈I of E, there exists a number δ > 0 (called a Lebesgue

number for the cover) such that if S ⊆ E satisfies diam(S) < δ, then there is an α ∈ I suchthat S ⊆ Uα.

To prove the claim, suppose that Uαα∈I was an open cover of E such that no δ withthe required property existed. Then for each n ∈ N, we could find a set Sn ⊆ E withdiam(Sn) < 1

nsuch that Sn is not contained in any Uα. Choose any fn ∈ Sn. Since E is

sequentially compact, there must be a subsequence fnkk∈N that converges to an element of

2This proof is adapted from one given in J. R. Munkres, “Topology,” Second Edition, Prentice Hall, 2000.


E, say fnk→ a ∈ E. But we must have a ∈ Uα for some α, and since Uα is open there must

exist some ε > 0 such that B(a, ε) ⊆ Uα. Now choose k large enough that we have both

1

nk<ε

2and ‖a− fnk

‖ < ε

2.

The first inequality above implies that diam(Snk) < ε

2. Therefore, using this and second

inequality, we have Snk⊆ B(a, ε) ⊆ Uα, which is a contradiction. Therefore the claim is

proved.

Next, we will prove the following claim.

Claim 2. For any ε > 0, there exist finitely many f1, . . . , fN ∈ E such that

E ⊆N⋃

k=1

B(fk, ε).

To prove this claim, assume that there is an ε > 0 such that E cannot be covered byfinitely many ε-balls centered at points of E. Choose any f1 ∈ E. Since E cannot becovered by a single ε-ball, we have E 6⊆ B(f1, ε). Hence there exists f2 ∈ E \ B(f1, ε), i.e.,f2 ∈ E and ‖f2 − f1‖ ≥ ε. But E cannot be covered by two ε-balls, so there must exist anf3 ∈ E \

(

B(f1, ε)∪B(f2, ε))

. In particular, we have ‖f3 − f1‖, ‖f3 − f2‖ ≥ ε. Continuing inthis way we obtain a sequence of points fnn∈N in E which has no convergent subsequence,which is a contradiction. Hence the claim is proved.

Finally, we show that E is compact. Let Uαα∈I be any open cover of E. Let δ be theLebesgue number given by Claim 1, and set ε = δ

3. By Claim 2, there exists a covering of E

by finitely many ε-balls. Each ball has diameter smaller than δ, so by Claim 1 is containedin some Uα. Thus we find finitely many Uα that cover E.

(c) ⇒ (b). Assume that E is closed and totally bounded, and let fnn∈N be any sequenceof points in E. Since E is covered by finitely many balls of radius 1

2, one of those balls must

contain infinitely many fn, say f (1)n n∈N. Then we have

∀m,n ∈ N, ‖f (1)m − f (1)

n ‖ < 1.

Since E is covered by finitely many balls of radius 14, we can find a subsequence f (2)

n n∈N of

f (1)n n∈N such that

∀m,n ∈ N, ‖f (1)m − f (1)

n ‖ <1

2.

By induction we keep constructing subsequences f (k)n n∈N such that ‖f (k)

m − f(k)n ‖ < 1

kfor

all m, n ∈ N.

Now consider the “diagonal subsequence” f (n)n n∈N. Given ε > 0, let N be large enough

that 1N< ε. If m ≥ n > N , then f

(m)m is one element of the sequence f (n)

k k∈N, say

f(m)m = f

(n)k . Then

‖f (m)m − f (n)

n ‖ = ‖f (n)k − f (n)

n ‖ <1

n< ε.

Thus f (n)n n∈N is Cauchy and hence converges. Since E is closed, it must converge to some

element of E.

24 CHRISTOPHER HEIL

(a) ⇒ (c). Exercise.

Exercise 4.3. Show that if E is a totally bounded subset of a Banach space X, then itsclosure E is compact. A set whose closure is compact is said to be precompact.

Notation 4.4. We let BallH denote the closed unit sphere in H, i.e.,

BallH = Ball(H) = f ∈ H : ‖f‖ ≤ 1.

Exercise 4.5. Prove that if H is infinite-dimensional, then BallH is not compact.

Definition 4.6 (Compact Operators). Let H, K be Hilbert spaces. A linear operatorT : H → K is compact if T (BallH) has compact closure in K. We define

B0(H,K) = T : H → K : T is compact,and set B0(H) = B0(H,H).

By definition, a compact operator is linear, and we will see that all compact operators arebounded. Thus it will turn out that B0(H,K) ⊆ B(H,K). In fact, we will see that B0(H,K)is a closed subspace of B(H,K).

The following result gives some useful reformulations of the definition of compact operator.

Proposition 4.7 (Characterizations of Compact Operators). Let T : H → K be linear.Then the following statements are equivalent.

(a) T is compact.

(b) T (BallH) is totally bounded.

(c) If fnn∈N is a bounded sequence in H, then Tfnn∈N contains a convergent subse-quence.

Proof. (a) ⇔ (b). This follows from Theorem 4.2 and Exercise 4.3.

(a) ⇒ (c). Suppose that T is compact and that fnn∈N is a bounded sequence in H.By rescaling the sequence (i.e., multiplying by an appropriate scalar), we may assume that

fn ∈ BallH for every n. Therefore Tfn ∈ T (BallH) ⊆ T (BallH). Since T (BallH) is compact,it follows from Theorem 4.2 that Tfnn∈N contains a subsequence which converges to an

element of T (BallH).

(c) ⇒ (a). Exercise.

Proposition 4.8. If T : H → K is compact, then it is bounded. That is,

B0(H,K) ⊆ B(H,K).

Proof. Assume that T : H → K is linear but unbounded. Then there exist vectors fn ∈ Hsuch that ‖fn‖ = 1 but ‖Tfn‖ ≥ n. Therefore every subsequence of Tfnn∈N is unbounded,and hence cannot converge. Therefore T is not compact by Proposition 4.7.


Exercise 4.9. Show that if H is infinite-dimensional then the identity operator on H is notcompact. Hence a bounded operator need be compact in general.

The following exercise shows that a compact operator maps an orthonormal sequence toa sequence that converges to the zero vector.

Exercise 4.10. (a) Let hnn∈N be a sequence of vectors in H, and let h ∈ H. Supposethat every subsequence of hnn∈N contains a subsequence that converges to h. Prove thathn → h.

Hint: Proceed by contradiction. Suppose that hn does not converge to h. Show that thisimplies that there is an ε > 0 and a subsequence hnk

k∈N such that ‖h − hnk‖ ≥ ε for

every k.

(b) Suppose that T : H → K is compact, and let enn∈N be an orthonormal sequencein H. Show that Ten → 0.

Hint: Choose any subsequence fnn∈N. Since T is compact, this sequence has a sub-sequence gnn∈N such that Tgnn∈N converges, say Tgn → h. Prove that 〈Tgn, h〉 → 0(use Bessel’s Inequality to find a bound for the `2-norm of 〈Tgn, h〉n∈N). Use part (a) tocomplete the proof.

The following exercise shows that a compact operator maps weakly convergent sequencesto convergent sequences.

Definition 4.11. Let fnn∈N be a sequence of vectors in H and let f ∈ H. We say that

fn converges weakly to f , written fnw→ f , if

∀ g ∈ H, 〈fn, g〉 → 〈f, g〉 as n→ ∞.

Exercise 4.12. (a) Show that if fn → f , then fnw→ f .

(b) Show that if enn∈N is an orthonormal sequence in H, then enw→ 0.

(c) Suppose that T ∈ B(H) is compact. Show that if fnw→ f , then Tfn → Tf .

Exercise 4.13. Let φ ∈ L∞(Rn) be fixed, with φ 6= 0. Then by Exercise 1.18 we know thatthe multiplication operator Mφ : L2(Rn) → L2(Rn) given by Mφf = fφ is bounded. Showthat Mφ is not compact.

Hint: There must exist an ε > 0 and a set E ⊆ Rn with positive measure such that|φ(x)| ≥ ε for all x ∈ E.

Exhibit a measure space (X,Ω, µ) and a bounded, nonzero φ ∈ L∞(X) such that Mφ iscompact. Hint: Consider Exercise 4.23.

Exercise 4.14. Porve that if T : H → K is compact and injective, then T−1 : range(T ) → His unbounded.

26 CHRISTOPHER HEIL

Theorem 4.15 (Limits of Compact Operators). B0(H,K) is a closed subspace of B(H,K)(under the operator norm). That is,

(a) if S, T ∈ B0(H,K) and α, β ∈ F, then αS + βT ∈ B0(H,K),

(b) if Tn ∈ B0(H,K), T ∈ B(H,K), and ‖T − Tn‖ → 0, then T ∈ B0(H,K).

Proof. (a) Exercise.

(b) Assume that Tn are compact operators and that Tn → T in operator norm. ByProposition 4.7, it suffices to show that T (BallH) is a totally bounded subset of K.

Choose any ε > 0. Then there exists an n such that ‖T − Tn‖ < ε3. Now, Tn is compact,

so Tn(BallH) is totally bounded. Hence there exist finitely many points h1, . . . , hm ∈ BallHsuch that

Tn(BallH) ⊆m⋃

j=1

B(

Tnhj,ε3

)

. (4.1)

We will show that T (BallH) is totally bounded by showing that

T (BallH) ⊆m⋃

j=1

B(

Tnhj, ε)

. (4.2)

Choose any element of T (BallH), i.e., any point Tf with ‖f‖ ≤ 1. Then Tnf ∈ Tn(BallH),so by (4.1) there must be some j such that ‖Tnf − Tnhj‖ < ε

3. Consequently,

‖Tf − Thj‖ ≤ ‖Tf − Tnf‖ + ‖Tnf − Tnhj‖ + ‖Tnhj − Thj‖

< ‖T − Tn‖ ‖f‖ +ε

3+ ‖Tn − T‖ ‖hj‖

<ε

3· 1 +

ε

3+ε

3· 1

= ε.

Hence (4.2) follows, so T is compact.

Exercise 4.16. Another way to prove Theorem 4.15 is to apply a Cantor diagonalizationargument. Fill in the details in the following sketch of this argument.

Suppose that fnn∈N is a bounded sequence in H. Then since T1 is compact, there exists a

subsequence f (1)n n∈N of fnn∈N such that T1f

(1)n n∈N converges. Then since T2 is compact,

there exists a subsequence f (2)n n∈N of f (1)

n n∈N such that T2f(2)n n∈N converges (and note

that T1f(2)n n∈N also converges!). Continue to construct subsequences in this way, and then

show that the “diagonal subsequence” Tf (n)n n∈N converges (use the fact that there exists

a k such that ‖T − Tk‖ < ε). Therefore T is compact.

Theorem 4.17 (Compositions and Compact Operators). Let H1, H2, H3 be Hilbert spaces.

(a) If A : H1 → H2 is bounded and T : H2 → H3 is compact, then TA : H1 → H3 iscompact.


(b) If T : H1 → H2 is compact and A : H2 → H3 is bounded, then AT : H1 → H3 iscompact.

Proof. (b) Assume that A is bounded and T is compact. Let fnn∈N be any boundedsequence in H1. Then since T is compact, there is a subsequence Tfnk

k∈N that convergesin H2. Since A is bounded, the subsequence ATfnk

k∈N therefore converges in H3. HenceAT is compact.

(a) Exercise.

Exercise 4.18. Prove that if T ∈ B0(H,K), then range(T ) is a separable subspace of K.

Hints: Since T (BallH) is compact, it is totally bounded. Hence for each n ∈ N we can

find finitely many balls of radius 1/n with centers in T (BallH) that cover T (BallH). If we

consider all these balls for every n, we have countably many balls that cover T (BallH). Show

that this implies that T (BallH) contains a countable, dense subset. Then do the same foreach ball of radius k ∈ N instead of just k = 1. Combine all of these together to get acountable dense subset of range(T ).

Definition 4.19 (Finite-Rank Operators). Recall that the rank of an operator T : H → Kis the dimension of range(T ). We say that T is a finite-rank operator if range(T ) is finite-dimensional. We set

B00(H,K) = T ∈ B(H,K) : T is finite-rank,and set B00(H) = B00(H,H).

A linear, finite-rank operator need not be bounded (that is why we include the assumptionof boundedness in the definition of B00(H,K) above). However, the following result showsthat if a finite-rank operator is bounded, then it is actually compact.

Proposition 4.20. If T : H → K is bounded, linear, and has finite rank, then T is compact.Thus,

B00(H,K) ⊆ B0(H,K).

Proof. Since T is bounded, T (BallH) is a bounded subset of the finite-dimensional spacerange(T ). All finite-dimensional spaces are closed. Hence the closure of T (BallH) is a closedand bounded subset of range(T ), and therefore is compact.

This gives us the following very useful way to show that a general operator T is compact:try to construct a sequence of finite-rank operators Tn that converge to T in operator norm.

Corollary 4.21. Suppose that Tn ∈ B(H,K) are finite-rank operators, T ∈ B(H,K), andTn → T in operator norm. Then T is compact.

Exercise 4.22. Show that if E ∈ B(H) is compact and idempotent, then E has finite rank.

28 CHRISTOPHER HEIL

Example 4.23. Let enn∈N be an orthonormal basis for a separable Hilbert space H, andlet λ = (λn)n∈N be a bounded sequence of scalars. Then we know from Example 1.7 that

Lf =

∞∑

n=1

λn 〈f, en〉 en

defines a bounded operator on H.Suppose that λn → 0 as n→ ∞. Define

LNf =N

∑

n=1

λn 〈f, en〉 en.

Since range(LN ) ⊆ spane1, . . . , eN (must it be equality?), we have that LN is finite-rank.(Exercise: Show that L is not finite-rank if there are infinitely many λn 6= 0.)

Further, LN is a good approximation to L, because (using the Plancherel Theorem) wehave

‖(L− LN )f‖2 =

∥

∥

∥

∥

∞∑

n=N+1

λn 〈f, en〉 en

∥

∥

∥

∥

2

=

∞∑

n=N+1

|λn|2 |〈f, en〉|2

≤(

supn>N

|λn|2)

∞∑

n=N+1

|〈f, en〉|2

≤(

supn>N

|λn|2)

‖f‖2.

It follows that LN converges to L in operator norm:

limN→∞

‖L− LN‖2 ≤ limN→∞

(

supn>N

|λn|2)

= lim supN→∞

|λn|2 = 0.

Since each LN is compact, we conclude that L is compact as well.

Exercise 4.24. Continuing the preceding example, prove the following.(a) Prove that if λn does not converge to zero then L is not compact. Hint: We know at

least some of the eigenvectors of L.

(b) Prove that, with only the assumption that λ ∈ `∞, we have

∀ f ∈ H, LNf → Lf. (4.3)

That is, for each individual vector f we have ‖Lf − LNf‖ → 0, where this is the normin H. A sequence of operators which satisfies (4.3) is said to converge strongly or in thestrong operator topology (SOT). Prove that strong convergence of operators does not implyconvergence in operator norm, i.e., (4.3) does not imply that ‖L− LN‖ → 0.

(c) Assuming that λ ∈ `∞, prove that L is self-adjoint if and only if every λn is real.


We can characterize all the finite-rank operators, as follows.

Proposition 4.25 (Finite-Rank Operators). Let L : H → K be bounded and linear. Thenthe following statements are equivalent.

(a) L has finite rank.

(b) There exist vectors ϕ1, . . . , ϕN ∈ H and ψ1, . . . , ψN ∈ K such that

Lf =

N∑

k=1

〈f, ϕk〉ψk, f ∈ H. (4.4)

Proof. (a) ⇒ (b). Since L has finite rank, we know that range(L) is a finite-dimensionalsubspace of K. Every finite-dimensional subspace is closed, so we can find a finite orthonor-mal basis ψkN

k=1 for range(L). Therefore, if f ∈ H then we can express Lf in terms of thisorthonormal basis:

Lf =N

∑

k=1

〈Lf, ψk〉ψk =N

∑

k=1

〈f, L∗ψk〉ψk =N

∑

k=1

〈f, ϕk〉ψk,

where ϕk = L∗ψk.

(b) ⇒ (a). We have range(L) ⊆ spanψ1, . . . , ψN.

Corollary 4.26. If L ∈ B(H,K) has rank 1, then there exist ϕ ∈ H and ψ ∈ K such that

Lf = 〈f, ϕ〉ψ, f ∈ H.

In particular, if ϕ = ψ are unit vectors, then Lf = 〈f, ϕ〉ϕ is the orthogonal projection ofH onto spanϕ.

Exercise 4.27. Compute the adjoint of L given by (4.4). Conclude that the adjoint of afinite-rank operator is also finite-rank.

Exercise 4.28. Show that if A ∈ B(H) and AT = TA for every finite-rank T then A = cIfor some scalar c.

Exercise 4.29. Use the idea of the Proposition 4.25 to show that if H is separable andL : H → H is any bounded linear operator, then there exist finite-rank operators LN thatconverge to L in the strong operator topology, i.e., ‖Lf − LNf‖ → 0 for each individual f .However, observe that if L is not compact, then we cannot have LN → L in operator norm.

The following result shows that not only is the operator norm limit of a sequence of finite-rank operators compact, but every compact operator can be realized as the operator normlimit of finite-rank operators.

Theorem 4.30. If T ∈ B(H,K), then the following statements are equivalent.

30 CHRISTOPHER HEIL

(a) T is compact.

(b) There exist finite-rank operators Tn ∈ B(H,K) such that Tn → T .

As a consequence, we have that B00(H,K) is a dense subspace of B0(H,K), i.e.,

B00(H,K) = B0(H,K) (closure in operator norm).

Proof. (b) ⇒ (a). This follows from Theorem 4.15 and the fact that all bounded finite-rankoperators are compact.

(a) ⇒ (b). Assume that T is compact. Let R = range(T ). If R is finite-dimensional,then T is finite-rank, and we are done. So, assume that R is infinite-dimensional. ByExercise 4.18, we know that R is separable, so there exists a countable orthonormal basisenn∈N for R. For any f ∈ H we have Tf ∈ R, so

Tf =

∞∑

n=1

〈Tf, en〉 en, f ∈ H.

Define

TNf =

N∑

n=1

〈Tf, en〉 en, f ∈ H,

and note that TN = PNT where PN is the orthogonal projection of K onto the closedsubspace spane1, . . . , eN. By definition, we have that TN converges to T in the strongoperator topology, i.e., TNf → Tf for every f . Our goal is to show more, namely, to showthat TN → T in operator norm. That is, we need to show that sup‖f‖=1 ‖Tf − TNf‖ → 0.

Choose any ε > 0. Since T (BallH) is totally bounded, it is covered by finitely many ε-ballscentered at points in T (BallH). Hence, there exist h1, . . . , hm ∈ H such that

T (BallH) ⊆m⋃

k=1

B(

Thk,ε

3

)

.

Since limN→∞ ‖Thk − TNhk‖ = 0 for k = 1, . . . , m, we can find an N0 such that

∀N > N0, ‖Thk − TNhk‖ <ε

3, k = 1, . . . , m.

Choose any f with ‖f‖ = 1 and any N > N0. Then Tf ∈ B(

Thk,ε3

)

for some k, i.e.,

‖Tf − Thk‖ <ε

3.

Therefore we also have

‖TNf − TNhk‖ =

∥

∥

∥

∥

N∑

n=1

〈T (f − hk), en〉 en

∥

∥

∥

∥

=

( N∑

n=1

|〈T (f − hk), en〉|2)1/2


≤( ∞

∑

n=1

|〈T (f − hk), en〉|2)1/2

= ‖Tf − Thk‖ <ε

3.

Alternatively, this follows even more simply from the fact that

‖TNf − TNhk‖ = ‖PNTf − PNThk‖ ≤ ‖PN‖ ‖Tf − Thk‖ < 1 · ε3.

In any case, it follows that

‖Tf − TNf‖ ≤ ‖Tf − Thk‖ + ‖Thk − TNhk‖ + ‖TNhk − TNf‖ <ε

3+ε

3+ε

3= ε.

This is true for every unit vector, so we have ‖T − TN‖ ≤ ε for all N > N0. Therefore, wedo indeed have ‖T − TN‖ → 0.

Corollary 4.31. If T ∈ B(H,K), then

T is compact ⇐⇒ T ∗ is compact.

Proof. Assume that T is compact. Then there exist finite-rank operators TN such thatTN → T . Hence T ∗

N → T ∗ (why?), but each T ∗N is finite-rank, so T ∗ is compact. The

converse is symmetrical.

Exercise 4.32. Extend Example 4.23 as follows. Let H be a separable Hilbert space, andlet enn∈N be an orthonormal basis for H. Let λ = (λn)n∈N ∈ `∞(N) be given. DefineLen = λnen. Prove that the definition of L can be extended to all of H in such a way that Lis a bounded linear operator. Prove that this operator L is compact if and only if λn → 0.

The next result shows that an integral operator with a square-integrable kernel is compact.

Theorem 4.33. Let (X,Ω, µ) be a σ-finite measure space. If k ∈ L2(X × X), then theintegral operator

Lf(x) =

∫

X

k(x, y) f(y) dµ(y), f ∈ L2(X),

defines a compact mapping of L2(X) into itself. Further, ‖L‖ ≤ ‖k‖2.

Proof. Note that by Theorem 1.22 we already know that L defines a bounded operator, andthat ‖L‖ ≤ ‖k‖2. So, we need only show that L is compact.

For simplicity, we will consider only the case where L2(X) is separable. In this case thereexists an orthonormal basis enn∈N for L2(X). Define

emn(x, y) = em(x) en(y), x, y ∈ X.

Then it is easy to see that emnm,n∈N is an orthonormal sequence in L2(X ×X), and withmore work (exercise3) it can be shown that that it is also complete and hence forms anorthonormal basis for L2(X ×X). Since k ∈ L2(X ×X), we therefore have

k =∞

∑

m=1

∞∑

n=1

〈k, emn〉 emn,

3For details on this type of argument, see the “Real Analysis Review” handout on the instructor’s webpage.

32 CHRISTOPHER HEIL

where this series converges in the norm of L2(X×X), and in fact it converges unconditionally.For each N ∈ N define an approximation to k by setting

kN =N

∑

m=1

N∑

n=1

〈k, emn〉 emn.

Then kN → k in L2-norm.Now define an approximation to L by defining LN to be the integral operator with kernel

kN , i.e.,

LNf(x) =

∫

X

kN(x, y) f(y) dµ(y), f ∈ L2(X).

Since kN ∈ L2(X ×X), we know that LN is bounded. Further, since the sums involved arefinite, we can interchange sums and integrals in the following calculation to obtain that

LNf(x) =

∫

X

kN(x, y) f(y) dµ(y)

=

∫

X

N∑

m=1

N∑

n=1

〈k, emn〉 emn(x, y) f(y) dµ(y)

=N

∑

m=1

N∑

n=1

〈k, emn〉∫

X

em(x) en(y) f(y) dµ(y)

=

N∑

m=1

N∑

n=1

〈k, emn〉〈f, en〉 em(x).

This is an equality of functions, i.e., LNf =∑N

m=1

∑Nn=1 〈k, emn〉〈f, en〉 em a.e. In any case

we have LNf ∈ spane1, . . . , eN, so LN has finite rank. Since LN is bounded (why?), it istherefore compact.

Consequently, if we can show that LN → L, then we can conclude that L itself is compact.Note that L−LN is simply the integral operator with kernel k−kN . Since k−kN ∈ L2(X×X),we know that L− LN is bounded, and that

‖L− LN‖ ≤ ‖k − kN‖2 → 0 as N → ∞.

Hence LN → L in operator norm, so L is compact.

For the remainder of this section, we consider eigenvalues and eigenvectors of compactoperators.

Definition 4.34. Let A ∈ B(H) be given.

(a) A scalar λ ∈ F is an eigenvalue of A if there exists a nonzero vector f ∈ H such thatAf = λf . Equivalently, λ is an eigenvalue if ker(A− λI) 6= 0.

(b) If λ ∈ F is an eigenvalue of A, then any nonzero vector in ker(A − λI) is calledan eigenvector of A corresponding to the eigenvalue λ, or simply a λ-eigenvector forshort. Equivalently, a nonzero vector f ∈ H is a λ-eigenvector if Af = λf .


(c) If λ ∈ F is an eigenvalue of A, then ker(A − λI) is called the eigenspace of Acorresponding to the eigenvalue λ, or simply a λ-eigenspace for short. Every nonzerovector in the λ-eigenspace is a λ-eigenvector of A.

(d) The point spectrum σp(A) of A is the set of eigenvalues of A:

σp(A) = λ ∈ F : λ is an eigenvalue of A.

Exercise 4.35. Let enn∈N be an orthonormal basis for a separable Hilbert space H, and letλ = (λn)n∈N ∈ `∞(N) be fixed. Let L : H → H be the bounded operator Lf =

∑

λn 〈f, en〉 en

defined in Example 1.7.

(a) Show that σp(L) = λn : n ∈ N.(b) Show that if µ is one component of λ and J = n ∈ N : λn = µ, then the µ-

eigenspace of L is spanenn∈J .

(c) Show that the eigenspaces of L corresponding to distinct eigenvalues are orthogonal.

Exercise 4.36. Let L ∈ B(H) be given. Prove the following.

(a) If L is self-adjoint, then all eigenvalues of L are real.

(b) If L is positive (〈Lf, f〉 ≥ 0 for all f), then all eigenvalues of L are nonnegative.

(c) If L is positive definite (〈Lf, f〉 > 0 for all f 6= 0), then all eigenvalues of L arestrictly positive.

(d) If L is unitary, then every eigenvalue λ satisfies |λ| = 1.

Exercise 4.37. Suppose that L ∈ B(H) is normal. Prove that if λ 6= µ are distincteigenvalues of L, then the corresponding eigenspaces are orthogonal, i.e., ker(L − λI) ⊥ker(L− µI).

While any linear operator A : Cn → Cn must have an eigenvalue, bounded operators oninfinite-dimensional Hilbert spaces need not have any eigenvalues.

Exercise 4.38. (a) Prove that the Volterra operator defined in Exercise 1.25 is compact buthas no eigenvalues, i.e., its point spectrum is empty.

(b) Prove that the right-shift operator R on `2(N) has no eigenvalues.

(c) Prove that every scalar |λ| < 1 is an eigenvalue of the left-shift operator L on `2(N), andfind the corresponding eigenvectors. Thus, this operator has uncountably many eigenvalues.

(d) Let φ(x) = x. Prove that the multiplication operator Mφ : L2[0, 1] → L2[0, 1], definedby Mφf(x) = xf(x), is self-adjoint but has no eigenvalues.

(e) Define

k(x, y) =

i, y ≤ x,

−i, y > x,

34 CHRISTOPHER HEIL

and let L : L2[0, 1] → L2[0, 1] be the integral operator with kernel k. Prove that L is bothcompact and self-adjoint. Prove that the eigenvalues of L are λk = 2

(2k+1)π(and only these),

and find the corresponding eigenvectors.

Exercise 4.39 (Convolution). Fix g ∈ L2[0, 1], where we consider functions in L2[0, 1] to be1-periodically extended to the real line. Let T be the convolution operator

Tf(x) = (f ∗ g)(x) =

∫ 1

0

g(x− y) f(y) dy.

(a) Prove that T is compact. Hint: Write T as an integral operator and show that itskernel is square-integrable. Note that the fact that [0, 1] has finite measure is important.

(b) Prove that the complex exponential functions en(x) = e2πinx are eigenvectors of T ,with corresponding eigenvalues g(n) (the Fourier coefficients of g).

Exercise 4.40. (a) Assume that A ∈ B(H) is normal and let λ ∈ F be given. Show thatA − λI is normal. Use this to show that ker(A − λI) = ker(A∗ − λI). Conclude that if λis an eigenvalue of A then λ is an eigenvalue of A∗, and the corresponding eigenspaces areequal.

Hint: Consider Corollary 2.21.

(b) Find an example of a non-normal operator for which the conclusion of part (a) fails.Hint: Consider a shift operator.

The next result shows that the eigenspaces (if any) of a compact operator correspondingto nonzero eigenvalues must be finite-dimensional.

Proposition 4.41. Assume that T : H → H is compact and that λ 6= 0 is an eigenvalueof T . Then ker(T − λI) is finite-dimensional.

Proof. Since T is bounded, we know that ker(T−λI) is a closed subspace of H. Suppose thatit was infinite-dimensional. Then we could find an infinite orthonormal sequence enn∈N inker(T − λI). Each en is a λ-eigenvector, i.e., Ten = λen. But then enn∈N is a boundedsequence in H yet

‖Tem − Ten‖ = |λ| ‖em − en‖ = |λ|√

2,

so since λ 6= 0 there can be no convergent subsequences of Tenn∈N, which contradicts thefact that T is compact.

The following is one useful theoretical result which implies the existence of an eigenvalueof a compact operator T . It states that if inf‖f‖=1 ‖Tf − λf‖ = 0, then this infimum isactually achieved, i.e., ‖Tf −λf‖ = 0 for some unit vector f , or in other words, there existsa λ-eigenvector for T .

Proposition 4.42. Assume that T : H → H is compact and that λ 6= 0 is given. Then:

inf‖f‖=1

‖Tf − λf‖ = 0 =⇒ λ ∈ σp(T ).


Proof. Assume that inf‖f‖=1 ‖Tf − λf‖ = 0. Then we can find unit vectors fn such that‖Tfn −λfn‖ → 0. Since T is compact, Tfnn∈N has a convergent subsequence, say Tfnk

→g ∈ H. Since λ 6= 0 we have

fnk=

(

λfnk− Tfnk

)

+ Tfnk

λ→ 0 + g

λ=

g

λ. (4.5)

Since the fnkare unit vectors, we conclude that g 6= 0. Moreover, since T is continuous it

follow from (4.5) that Tfnk→ Tg/λ. But we also know that Tfnk

→ g, so we conclude thatTg/λ = g, or in other words that g is a λ-eigenvector.

Corollary 4.43. Assume T : H → H is compact and that λ 6= 0. If λ /∈ σp(T ) andλ /∈ σp(T

∗), then range(T − λI) is a bounded bijection of H onto itself, and (T − λI)−1 isbounded.

Proof. Since λ is not an eigenvalue, we know that T − λI is injective. Further, it followsfrom the preceding proposition that inf‖f‖=1 ‖Tf − λf‖ > 0. Hence there exists a C > 0such that ‖Tf − λf‖ ≥ C for every unit vector f , and hence

∀ f ∈ H, ‖Tf − λf‖ ≥ C ‖f‖. (4.6)

It follows from Exercise 1.2 than range(T − λI) is a closed subspace of H. But then, sinceλ is not an eigenvalue of T ∗, we have that

range(T − λI) = range(T − λI) = ker(

(T − λI)∗)⊥

= ker(T ∗ − λI)⊥ = 0⊥ = H.

Thus T − λI is a bounded bijection. It remains to show that (T − λI)−1 is bounded. Givenf ∈ H we have from (4.6) that

‖f‖ = ‖(T − λI)(T − λI)−1f‖ ≥ C ‖(T − λI)−1f‖.Rearranging, we see that ‖(T − λI)−1‖ ≤ 1

C<∞.

Actually, it can be shown that if T : H → H is compact, λ 6= 0, and λ /∈ σp(T ), thenλ /∈ σp(T

∗) follows automatically.

5. The Diagonalization of Compact Self-Adjoint Operators

First let us summarize the facts that have been developed regarding the operator L intro-duced in Example 1.7 and studied in other examples in previous sections.

Theorem 5.1. Let enn∈N be an orthonormal basis for a separable Hilbert space H, andlet λ = (λn)n∈∞ ∈ `∞(N) be a bounded sequence of scalars. Define

Lf =∞

∑

n=1

λn 〈f, en〉 en, f ∈ H. (5.1)

Then the following statements hold.

(a) L is bounded, and ‖L‖ = ‖λ‖∞.

36 CHRISTOPHER HEIL

(b) L is normal, and L∗f =∑∞

n=1 λn 〈f, en〉 en.

(c) L is self-adjoint if and only if λn ∈ R for every n.

(d) L is compact if and only if λn → 0.

Exercise 5.2. Assume that λ → 0 and that L is defined by (5.1). In the definition of L,combine those terms corresponding to identical λn together. That is, let µ = (µk)k∈I bethe sequence of distinct values in λ (so I is either 1, . . . , N if there are only finitely manydistinct values, or I = N if there are infinitely many). If we set Jk = n ∈ N : λn = µk,then

Pkf =∑

n∈Jk

〈f, en〉 en

is the orthogonal projection of H onto spanenn∈Jk. Show that the operator L defined

in (5.1) can be rewritten as

Lf =∑

k∈I

µk Pkf, f ∈ H,

with convergence of the series in the norm of H. Show further that

L =∑

k∈I

µk Pk,

with convergence of the series in operator norm. Show that spanenn∈Jkis the µk-eigenspace

of L. Show that PjPk = PkPj = 0 for all j 6= k ∈ N, and consequently the eigenspacescorresponding to distinct eigenvalues are orthogonal.

In this section we will prove a converse result, showing that all compact, self-adjointoperators on a Hilbert space can be represented in the form of (5.1). First, however, we needto develop some useful machinery.

Exercise 5.3. If λ is an eigenvalue of L ∈ B(H), then |λ| ≤ ‖L‖.

Exercise 5.4. Let A be an n× n complex matrix. Define its spectral radius to be

ρ(A) = max|λ| : λ is an eigenvalue of A = max|λ| : λ ∈ σp(A).By the preceding exercise, if we choose any norm on C

n and let ‖A‖ be the correspondingoperator norm, we have

ρ(A) ≤ ‖A‖.

(a) Prove that if A is self-adjoint and we use the Euclidean (`2) norm on Cn, then

‖A‖ = ρ(A).

(b) Prove that the same equality holds if A is normal. Find an example of a non-normalmatrix for which ρ(A) < ‖A‖.


(c) Prove that if A is any n× n matrix, then (still using the Euclidean norm on Cn),

‖A‖ = ρ(A∗A)1/2.

(d) (Harder). Prove that A is a fixed but arbitrary n × n complex matrix and ε > 0is given, then there exists a norm on Cn such that the corresponding operator norm of Asatisfies

‖A‖ ≤ ρ(A) + ε.

Although an arbitrary compact operator need not have any eigenvalues (see Exercise 4.38),the following result shows that a compact, self-adjoint operator must have at least oneeigenvalue.

Proposition 5.5. If T : H → H is compact and self-adjoint, then either ‖T‖ or −‖T‖ is aneigenvalue of T .

Proof. Since T is self-adjoint, we know from Proposition 2.18 that

‖T‖ = sup‖f‖=1

|〈Tf, f〉|.

Hence, there must exist unit vectors fn such that |〈Tfn, fn〉| → ‖T‖. Since T is self-adjoint,all the inner products 〈Tfn, fn〉 are real, so we can find a subsequence that converges eitherto ‖T‖ or to −‖T‖. Call this subsequence gnn∈N, and let λ be either ‖T‖ or −‖T‖ asappropriate. Then we have ‖gn‖ = 1 for every n, and 〈Tgn, gn〉 → λ. Hence, since both λand 〈Tgn, gn〉 are real,

‖Tgn − λgn‖2 = ‖Tgn‖2 − 2λ 〈Tgn, gn〉 + λ2 ‖gn‖2

≤ ‖T‖2 ‖gn‖2 − 2λ 〈Tgn, gn〉 + λ2 ‖gn‖2

= λ2 − 2λ 〈Tgn, gn〉 + λ2

→ λ2 − 2λ2 + λ2 = 0.

It therefore follows from Proposition 4.42 that λ is an eigenvalue of T .

Now we can prove that every compact, self-adjoint operator has a very simple and specialform.

Theorem 5.6 (Spectral Theorem for Compact Self-Adjoint Operators). Let T : H → Hbe compact and self-adjoint. Then there exist nonzero real numbers λnn∈J , either finitely

many or λn → 0 if infinitely many, and an orthonormal basis enn∈N of range(T ), such that

Tf =∑

n∈J

λn 〈f, en〉 en, f ∈ H.

Each λn is an eigenvalue of T , and each en is a corresponding eigenvector.

38 CHRISTOPHER HEIL

Proof. Note that since T is compact, range(T ) is separable by Exercise 4.18.If T = 0 then the result is trivial, so assume that T is not the zero operator.Let H1 = H and T1 = T . By Proposition 5.5, T1 has an eigenvalue λ1 which satisfies

|λ1| = ‖T1‖ > 0. Let e1 be a corresponding eigenvector, normalized to ‖e1‖ = 1.Let H2 = e1⊥ and let T2 = T |H2

(the restriction of T to H2). If T2 = 0, then stop atthis point. Otherwise, continue as follows.

Since spane1 is invariant under T1 (after all, e1 is an eigenvector), we know from Exer-cise 3.10 that H2 is invariant under T ∗

1 = T1. Exercise: Show that T2 : H2 → H2 is compactand self-adjoint. Therefore T2 has an eigenvalue λ2 such that |λ2| = ‖T2‖ > 0. Note thatsince T2 is a restriction of T1, we have |λ2| = ‖T2‖ ≤ ‖T1‖ = |λ1|. Let e2 be a correspondingeigenvector, normalized to ‖e2‖ = 1. Note that by definition of H2, we have e2 ⊥ e1. Further,λ2 is an eigenvalue of T (not just T2), and e2 is the corresponding eigenvector of T .

Let H3 = e1, e2⊥ and let T3 = T |H3. If T3 = 0, then stop at this point. Otherwise,

continue as before to construct an eigenvalue λ3 and eigenvector e3 (which will be orthogonalto both e1 and e2).

Continuing in this process, there are two possibilities.

Case 1: TN+1 = 0 for some N . In this case, since HN+1 = e1, . . . , eN⊥, we have

H = spane1, . . . , eN ⊕HN+1.

Therefore, if f ∈ H then we can write f uniquely as

f =

N∑

n=1

〈f, en〉 en + v

where v ∈ HN+1. Since T (v) = TN+1(v) = 0, we therefore have

Tf =N

∑

n=1

〈f, en〉T (en) + T (v) =N

∑

n=1

λn 〈f, en〉 en.

In this case T is finite-rank and the proof is complete.

Case 2: TN 6= 0 for any N . In this case we obtain countably many eigenvalues λn andcorresponding orthonormal eigenvectors en. Since T is compact, we have by Exercise 4.10that λnen = T (en) → 0. Since ‖en‖ = 1, we conclude that λn → 0.

Let M = spanenn∈N. Then enn∈N is an orthonormal basis for M , and H = M ⊕M⊥.Hence, if f ∈ H then we can write f uniquely as

f =∞

∑

n=1

〈f, en〉 en + v

for some v ∈M⊥. Therefore

Tf =

∞∑

n=1

〈f, en〉T (en) + T (v) =

∞∑

n=1

λn 〈f, en〉 en + T (v).

If we show that T (v) = 0, then we are done.


Note that since spane1, . . . , eN ⊆ M , we have v ∈ M⊥ ⊆ spane1, . . . , eN⊥ = HN .Hence

‖T (v)‖ = ‖TN(v)‖ ≤ ‖TN‖ ‖v‖ = |λN | ‖v‖ → 0 as N → ∞.

Consequently T (v) = 0.

Since each eigenspace corresponding to nonzero eigenvalues is finite-dimensional, we cangroup terms corresponding to the same eigenvalue together. Alternatively, we could write amore efficient proof of the Spectral Theorem (as Conway does), by using the same argumenton the distinct eigenvalues and corresponding eigenspaces, instead of one eigenvalue andeigenvector at a time. Either way, an extension of the ideas used in the preceding resultgives the following expanded form of the Spectral Theorem.

Theorem 5.7 (Spectral Theorem for Compact Self-Adjoint Operators). Let T : H → H becompact and self-adjoint. Then the following statements hold.

(a) T has only a finite or countably infinite number of distinct eigenvalues, and eacheigenvalue is real.

(b) Let µ1, µ2, . . . = µkk∈I be the distinct nonzero eigenvalues, where either I =1, . . . , N or I = N. Then each eigenspace

Ek = ker(T − λµk)

is finite-dimensional.

(c) If I is infinite, then µk → 0 as k → ∞.

(d) If Pk is the orthogonal projection onto the eigenspace Ek, then PjPk = PkPj = 0 forj 6= k ∈ I. That is, eigenspaces corresponding to distinct eigenvalues are orthogonal.

(e) We have

T =∑

k∈I

µkPk,

where the series converges in operator norm.

(f) There exist nonzero real numbers λnn∈J , either finitely many or λn → 0 if infinitelymany, and an orthonormal sequence enn∈J such that

Tf =∑

n∈J

λn 〈f, en〉 en.

The λn are obtained by repeating each µk according to its multiplicity (the dimensionof the eigenspace Ek). The sequence enn∈J forms an orthonormal basis for

ker(T )⊥ = range(T ).

Corollary 5.8. If T : H → H is compact, self-adjoint, and injective, then H is separable.

Proof. Since T = T ∗, we have that range(T ) = ker(T ∗)⊥ = 0⊥ = H. On the other hand,

since T is compact we know from Exercise 4.18 that range(T ) is separable.

40 CHRISTOPHER HEIL

Example 5.9 (Diagonalization of Self-Adjoint Matrices). Let us examine what the SpectralTheorem says in finite dimensions. Let A be a self-adjoint n × n matrix (i.e., symmetricif real, and Hermitian if complex). Then the Spectral Theorem says that there exist realnonzero eigenvalues λ1, . . . , λk and corresponding orthonormal eigenvectors u1, . . . , uk suchthat

Ax =

k∑

j=1

λj (x · uj) uj. (5.2)

We can extend this representation by including the zero eigenvalues of A, as follows. From(5.2), we see that the column space, or range, of A is

C(A) = range(A) = spanu1, . . . , uk.Since A is self-adjoint, its nullspace is the orthogonal complement of its column space, for

N(A) = ker(A) = range(A)⊥ = C(A)⊥.

Let uk+1, . . . , un be an orthonormal basis for N(A), and let λk+1 = · · · = λn = 0. Thenu1, . . . , un is an orthonormal basis for C

n with corresponding eigenvectors λ1, . . . , λn. Further,we have the following representations:

x =

n∑

j=1

(x · uj) uj and Ax =

n∑

j=1

λj (x · uj) uj, x ∈ Fn.

Let us rewrite this representation as follows:

Ax =

n∑

j=1

λj (x · uj) uj

=

| |u1 · · · un

| |

λ1 (x · u1)...

λn (x · un)

=

| |u1 · · · un

| |

λ1

. . .

un

x · u1...

x · un

=

| |u1 · · · un

| |

λ1

. . .un

— uH1 —...

— uHn —

x

= UΛUHx,

where U is the matrix that has u1, . . . , un as columns, and Λ is the diagonal matrix withλ1, . . . , λn on the diagonal. On the one hand, this is nothing more than the diagonalizationof A. However, this says much more: every self-adjoint matrix can be diagonalized (even ifsome eigenvalues are repeated), and furthermore, the eigenvector matrix is unitary (becauseit has orthonormal columns). We summarize this next as a theorem.


Theorem 5.10 (Diagonalization of Self-Adjoint Matrices). Let A be an n×n matrix. Thenthe following statements are equivalent.

(a) A is self-adjoint.

(b) A = UΛU∗ where U is unitary and Λ is diagonal with real scalars on its diagonal.

(c) There exist real scalars λ1, . . . , λn an orthonormal vectors u1, . . . , un such that

Ax =

n∑

j=1

λj (x · uj) uj, x ∈ Fn.

(d) There exists an orthonormal basis u1, . . . , un for Fn consisting of eigenvectors of Awith corresponding real eigenvalues λ1, . . . , λn.

functional analysis lecture notes chapter...

Documents