hilbert spaces - information and computer sciencejean/hilbert-spaces.pdfwhen eis a hilbert space,...

Hilbert Spaces

Jean Gallier and Jocelyn QuaintanceDepartment of Computer and Information Science

University of PennsylvaniaPhiladelphia, PA 19104, USAe-mail: [email protected]

c© Jean GallierPlease, do not reproduce without permission of the authors

March 10, 2017

Contents

1 A Glimpse at Hilbert Spaces 51.1 The Projection Lemma, Duality . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Total Orthogonal Families, Fourier Coefficients . . . . . . . . . . . . . . . . . 161.3 The Hilbert Space l2(K) and the Riesz-Fischer Theorem . . . . . . . . . . . 24

3

4 CONTENTS

Chapter 1

A Glimpse at Hilbert Spaces

1.1 The Projection Lemma, Duality

Given a Hermitian space 〈E,ϕ〉, we showed in Section ?? that the function ‖ ‖ : E → Rdefined such that ‖u‖ =

√ϕ(u, u), is a norm on E. Thus, E is a normed vector space. If E

is also complete, then it is a very interesting space.

Recall that completeness has to do with the convergence of Cauchy sequences. A normedvector space 〈E, ‖ ‖〉 is automatically a metric space under the metric d defined such thatd(u, v) = ‖v−u‖ (see Chapter ?? for the definition of a normed vector space and of a metricspace, or Lang [5, 6], or Dixmier [3]). Given a metric space E with metric d, a sequence(an)n≥1 of elements an ∈ E is a Cauchy sequence iff for every ε > 0, there is some N ≥ 1such that

d(am, an) < ε for all m,n ≥ N.

We say that E is complete iff every Cauchy sequence converges to a limit (which is unique,since a metric space is Hausdorff).

Every finite dimensional vector space over R or C is complete. For example, one canshow by induction that given any basis (e1, . . . , en) of E, the linear map h : Cn → E definedsuch that

h((z1, . . . , zn)) = z1e1 + · · ·+ znen

is a homeomorphism (using the sup-norm on Cn). One can also use the fact that any twonorms on a finite dimensional vector space over R or C are equivalent (see Chapter ??, orLang [6], Dixmier [3], Schwartz [9]).

However, if E has infinite dimension, it may not be complete. When a Hermitian space iscomplete, a number of the properties that hold for finite dimensional Hermitian spaces alsohold for infinite dimensional spaces. For example, any closed subspace has an orthogonalcomplement, and in particular, a finite dimensional subspace has an orthogonal complement.Hermitian spaces that are also complete play an important role in analysis. Since they werefirst studied by Hilbert, they are called Hilbert spaces.

5

6 CHAPTER 1. A GLIMPSE AT HILBERT SPACES

Definition 1.1. A (complex) Hermitian space 〈E,ϕ〉 which is a complete normed vectorspace under the norm ‖ ‖ induced by ϕ is called a Hilbert space. A real Euclidean space〈E,ϕ〉 which is complete under the norm ‖ ‖ induced by ϕ is called a real Hilbert space.

All the results in this section hold for complex Hilbert spaces as well as for real Hilbertspaces. We state all results for the complex case only, since they also apply to the real case,and since the proofs in the complex case need a little more care.

Example 1.1. The space l2 of all countably infinite sequences x = (xi)i∈N of complexnumbers such that

∑∞i=0 |xi|2 < ∞ is a Hilbert space. It will be shown later that the map

ϕ : l2 × l2 → C defined such that

ϕ ((xi)i∈N, (yi)i∈N) =∞∑i=0

xiyi

is well defined, and that l2 is a Hilbert space under ϕ. In fact, we will prove a more generalresult (Proposition 1.11).

Example 1.2. The set C∞[a, b] of smooth functions f : [a, b]→ C is a Hermitian space underthe Hermitian form

〈f, g〉 =

∫ b

a

f(x)g(x)dx,

but it is not a Hilbert space because it is not complete. It is possible to construct itscompletion L2([a, b]), which turns out to be the space of Lebesgue integrable functions on[a, b].

Remark: Given a Hermitian space E (with Hermitian product 〈−,−〉), it is possible toconstruct a Hilbert space Eh (with Hermitian product 〈−,−〉h) and a linear map j : E → Eh,such that

〈u, v〉 = 〈j(u), j(v)〉hfor all u, v ∈ E, and j(E) is dense in Eh. Furthermore, Eh is unique up to isomorphism. Fordetails, see Bourbaki [2].

One of the most important facts about finite-dimensional Hermitian (and Euclidean)spaces is that they have orthonormal bases. This implies that, up to isomorphism, everyfinite-dimensional Hermitian space is isomorphic to Cn (for some n ∈ N) and that the innerproduct is given by

〈(x1, . . . , xn), (y1, . . . , yn)〉 =n∑i=1

xiyi.

Furthermore, every subspace W has an orthogonal complement W⊥, and the inner productinduces a natural duality between E and E∗, where E∗ is the space of linear forms on E.

1.1. THE PROJECTION LEMMA, DUALITY 7

When E is a Hilbert space, E may be infinite dimensional, often of uncountable dimen-sion. Thus, we can’t expect that E always have an orthonormal basis. However, if we modifythe notion of basis so that a “Hilbert basis” is an orthogonal family that is also dense in E,i.e., every v ∈ E is the limit of a sequence of finite combinations of vectors from the Hilbertbasis, then we can recover most of the “nice” properties of finite-dimensional Hermitianspaces. For instance, if (uk)k∈K is a Hilbert basis, for every v ∈ E, we can define the Fouriercoefficients ck = 〈v, uk〉/‖uk‖, and then, v is the “sum” of its Fourier series

∑k∈K ckuk. How-

ever, the cardinality of the index set K can be very large, and it is necessary to define whatit means for a family of vectors indexed by K to be summable. We will do this in Section1.2. It turns out that every Hilbert space is isomorphic to a space of the form l2(K), wherel2(K) is a generalization of the space of Example 1.1 (see Theorem 1.16, usually called theRiesz-Fischer theorem).

Our first goal is to prove that a closed subspace of a Hilbert space has an orthogonalcomplement. We also show that duality holds if we redefine the dual E ′ of E to be the spaceof continuous linear maps on E. Our presentation closely follows Bourbaki [2]. We alsowere inspired by Rudin [7], Lang [5, 6], Schwartz [9, 8], and Dixmier [3]. In fact, we highlyrecommend Dixmier [3] as a clear and simple text on the basics of topology and analysis.We first prove the so-called projection lemma.

Recall that in a metric space E, a subset X of E is closed iff for every convergent sequence(xn) of points xn ∈ X, the limit x = limn→∞ xn also belongs to X. The closure X of X isthe set of all limits of convergent sequences (xn) of points xn ∈ X. Obviously, X ⊆ X. Wesay that the subset X of E is dense in E iff E = X, the closure of X, which means thatevery a ∈ E is the limit of some sequence (xn) of points xn ∈ X. Convex sets will again playa crucial role.

First, we state the following easy “parallelogram inequality”, whose proof is left as anexercise.

Proposition 1.1. If E is a Hermitian space, for any two vectors u, v ∈ E, we have

‖u+ v‖2 + ‖u− v‖2 = 2(‖u‖2 + ‖v‖2).

From the above, we get the following proposition:

Proposition 1.2. If E is a Hermitian space, given any d, δ ∈ R such that 0 ≤ δ < d, let

B = {u ∈ E | ‖u‖ < d} and C = {u ∈ E | ‖u‖ ≤ d+ δ}.

For any convex set such A that A ⊆ C −B, we have

‖v − u‖ ≤√

12dδ,

for all u, v ∈ A (see Figure 1.1).


C B

A u

v

Figure 1.1: Inequality of Proposition 1.2

Proof. Since A is convex, 12

(u + v) ∈ A if u, v ∈ A, and thus, ‖12

(u + v)‖ ≥ d. From theparallelogram inequality written in the form∥∥∥1

2(u+ v)

∥∥∥2 +∥∥∥1

2(u− v)

∥∥∥2 =1

2

(‖u‖2 + ‖v‖2

),

since δ < d, we get∥∥∥1

2(u− v)

∥∥∥2 =1

2

(‖u‖2 + ‖v‖2

)−∥∥∥1

2(u+ v)

∥∥∥2 ≤ (d+ δ)2 − d2 = 2dδ + δ2 ≤ 3dδ,

from which‖v − u‖ ≤

√12dδ.

If X is a nonempty subset of a metric space (E, d), for any a ∈ E, recall that we definethe distance d(a,X) of a to X as

d(a,X) = infb∈X

d(a, b).

Also, the diameter δ(X) of X is defined by

δ(X) = sup{d(a, b) | a, b ∈ X}.

It is possible that δ(X) =∞. We leave the following standard two facts as an exercise (seeDixmier [3]):

Proposition 1.3. Let E be a metric space.

(1) For every subset X ⊆ E, δ(X) = δ(X).

(2) If E is a complete metric space, for every sequence (Fn) of closed nonempty subsets ofE such that Fn+1 ⊆ Fn, if limn→∞ δ(Fn) = 0, then

⋂∞n=1 Fn consists of a single point.


We are now ready to prove the crucial projection lemma.

Proposition 1.4. (Projection lemma) Let E be a Hilbert space.

(1) For any nonempty convex and closed subset X ⊆ E, for any u ∈ E, there is a uniquevector pX(u) ∈ X such that

‖u− pX(u)‖ = infv∈X‖u− v‖ = d(u,X).

(2) The vector pX(u) is the unique vector w ∈ E satisfying the following property (seeFigure 1.2):

w ∈ X and < 〈u− w, z − w〉 ≤ 0 for all z ∈ X. (∗)

Xw

u

z

Figure 1.2: Inequality of Proposition 1.4

Proof. (1) Let d = infv∈X ‖u − v‖ = d(u,X). We define a sequence Xn of subsets of X asfollows: for every n ≥ 1,

Xn =

{v ∈ X | ‖u− v‖ ≤ d+

1

n

}.

It is immediately verified that each Xn is nonempty (by definition of d), convex, and thatXn+1 ⊆ Xn. Also, by Proposition 1.2, we have

sup{‖w − v‖ | v, w ∈ Xn} ≤√

12d/n,

and thus,⋂n≥1Xn contains at most one point. We will prove that

⋂n≥1Xn contains exactly

one point, namely, pX(u). For this, define a sequence (wn)n≥1 by picking some wn ∈ Xn forevery n ≥ 1. We claim that (wn)n≥1 is a Cauchy sequence. Given any ε > 0, if we pick Nsuch that

N >12d

ε2,


since (Xn)n≥1 is a monotonic decreasing sequence, for all m,n ≥ N , we have

‖wm − wn‖ ≤√

12d/N < ε,

as desired. Since E is complete, the sequence (wn)n≥1 has a limit w, and since wn ∈ X andX is closed, we must have w ∈ X. Also observe that

‖u− w‖ ≤ ‖u− wn‖+ ‖wn − w‖,

and since w is the limit of (wn)n≥1 and

‖u− wn‖ ≤ d+1

n,

given any ε > 0, there is some n large enough so that

1

n<ε

2and ‖wn − w‖ ≤

ε

2,

and thus‖u− w‖ ≤ d+ ε.

Since the above holds for every ε > 0, we have ‖u − w‖ = d. Thus, w ∈ Xn for all n ≥ 1,which proves that

⋂n≥1Xn = {w}. Now, any z ∈ X such that ‖u − z‖ = d(u,X) = d

also belongs to every Xn, and thus z = w, proving the uniqueness of w, which we denote aspX(u).

(2) Let w ∈ X. Since X is convex, z = (1− λ)pX(u) + λw ∈ X for every λ, 0 ≤ λ ≤ 1.Then, we have

‖u− z‖ ≥ ‖u− pX(u)‖

for all λ, 0 ≤ λ ≤ 1, and since

‖u− z‖2 = ‖u− pX(u)− λ(w − pX(u))‖2

= ‖u− pX(u)‖2 + λ2‖w − pX(u)‖2 − 2λ< 〈u− pX(u), w − pX(u)〉 ,

for all λ, 0 < λ ≤ 1, we get

< 〈u− pX(u), w − pX(u)〉 =1

2λ

(‖u− pX(u)‖2 − ‖u− z‖2

)+λ

2‖w − pX(u)‖2,

and since this holds for every λ, 0 < λ ≤ 1 and

‖u− z‖ ≥ ‖u− pX(u)‖,

we have< 〈u− pX(u), w − pX(u)〉 ≤ 0.


Conversely, assume that w ∈ X satisfies the condition

< 〈u− w, z − w〉 ≤ 0

for all z ∈ X. For all z ∈ X, we have

‖u− z‖2 = ‖u− w‖2 + ‖z − w‖2 − 2< 〈u− w, z − w〉 ≥ ‖u− w‖2,

which implies that ‖u− w‖ = d(u,X) = d, and from (1), that w = pX(u).

The vector pX(u) is called the projection of u onto X, and the map pX : E → X is calledthe projection of E onto X. In the case of a real Hilbert space, there is an intuitive geometricinterpretation of the condition

〈u− pX(u), z − pX(u)〉 ≤ 0

for all z ∈ X. If we restate the condition as

〈u− pX(u), pX(u)− z〉 ≥ 0

for all z ∈ X, this says that the absolute value of the measure of the angle between thevectors u− pX(u) and pX(u)− z is at most π/2. This makes sense, since X is convex, andpoints in X must be on the side opposite to the “tangent space” to X at pX(u), which isorthogonal to u− pX(u). Of course, this is only an intuitive description, since the notion oftangent space has not been defined!

The map pX : E → X is continuous, as shown below.

Proposition 1.5. Let E be a Hilbert space. For any nonempty convex and closed subsetX ⊆ E, the map pX : E → X is continuous.

Proof. For any two vectors u, v ∈ E, let x = pX(u)−u, y = pX(v)−pX(u), and z = v−pX(v).Clearly,

v − u = x+ y + z,

and from Proposition 1.4 (2), we also have

< 〈x, y〉 ≥ 0 and < 〈z, y〉 ≥ 0,

from which we get

‖v − u‖2 = ‖x+ y + z‖2 = ‖x+ z + y‖2

= ‖x+ z‖2 + ‖y‖2 + 2< 〈x, y〉+ 2< 〈z, y〉≥ ‖y‖2 = ‖pX(v)− pX(u)‖2.

However, ‖pX(v)− pX(u)‖ ≤ ‖v − u‖ obviously implies that pX is continuous.


We can now prove the following important proposition.

Proposition 1.6. Let E be a Hilbert space.

(1) For any closed subspace V ⊆ E, we have E = V ⊕ V ⊥, and the map pV : E → V islinear and continuous.

(2) For any u ∈ E, the projection pV (u) is the unique vector w ∈ E such that

w ∈ V and 〈u− w, z〉 = 0 for all z ∈ V .

Proof. (1) First, we prove that u − pV (u) ∈ V ⊥ for all u ∈ E. For any v ∈ V , since V is asubspace, z = pV (u) + λv ∈ V for all λ ∈ C, and since V is convex and nonempty (since itis a subspace), and closed by hypothesis, by Proposition 1.4 (2), we have

<(λ 〈u− pV (u), v〉) = <(〈u− pV (u), λv〉 = < 〈u− pV (u), z − pV (u)〉 ≤ 0

for all λ ∈ C. In particular, the above holds for λ = 〈u− pV (u), v〉, which yields

| 〈u− pV (u), v〉 | ≤ 0,

and thus, 〈u− pV (u), v〉 = 0. As a consequence, u − pV (u) ∈ V ⊥ for all u ∈ E. Sinceu = pV (u) + u − pV (u) for every u ∈ E, we have E = V + V ⊥. On the other hand, since〈−,−〉 is positive definite, V ∩ V ⊥ = {0}, and thus E = V ⊕ V ⊥.

We already proved in Proposition 1.5 that pV : E → V is continuous. Also, since

pV (λu+µv)− (λpV (u) +µpV (v)) = pV (λu+µv)− (λu+µv) + λ(u− pV (u)) +µ(v− pV (v)),

for all u, v ∈ E, and since the left-hand side term belongs to V , and from what we justshowed, the right-hand side term belongs to V ⊥, we have

pV (λu+ µv)− (λpV (u) + µpV (v)) = 0,

showing that pV is linear.

(2) This is basically obvious from (1). We proved in (1) that u − pV (u) ∈ V ⊥, which isexactly the condition

〈u− pV (u), z〉 = 0

for all z ∈ V . Conversely, if w ∈ V satisfies the condition

〈u− w, z〉 = 0

for all z ∈ V , since w ∈ V , every vector z ∈ V is of the form y − w, with y = z + w ∈ V ,and thus, we have

〈u− w, y − w〉 = 0

for all y ∈ V , which implies the condition of Proposition 1.4 (2):

< 〈u− w, y − w〉 ≤ 0

for all y ∈ V . By Proposition 1.4, w = pV (u) is the projection of u onto V .


Let us illustrate the power of Proposition 1.6 on the following “least squares” problem.Given a real m × n-matrix A and some vector b ∈ Rm, we would like to solve the linearsystem

Ax = b

in the least-squares sense, which means that we would like to find some solution x ∈ Rn thatminimizes the Euclidean norm ‖Ax− b‖ of the error Ax− b. It is actually not clear that theproblem has a solution, but it does! The problem can be restated as follows: Is there somex ∈ Rn such that

‖Ax− b‖ = infy∈Rn‖Ay − b‖,

or equivalently, is there some z ∈ Im (A) such that

‖z − b‖ = d(b, Im (A)),

where Im (A) = {Ay ∈ Rm | y ∈ Rn}, the image of the linear map induced by A. SinceIm (A) is a closed subspace of Rm, because we are in finite dimension, Proposition 1.6 tellsus that there is a unique z ∈ Im (A) such that

‖z − b‖ = infy∈Rn‖Ay − b‖,

and thus, the problem always has a solution since z ∈ Im (A), and since there is at least somex ∈ Rn such that Ax = z (by definition of Im (A)). Note that such an x is not necessarilyunique. Furthermore, Proposition 1.6 also tells us that z ∈ Im (A) is the solution of theequation

〈z − b, w〉 = 0 for all w ∈ Im (A),

or equivalently, that x ∈ Rn is the solution of

〈Ax− b, Ay〉 = 0 for all y ∈ Rn,

which is equivalent to〈A>(Ax− b), y〉 = 0 for all y ∈ Rn,

and thus, since the inner product is positive definite, to A>(Ax− b) = 0, i.e.,

A>Ax = A>b.

Therefore, the solutions of the original least-squares problem are precisely the solutionsof the the so-called normal equations

A>Ax = A>b,

discovered by Gauss and Legendre around 1800. We also proved that the normal equationsalways have a solution.


Computationally, it is best not to solve the normal equations directly, and instead, touse methods such as the QR-decomposition (applied to A) or the SVD-decomposition (inthe form of the pseudo-inverse). We will come back to this point later on.

As another corollary of Proposition 1.6, for any continuous nonnull linear map h : E → C,the null space

H = Kerh = {u ∈ E | h(u) = 0} = h−1(0)

is a closed hyperplane H, and thus, H⊥ is a subspace of dimension one such that E = H⊕H⊥.This suggests defining the dual space of E as the set of all continuous maps h : E → C.

Remark: If h : E → C is a linear map which is not continuous, then it can be shownthat the hyperplane H = Kerh is dense in E! Thus, H⊥ is reduced to the trivial subspace{0}. This goes against our intuition of what a hyperplane in Rn (or Cn) is, and warns usnot to trust our “physical” intuition too much when dealing with infinite dimensions. Asa consequence, the map [ : E → E∗ introduced in Section ?? (see just after Definition 1.2below) is not surjective, since the linear forms of the form u 7→ 〈u, v〉 (for some fixed vectorv ∈ E) are continuous (the inner product is continuous).

We now show that by redefining the dual space of a Hilbert space as the set of continuouslinear forms on E, we recover Theorem ??.

Definition 1.2. Given a Hilbert space E, we define the dual space E ′ of E as the vectorspace of all continuous linear forms h : E → C. Maps in E ′ are also called bounded linearoperators, bounded linear functionals, or simply, operators or functionals .

As in Section ??, for all u, v ∈ E, we define the maps ϕlu : E → C and ϕrv : E → C suchthat

ϕlu(v) = 〈u, v〉,

andϕrv(u) = 〈u, v〉 .

In fact, ϕlu = ϕru, and because the inner product 〈−,−〉 is continuous, it is obvious that ϕrvis continuous and linear, so that ϕrv ∈ E ′. To simplify notation, we write ϕv instead of ϕrv.

Theorem ?? is generalized to Hilbert spaces as follows.

Proposition 1.7. (Riesz representation theorem) Let E be a Hilbert space. Then, the map[ : E → E ′ defined such that

[(v) = ϕv,

is semilinear, continuous, and bijective.

Proof. The proof is basically identical to the proof of Theorem ??, except that a differentargument is required for the surjectivity of [ : E → E ′, since E may not be finite dimensional.For any nonnull linear operator h ∈ E ′, the hyperplane H = Kerh = h−1(0) is a closedsubspace of E, and by Proposition 1.6, H⊥ is a subspace of dimension one such that E =


H ⊕H⊥. Then, picking any nonnull vector w ∈ H⊥, observe that H is also the kernel of thelinear operator ϕw, with

ϕw(u) = 〈u,w〉 ,

and thus, since any two nonzero linear forms defining the same hyperplane must be propor-tional, there is some nonzero scalar λ ∈ C such that h = λϕw. But then, h = ϕλw, provingthat [ : E → E ′ is surjective.

Proposition 1.7 is known as the Riesz representation theorem, or “Little Riesz Theorem.”It shows that the inner product on a Hilbert space induces a natural linear isomorphismbetween E and its dual E ′.

Remarks:

(1) Actually, the map [ : E → E ′ turns out to be an isometry. To show this, we need torecall the notion of norm of a linear map, which we do not want to do right now.

(2) Many books on quantum mechanics use the so-called Dirac notation to denote objectsin the Hilbert space E and operators in its dual space E ′. In the Dirac notation, anelement of E is denoted as |x〉, and an element of E ′ is denoted as 〈t|. The scalarproduct is denoted as 〈t| · |x〉. This uses the isomorphism between E and E ′, exceptthat the inner product is assumed to be semi-linear on the left, rather than on theright.

Proposition 1.7 allows us to define the adjoint of a linear map, as in the Hermitian case(see Proposition ??). The proof is unchanged.

Proposition 1.8. Given a Hilbert space E, for every linear map f : E → E, there is aunique linear map f ∗ : E → E, such that

〈f ∗(u), v〉 = 〈u, f(v)〉

for all u, v ∈ E. The map f ∗ is called the adjoint of f .

It is easy to show that if f is continuous, then f ∗ is also continuous. As in the Hermitiancase, given two Hilbert spaces E and F , for any linear map f : E → F , such that

〈f(u), v〉2 = 〈u, f ∗(v)〉1

for all u ∈ E and all v ∈ F . The linear map f ∗ is also called the adjoint of f .


1.2 Total Orthogonal Families (Hilbert Bases),

Fourier Coefficients

We conclude our quick tour of Hilbert spaces by showing that the notion of orthogonal basiscan be generalized to Hilbert spaces. However, the useful notion is not the usual notion ofa basis, but a notion which is an abstraction of the concept of Fourier series. Every elementof a Hilbert space is the “sum” of its Fourier series.

Definition 1.3. Given a Hilbert space E, a family (uk)k∈K of nonnull vectors is an orthogonalfamily iff the uk are pairwise orthogonal, i.e., 〈ui, uj〉 = 0 for all i 6= j (i, j ∈ K), and anorthonormal family iff 〈ui, uj〉 = δi, j, for all i, j ∈ K. A total orthogonal family (or system)or Hilbert basis is an orthogonal family that is dense in E. This means that for every v ∈ E,for every ε > 0, there is some finite subset I ⊆ K and some family (λi)i∈I of complexnumbers, such that ∥∥∥v −∑

i∈I

λiui

∥∥∥ < ε.

Given an orthogonal family (uk)k∈K , for every v ∈ E, for every k ∈ K, the scalar ck =〈v, uk〉 /‖uk‖2 is called the k-th Fourier coefficient of v over (uk)k∈K .

Remark: The terminology Hilbert basis is misleading, because a Hilbert basis (uk)k∈K isnot necessarily a basis in the algebraic sense. Indeed, in general, (uk)k∈K does not span E.Intuitively, it takes linear combinations of the uk’s with infinitely many nonnull coefficientsto span E. Technically, this is achieved in terms of limits. In order to avoid the confusionbetween bases in the algebraic sense and Hilbert bases, some authors refer to algebraic basesas Hamel bases and to total orthogonal families (or Hilbert bases) as Schauder bases .

Given an orthogonal family (uk)k∈K , for any finite subset I of K, we often call sums ofthe form

∑i∈I λiui partial sums of Fourier series , and if these partial sums converge to a

limit denoted as∑

k∈K ckuk, we call∑

k∈K ckuk a Fourier series .

However, we have to make sense of such sums! Indeed, when K is unordered or uncount-able, the notion of limit or sum has not been defined. This can be done as follows (for moredetails, see Dixmier [3]):

Definition 1.4. Given a normed vector space E (say, a Hilbert space), for any nonemptyindex set K, we say that a family (uk)k∈K of vectors in E is summable with sum v ∈ E ifffor every ε > 0, there is some finite subset I of K, such that,∥∥∥v −∑

j∈J

uj

∥∥∥ < ε

for every finite subset J with I ⊆ J ⊆ K. We say that the family (uk)k∈K is summableiff there is some v ∈ E such that (uk)k∈K is summable with sum v. A family (uk)k∈K is a

1.2. TOTAL ORTHOGONAL FAMILIES, FOURIER COEFFICIENTS 17

Cauchy family iff for every ε > 0, there is a finite subset I of K, such that,∥∥∥∑j∈J

uj

∥∥∥ < ε

for every finite subset J of K with I ∩ J = ∅,

If (uk)k∈K is summable with sum v, we usually denote v as∑

k∈K uk. The followingtechnical proposition will be needed:

Proposition 1.9. Let E be a complete normed vector space (say, a Hilbert space).

(1) For any nonempty index set K, a family (uk)k∈K is summable iff it is a Cauchy family.

(2) Given a family (rk)k∈K of nonnegative reals rk ≥ 0, if there is some real number B > 0such that

∑i∈I ri < B for every finite subset I of K, then (rk)k∈K is summable and∑

k∈K rk = r, where r is least upper bound of the set of finite sums∑

i∈I ri (I ⊆ K).

Proof. (1) If (uk)k∈K is summable, for every finite subset I of K, let

uI =∑i∈I

ui and u =∑k∈K

uk

For every ε > 0, there is some finite subset I of K such that

‖u− uL‖ < ε/2

for all finite subsets L such that I ⊆ L ⊆ K. For every finite subset J of K such thatI ∩ J = ∅, since I ⊆ I ∪ J ⊆ K and I ∪ J is finite, we have

‖u− uI∪J‖ < ε/2 and ‖u− uI‖ < ε/2,

and since‖uI∪J − uI‖ ≤ ‖uI∪J − u‖+ ‖u− uI‖

and uI∪J − uI = uJ since I ∩ J = ∅, we get

‖uJ‖ = ‖uI∪J − uI‖ < ε,

which is the condition for (uk)k∈K to be a Cauchy family.

Conversely, assume that (uk)k∈K is a Cauchy family. We define inductively a decreasingsequence (Xn) of subsets of E, each of diameter at most 1/n, as follows: For n = 1, since(uk)k∈K is a Cauchy family, there is some finite subset J1 of K such that

‖uJ‖ < 1/2


for every finite subset J of K with J1 ∩ J = ∅. We pick some finite subset J1 with the aboveproperty, and we let I1 = J1 and

X1 = {uI | I1 ⊆ I ⊆ K, I finite}.

For n ≥ 1, there is some finite subset Jn+1 of K such that

‖uJ‖ < 1/(2n+ 2)

for every finite subset J of K with Jn+1 ∩ J = ∅. We pick some finite subset Jn+1 with theabove property, and we let In+1 = In ∪ Jn+1 and

Xn+1 = {uI | In+1 ⊆ I ⊆ K, I finite}.

Since In ⊆ In+1, it is obvious that Xn+1 ⊆ Xn for all n ≥ 1. We need to prove that each Xn

has diameter at most 1/n. Since Jn was chosen such that

‖uJ‖ < 1/(2n)

for every finite subset J of K with Jn ∩ J = ∅, and since Jn ⊆ In, it is also true that

‖uJ‖ < 1/(2n)

for every finite subset J of K with In ∩ J = ∅ (since In ∩ J = ∅ and Jn ⊆ In implies thatJn ∩ J = ∅). Then, for every two finite subsets J, L such that In ⊆ J, L ⊆ K, we have

‖uJ−In‖ < 1/(2n) and ‖uL−In‖ < 1/(2n),

and since‖uJ − uL‖ ≤ ‖uJ − uIn‖+ ‖uIn − uL‖ = ‖uJ−In‖+ ‖uL−In‖,

we get‖uJ − uL‖ < 1/n,

which proves that δ(Xn) ≤ 1/n. Now, if we consider the sequence of closed sets (Xn), westill have Xn+1 ⊆ Xn, and by Proposition 1.3, δ(Xn) = δ(Xn) ≤ 1/n, which means thatlimn→∞ δ(Xn) = 0, and by Proposition 1.3,

⋂∞n=1Xn consists of a single element u. We

claim that u is the sum of the family (uk)k∈K .

For every ε > 0, there is some n ≥ 1 such that n > 2/ε, and since u ∈ Xm for all m ≥ 1,there is some finite subset J0 of K such that In ⊆ J0 and

‖u− uJ0‖ < ε/2,

where In is the finite subset of K involved in the definition of Xn. However, since δ(Xn) ≤1/n, for every finite subset J of K such that In ⊆ J , we have

‖uJ − uJ0‖ ≤ 1/n < ε/2,


and since‖u− uJ‖ ≤ ‖u− uJ0‖+ ‖uJ0 − uJ‖,

we get‖u− uJ‖ < ε

for every finite subset J of K with In ⊆ J , which proves that u is the sum of the family(uk)k∈K .

(2) Since every finite sum∑

i∈I ri is bounded by the uniform bound B, the set of thesefinite sums has a least upper bound r ≤ B. For every ε > 0, since r is the least upper boundof the finite sums

∑i∈I ri (where I finite, I ⊆ K), there is some finite I ⊆ K such that∣∣∣∣∣r −∑

i∈I

ri

∣∣∣∣∣ < ε,

and since rk ≥ 0 for all k ∈ K, we have∑i∈I

ri ≤∑j∈J

rj

whenever I ⊆ J , which shows that∣∣∣∣∣r −∑j∈J

rj

∣∣∣∣∣ ≤∣∣∣∣∣r −∑

i∈I

ri

∣∣∣∣∣ < ε

for every finite subset J such that I ⊆ J ⊆ K, proving that (rk)k∈K is summable with sum∑k∈K rk = r.

Remark: The notion of summability implies that the sum of a family (uk)k∈K is independentof any order on K. In this sense, it is a kind of “commutative summability”. More precisely,it is easy to show that for every bijection ϕ : K → K (intuitively, a reordering of K), thefamily (uk)k∈K is summable iff the family (ul)l∈ϕ(K) is summable, and if so, they have thesame sum.

The following proposition gives some of the main properties of Fourier coefficients. Amongother things, at most countably many of the Fourier coefficient may be nonnull, and thepartial sums of a Fourier series converge. Given an orthogonal family (uk)k∈K , we let Uk =Cuk, and pUk

: E → Uk is the projection of E onto Uk.

Proposition 1.10. Let E be a Hilbert space, (uk)k∈K an orthogonal family in E, and V theclosure of the subspace generated by (uk)k∈K. The following properties hold:

(1) For every v ∈ E, for every finite subset I ⊆ K, we have∑i∈I

|ci|2 ≤ ‖v‖2,

where the ck are the Fourier coefficients of v.


(2) For every vector v ∈ E, if (ck)k∈K are the Fourier coefficients of v, the followingconditions are equivalent:

(2a) v ∈ V(2b) The family (ckuk)k∈K is summable and v =

∑k∈K ckuk.

(2c) The family (|ck|2)k∈K is summable and ‖v‖2 =∑

k∈K |ck|2;

(3) The family (|ck|2)k∈K is summable, and we have the Bessel inequality:∑k∈K

|ck|2 ≤ ‖v‖2.

As a consequence, at most countably many of the ck may be nonzero. The family(ckuk)k∈K forms a Cauchy family, and thus, the Fourier series

∑k∈K ckuk converges

in E to some vector u =∑

k∈K ckuk. Furthermore, u = pV (v).

Proof. (1) Let

uI =∑i∈I

ciui

for any finite subset I of K. We claim that v−uI is orthogonal to ui for every i ∈ I. Indeed,

〈v − uI , ui〉 =

⟨v −

∑j∈I

cjuj, ui

⟩= 〈v, ui〉 −

∑j∈I

cj 〈uj, ui〉

= 〈v, ui〉 − ci‖ui‖2

= 〈v, ui〉 − 〈v, ui〉 = 0,

since 〈uj, ui〉 = 0 for all i 6= j and ci = 〈v, ui〉 /‖ui‖2. As a consequence, we have

‖v‖2 =∥∥∥v −∑

i∈I

ciui +∑i∈I

ciui

∥∥∥2=∥∥∥v −∑

i∈I

ciui

∥∥∥2 +∥∥∥∑i∈I

ciui

∥∥∥2=∥∥∥v −∑

i∈I

ciui

∥∥∥2 +∑i∈I

|ci|2,

since the ui are pairwise orthogonal, that is,

‖v‖2 =∥∥∥v −∑

i∈I

ciui

∥∥∥2 +∑i∈I

|ci|2.


Thus, ∑i∈I

|ci|2 ≤ ‖v‖2,

as claimed.

(2) We prove the chain of implications (a)⇒ (b) ⇒ (c) ⇒ (a).

(a)⇒ (b): If v ∈ V , since V is the closure of the subspace spanned by (uk)k∈K , for everyε > 0, there is some finite subset I of K and some family (λi)i∈I of complex numbers, suchthat ∥∥∥v −∑

i∈I

λiui

∥∥∥ < ε.

Now, for every finite subset J of K such that I ⊆ J , we have∥∥∥v −∑i∈I

λiui

∥∥∥2 =∥∥∥v −∑

j∈J

cjuj +∑j∈J

cjuj −∑i∈I

λiui

∥∥∥2=∥∥∥v −∑

j∈J

cjuj

∥∥∥2 +∥∥∥∑j∈J

cjuj −∑i∈I

λiui

∥∥∥2,since I ⊆ J and the uj (with j ∈ J) are orthogonal to v−

∑j∈J cjuj by the argument in (1),

which shows that ∥∥∥v −∑j∈J

cjuj

∥∥∥ ≤ ∥∥∥v −∑i∈I

λiui

∥∥∥ < ε,

and thus, that the family (ckuk)k∈K is summable with sum v, so that

v =∑k∈K

ckuk.

(b)⇒ (c): If v =∑

k∈K ckuk, then for every ε > 0, there some finite subset I of K, suchthat ∥∥∥v −∑

j∈J

cjuj

∥∥∥ < √ε,for every finite subset J of K such that I ⊆ J , and since we proved in (1) that

‖v‖2 =∥∥∥v −∑

j∈J

cjuj

∥∥∥2 +∑j∈J

|cj|2,

we get

‖v‖2 −∑j∈J

|cj|2 < ε,

which proves that (|ck|2)k∈K is summable with sum ‖v‖2.


(c) ⇒ (a): Finally, if (|ck|2)k∈K is summable with sum ‖v‖2, for every ε > 0, there issome finite subset I of K such that

‖v‖2 −∑j∈J

|cj|2 < ε2

for every finite subset J of K such that I ⊆ J , and again, using the fact that

‖v‖2 =∥∥∥v −∑

j∈J

cjuj

∥∥∥2 +∑j∈J

|cj|2,

we get ∥∥∥v −∑j∈J

cjuj

∥∥∥ < ε,

which proves that (ckuk)k∈K is summable with sum∑

k∈K ckuk = v, and v ∈ V .

(3) Since∑

i∈I |ci|2 ≤ ‖v‖2 for every finite subset I of K, by Proposition 1.9, the family(|ck|2)k∈K is summable. The Bessel inequality∑

k∈K

|ck|2 ≤ ‖v‖2

is an obvious consequence of the inequality∑

i∈I |ci|2 ≤ ‖v‖2 (for every finite I ⊆ K). Now,for every natural number n ≥ 1, if Kn is the subset of K consisting of all ck such that|ck| ≥ 1/n, the number of elements in Kn is at most∑

k∈Kn

|nck|2 ≤ n2∑k∈K

|ck|2 ≤ n2‖v‖2,

which is finite, and thus, at most a countable number of the ck may be nonzero.

Since (|ck|2)k∈K is summable with sum c, for every ε > 0, there is some finite subset I ofK such that ∑

j∈J

|cj|2 < ε2

for every finite subset J of K such that I ∩ J = ∅. Since∥∥∥∑j∈J

cjuj

∥∥∥2 =∑j∈J

|cj|2,

we get ∥∥∥∑j∈J

cjuj

∥∥∥ < ε.

This proves that (ckuk)k∈K is a Cauchy family, which, by Proposition 1.9, implies that(ckuk)k∈K is summable, since E is complete. Thus, the Fourier series

∑k∈K ckuk is summable,

with its sum denoted u ∈ V .


Since∑

k∈K ckuk is summable with sum u, for every ε > 0, there is some finite subset I1of K such that ∥∥∥u−∑

j∈J

cjuj

∥∥∥ < ε

for every finite subset J of K such that I1 ⊆ J . By the triangle inequality, for every finitesubset I of K, ∥∥∥u− v∥∥∥ ≤ ∥∥∥u−∑

i∈I

ciui

∥∥∥+∥∥∥∑i∈I

ciui − v∥∥∥.

By (2), every w ∈ V is the sum of its Fourier series∑

k∈K λkuk, and for every ε > 0, thereis some finite subset I2 of K such that∥∥∥w −∑

j∈J

λjuj

∥∥∥ < ε

for every finite subset J of K such that I2 ⊆ J . By the triangle inequality, for every finitesubset I of K, ∥∥∥v −∑

i∈I

λiui

∥∥∥ ≤ ‖v − w‖+∥∥∥w −∑

i∈I

λiui

∥∥∥.Letting I = I1 ∪ I2, since we showed in (2) that∥∥∥v −∑

i∈I

ciui

∥∥∥ ≤ ∥∥∥v −∑i∈I

λiui

∥∥∥for every finite subset I of K, we get

‖u− v‖ ≤∥∥∥u−∑

i∈I

ciui

∥∥∥+∥∥∥∑i∈I

ciui − v∥∥∥

≤∥∥∥u−∑

i∈I

ciui

∥∥∥+∥∥∥∑i∈I

λiui − v∥∥∥

≤∥∥∥u−∑

i∈I

ciui

∥∥∥+ ‖v − w‖+∥∥∥w −∑

i∈I

λiui

∥∥∥,and thus

‖u− v‖ ≤ ‖v − w‖+ 2ε.

Since this holds for every ε > 0, we have

‖u− v‖ ≤ ‖v − w‖

for all w ∈ V , i.e. ‖v − u‖ = d(v, V ), with u ∈ V , which proves that u = pV (v).


1.3 The Hilbert Space l2(K) and the Riesz-Fischer The-

orem

Proposition 1.10 suggests looking at the space of sequences (zk)k∈K (where zk ∈ C) such that(|zk|2)k∈K is summable. Indeed, such spaces are Hilbert spaces, and it turns out that everyHilbert space is isomorphic to one of those. Such spaces are the infinite-dimensional versionof the spaces Cn under the usual Euclidean norm.

Definition 1.5. Given any nonempty index set K, the space l2(K) is the set of all sequences(zk)k∈K , where zk ∈ C, such that (|zk|2)k∈K is summable, i.e.,

∑k∈K |zk|2 <∞.

Remarks:

(1) When K is a finite set of cardinality n, l2(K) is isomorphic to Cn.

(2) When K = N, the space l2(N) corresponds to the space l2 of Example 2 in Section??. In that example, we claimed that l2 was a Hermitian space, and in fact, a Hilbertspace. We now prove this fact for any index set K.

Proposition 1.11. Given any nonempty index set K, the space l2(K) is a Hilbert spaceunder the Hermitian product

〈(xk)k∈K , (yk)k∈K〉 =∑k∈K

xkyk.

The subspace consisting of sequences (zk)k∈K such that zk = 0, except perhaps for finitelymany k, is a dense subspace of l2(K).

Proof. First, we need to prove that l2(K) is a vector space. Assume that (xk)k∈K and(yk)k∈K are in l2(K). This means that (|xk|2)k∈K and (|yk|2)k∈K are summable, which, inview of Proposition 1.9, is equivalent to the existence of some positive bounds A and Bsuch that

∑i∈I |xi|2 < A and

∑i∈I |yi|2 < B, for every finite subset I of K. To prove that

(|xk + yk|2)k∈K is summable, it is sufficient to prove that there is some C > 0 such that∑i∈I |xi + yi|2 < C for every finite subset I of K. However, the parallelogram inequality

implies that ∑i∈I

|xi + yi|2 ≤∑i∈I

2(|xi|2 + |yi|2) ≤ 2(A+B),

for every finite subset I of K, and we conclude by Proposition 1.9. Similarly, for every λ ∈ C,∑i∈I

|λxi|2 ≤∑i∈I

|λ|2|xi|2 ≤ |λ|2A,

and (λkxk)k∈K is summable. Therefore, l2(K) is a vector space.

1.3. THE HILBERT SPACE L2(K) AND THE RIESZ-FISCHER THEOREM 25

By the Cauchy-Schwarz inequality,∑i∈I

|xiyi| ≤∑i∈I

|xi||yi| ≤(∑i∈I

|xi|2)1/2(∑

i∈I

|yi|2)1/2 ≤∑

i∈I

(|xi|2 + |yi|2)/2 ≤ (A+B)/2,

for every finite subset I of K. Here, we used the fact that

4CD ≤ (C +D)2,

which is equivalent to

(C −D)2 ≥ 0.

By Proposition 1.9, (|xkyk|)k∈K is summable. The customary language is that (xkyk)k∈Kis absolutely summable. However, it is a standard fact that this implies that (xkyk)k∈K issummable (For every ε > 0, there is some finite subset I of K such that∑

j∈J

|xjyj| < ε

for every finite subset J of K such that I ∩ J = ∅, and thus

|∑j∈J

xjyj| ≤∑i∈J

|xjyj| < ε,

proving that (xkyk)k∈K is a Cauchy family, and thus summable). We still have to prove thatl2(K) is complete.

Consider a sequence ((λnk)k∈K)n≥1 of sequences (λnk)k∈K ∈ l2(K), and assume that it is aCauchy sequence. This means that for every ε > 0, there is some N ≥ 1 such that∑

k∈K

|λmk − λnk |2 < ε2

for all m,n ≥ N . For every fixed k ∈ K, this implies that

|λmk − λnk | < ε

for all m,n ≥ N , which shows that (λnk)n≥1 is a Cauchy sequence in C. Since C is complete,the sequence (λnk)n≥1 has a limit λk ∈ C. We claim that (λk)k∈K ∈ l2(K) and that this isthe limit of ((λnk)k∈K)n≥1.

Given any ε > 0, the fact that ((λnk)k∈K)n≥1 is a Cauchy sequence implies that there issome N ≥ 1 such that for every finite subset I of K, we have∑

i∈I

|λmi − λni |2 < ε/4


for all m,n ≥ N . Let p = |I|. Then,

|λmi − λni | <√ε

2√p

for every i ∈ I. Since λi is the limit of (λni )n≥1, we can find some n large enough so that

|λni − λi| <√ε

2√p

for every i ∈ I. Since|λmi − λi| ≤ |λmi − λni |+ |λni − λi|,

we get

|λmi − λi| <√ε√p,

and thus, ∑i∈I

|λmi − λi|2 < ε,

for all m ≥ N . Since the above holds for every finite subset I of K, by Proposition 1.9, weget ∑

k∈K

|λmk − λk|2 < ε,

for all m ≥ N . This proves that (λmk − λk)k∈K ∈ l2(K) for all m ≥ N , and since l2(K) is avector space and (λmk )k∈K ∈ l2(K) for all m ≥ 1, we get (λk)k∈K ∈ l2(K). However,∑

k∈K

|λmk − λk|2 < ε

for all m ≥ N , means that the sequence (λmk )k∈K converges to (λk)k∈K ∈ l2(K). The factthat the subspace consisting of sequences (zk)k∈K such that zk = 0 except perhaps for finitelymany k is a dense suspace of l2(K) is left as an easy exercise.

Remark: The subspace consisting of all sequences (zk)k∈K such that zk = 0, except perhapsfor finitely many k, provides an example of a subspace which is not closed in l2(K). Indeed,this space is strictly contained in l2(K), since there are countable sequences of nonnullelements in l2(K) (why?).

We just need two more propositions before being able to prove that every Hilbert spaceis isomorphic to some l2(K).

Proposition 1.12. Let E be a Hilbert space, and (uk)k∈K an orthogonal family in E. Thefollowing properties hold:


(1) For every family (λk)k∈K ∈ l2(K), the family (λkuk)k∈K is summable. Furthermore,v =

∑k∈K λkuk is the only vector such that ck = λk for all k ∈ K, where the ck are the

Fourier coefficients of v.

(2) For any two families (λk)k∈K ∈ l2(K) and (µk)k∈K ∈ l2(K), if v =∑

k∈K λkuk andw =

∑k∈K µkuk, we have the following equation, also called Parseval identity:

〈v, w〉 =∑k∈K

λkµk.

Proof. (1) The fact that (λk)k∈K ∈ l2(K) means that (|λk|2)k∈K is summable. The proofgiven in Proposition 1.10 (3) applies to the family (|λk|2)k∈K (instead of (|ck|2)k∈K), and yieldsthe fact that (λkuk)k∈K is summable. Letting v =

∑k∈K λkuk, recall that ck = 〈v, uk〉 /‖uk‖2.

Pick some k ∈ K. Since 〈−,−〉 is continuous, for every ε > 0, there is some η > 0 such that

| 〈v, uk〉 − 〈w, uk〉 | < ε‖uk‖2

whenever

‖v − w‖ < η.

However, since for every η > 0, there is some finite subset I of K such that∥∥∥v −∑j∈J

λjuj

∥∥∥ < η

for every finite subset J of K such that I ⊆ J , we can pick J = I ∪ {k}, and lettingw =

∑j∈J λjuj, we get ∣∣∣∣∣〈v, uk〉 −

⟨∑j∈J

λjuj, uk

⟩∣∣∣∣∣ < ε‖uk‖2.

However,

〈v, uk〉 = ck‖uk‖2 and

⟨∑j∈J

λjuj, uk

⟩= λk‖uk‖2,

and thus, the above proves that |ck − λk| < ε for every ε > 0, and thus, that ck = λk.

(2) Since 〈−,−〉 is continuous, for every ε > 0, there are some η1 > 0 and η2 > 0, suchthat

| 〈x, y〉 | < ε

whenever ‖x‖ < η1 and ‖y‖ < η2. Since v =∑

k∈K λkuk and w =∑

k∈K µkuk, there is somefinite subset I1 of K such that ∥∥∥v −∑

j∈J

λjuj

∥∥∥ < η1


for every finite subset J of K such that I1 ⊆ J , and there is some finite subset I2 of K suchthat ∥∥∥w −∑

j∈J

µjuj

∥∥∥ < η2

for every finite subset J of K such that I2 ⊆ J . Letting I = I1 ∪ I2, we get∣∣∣∣∣⟨v −

∑i∈I

λiui, w −∑i∈I

µiui

⟩∣∣∣∣∣ < ε.

Furthermore,

〈v, w〉 =

⟨v −

∑i∈I

λiui +∑i∈I


µiui +∑i∈I

µiui

⟩

=

⟨v −

∑i∈I


µiui

⟩+∑i∈I

λiµi,

since the ui are orthogonal to v−∑

i∈I λiui and w−∑

i∈I µiui for all i ∈ I. This proves thatfor every ε > 0, there is some finite subset I of K such that∣∣∣∣∣〈v, w〉 −∑

i∈I

λiµi

∣∣∣∣∣ < ε.

We already know from Proposition 1.11 that (λkµk)k∈K is summable, and since ε > 0 isarbitrary, we get

〈v, w〉 =∑k∈K

λkµk.

The next proposition states properties characterizing Hilbert bases (total orthogonalfamilies).

Proposition 1.13. Let E be a Hilbert space, and let (uk)k∈K be an orthogonal family in E.The following properties are equivalent:

(1) The family (uk)k∈K is a total orthogonal family.

(2) For every vector v ∈ E, if (ck)k∈K are the Fourier coefficients of v, then the family(ckuk)k∈K is summable and v =

∑k∈K ckuk.

(3) For every vector v ∈ E, we have the Parseval identity:

‖v‖2 =∑k∈K

|ck|2.


(4) For every vector u ∈ E, if 〈u, uk〉 = 0 for all k ∈ K, then u = 0.

Proof. The equivalence of (1), (2), and (3), is an immediate consequence of Proposition 1.10and Proposition 1.12.

(4) If (uk)k∈K is a total orthogonal family and 〈u, uk〉 = 0 for all k ∈ K, since u =∑k∈K ckuk where ck = 〈u, uk〉/‖uk‖2, we have ck = 0 for all k ∈ K, and u = 0.

Conversely, assume that the closure V of (uk)k∈K is different from E. Then, by Propo-sition 1.6, we have E = V ⊕ V ⊥, where V ⊥ is the orthogonal complement of V , and V ⊥ isnontrivial since V 6= E. As a consequence, there is some nonnull vector u ∈ V ⊥. But then,u is orthogonal to every vector in V , and in particular,

〈u, uk〉 = 0

for all k ∈ K, which, by assumption, implies that u = 0, contradicting the fact that u 6=0.

Remarks:

(1) If E is a Hilbert space and (uk)k∈K is a total orthogonal family in E, there is a simplerargument to prove that u = 0 if 〈u, uk〉 = 0 for all k ∈ K, based on the continuityof 〈−,−〉. The argument is to prove that the assumption implies that 〈v, u〉 = 0 forall v ∈ E. Since 〈−,−〉 is positive definite, this implies that u = 0. By continuity of〈−,−〉, for every ε > 0, there is some η > 0 such that for every finite subset I of K,for every family (λi)i∈I , for every v ∈ E,∣∣∣∣∣〈v, u〉 −

⟨∑i∈I

λiui, u

⟩∣∣∣∣∣ < ε

whenever ∥∥∥v −∑i∈I

λiui

∥∥∥ < η.

Since (uk)k∈K is dense in E, for every v ∈ E, there is some finite subset I of K andsome family (λi)i∈I such that ∥∥∥v −∑

i∈I

λiui

∥∥∥ < η,

and since by assumption,⟨∑

i∈I λiui, u⟩

= 0, we get

|〈v, u〉| < ε.

Since this holds for every ε > 0, we must have 〈v, u〉 = 0


(2) If V is any nonempty subset of E, the kind of argument used in the previous remarkcan be used to prove that V ⊥ is closed (even if V is not), and that V ⊥⊥ is the closureof V .

We will now prove that every Hilbert space has some Hilbert basis. This requires usinga fundamental theorem from set theory known as Zorn’s Lemma, which we quickly review.

Given any set X with a partial ordering ≤, recall that a nonempty subset C of X is achain if it is totally ordered (i.e., for all x, y ∈ C, either x ≤ y or y ≤ x). A nonempty subsetY of X is bounded iff there is some b ∈ X such that y ≤ b for all y ∈ Y . Some m ∈ X ismaximal iff for every x ∈ X, m ≤ x implies that x = m. We can now state Zorn’s Lemma.For more details, see Rudin [7], Lang [4], or Artin [1].

Proposition 1.14. Given any nonempty partially ordered set X, if every (nonempty) chainin X is bounded, then X has some maximal element.

We can now prove the existence of Hilbert bases. We define a partial order on families(uk)k∈K as follows: For any two families (uk)k∈K1 and (vk)k∈K2 , we say that

(uk)k∈K1 ≤ (vk)k∈K2

iff K1 ⊆ K2 and uk = vk for all k ∈ K1. This is clearly a partial order.

Proposition 1.15. Let E be a Hilbert space. Given any orthogonal family (uk)k∈K in E,there is a total orthogonal family (ul)l∈L containing (uk)k∈K.

Proof. Consider the set S of all orthogonal families greater than or equal to the familyB = (uk)k∈K . We claim that every chain in S is bounded. Indeed, if C = (Cl)l∈L is a chainin S, where Cl = (uk,l)k∈Kl

, the union family

(uk)k∈⋃l∈LKl, where uk = uk,l whenever k ∈ Kl,

is clearly an upper bound for C, and it is immediately verified that it is an orthogonal family.By Zorn’s Lemma 1.14, there is a maximal family (ul)l∈L containing (uk)k∈K . If (ul)l∈L isnot dense in E, then its closure V is strictly contained in E, and by Proposition 1.6, theorthogonal complement V ⊥ of V is nontrivial since V 6= E. As a consequence, there is somenonnull vector u ∈ V ⊥. But then, u is orthogonal to every vector in (ul)l∈L, and we can forman orthogonal family strictly greater than (ul)l∈L by adding u to this family, contradictingthe maximality of (ul)l∈L. Therefore, (ul)l∈L is dense in E, and thus, it is a Hilbert basis.

Remark: It is possible to prove that all Hilbert bases for a Hilbert space E have index setsK of the same cardinality. For a proof, see Bourbaki [2].

At last, we can prove that every Hilbert space is isomorphic to some Hilbert space l2(K)for some suitable K.


Theorem 1.16. (Riesz-Fischer) For every Hilbert space E, there is some nonempty set Ksuch that E is isomorphic to the Hilbert space l2(K). More specifically, for any Hilbert basis(uk)k∈K of E, the maps f : l2(K)→ E and g : E → l2(K) defined such that

f ((λk)k∈K) =∑k∈K

λkuk and g(u) =(〈u, uk〉/‖uk‖2

)k∈K = (ck)k∈K ,

are bijective linear isometries such that g ◦ f = id and f ◦ g = id.

Proof. By Proposition 1.12 (1), the map f is well defined, and it it clearly linear. ByProposition 1.10 (3), the map g is well defined, and it is also clearly linear. By Proposition1.10 (2b), we have

f(g(u)) = u =∑k∈K

ckuk,

and by Proposition 1.12 (1), we have

g(f ((λk)k∈K)) = (λk)k∈K ,

and thus g ◦ f = id and f ◦ g = id. By Proposition 1.12 (2), the linear map g is an isometry.Therefore, f is a linear bijection and an isometry between l2(K) and E, with inverse g.

Remark: The surjectivity of the map g : E → l2(K) is known as the Riesz-Fischer theorem.

Having done all this hard work, we sketch how these results apply to Fourier series.Again, we refer the readers to Rudin [7] or Lang [5, 6] for a comprehensive exposition.

Let C(T ) denote the set of all periodic continuous functions f : [−π, π]→ C with period2π. There is a Hilbert space L2(T ) containing C(T ) and such that C(T ) is dense in L2(T ),whose inner product is given by

〈f, g〉 =

∫ π

−πf(x)g(x)dx.

The Hilbert space L2(T ) is the space of Lebesgue square-integrable periodic functions (ofperiod 2π).

It turns out that the family (eikx)k∈Z is a total orthogonal family in L2(T ), because it isalready dense in C(T ) (for instance, see Rudin [7]). Then, the Riesz-Fischer theorem saysthat for every family (ck)k∈Z of complex numbers such that∑

k∈Z

|ck|2 <∞,

there is a unique function f ∈ L2(T ) such that f is equal to its Fourier series

f(x) =∑k∈Z

ckeikx,


where the Fourier coefficients ck of f are given by the formula

ck =1

2π

∫ π

−πf(t)e−iktdt.

The Parseval theorem says that

+∞∑k=−∞

ckdk =1

2π

∫ π

−πf(t)g(t)dt

for all f, g ∈ L2(T ), where ck and dk are the Fourier coefficients of f and g.

Thus, there is an isomorphism between the two Hilbert spaces L2(T ) and l2(Z), which isthe deep reason why the Fourier coefficients “work”. Theorem 1.16 implies that the Fourierseries

∑k∈Z cke

ikx of a function f ∈ L2(T ) converges to f in the L2-sense, i.e., in the mean-square sense. This does not necessarily imply that the Fourier series converges to f pointwise!This is a subtle issue, and for more on this subject, the reader is referred to Lang [5, 6] orSchwartz [10, 11].

We can also consider the set C([−1, 1]) of continuous functions f : [−1, 1]→ C. There is aHilbert space L2([−1, 1]) containing C([−1, 1]) and such that C([−1, 1]) is dense in L2([−1, 1]),whose inner product is given by

〈f, g〉 =

∫ 1

−1f(x)g(x)dx.

The Hilbert space L2([−1, 1]) is the space of Lebesgue square-integrable functions over [−1, 1].The Legendre polynomials Pn(x) defined in Example 5 of Section ?? (Chapter ??) form aHilbert basis of L2([−1, 1]). Recall that if we let fn be the function

fn(x) = (x2 − 1)n,

Pn(x) is defined as follows:

P0(x) = 1, and Pn(x) =1

2nn!f (n)n (x),

where f(n)n is the nth derivative of fn. The reason for the leading coefficient is to get

Pn(1) = 1. It can be shown with much efforts that

Pn(x) =∑

0≤k≤n/2

(−1)k(2(n− k))!

2n(n− k)!k!(n− 2k)!xn−2k.

Bibliography

[1] Michael Artin. Algebra. Prentice Hall, first edition, 1991.

[2] Nicolas Bourbaki. Espaces Vectoriels Topologiques. Elements de Mathematiques. Mas-son, 1981.

[3] Jacques Dixmier. General Topology. UTM. Springer Verlag, first edition, 1984.

[4] Serge Lang. Algebra. Addison Wesley, third edition, 1993.

[5] Serge Lang. Real and Functional Analysis. GTM 142. Springer Verlag, third edition,1996.

[6] Serge Lang. Undergraduate Analysis. UTM. Springer Verlag, second edition, 1997.

[7] Walter Rudin. Real and Complex Analysis. McGraw Hill, third edition, 1987.

[8] Laurent Schwartz. Topologie Generale et Analyse Fonctionnelle. Collection Enseigne-ment des Sciences. Hermann, 1980.

[9] Laurent Schwartz. Analyse I. Theorie des Ensembles et Topologie. Collection Enseigne-ment des Sciences. Hermann, 1991.

[10] Laurent Schwartz. Analyse III. Calcul Integral. Collection Enseignement des Sciences.Hermann, 1993.

[11] Laurent Schwartz. Analyse IV. Applications a la Theorie de la Mesure. CollectionEnseignement des Sciences. Hermann, 1993.

33

hilbert spaces - information and computer sciencejean/hilbert-spaces.pdfwhen eis a hilbert space,...

Documents