functional analysis i - files.warwick · definition 1.9 if a linear map l : v !w is a bijection,...

89
Functional Analysis I Term 1, 2009–2010 Vassili Gelfreich

Upload: others

Post on 08-Sep-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Functional Analysis I

Term 1, 2009–2010

Vassili Gelfreich

Page 2: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic
Page 3: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Contents1 Vector spaces 1

1.1 Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Examples of vector spaces . . . . . . . . . . . . . . . . . . . . . . . 21.3 Hamel bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Normed spaces 62.1 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Four famous inequalities . . . . . . . . . . . . . . . . . . . . . . . . 72.3 Examples of norms on a space of functions . . . . . . . . . . . . . . 82.4 Equivalence of norms . . . . . . . . . . . . . . . . . . . . . . . . . . 92.5 Linear Isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Convergence in a normed space 123.1 Definition and examples . . . . . . . . . . . . . . . . . . . . . . . . 123.2 Topology on a normed space . . . . . . . . . . . . . . . . . . . . . . 143.3 Closed sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.4 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4 Banach spaces 184.1 Completeness: Definition and examples . . . . . . . . . . . . . . . . 184.2 The completion of a normed space . . . . . . . . . . . . . . . . . . . 204.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5 Lebesgue spaces 255.1 Integrable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 255.2 Properties of Lebesgue integrals . . . . . . . . . . . . . . . . . . . . 285.3 Lebesgue space L1(R) . . . . . . . . . . . . . . . . . . . . . . . . . 315.4 Lp spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

6 Hilbert spaces 366.1 Inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 366.2 Natural norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376.3 Parallelogram law and polarisation identity . . . . . . . . . . . . . . 386.4 Hilbert spaces: Definition and examples . . . . . . . . . . . . . . . . 40

7 Orthonormal bases in Hilbert spaces 417.1 Orthonormal sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417.2 Gram-Schmidt orthonormalisation . . . . . . . . . . . . . . . . . . . 427.3 Bessel’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . 447.4 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447.5 Orthonormal basis in a Hilbert space . . . . . . . . . . . . . . . . . . 45

iii

Page 4: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

8 Closest points and approximations 488.1 Closest points in convex subsets . . . . . . . . . . . . . . . . . . . . 488.2 Orthogonal complements . . . . . . . . . . . . . . . . . . . . . . . . 498.3 Best approximations . . . . . . . . . . . . . . . . . . . . . . . . . . 518.4 Weierstrass Approximation Theorem . . . . . . . . . . . . . . . . . . 53

9 Separable Hilbert spaces 569.1 Definition and examples . . . . . . . . . . . . . . . . . . . . . . . . 569.2 Isometry to `2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

10 Linear maps between Banach spaces 5810.1 Continuous linear maps . . . . . . . . . . . . . . . . . . . . . . . . . 5810.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6010.3 Kernel and range . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

11 Linear functionals 6311.1 Definition and examples . . . . . . . . . . . . . . . . . . . . . . . . 6311.2 Riesz representation theorem . . . . . . . . . . . . . . . . . . . . . . 63

12 Linear operators on Hilbert spaces 6512.1 Complexification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6512.2 Adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6512.3 Self-adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . . . 68

13 Introduction to Spectral Theory 7113.1 Point spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7113.2 Invertible operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 7213.3 Resolvent and spectrum . . . . . . . . . . . . . . . . . . . . . . . . . 7313.4 Compact operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 7613.5 Spectral theory for compact self-adjoint operators . . . . . . . . . . . 78

14 Sturm-Liouville problems 82

iv

Page 5: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

PrefaceThese notes follow the lectures on Functional Analysis given in the Autumn Term of2009. If you find a mistake or misprint please inform the author by sending an e-mailto [email protected]. The author thanks James Robinson for his set ofnotes and selection of exercises which significantly facilitated the preparation of thelectures. The author also thanks all students who helped with proofreading the notes.

1 Vector spaces

1.1 Definition.A vector space V over a field K is a set equipped with two binary operations calledvector addition and multiplication by scalars. Elements of V are called vectors andelements of K are called scalars. The sum of two vectors x,y ∈V is denoted x+ y, theproduct of a scalar α ∈K and vector x ∈V is denoted αx.

It is possible to consider vector spaces over an arbitrary field K, but we will con-sider the fields R and C only. So we will always assume that K denotes either R or Cand refer to V as a real or complex vector space respectively.

In a vector space, addition and multiplication have to satisfy the following set ofaxioms: Let x,y,z be arbitrary vectors in V , and α,β be arbitrary scalars in K, then

• Associativity of addition: x+(y+ z) = (x+ y)+ z.

• Commutativity of addition: x+ y = y+ z.

• There exists an element 0 ∈V , called the zero vector, such that x+0 = x for allx ∈V .

• For all x ∈V , there exists an element y ∈V , called the additive inverse of x, suchthat x+ y = 0. The additive inverse is denoted −x.

• “Associativity” of multiplication:1 α(βx) = (αβ )x.

• Distributivity:

α(x+ y) = αx+αy and (α +β )x = αx+βx.

• There is an element 1 ∈K such that 1x = x for all x ∈V . This element is calledthe multiplicative identity in K.

1The purist would not use the word “associativity” for this property as it includes two differentoperations: αβ is a product of two scalars and βx involves a vector and scalar.

1

Page 6: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

It is convenient to define two additional operations: subtraction of two vectors anddivision by a (non-zero) scalar are defined by

x− y = x+(−y),x/α = (1/α)x.

1.2 Examples of vector spaces1. Rn is a real vector space.

2. Cn is a complex vector space.

3. Cn is a real vector space.

4. The set of all polynomials P is a vector space:

P =

{n

∑k=0

αkxk : αk ∈K, n ∈ N

}.

5. The set of all bounded sequences `∞(K) is a vector space:

`∞(K) ={

(x1,x2, . . .) : xk ∈K for all k ∈ N, supk∈N|xk|< ∞

}.

For two sequences x,y ∈ `∞(K), we define x+ y by

x+ y = (x1 + y1,x2 + y2, . . .) .

For α ∈K, we setαx = (αx1,αx2, . . .) .

We will always use these definitions of addition and multiplication by scalars forsequences.

6. Let 1 ≤ p < ∞. The set `p(K) of all pth power summable sequences is a vectorspace:

`p(K) =

{(x1,x2, . . .) : xk ∈K,

∑k=1|xk|p < ∞

}.

The definition of the multiplication by scalars and vector addition is the sameas in the previous example. Let us check that the sum x + y ∈ `p(K) for anyx,y ∈ `p(K). Indeed,

∑k=1|xk + yk|p ≤

∑k=1

(|xk|+ |yk|)p ≤∞

∑k=1

2p(|xk|p + |yk|p)

≤ 2p∞

∑k=1|xk|p +2p

∑k=1|yk|p < ∞ .

2

Page 7: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

7. The space C[0,1] of all real-valued continuous functions on the closed interval[0,1] is a vector space. The addition and multiplication by scalars are definednaturally: for f ,g ∈C[0,1] and α ∈ R we defined by f + g the function whosevalues are given by

( f +g)(t) = f (t)+g(t) , t ∈ [0,1] ,

and α f is the function whose values are

(α f )(t) = α f (t) , t ∈ [0,1] ,

We will always use these definitions on all spaces of functions to be consideredlater.

8. The set L1(0,1) of all real-valued continuous functions f on the open interval(0,1) for which ∫ 1

0| f (t)|dt < ∞

is a vector space.

If f ∈C[0,1] then f ∈ L1(0,1). Indeed, since [0,1] is compact f is bounded (andattains its lower and upper bounds). Then∫ 1

0| f (t)|dt ≤ max

t∈[0,1]| f (t)|< ∞ ,

i.e. f ∈ L1(0,1).

We note that L1(0,1) contains some functions which do not belong to C[0,1].For example, f (t) = t−1/2 is not continuous on [0,1] but it is continuous on(0,1) and ∫ 1

0| f (t)|dt =

∫ 1

0|t|−1/2dx = 2t1/2

∣∣∣10= 2 < ∞ ,

so f ∈ L1(0,1).

We conclude that C[0,1] is a strict subset of f ∈ L1(0,1).

1.3 Hamel basesDefinition 1.1 The linear span of a subset E of a vector space V is the collection ofall finite linear combinations of elements of E:

Span(E) =

{x ∈V : x =

n

∑j=1

α je j, n ∈ N, α j ∈K, e j ∈ E

}.

We say that E spans V if V = Span(E), i.e. every element of V can be written as afinite linear combination of elements of E.

3

Page 8: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Definition 1.2 A set E is linearly independent if any finite collection of elements of Eis linearly independent:

n

∑j=1

α je j = 0 =⇒ α1 = α2 = · · ·= αn = 0

for any choice of n ∈ N, e j ∈ E and α j ∈K.

Definition 1.3 A Hamel basis E for V is a linearly independent subset of V whichspans V .

Examples:

1. Any basis in Rn is a Hamel basis.

2. The set E ={

1,x,x2, . . .}

is a Hamel basis in the space of all polynomials.

Lemma 1.4 If E is a Hamel basis for a vector space V then any element x ∈V can beuniquely written in the form

x =n

∑j=1

α je j

where n ∈ N, α j ∈K and e j ∈ E.

Exercise: Prove the lemma.

Definition 1.5 We say that a set is finite if it consists of a finite number of elements.

Theorem 1.6 If V has a finite Hamel basis then every Hamel basis for V has the samenumber of elements.

Definition 1.7 If V has a finite basis E then the dimension of V (denoted dimV ) isthe number of elements in E. If V has no finite basis then we say that V is infinite-dimensional.

Example: In Rn any basis consists of n vectors. Therefore dimRn = n.

Let V and W be two vector spaces over K.

Definition 1.8 A map L : V →W is called linear if for any x,y ∈V and any α ∈K

L(x+αy) = L(x)+αL(y) .

Definition 1.9 If a linear map L : V →W is a bijection, then L is called a linearisomorphism. We say that V and W are linearly isomorphic if there is a bijectivelinear map L : V →W.

4

Page 9: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proposition 1.10 Any n-dimensional vector space over K is linearly isomorphic toKn.

Proof: If E = {e j : 1≤ j ≤ n} is a basis in V , then every element x ∈V is representeduniquely in the form

x =n

∑j=1

α je j .

The map L : x 7→ (α1, . . . ,αn) is a linear bijection V → Kn. Therefore V is linearlyisomorphic to Kn. �

In order to show that a vector space is infinite-dimensional it is sufficient to find aninfinite linearly independent subset. Let’s consider the following examples:

1. `p(K) is infinite-dimensional (1≤ p≤ ∞).

Proof. The set

E = {(1,0,0,0, . . .), (0,1,0,0, . . .), (0,0,1,0, . . .), . . .}

is not finite and linearly independent. Therefore dim`p(K) = ∞.

Remark: This linearly independent subset is not a Hamel basis. Indeed, thesequence x = (x1,x2,x3, . . .) with xk = e−k belongs to `p(K) for any p ≥ 1 butcannot be represented as a sum of finitely many elements of the set E.

2. C[0,1] is infinite-dimensional.

Proof: The set E = {xk : k ∈ N} is linearly independent subset of C0[0,1]: In-deed, suppose

p(x) =n

∑k=1

αkxk = 0 for all x ∈ [0,1].

Differentiating the equality n times we get p(n)(x) = n!αn = 0. Which impliesαn = 0. Therefore p(x)≡ 0 implies αk = 0 for all k.

The linearly independent sets provided in the last two examples are not Hamelbases. This is not a coincidence: we will see later that `p(K) and C[0,1] (as well asmany other functional spaces) do not have a countable Hamel basis.

Theorem 1.11 Every vector space has a Hamel basis.

The proof of this theorem is based on Zorn’s Lemma.We note that in many interesting vector spaces (called normed spaces), a very large

number of elements should be included into a Hamel basis in order to enable repre-sentation of every element in the form of a finite sum. Then the basis is too large tobe useful for the study of the original vector space. A natural idea would be to allowinfinite sums in the definition of a basis. In order to use infinite sums we need to de-fine convergence which cannot be done using the axioms of vector spaces only. Anadditional structure on the vector space should be defined.

5

Page 10: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

2 Normed spaces

2.1 NormsDefinition 2.1 A norm on a vector space V is a map ‖ · ‖ : V → R such that for anyx,y ∈V and any α ∈K:

1. ‖x‖ ≥ 0, and ‖x‖= 0⇔ x = 0 (positive definiteness);

2. ‖αx‖= |α|‖x‖ (positive homogeneity);

3. ‖x+ y‖ ≤ ‖x‖+‖y‖ (triangle inequality).

The pair (V,‖ · ‖) is called a normed space.

In other words, a normed space is a vector space equipped with a norm.

Examples:

1. Rn with each of the following norms is a normed space:

(a) ‖x‖=

√n

∑k=1|xk|2

(b) ‖x‖p =

(n

∑k=1|xk|p

)1/p

, 1≤ p < ∞

(c) ‖x‖∞ = max1≤k≤n

|xk|.

2. `p(K) is a vector space with the following norm (1≤ p < ∞)

‖x‖`p =

(∞

∑k=1|xk|p

)1/p

.

3. `∞(K) is a vector space with the following norm

‖x‖`∞ = supk∈N|xk| .

In order to prove the triangle inequality for the `p norm, we will state and proveseveral inequalities.

6

Page 11: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

2.2 Four famous inequalitiesLemma 2.2 (Young’s inequality) If a,b > 0, 1 < p,q < ∞, 1

p + 1q = 1, then

ab≤ ap

p+

bq

q.

Proof: Consider the function f (t) = t p

p − t + 1q defined for t ≥ 0. Since f ′(t) = t p−1−1

vanishes at t = 1 only, and f ′′(t) = (p−1)t p−2≥ 0, the point t = 1 is a global minimumfor f . Consequently, f (t)≥ f (1) = 0 for all t ≥ 0. Now substitute t = ab−q/p:

f (ab−q/p) =apb−q

p−ab−q/p +

1q≥ 0 .

Multiplying the inequality by bq yields Young’s inequality. �

Lemma 2.3 (Holder’s inequality) If 1≤ p,q≤ ∞, 1p + 1

q = 1, x ∈ `p(K), y ∈ `q(K),then

∑j=1|x jy j| ≤ ‖x‖`p‖y‖`q .

Proof. If 1 < p,q < ∞, we use Young’s inequality to get that for any n ∈ N

n

∑j=1

|x j|‖x‖`p

|y j|‖y‖`q

≤n

∑j=1

(1p|x j|p

‖x‖p`p

+1q|y j|q

‖y‖q`q

)≤ 1

p+

1q

= 1

Therefore for any n ∈ Nn

∑j=1|x jy j| ≤ ‖x‖`p‖y‖`q .

Since the partial sums are monotonically increasing and bounded above, the seriesconverge and Holder’s inequality follows by taking the limit as n→ ∞.

If p = 1 and q = ∞:

n

∑j=1|x jy j| ≤ max

1≤ j≤n|y j|

n

∑j=1|x j| ≤ ‖x‖`1‖y‖`∞ .

Therefore the series converges and Holder’s inequality follows by taking the limit asn→ ∞. �

Lemma 2.4 (Cauchy-Schwartz inequality) If x,y ∈ `2(K) then

∑j=1|x jy j| ≤

(∞

∑j=1|x j|2

)1/2(∞

∑j=1|y j|2

)1/2

.

7

Page 12: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proof: This inequality coincides with Holder’s inequality with p = q = 2. �

Now we state and prove the triangle inequality for the `p norm.

Lemma 2.5 (Minkowski’s inequality) If x,y ∈ `p(K) for 1 ≤ p ≤ ∞ then x + y ∈`p(K) and

‖x+ y‖`p ≤ ‖x‖`p +‖y‖`p .

Proof: If 1 < p < ∞, define q from 1p + 1

q = 1. Then using Holder’s inequality (finitesequences belong to `p with any p) we get2

n

∑j=1|x j + y j|p =

n

∑j=1|x j + y j|p−1|x j + y j|

≤n

∑j=1|x j + y j|p−1|x j|+

n

∑j=1|x j + y j|p−1|y j|

(n

∑j=1|x j + y j|(p−1)q

)1/q( n

∑j=1|x j|p

)1/p

(Holder’s inequality)

+

(n

∑j=1|x j + y j|(p−1)q

)1/q( n

∑j=1|y j|p

)1/p

.

Dividing the inequality by(

∑nj=1 |x j + y j|p

)1/qand using that (p− 1)q = p and 1−

1q = 1

p , we get for all n

(n

∑j=1|x j + y j|p

)1/p

(n

∑j=1|x j|p

)1/p

+

(n

∑j=1|y j|p

)1/p

.

The series on the right hand side converge to ‖x‖`p + ‖y‖`p . Consequently the serieson the left hand side also converge, x+y ∈ `p(K), and Minkowski’s inequality followsby taking the limit as n→ ∞.

Exercise: Prove Minkowski’s inequality for p = 1 and p = ∞. �

2.3 Examples of norms on a space of functionsEach of the following formulae defines a norm on C[0,1], the space of all continuousfunctions on [0,1]:

2We do not start directly with n = ∞ because a priori we do not know convergence for some of theseries involved in the proof.

8

Page 13: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

1. the “sup(remum) norm”‖ f‖∞ = sup

t∈[0,1]| f (t)| ;

2. the “L1 norm”

‖ f‖L1 =∫ 1

0| f (t)|dt ;

3. the “L2 norm”

‖ f‖L2 =(∫ 1

0| f (t)|2 dt

)1/2

.

Exercise: Check that each of these formulae defines a norm. For the case of the L2

norm, you will need a Cauchy-Schwartz inequality for integrals.

Example: Let k ∈ N. The space Ck[0,1] consists of all continuous real-valued func-tions which have continuous derivatives up to order k. The norm on Ck[0,1] is definedby

‖ f‖Ck =k

∑j=0

supt∈[0,1]

| f ( j)(t)| ,

where f ( j) denotes the derivative of order j.

2.4 Equivalence of normsWe have seen that various different norms can be introduced on a vector space. In orderto compare different norms it is convenient to introduce the following equivalencerelation.

Definition 2.6 Two norms ‖ · ‖1 and ‖ · ‖2 on a vector space V are equivalent if thereare constants c1,c2 > 0 such that

c1‖x‖1 ≤ ‖x‖2 ≤ c2‖x‖1 for all x ∈V .

In this case we write ‖ · ‖1 ∼ ‖ ·‖2.

Theorem 2.7 Any two norms on Rn are equivalent.3

Example: The norms ‖ · ‖L1 and ‖ · ‖∞ on C[0,1] are not equivalent.

3You already saw this statement in Analysis III and Differentiation in Year 2. The proof is basedon the observation that the unit sphere S ⊂ Rn is sequentially compact. Then we checked that f (x) =‖x‖2/‖x‖1 is continuous on S and consequently it is bounded and attains its lower and upper bounds onS. We set c1 = minS f and c2 = maxS f .

9

Page 14: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proof: Consider the sequence of functions fn(t) = tn with n ∈ N. Obviously fn ∈C[0,1]. We see that

‖ fn‖∞ = maxt∈[0,1]

|t|n = 1 ,

‖ fn‖L1 =∫ 1

0tn dt =

1n+1

.

Suppose the norms are equivalent. Then there is a constant c2 > 0 such that for all fn:

‖ fn‖∞

‖ fn‖L1= n+1≤ c2 ,

which is not possible for all n. This contradiction implies the norms are not equivalent.�

2.5 Linear IsometriesSuppose V and W are normed spaces.

Definition 2.8 If a linear map L : V →W preserves norms, i.e. ‖L(x)‖ = ‖x‖ for allx ∈V , it is called a linear isometry.

This definition implies L is injective, i.e., L : V → L(V ) is bijective, but it does not im-ply L(V ) =W , i.e., L is not necessarily invertible. Note that sometimes the invertibilityproperty is included into the definition of the isometry. Finally, in Metric Spaces theword “isometry” is used to denote distance-preserving transformations.

Definition 2.9 We say that two normed spaces are isometrically isomorphic (or simplyisometric), if there is an invertible linear isometry between them.

A linear invertible map can be used to “pull back” a norm as follows.

Proposition 2.10 Let (V,‖ ·‖V ) be a normed space, W a vector space, and L : W →Va linear isomorphism. Then

‖x‖W := ‖L(x)‖Vdefines a norm on W.

Proof: For any x,y ∈V and any α ∈K we have:

‖x‖W = ‖L(x)‖V ≥ 0 ,

‖αx‖W = ‖L(αx)‖V = |α|‖L(x)‖V = |α|‖x‖W .

If ‖x‖W = ‖L(x)‖V = 0, then L(x) = 0 due to non-degeneracy of the norm ‖ ·‖V . SinceL is invertible, we get x = 0. Therefore ‖ · ‖W is non-degenerate.

10

Page 15: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Finally, the triangle inequality follows from the triangle inequality for ‖ · ‖V :

‖x+ y‖W = ‖L(x)+L(y)‖V ≤ ‖L(x)‖V +‖L(y)‖V = ‖x‖W +‖y‖W .

Therefore, ‖ · ‖W is a norm. �

Note that in the proposition the new norm is introduced in such a way that L :(W,‖ · ‖W )→ (V,‖ · ‖V ) is a linear isometry.

Let V be a finite dimensional vector space and n = dimV . We have seen that V islinearly isomorphic to Kn. Then the proposition implies the following statements.

Corollary 2.11 Any finite dimensional vector space V can be equipped with a norm.

Corollary 2.12 Any n-dimensional normed space V is isometrically isomorphic to Kn

equipped with a suitable norm.

Since any two norms on Rn (and therefore on Cn) are equivalent we also get thefollowing statement.

Theorem 2.13 Let V be a finite-dimensional vector space. Then all norms on V areequivalent.

11

Page 16: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

3 Convergence in a normed space

3.1 Definition and examplesThe norm on a vector space V can be used to measure distances between points x,y∈V .So we can define the limit of a sequence.

Definition 3.1 A sequence (xn)∞n=1, xn ∈V , n∈N, converges to a limit x ∈V if for any

ε > 0 there is N ∈ N such that

‖xn− x‖< ε for all n > N.

Then we write xn→ x.

We note that the sequence of vectors xn → x if and only if the sequence of non-negative real numbers ‖xn− x‖→ 0.

Proposition 3.2 The limit of a convergent sequence is unique.

Exercise: Prove it.

Proposition 3.3 Any convergent sequence is bounded.

Exercise: Prove it.

Proposition 3.4 If xn converges to x, then ‖xn‖→ ‖x‖.

Exercise: Prove it.

It is possible to check convergence of a sequence of real numbers without actuallyfinding its limit: it is sufficient to check that it satisfies the following definition:

Definition 3.5 (Cauchy sequence) A sequence (xn)∞n=1 in a normed space V is Cauchy

if for any ε > 0 there is an N such that

‖xn− xm‖< ε for all m,n > N.

Theorem 3.6 A sequence of real numbers converges iff it is Cauchy.

Proposition 3.7 Any convergent sequence is Cauchy.

Exercise: Prove it.

Proposition 3.8 Any Cauchy sequence is bounded.

Exercise: Prove it.

12

Page 17: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Example: Consider the sequence fn ∈C[0,1] defined by fn(t) = tn.

1. fn→ 0 in the L1 norm.

Proof: We have already computed the norms:

‖ fn‖L1 =1

n+1→ 0 .

Consequently, fn→ 0. �

2. fn does not converge in the sup norm.

Proof: If m > 2n≥ 1 then

fn(2−1/n)− fm(2−1/n) =12− 1

2m/n≥ 1

4.

Consequently ( fn) is not Cauchy in the sup norm and hence not convergent.

This example shows that the convergence in the L1 norm does not imply the point-wise convergence and, as a results, does not imply the convergence in the sup norm(often called the uniform convergence). Note that in contrast to the uniform and L1

convergences the notion of pointwise convergence is not based on a norm on the spaceof continuous function.

Exercise: The pointwise convergence does not imply the L1 convergence.Hint: Construct fn with a very small support but make the maximum of fn very

large to ensure that ‖ fn‖L1 > n. Therefore fn is not bounded in the L1 norm, hence notconvergent.

We can also make supp fn∩ supp fm = /0 for all m,n such that n 6= m. Then for anyt there is at most one n such that fn(t) 6= 0. The last property guarantees pointwiseconvergence: fn(t)→ 0 for any t. �

Proposition 3.9 If fn ∈C[0,1] for all n ∈ N and fn→ f in the sup norm, then fn→ fin the L1 norm, i.e.,

‖ fn− f‖∞→ 0 =⇒ ‖ fn− f‖L1 → 0 .

Proof:

0≤ ‖ fn− f‖L1 =∫ 1

0| fn(t)− f (t)|dt ≤ sup

0≤t≤1| fn(t)− f (t)|= ‖ fn− f‖∞→ 0 .

Therefore ‖ fn− f‖L1 → 0. �

We have seen that different norms may lead to different conclusions about conver-gence of a given sequence but sometime convergence in one norm implies convergencein another one. The following lemma shows that equivalent norms give rise to the samenotion of convergence.

13

Page 18: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Lemma 3.10 Suppose ‖·‖1 and ‖·‖2 are equivalent norms on a vector space V . Thenfor any sequence (xn):

‖xn− x‖1→ 0 ⇔ ‖xn− x‖2→ 0 .

Proof: Since the norms are equivalent, there are constant c1,c2 > 0 such that

0≤ c1‖xn− x‖1 ≤ ‖xn− x‖2 ≤ c2‖xn− x‖1

for all n. Then ‖xn− x‖2→ 0 implies ‖xn− x‖1→ 0, and vice versa. �

3.2 Topology on a normed spaceWe say that a collection T of subsets of V is a topology on V if it satisfies the followingproperties:

1. /0,V ∈T ;

2. any finite intersection of elements of T belongs to T ;

3. any union of elements of T belongs to T .

A set equipped with a topology is called a topological space. The elements of Tare called open sets. The topology can be used to define a convergent sequence andcontinuous function.

We note that each norm on V can be used to define a topology on V , i.e., to definethe notion of an open set.

Definition 3.11 A subset X ⊂ V is open, if for any x ∈ X there is ε > 0 such that theball of radius ε centred around x belongs to X:

B(x,ε) = {y ∈V : ‖y− x‖< ε} ⊂ X .

Example: In any normed space V :

1. The unit ball centred around the zero, B0 = {x : ‖x‖< 1}, is open.

2. Any open ball B(x,ε) is open.

3. V is open.

4. The empty set is open.

It is not too difficult to check that the collection of open sets defines a topologyon V . You can easily check from the definition that equivalent norms generate thesame topology, i.e., open sets are exactly the same. The notion of convergence can bedefined in terms of the topology.

14

Page 19: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Definition 3.12 An open neighbourhood of x is an open set which contains x.

Lemma 3.13 A sequence xn → x if and only if for any open neighbourhood X of xthere is N ∈ N such that xn ∈ X for all n > N.

Proof: ( =⇒ ). Let xn→ x. Take any open X such that x ∈ X . Then there is ε > 0 suchthat B(x,ε) ⊂ X . Since the sequence converges there is N such that ‖xn− x‖ < ε forall n > N. Then xn ∈ B(x,ε)⊂ X for the same values of n.

(⇐=). Take any ε > 0. The ball B(x,ε) is open, therefore there is N such thatxn ∈ B(x,ε) for all n > N. Hence ‖xn− x‖< ε and xn→ x. �

3.3 Closed setsDefinition 3.14 A set X ⊂V is closed if its complement V \X is open.

Example: In any normed space V :

1. The unit sphere S = {x : ‖x‖= 1} is closed

2. Any closed ballB(x,ε) = {y ∈V : ‖y− x‖ ≤ ε }

is closed.

3. V is closed.

4. The empty set is closed.

Lemma 3.15 A subset X ⊂ V is closed if and only if any convergent sequence in Xhas its limit in X.

Proof: The proof literally repeats the proof given in Differentiation. �

Definition 3.16 We say that a subset L⊂V is a linear subspace, if it is a vector spaceitself, i.e., if x1,x2 ∈ L and λ ∈K imply x1 +λx2 ∈ L.

Proposition 3.17 Any finite dimensional linear subspace L of a normed space V isclosed.

Proof: Since L is finite-dimensional, it has a finite Hamel basis

E = {e1,e2, . . . ,en } ⊂ L

such that L = Span(E). Suppose L is not closed, then by Lemma 3.15 there is aconvergent sequence xk → x∗, xk ∈ L but x∗ ∈ V \L. Then x∗ is linearly independentfrom E (otherwise it would belong to L). Consequently

E = {e1,e2, . . . ,en,x∗ }

15

Page 20: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

is a Hamel basis in L = Span(E). In this basis, the components of xk are given by(αk

1 , . . . ,αkn ,0) and x∗ corresponds to the vector (0, . . . ,0,1). We get in the limit as

k→ ∞

(αk1 , . . . ,αk

n ,0)→ (0, . . . ,0,1) ,

which is obviously impossible. Therefore L is closed.4 �

Example: The subspace of polynomial functions is linear but not closed in C[0,1]equipped with the sup norm.

3.4 CompactnessDefinition 3.18 (sequential compactness) A subset K of a normed space (V,‖ ·‖V ) is(sequentially) compact if any sequence (xn)∞

n=1 with xn ∈ K has a convergent subse-quence xn j → x∗ with x∗ ∈ K.

Proposition 3.19 A compact set is closed and bounded.

Theorem 3.20 A subset of Rn is compact iff it is closed and bounded.

Corollary 3.21 A subset of a finite-dimensional vector space is compact iff it is closedand bounded.

Example: The unit sphere in `p(K) is closed, bounded but not compact.

Proof: Take the sequence e j such that

e j = (0, . . . ,0, 1︸︷︷︸jth place

,0, . . .) .

We note that ‖e j−ek‖`p = 21/p for all j 6= k. Consequently, (e j)∞j=1 does not have any

convergent subsequence, hence S is not compact. �

Lemma 3.22 (Riesz’ Lemma) Let X be a normed vector space and Y be a closedlinear subspace of X such that Y 6= X and α ∈ R, 0 < α < 1. Then there is xα ∈ Xsuch that ‖xα‖= 1 and ‖xα − y‖> α for all y ∈ Y .

Proof: Since Y ⊂ X and Y 6= X there is x ∈ X \Y . Since Y is closed, X \Y is open andtherefore

d := inf{‖x− y‖ : y ∈ Y }> 0 .

4This proof implicitly uses equivalence of norms on Rn+1 to establish that convergence in the normobtained by restricting the original norm ‖ · ‖V onto L implies convergence of the components of thevectors.

16

Page 21: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Since α−1 > 1 there is a point z ∈ Y such that ‖x− z‖< dα−1. Let xα = x−z‖x−z‖ . Then

‖xα‖= 1 and for any y ∈ Y ,

‖xα − y‖=∥∥∥∥ x− z‖x− z‖

− y∥∥∥∥=

∥∥x− (z+‖x− z‖y)∥∥

‖x− z‖>

ddα−1 = α ,

as z+‖x− z‖y ∈ Y because Y is a linear subspace. �

Theorem 3.23 A normed space is finite dimensional iff the unit sphere is compact.

Proof: Bolzano-Weierstrass Theorem and Lemma 3.15 imply that in a finite dimen-sional normed space the unit sphere is compact (the unit sphere is bounded and closed).

So we only need to show that if the unit sphere S⊂ B is sequentially compact, thenthe normed space V is finite dimensional. Indeed, if V is infinite dimensional, thenRiesz’ Lemma can be used to construct an infinite sequence of xn ∈ S such that ‖xn−xm‖ > α > 0 for all m 6= n. This sequence does not have a convergent subsequence(none of the subsequences is Cauchy) and therefore S is not compact.

We construct xn inductively. Fix α ∈ (0,1) and take any x1 ∈ S.Suppose that for some n≥ 1 we have found En = {x1, . . . ,xn } such that xk ∈ S and

‖xl− xk‖> α for all 1≤ k, l ≤ n, k 6= l (note that the second property is automaticallyfulfilled for n = 1). The linear subspace Yn = Span(En) is n-dimensional and henceclosed (see Proposition 3.17). Since X is infinite dimensional Yn 6= X . Then Riesz’Lemma implies that there is xn+1 ∈ S such that ‖xn+1− xk‖> α for all 1≤ k ≤ n.

Repeating this argument we generate xn for all n ∈ N. �

17

Page 22: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

4 Banach spaces

4.1 Completeness: Definition and examplesDefinition 4.1 (Banach space) A normed space V is called complete if any Cauchysequence in V converges to a limit in V . A complete normed space is called a Banachspace.

Theorem 4.2 Every finite-dimensional normed space is complete.

Proof: Theorem 3.6 implies that R is complete, i.e., every Cauchy sequence of num-bers has a limit.

Now let V be a real vector space, dimV = n < ∞. Take any basis in V . Thena sequence of vectors in V converges iff each component of the vectors converges,and a sequence of vectors is Cauchy iff each component is Cauchy. Therefore eachcomponent has a limit, and those limits constitute the limit vector for the originalsequence. Hence V is complete.

Considering C as a real vector space we conclude that it is also complete. There-fore, any finite-dimensional complex vector space V is also complete. �

In particular, Rn and Cn are complete.

Theorem 4.3 (`p is a Banach space) The space `p(K) equipped with the standard `p

norm is complete.

Proof: Suppose that xk = (xk1,x

k2, . . .) ∈ `p(K) is Cauchy. Then for every ε > 0 there is

N such that

‖xm− xn‖`p =∞

∑j=1|xm

j − xnj |p < ε

for all m,n > N. Consequently, for each j ∈ N the sequence xkj is Cauchy, and the

completeness of K implies that there is a j ∈K such that

xkj→ a j

as k→ ∞. Let a = (a1,a2, . . .). First we note that for any M ≥ 1 and m,n > N:

M

∑j=1|xm

j − xnj |p ≤

∑j=1|xm

j − xnj |p < ε .

Taking the limit as n→ ∞ we get

M

∑j=1|xm

j −a j|p ≤ ε .

18

Page 23: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

This holds for any M, so we can take the limit as M→ ∞:

∑j=1|xm

j −a j|p ≤ ε .

We conclude that xm−a ∈ `p(K). Since `p(K) is a vector space and xm ∈ `p(K), thena ∈ `p(K). Moreover, ‖xm− a‖`p < ε for all m > N. Consequently xm→ a in `p(K)with the standard norm, and so `p(K) is complete. �

Theorem 4.4 (C is a Banach space) The space C[0,1] equipped with the sup norm iscomplete.

Proof: Let fk be a Cauchy sequence. Then for any ε > 0 there is N such that

supt∈[0,1]

| fn(t)− fm(t)|< ε

for all m,n > N. In particular, fn(t) is Cauchy for any fixed t and consequently has alimit. Set

f (t) = limn→∞

fn(t) .

Let’s prove that fn(t)→ f (t) uniformly in t. Indeed, we already know that

| fn(t)− fm(t)|< ε

for all n,m > N and all t ∈ [0,1]. Taking the limit as m→ ∞ we get

| fn(t)− f (t)| ≤ ε

for all n > N and all t ∈ [0,1]. Therefore fn converges uniformly:

‖ fn− f‖∞ = supt∈[0,1]

| fn(t)− f (t)|< ε .

for all n > N. The uniform limit of a sequence of continuous functions is continuous.Consequently, f ∈C[0,1] which completes the proof of completeness. �

Example: The space C[0,2] equipped with the L1 norm is not complete.

Proof: Consider the following sequence of functions:

fn(t) =

{tn for 0≤ t ≤ 1,

1 for 1≤ t ≤ 2 .

This is a Cauchy sequence in the L1 norm. Indeed for any n < m:

‖ fn− fm‖L1 =∫ 1

0(tn− tm) dt =

1n+1

− 1m+1

<1

n+1,

19

Page 24: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

and consequently for any m,n > N

‖ fn− fm‖L1 <1N

.

Now let us show that fn do not converge to a continuous function in the L1 norm.Indeed, suppose such a limit exists and call it f . Then

‖ fn− f‖L1 =∫ 1

0|tn− f (t)|dt +

∫ 2

1|1− f (t)|dt→ 0 .

Since| f (t)|− |tn| ≤ |tn− f (t)| ≤ | f (t)|+ |tn|

implies that∫ 1

0| f (t)|dt−

∫ 1

0tn dt ≤

∫ 1

0|tn− f (t)|dt ≤

∫ 1

0| f (t)|dt +

∫ 1

0tn dt ,

we have∫ 1

0 |tn− f (t)|dt→∫ 1

0 | f (t)|dt as n→ ∞ and consequently∫ 1

0| f (t)|dt +

∫ 2

1|1− f (t)|dt = 0 .

As f is assumed to be continuous, it follows

f (t) ={

0, 0 < t < 1,1, 1 < t < 2.

We see that the limit function f cannot be continuous. This contradiction implies thatC[0,2] is not complete with respect to the L1 norm.

4.2 The completion of a normed spaceA normed space may be incomplete. However, every normed space X can be con-sidered as a subset of a larger Banach space X . The minimal among these spaces5 iscalled the completion of X .

Informally we can say that X consists of limit points of all Cauchy sequences in X .Of course, every point x ∈ X is a limit point of the constant sequence (xn = x for alln ∈ N) and therefore X ⊂ X . If X is not complete, some of the limit points are not inX , so X is larger then the original set X .

Definition 4.5 (dense set) We say that a subset X ⊂ V is dense in V if for any v ∈ Vand any ε > 0 there is x ∈ X such that ‖x− v‖< ε .

5In this context “minimal” means that if any other space X has the same property, then the minimalX is isometric to a subspace of X . It turns out that this property can be achieved by requiring X to bedense in X .

20

Page 25: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Note that X is dense in V iff for every point v ∈V there is a sequence xn ∈ X suchthat xn→ v.

Theorem 4.6 Let (X ,‖ · ‖X) be a normed space. Then there is a complete normedspace (X ,‖ · ‖X ) and a linear map i : X →X such that i is an isometrical isomor-phism between (X ,‖ · ‖X) and (i(X),‖ · ‖X ), and i(X) is dense in X .

Moreover, X is unique up to isometry, i.e., if there is another complete normedspace (X ,‖·‖X ) with these properties, then X and X are isometrically isomorphic.

Proof: The proof is relatively long so we break it into a sequence of steps.Construction of X . Let Y be the set of all Cauchy sequences in X . We say that twoCauchy sequences x = (xn)∞

n=1, xn ∈ X , and y = (yn)∞n=1, yn ∈ X , are equivalent, and

write x∼ y, iflimn→∞‖xn− yn‖X = 0 .

Let X be the space of all equivalence classes in Y , i.e., it is the factor space: X =Y / ∼. The elements of X are collections of equivalent Cauchy sequences from X .We will use [x] to denote the equivalence class of x.Exercises: Show that X is a vector space.Norm on X . For an η ∈ X take any representative x = (xn)∞

n=1, xn ∈ X , of theequivalence class η . Then the equation

‖η‖X = limn→∞‖xn‖X . (4.1)

defines a norm on X . Indeed:

1. Equation (4.1) defines a function X → R, i.e., for any η ∈X and any repre-sentative x ∈ η the limit exists and is independent from the choice of the repre-sentative. (Exercise)

2. The function defined by (4.1) satisfies the axioms of norm. (Exercise)

Definition of i : X →X . For any x ∈ X let

i(x) = [(x,x,x,x, . . .)]

(the equivalence class of the constant sequence). Obviously, i is a linear isometry,and it is a bijection X → i(X). Therefore the spaces X and i(X) are isometricallyisomorphic.

Completeness of X . Let(

η(k))∞

k=1be a Cauchy sequence in (X ,‖ · ‖X ). For every

k ∈N take a representative x(k) ∈ η(k). Note that x(k) ∈Y is a Cauchy sequence in thespace (X ,‖ · ‖X). Then there is a strictly monotone sequence of integers nk such that∥∥∥x(k)

j − x(k)l

∥∥∥X≤ 1

kfor all j, l ≥ nk . (4.2)

21

Page 26: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Now consider the sequence x∗ defined by

x∗ =(

x(k)nk

)∞

k=1.

Next we will check that x∗ is Cauchy, and consider its equivalence class η∗ = [x∗] ∈X . Then we will prove that η(k)→ η∗ in (X ,‖ · ‖X ).The sequence x∗ is Cauchy. Since the sequence of η(k) is Cauchy, for any ε > 0 thereis Mε such that

limn→∞‖x(k)

n − x(l)n ‖X = ‖η(k)−η

(l)‖X < ε for all k, l > Mε .

Consequently, for every k, l > Mε there is Nk,lε such that

‖x(k)n − x(l)

n ‖X < ε for all n > Nk,lε . (4.3)

Then fix any ε > 0. If j, l > 3ε

and m > max{n j,nl,Nk,lε/3} we have

‖x∗j − x∗l ‖X = ‖x( j)n j − x(l)

nl ‖X

≤ ‖x( j)n j − x( j)

m ‖X +‖x( j)m − x(l)

m ‖X +‖x(l)m − x(l)

nl ‖X

<1j+

ε

3+

1l

< ε

where we used (4.3) and (4.2). Therefore x∗ is Cauchy and η = [x∗] ∈X .The sequence η(k)→ [x∗]. Indeed, take any ε > 0 and k > 3ε−1, then

‖η(k)−η∗‖X = lim

j→∞‖x(k)

j − x∗j‖X = limj→∞‖x(k)

j − x( j)n j ‖X

≤ limj→∞

(‖x(k)

j − x(k)nk ‖X +‖x(k)

nk − x( j)n j ‖X

)≤ 1

k+ ε < 2ε

Therefore η(k)→ η∗.

We have proved that any Cauchy sequence in X has a limit in X , so X is com-plete.

Density of i(X) in X . Take an η ∈X and let x ∈ η . Take any ε > 0. Since x isCauchy, there is Nε such that ‖xm− xk‖X < ε for all k,m > Nε . Then

‖η− i(xk)‖X = limm→∞‖xm− xk‖X ≤ ε.

Therefore i(X) is dense in X .Uniqueness of X up to isometry. I will not show you the details of the proof. Theproof uses the fact that i(X) is isometrically isomorphic to i(X) (since each one isisometrically isomorphic to X). Moreover, i(X) is dense in X and i(X) is dense inX . In order to complete the proof one has to show that an isometry between twodense subsets can be extended to an isometry between the sets. �

22

Page 27: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

4.3 ExamplesThe theorem provides an explicit construction for the completion of a normed space.Often this description is not sufficiently convenient and a more direct description isdesirable.

1. Example: Consider the space P[0,1] of all polynomial functions restricted tothe interval [0,1] and equip this space with the sup norm. This space is not com-plete. On the other hand any polynomial is continuous, and therefore P[0,1]can be considered as a subspace of C[0,1] which is complete. The Weierstrassapproximation theorem states that any continuous function on [0,1] can be uni-formly approximated by polynomials. In other words, the polynomials are densein C[0,1]. Then Theorem 4.6 implies that the completion of P[0,1] is isometri-cally isomorphic to C[0,1] equipped with the sup norm.

2. Example: Let ` f (K) be the space of all sequences which have only a finitenumber of non-zero elements. This space is not complete in the `p norm. Thecompletion of ` f (K) in the `p norm is isometric to `p(K).

Indeed, we have already seen that `p(K) is complete. So in order to prove theclaim you only need to check that ` f (K) is dense in `p(K).

We see that the completion of a space depends both on the space and on thenorm.

3. Example: L1(0,1) is the completion of C[0,1] in the L1 norm.

According to this definition any f ∈ L1(0,1) is an equivalence class of a Cauchysequence fn ∈C[0,1] equipped with the L1 norm. The norm of f is defined by

‖ f‖L1 = limn→∞‖ fn‖L1 = lim

n→∞

∫ 1

0| fn(t)|dt .

In spite of the fact that f can be considered as a limit of the sequence of continu-ous functions fn in the L1 norm, we cannot define the value of f (t) for a given tas the limit of fn(t), because the limit may not exist or may depend on the choiceof the representative in the equivalence class.

In spite of that we can define the integral of f by setting∫ 1

0f (t)dt := lim

n→∞

∫ 1

0fn(t)dt . (4.4)

Indeed, the limit exists: the sequence of the integrals is Cauchy as∣∣∣∣∫ 1

0fn(t)dt−

∫ 1

0fm(t)dt

∣∣∣∣ =∣∣∣∣∫ 1

0

(fn(t)− fm(t)

)dt∣∣∣∣

≤∫ 1

0| fn(t)− fm(t)| dt = ‖ fn− fm‖L1 ,

23

Page 28: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

hence convergent. It is not difficult to check that the limit does not depend fromthe choice of a representative in the equivalence class f .

Obviously, if fn is a constant sequence, i.e., fn(t) = f0(t), t ∈ [0,1], for all n ∈N, then

∫ 10 f (t) =

∫ 10 f0(t)dt. Therefore this definition can be considered as an

extension of the classical Riemann’s integral used for continuous functions.

Taking into account the new definition of the integral we can write

‖ f‖L1 =∫ 1

0| f (t)|dt .

Then we recall that the norm is non-degenerate, therefore f = g for f ,g ∈L1(0,1) if and only if ‖ f −g‖L1 = 0, i.e.,∫ 1

0| f (t)−g(t)|dt = 0 ,

where the integral should be interpreted in the sense of the new definition. Notethat f and g are not continuous and consequently the equality above cannot beused to deduce that f (t) = g(t) for all t.

Nevertheless a more direct description of L1(0,1) is possible: The space L1(0,1)is isometrically isomorphic to the space of equivalence classes of Lebesgue in-tegrable functions on (0,1): the functions f and g are equivalent if∫ 1

0| f (t)−g(t)|dt = 0 .

Here the integral should be considered as the Lebesgue integral. We note that itcoincides with the definition provided above but its construction is more direct.Since the notion of the Lebesgue integration is very important for the functionalanalysis and its applications we will discuss it in more details in the next fewlectures. A more detailed study of this topics is a part of MA359 Measure Theorymodule.

24

Page 29: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

5 Lebesgue spacesThe exposition of the Lebesgue integral is based on the book H.A.Priesly, Introductionto integration, Oxford Sc.Publ., 1997, 306 p.

5.1 Integrable functions

Integrals of step functionsWe say that ϕ : R→ R is a step function if it is piecewise constant on a finite numberof intervals, i.e., it can be represented as a finite sum

ϕ(x) =n

∑j=1

c jχI j(x) ,

where c j ∈ R, I j ⊂ R is an interval, I j ∩ Ik = /0 if k 6= j, and χI is the characteristicfunction of I:

χI(x) ={

1, x ∈ I ,0, x 6∈ I .

We note that the intervals are allowed to be of any of the four possible types (e.g.(a,b), [a,b), (a,b] or [a,b]).

We define the integral of a step function ϕ by∫ϕ :=

n

∑j=1

c j|I j|,

where |I j| is the length of I j. We note that this sum equals to the Riemann integralwhich you studied in Year 1, i.e.,

∫ϕ is the “algebraic” area under the graph of the

step function ϕ (the area is counted negative on those intervals where ϕ(x) < 0).

Sets of measure zeroDefinition 5.1 We say that a set A ⊂ R has measure zero if for any ε > 0 there is an(at most countable) collection of intervals that cover A and whose total length is lessthan ε:

A⊂∞⋃

j=1

[a j,b j] and∞

∑j=1

(b j−a j) < ε .

Exercise: Show that a countable union of measure zero sets has measure zero. Hint:for An choose a cover with εn = ε/2n.

Examples. The set Q of all rational numbers has measure zero. The Cantor set hasmeasure zero.

Definition 5.2 A property is said to hold “almost everywhere” or “for almost everyx” (and abbreviated to “a.e.”) if the set of points at which the property does not holdhas measure zero.

25

Page 30: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Almost everywhere convergenceTheorem 5.3 Let (ϕn(x))∞

n=1 be an increasing sequence of step functions (ϕn+1(x)≥ϕn(x) for all x) such that

∫ϕn < K. Then ϕn(x) converges for a.e. x.

Proof: First note that an increasing sequence of numbers has a limit if and only if it isbounded from above. So in order to prove the theorem it is sufficient to show that theset

E = {x ∈ R : ϕn(x)→+∞}

has measure zero.Without loosing in generality, we can assume that ϕn(x)≥ 0.6 Let us define the set

En,m = {x : ϕn(x) > m} .

This set is a finite union of intervals. Indeed, ϕn(x) = ∑ j c(n)j χ

I(n)j

(x) is a step function.

Let In,m = { j : c(n)j > m}. Then

En,m =⋃

j∈In,m

I(n)j .

The total length of those intervals is less than K/m. Indeed, since c(n)j ≥ 0 for all j and

c(n)j > n for j ∈In,

K >∫

ϕn = ∑j

c(n)j |I

(n)j |> m ∑

j∈In,m

|I(n)j | .

Finally, E ⊂ Em =⋃

∞l=1 El,m for every m. Since the sequence ϕn is increasing, En,m ⊂

En+1,m. Moreover En+1,m \En,m consists of a finite number of intervals. Then

Em =∞⋃

l=1

El,m \El−1,m

is at most countable union of intervals (we denote E0,m = /0). Since En,m =⋃n

l=1 El,m \El−1,m and the total length of the intervals in En,m is less than K/m for all n, the totallength of the intervals in Em is not larger than K/m.

Since Em is at most countable union of intervals whose total length is less thanK/m and E ⊂ Em for all m, the set E has measure zero. �

6Otherwise replace ϕn by ϕn−ϕ1 which is non-negative. The new sequence satisfies the assumptionof the theorem (possibly with a different constant K). Moreover, the convergence of ϕn(x)−ϕ1(x) isequivalent to the convergence of ϕn(x).

26

Page 31: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Lemma 5.4 If ϕn and ψn are two increasing sequences of step functions which respec-tively tend to f and g a.e. and f (x)≥ g(x) a.e., then

limn→∞

∫ϕn ≥ lim

n→∞

∫ψn .

Proof: The following sets

E1 ={

x : limϕn(x) 6= f (x)}

,

E2 ={

x : limψn(x) 6= g(x)}

,

E3 ={

x : f (x) < g(x)}

have measure zero. Let E = E1∪E2∪E3. Assume x 6∈ E.Fix arbitrary k ∈N. The sequence ψk(x)−ϕn(x) is decreasing in n. Then as n→∞

ψk(x)−ϕn(x)→ ψk(x)− f (x)≤ g(x)− f (x)≤ 0 .

Consequently(ψk−ϕn)+(x) = max{ψk(x)−ϕn(x), 0}→ 0

Since it is a decreasing sequence of non-negative step functions which converges to 0a.e.,7 ∫

(ψk−ϕn)+→ 0 .

Since ψk and ϕn are step functions∫ψk−

∫ϕn =

∫(ψk−ϕn)≤

∫(ψk−ϕn)+ .

Then taking the limit as n→ ∞ we get∫ψk ≤ lim

n→∞

∫ϕn

Finally, take the limit as k→ ∞ to obtain the desired inequality. �

Corollary 5.5 If ϕn and ψn are two increasing sequences of step functions which tendto a function f a.e., then

limn→∞

∫ϕn = lim

n→∞

∫ψn .

Proof: Use the previous lemma with g = f .

7This statement is provided without a proof.

27

Page 32: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Lebesgue integrable functionsDefinition 5.6 If a function f : R→ R can be represented as an a.e. limit of an in-creasing sequence of step functions ϕn, then the integral of f is given by∫

f = limn→∞

∫ϕn .

If the limit is finite we write f ∈ Linc(R).

Note that the limit is independent of the choice of the increasing sequence of stepfunctions.

Unfortunately Linc(R) is not a vector space as f ∈ Linc(R) does not imply − f ∈Linc(R). Indeed, f is bounded from below by ϕ1 but not necessarily bounded fromabove. Then− f is not bounded from below and therefore− f 6∈ Linc(R). For example,

f (x) =

{1√|x|

, x 6= 0 ,

0, x = 0 ,or f =

∑k=1

k1/2χ[(k+1)−1,k−1] .

Definition 5.7 (Lebesgue integral) A function f : R→ R is (Lebesgue) integrable iff = g−h where g,h ∈ Linc(R) and∫

f :=∫

g−∫

h .

This integral is called the Lebesgue integral.8

Of course in the definition the choice of g and h is not unique. So we will have tocheck that the value of the integral does not depend from this freedom.

5.2 Properties of Lebesgue integralsThe properties of the Lebesgue integral are based on the properties of the integrals forthe functions from Linc(R) .

Proposition 5.8 If f ,g ∈ Linc(R) and λ ∈ R, λ ≥ 0, then

1. f +g, λ f , max{ f ,g}, min{ f ,g} ∈ Linc(R) .

2.∫( f +g) =

∫f +

∫g .

3. If additionally f (x)≥ g(x) a.e., then∫

f ≥∫

g.

8This construction is different (but equivalent) to the original definition of the Lebesgue integral.For an alternative approach see Measure Theory module.

28

Page 33: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Exercise: Prove the proposition.

First we state the main elementary properties of the Lebesgue integration.

Theorem 5.9 If f , f1, f2 are integrable and λ ∈ R, then

1. f1 +λ f2 is also integrable and∫( f1 +λ f2) =

∫f1 +λ

∫f2 .

2. | f (x)| is also integrable and |∫

f | ≤∫| f |.

3. If additionally f (x)≥ 0 a.e., then∫

f ≥ 0.

Proof:

1. Since f1, f2 are integrable, there are functions g1,g2,h1,h2 ∈ Linc(R) such thatfk = gk−hk, k = 1,2. If λ ≥ 0 then g1 +λg2,h1 +λh2 ∈ Linc(R) and∫

( f1 + f2) =∫

(g1 +λg2)−∫

(h1 +λh2)

=∫

g1 +λ

∫g2−

∫h1−λ

∫h2 =

∫f1 +λ

∫f2 .

The case of λ < 0 can be reduced to the previous one. Indeed, in this casewe can write f1 + λ f2 = f1 +(−λ )(− f2) and observe that − f2 = h2− g2 andconsequently is also integrable. Linearity is proved.

2. Since f is integrable, f = g−h with g,h ∈ Linc(R). Obviously for every x

| f (x)|= max{g(x),h(x)}−min{g(x),h(x)} ,

then Proposition 5.8 implies that the maximum and the minimum belong toLinc(R) and hence | f | is integrable. The inequality

g(x)+min{g(x),h(x)} ≤ h(x)+max{g(x),h(x)}

is valid for every x and in combination with Proposition 5.8 implies∫g+

∫min{g,h} ≤

∫h+

∫max{g,h} .

Consequently∫f =

∫g−

∫h≤

∫max{g,h}−

∫min{g,h}=

∫| f | .

Applying this result for f replaced by − f we conclude

−∫

f =∫

(− f )≤∫|− f |=

∫| f |.

Consequently, |∫

f | ≤∫| f |.

29

Page 34: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

3. Since f (x) = g(x)− h(x) with g,h ∈ Linc(R) and f (x) ≥ 0 a.e., we concludeg(x)≥ h(x) a.e.. Then Proposition 5.8 implies that

∫g≥

∫h. Consequently∫

f =∫

g−∫

h≥ 0 . �

We note that | f | is integrable does not imply that f is integrable.9

Exercise: If f is integrable than f+ = max{ f ,0} and f− = min{ f ,0} are integrable.(Hint: f+ = ( f + | f |)/2 and f− = ( f −| f |)/2.)

Integrals and limitsYou should be careful when swapping lim and

∫:

Examples:

limn→∞

∫nχ(

0,1n

) = 1 6=∫

limn→∞

nχ(0,

1n

) = 0 .

limn→∞

∫ 1n

χ(0,n) = 1 6=∫

limn→∞

1n

χ(0,n) = 0 .

The following two theorems establish conditions which allow swapping the limitand integration. They play the fundamental role in the theory of Lebesgue integrals.

Theorem 5.10 (Monotone Convergence Theorem) Suppose that fn are integrable,fn(x) ≤ fn+1(x) a.e., and

∫fn < K for some constant independent of n. Then there is

an integrable function g such that fn(x)→ g(x) a.e. and∫g = lim

n→∞

∫fn .

Corollary 5.11 If f is integrable and∫| f |= 0, then f (x) = 0 a.e.

Proof: Let fn(x) = n| f (x)|. This sequence satisfies MCT (integrable, increasing and∫fn = 0 < 1), consequently there is an integrable g(x) such that fn(x)→ g(x) for a.e.

x. Since the sequence is increasing, fn(x) ≤ g(x) a.e. which implies | f (x)| ≤ g(x)/nfor all n and a.e. x. Consequently f (x) = 0 a.e. �

Theorem 5.12 (Dominated Convergence Theorem) Suppose that fn are integrablefunctions and fn(x)→ f (x) for a.e. x.. If there is an integrable function g such that| fn(x)| ≤ g(x) for every n and a.e. x, then f is integrable and∫

f = limn→∞

∫fn .

9Indeed, we can sketch an example. It is based on partitioning the interval [0,1] into two very nastysubsets. So let f (x) = 0 outside [0,1], for x ∈ [0,1] let f (x) = 1 if x belongs to the Vitali set andf (x) =−1 otherwise. Then | f |= χ[0,1] but f is not integrable.

30

Page 35: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

It is also possible to integrate complex valued functions: f : R→C is integrable ifits real and imaginary parts are both integrable, and∫

f :=∫

Re f + i∫

Im f .

The MCT has no meaning for complex valued functions. The DCT is valid withoutmodifications (and indeed follows easily from the real version).

5.3 Lebesgue space L1(R)

Definition 5.13 The Lebesgue space L1(R) is the space of integrable functions modulothe following equivalence relation: f ∼ g iff f (x) = g(x) a.e. The Lebesgue space isequipped with the L1 norm:

‖ f‖L1 =∫| f | .

It is convenient to think about elements of L1(R) as functions R→ R interpretingthe equality f = g as f (x) = g(x) a.e.

From the viewpoint of Functional Analysis, the equivalence relation is introducedto ensure non-degeneracy of the L1 norm. Indeed, in the space of integrable functions∫| f |= 0 is equivalent to f (x) = 0 a.e. and therefore does not imply the latter equality

for all x.

Theorem 5.14 L1(R) is a Banach space.

Revise the properties of the Lebesgue integral which imply that L1(R) is a normedspace. The completeness of L1(R) follows from the combination of the following twostatements: The first lemma gives a criterion for completeness of a normed space,and the second one implies that the assumptions of the first lemma are satisfied forX = L1(R).

Lemma 5.15 If (X ,‖ · ‖X) is a normed space in which

∑j=1‖y j‖X < ∞

implies the series ∑∞j=1 y j converges, then X is complete.

Proof: Let x j ∈ X be a Cauchy sequence. Then there is a monotone increasing se-quence nk ∈ N such that for every k ∈ N

‖x j− xl‖X < 2−k for all k, l ≥ nk .

31

Page 36: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Let y1 = xn1 and yk = xnk− xnk−1 for k ≥ 2. Since ‖yk‖X ≤ 21−k for k ≥ 2,∞

∑k=1‖yk‖X ≤ ‖y1‖X +

∑k=1

21−k = ‖y1‖X +2 < ∞ .

Consequently there is x∗ ∈ X such that

x∗ =∞

∑j=1

y j .

On the other handk

∑j=1

y j = xn1 +k

∑j=2

(xn j − xn j−1) = xnk

and therefore xnk → x∗. Consequently xk→ x∗ and the space X is complete. �

Lemma 5.16 If ( fk)∞k=1 is a sequence of integrable functions such that ∑

∞k=1 ‖ fk‖L1 <

∞, then

1. ∑∞k=1 | fk(x)| converges a.e. to an integrable function,

2. ∑∞k=1 fk(x) converges a.e. to an integrable function.

Proof: The first statement follows from MCT applied to the sequence gn = ∑nk=1 | fk|

and K = ∑∞k=1 ‖ fk‖L1 . So there is an integrable function g(x) such that

g(x) =∞

∑k=1| fk(x)|

for almost all x. For these values of x the partial sums hn(x) = ∑nk=1 fk(x) obviously

converge, so let

h(x) =∞

∑k=1

fk(x).

Moreover

|hn(x)|=

∣∣∣∣∣ n

∑k=1

fk(x)

∣∣∣∣∣≤ n

∑k=1| fk(x)| ≤

∑k=1| fk(x)|= g(x) .

Therefore the partial sums hn satisfy DCT and the second statement follows. �

In addition to L1(R) we will sometimes consider the Lebesgue spaces L1(I) whereI is an interval. We say that f ∈ L1(I) if χI f ∈ L1(R), i.e., we extend the function byzero outside its original domain.

Proposition 5.17 The space C[0,1] is dense in L1(0,1).

Proof: First show that step functions are dense in L1(0,1). Then check that every stepfunction can be approximated by a piecewise linear continuous function. �

Consequently the space we constructed in this section is isometrically isomorphicto the completion of C[0,1] in the L1 norm.

32

Page 37: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

5.4 Lp spacesAnother important class of Lebesgue spaces consists of Lp spaces for 1 ≤ p < ∞,among those the L2 space is the most remarkable (it is also a Hilbert space, see thenext chapter for details). In this section we will sketch the main definitions of thosespaces noting that the full discussion requires more knowledge of Measure Theorythan we can fit into this module).

If I = (a,b) is an interval, then Lp(I) can be defined in terms of the integrationprocedure developed earlier in this chapter. This definition is equivalent to the standardone which will be given a bit later.

The Lebesgue space Lp(I) is the space of all integrable functions such that

‖ f‖pLp =

∫I| f |p < ∞

modulo the equivalence relation: f = g iff f (x) = g(x) a.e. We note that in this caseLp(I)⊂ L1(I). The definition of Lp(R) is slightly different. We say that f ∈ Lp(R) if fis locally integrable (i.e., f ∈ L1(I) for any interval I)10 and its pth power is integrable.The norm is defined by the same formula:

‖ f‖pLp =

∫R| f |p < ∞ .

We note that although L1(R)∩L2(R) 6= /0 (e.g. both spaces contain all step functions)none of those spaces is a subset of the other one. For example, f (x) = 1/(1 + |x|)belongs to L2(R) but not to L1(R). Indeed,

∫f 2 < ∞ but

∫f = ∞ so it is not integrable

on R. On the other hand

g(x) =χ(0,1)(x)

|x|1/2

belongs to L1(R) but not to L2(R).

Theorem 5.18 Lp(R) and Lp(I) are Banach spaces for p≥ 1 and any interval I.

We will not give a complete proof but sketch the main ideas instead.Let 1

p + 1q = 1 and f ∈ Lp(R), g ∈ Lq(R). Then the Holder inequality states11 that∫

| f g| ≤ ‖ f‖p‖g‖q .

10This is an important requirement. It is not sufficient to define Lp as a set of all functions suchthat f p is integrable: this space would not be a vector space. Indeed, let p = 2, g = χ[0,1] and f bethe non-integrable function from the footnote9. Then f 2 = g2 = χ[0,1] is integrable. But ( f + g)2 =f 2 + 2 f g + g2 = 2 + 2 f is not integrable. Therefore if we followed this definition f ,g ∈ L2 would notimply f +g ∈ L2.

11We will not discuss the proof of this inequality in these lectures.

33

Page 38: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Note that the characteristic function χI ∈ Lq(R) for any interval I and any q≥ 1, more-

over ‖χI‖Lq = |I|1q where |I| = b− a is the length of I. The Holder inequality with

g = χI implies that ∫χI| f |=

∫I| f | ≤ |I|1/q ‖ f‖p .

The left hand side of this inequality is the norm of f in L1(I):

‖ f‖L1(I) ≤ |I|1/q ‖ f‖Lp(I) .

Consequently any Cauchy sequence in Lp(I) is automatically a Cauchy sequence inL1(I). Since L1 is complete the Cauchy sequence converges to a limit in L1(I) (andconsequently converges a.e.). In order to proof completeness of Lp it is sufficient toshow that the pth power of this limit is integrable. This can be done on the basis of theDominated Convergence Theorem.

Exercise: The next two exercises show that L2(R) is complete (compare with the proofof completeness for L1(R)).

1. Let ( fk)∞k=1 be a sequence in L2(R) such that

∑k=1‖ fk‖L2 < ∞ .

Applying the MCT to the sequence

gn =

(n

∑k=1| fk|

)2

show that ∑k fk converges to a function f with integrable f 2.

2. Now use the DCT applied to hn =∣∣ f −∑

nk=1 fk

∣∣2 to deduce that ∑k fk convergesin the L2 norm to a function in L2.

If you look into a textbook, you will probably see a differently looking definitionof the Lebesgue spaces. Traditionally a function f is asked to be measurable insteadof locally integrable. Local integrability is a stronger property: every locally inte-grable function is measurable but there are measurable functions which are not locallyintegrable, e.g. x−2 is measurable but not locally integrable since

∫ 1−1 x−2 = +∞. Nev-

ertheless the two alternative definitions of the Lebesgue space are equivalent.Let us discuss the notion of a measurable function from the perspective of our

definitions.

34

Page 39: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

First we need to define the measure, which can be considered as a generalisationof the length of an interval. We say that a subset A ⊂ R has finite Lebesgue measureµ(A) if the characteristic function χA is Lebesgue integrable. Then

µ(A) :=∫

χA ≥ 0 .

Obviously µ([a,b]) = b− a for an interval [a,b] and consequently its Lebesgue mea-sure coincides with the length.

In order to study large sets (like R) we need to extend this definition to allowmeasuring sets with infinitely large measures. We say that A ⊂ R is measurable if χAis locally integrable. In particular, if A is measurable then An = A∩ [−n,n] has finitemeasure for each n. Since µ(An) is an increasing sequence the following limit exists(but can be +∞)

µ(A) = limn→∞

µ(An)≤+∞ .

Note that if∫

χA < ∞ MCT implies that the limit coincides with the previous definitionof µ(A).

For example R is measurable and µ(R) = +∞.

Definition 5.19 A function f : R→ R is measurable if preimage of any interval ismeasurable.

We note that sums, products and pointwise limits of measurable functions are mea-surable.

Consider the set of all measurable functions from R to R (or C) whose absolutevalue raised to the pth power has a finite Lebesgue integral, i.e.,

‖ f‖Lp :=(∫| f |p

)1/p

< ∞.

This space modulo the equivalence relation “ f = g iff f (x) = g(x) a.e.” is called theLebesgue space Lp(R).

35

Page 40: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

6 Hilbert spaces

6.1 Inner product spacesYou have already seen the inner product on Rn.

Definition 6.1 An inner product on a vector space V is a map (·, ·) : V ×V →K suchthat for all x,y,z ∈V and for all λ ∈K:

(i) (x,x)≥ 0, and (x,x) = 0 iff x = 0;

(ii) (x+ y,z) = (x,z)+(y,z);

(iii) (λx,y) = λ (x,y);

(iv) (x,y) = (y,x).

A vector space equipped with an inner product is called an inner product space.

• In a real vector space the complex conjugate in (iv) is not necessary.

• If K = C, then (iv) with y = x implies that (x,x) is real and therefore the require-ment (x,x)≥ 0 make sense.

• (iii) and (iv) imply that (x,λy) = λ (x.y).

1. Example: Rn is an inner product space

(x,y) =n

∑k=1

xkyk .

2. Example: Cn is an inner product space

(x,y) =n

∑k=1

xkyk .

3. Example: `2(K) is an inner product space

(x,y) =∞

∑k=1

xkyk .

Note that the sum converges because ∑k |xkyk| ≤ 12 ∑k

(|xk|2 + |yk|2

).

4. Example: L2(a,b) is an inner product space

( f ,g) =∫ b

af (x)g(x)dx .

36

Page 41: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

6.2 Natural normsEvery inner product space is a normed space as well.

Proposition 6.2 If V is an inner product space, then

‖v‖=√

(v,v)

defines a norm on V .

Definition 6.3 We say that ‖x‖ =√

(x,x) is the natural norm induced by the innerproduct.

Lemma 6.4 (Cauchy-Schwartz inequality) If V is an inner product space and ‖v‖=√(v,v) for all v ∈V , then

|(x,y)| ≤ ‖x‖‖y‖ for all x,y ∈V .

Proof of the lemma: The inequality is obvious if y = 0. So suppose that y 6= 0. Thenfor any λ ∈K:

0≤ (x−λy,x−λy) = (x,x)−λ (y,x)−λ (x,y)+ |λ |2(y,y) .

Then substitute λ = (x,y)/‖y‖2:

0≤ (x,x)−2|(x,y)|‖y‖2 +

|(x,y)|‖y‖2 = ‖x‖2− |(x,y)|

‖y‖2 ,

which implies the desired inequality. �

Proof of the proposition: Now we can complete the proof of Proposition 6.2. We notethat positive definiteness and homogeneity of ‖ · ‖ easily follow from (i), and (iii), (iv)in the definition of the inner product. In order to establish the triangle inequality weuse the Cauchy-Schwartz inequality. Let x,y ∈V . Then

‖x+ y‖2 = (x+ y,x+ y) = (x,x)+(x,y)+(y,x)+(y,y)≤ ‖x‖2 +2‖x‖‖y‖+‖y‖2 = (‖x‖+‖y‖)2 ,

and the triangle inequality follows by taking the square root.Therefore ‖ · ‖ is a norm. �

We have already proved the Cauchy-Schwartz inequality for `2(K) using a differentstrategy (see Lemma 2.4).

The Cauchy-Schwartz inequality in L2(a,b) takes the form∣∣∣∣∫ b

af (x)g(x)dx

∣∣∣∣≤ (∫ b

a| f (x)|2dx

)1/2(∫ b

a|g(x)|2dx

)1/2

.

In particular, it states that f ,g ∈ L2(a,b) implies f g ∈ L1(a,b).

37

Page 42: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Lemma 6.5 If V is an inner product space equipped with the natural norm, then xn→x and yn→ y imply that

(xn,yn)→ (x,y) .

Proof: Since any convergent sequence is bounded, the inequality

|(xn,yn)− (x,y)| = |(xn− x,yn)+(x,yn− y)|≤ |(xn− x,yn)|+ |(x,yn− y)|≤ ‖xn− x‖‖yn‖+‖x‖‖yn− y‖

implies that (xn,yn)→ (x,y). �

The lemma implies that we can swap inner products and limits.

6.3 Parallelogram law and polarisation identityNatural norms have some special properties.

Lemma 6.6 (Parallelogram law) If V is an inner product space with the natural norm‖ · ‖, then

‖x+ y‖2 +‖x− y‖2 = 2(‖x‖2 +‖y‖2) for all x,y ∈V .

Proof: The linearity of the inner product implies that for any x,y ∈V

‖x+ y‖2 +‖x− y‖2 = (x+ y,x+ y)+(x− y,x− y)= (x,x)+(x,y)+(y,x)+(y,y)

+(x,x)− (x,y)− (y,x)+(y,y)= 2

(‖x‖2 +‖y‖2) �

Example (some norms are not induced by an inner product): There is no innerproduct which induces the following norms on C[0,1]:

‖ f‖∞ = supt∈[0,1]

| f (t)| or ‖ f‖L1 =∫ 1

0| f (t)|dt .

Indeed, these norms do not satisfy the parallelogram law, e.g., take f (x) = x and g(x) =1− x, obviously f ,g ∈C[0,1] and

‖ f‖∞ = ‖g‖∞ = ‖ f −g‖∞ = ‖ f +g‖∞ = 1 ,

substituting these numbers into the parallelogram law we see 2 6= 4.Exercise: Is the parallelogram law for the L1 norm satisfied for these f ,g?

38

Page 43: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Lemma 6.7 (Polarisation identity) Let V be an inner product space with the naturalnorm ‖ · ‖. Then

1. If V is real4(x,y) = ‖x+ y‖2−‖x− y‖2 ;

2. If V is complex

4(x,y) = ‖x+ y‖2−‖x− y‖2 + i‖x+ iy‖2− i‖x− iy‖2 .

Proof: Plug in the definition of the natural norm into the right hand side and uselinearity of the inner product. �

Lemma 6.7 shows that the inner product can be restored from its natural norm.Although the right hand sides of the polarisation identities is meaningful for any norm,we should not rush to the conclusion that any normed space is automatically an innerproduct space. Indeed, the example above implies that for some norms these formulaecannot define an inner product. Nevertheless, if the norm satisfy the parallelogram law,we indeed get an inner product:

Proposition 6.8 Let V be a real normed space with the norm ‖ · ‖ satisfying the par-allelogram law, then

(x,y) =‖x+ y‖2−‖x− y‖2

4=‖x+ y‖2−‖x‖2−‖y‖2

2

defines an inner product on V .12

Proof: Let us check that (x,y) satisfy the axioms of inner product. Positivity andsymmetry are straightforward (Exercise). The linearity:

4(x,y)+4(z,y) = ‖x+ y‖2−‖x− y‖2 +‖z+ y‖2−‖z− y‖2

= 12(‖x+2y+ z‖2 +‖x− z‖2)− 1

2(‖x−2y+ z‖2 +‖x− z‖2)

= 12‖x+2y+ z‖2− 1

2‖x−2y+ z‖2

= 12(2‖x+ y+ z‖2 +2‖y‖2−‖x+ z‖2)

−12(2‖x− y+ z‖2 +2‖y‖2−‖x+ z‖2)

= ‖x+ y+ z‖2−‖x− y+ z‖2 = 4(x+ z,y) .

We have proved that(x,y)+(z,y) = (x+ z,y) .

Applying this identity several times and setting z = x/m we obtain

n(x/m,y) = (nx/m,y) and m(x/m,y) = (x,y)

12Can you find a simpler proof?

39

Page 44: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

for any m ∈ Z and n ∈ N. Consequently, for any rational λ = nm

(λx,y) = λ (x,y) .

We note that the right hand side of the definition involves the norms only, which com-mute with the limits. Any real number is a limit of rational numbers and therefore thelinearity holds for all λ ∈ R. �

6.4 Hilbert spaces: Definition and examplesDefinition 6.9 A Hilbert space is a complete inner product space (equipped with thenatural norm).

Of course, any Hilbert space is a Banach space.

1. Example: Rn is a Hilbert space

(x,y) =n

∑k=1

xkyk, ‖x‖=

(n

∑k=1

x2k

)1/2

.

2. Example: Cn is a Hilbert space

(x,y) =n

∑k=1

xkyk, ‖x‖=

(n

∑k=1|xk|2

)1/2

.

3. Example: `2(K) is a Hilbert space

(x,y) =∞

∑k=1

xkyk, ‖x‖=

(∞

∑k=1|xk|2

)1/2

.

4. Example: L2(a,b) is a Hilbert space

( f ,g) =∫ b

af (x)g(x)dx, ‖x‖=

(∫ b

a| f (x)|2dx

)1/2

.

40

Page 45: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

7 Orthonormal bases in Hilbert spacesThe goal of this section is to discuss properties of orthonormal bases in a Hilbert spaceH. Unlike Hamel bases, the orthonormal ones involve a countable number of elements:i.e. a vector x is represented in the form of an infinite sum

x =∞

∑k=1

αkek

for some αk ∈K.We will mainly consider complex spaces with K = C. The real case K = R is not

very different. We will use (·, ·) to denote an inner product on H, and ‖ · ‖ will standfor the natural norm induced by the inner product.

7.1 Orthonormal setsDefinition 7.1 Two vectors x,y ∈ H are called orthogonal if (x,y) = 0. Then we writex⊥ y.

Theorem 7.2 (Pythagoras theorem) If x⊥ y then ‖x+ y‖2 = ‖x‖2 +‖y‖2.

Proof: Since (x,y) = 0

‖x+ y‖2 = (x+ y,x+ y) = (x,x)+(x,y)+(y,x)+(y,y) = ‖x‖2 +‖y‖2 . �

Definition 7.3 A set E is orthonormal if ‖e‖= 1 for all e ∈ E and (e1,e2) = 0 for alle1,e2 ∈ E such that e1 6= e2 .

Note that this definition does not require the set E to be countable.

Exercise: Any orthonormal set is linearly independent.

Indeed, suppose ∑nk=1 αkek = 0 with ek ∈ E and αk ∈K. Multiplying this equality

by e j we get

0 =

(n

∑k=1

αkek,e j

)=

n

∑k=1

αk(ek,e j) = α j .

Since α j = 0 for all j, we conclude that the set E is linearly independent.

Definition 7.4 (Kronecker delta) The Kronecker delta is the function defined by

δ jk ={

1, if j = k,0, if j 6= k .

41

Page 46: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Example: For every j ∈ N, let e j = (δ jk)∞k=1 (it is an infinite sequence of zeros with 1

at the jth position). The set E = {e j : j ∈ N} is orthonormal in `2. Indeed, from thedefinition of the scalar product in `2 we see that (e j,ek) = δ jk for all j,k ∈ N.

Example: The set13

E ={

fk =eikx√

2π: k ∈ Z

}is an orthonormal set in L2(−π,π). Indeed, since | fk(x)|= 1√

2πfor all x:

‖ fk‖2L2 =

∫π

−π

| fk(x)|2dx = 1 ,

and if j 6= k

( fk, f j) =∫

π

−π

fk(x) f j(x)dx =1

∫π

−π

ei(k− j)xdx =ei(k− j)x

i(k− j)

∣∣∣∣∣x=π

x=−π

= 0 .

Lemma 7.5 If {e1, . . . ,en} is an orthonormal set in an inner product space V , then forany α j ∈K ∥∥∥∥∥ n

∑j=1

α je j

∥∥∥∥∥2

=n

∑j=1|α j|2 .

Proof: The following computation is straightforward:∥∥∥∥∥ n

∑j=1

α je j

∥∥∥∥∥2

=

(n

∑j=1

α je j,n

∑l=1

αlel

)=

n

∑j=1

n

∑l=1

α jα l(e j,el)

=n

∑j=1

n

∑l=1

α jα lδ jl =n

∑j=1

α jα j . �

7.2 Gram-Schmidt orthonormalisationLemma 7.6 (Gram-Schmidt orthonormalisation) Let V be an inner product spaceand (vk) be a sequence of linearly independent vectors in V (finite or infinite). Thenthere is an orthonormal sequence (ek) such that

Span{v1, . . . ,vk }= Span{e1, . . . ,ek } for all k.

13Remember that for any x ∈ R and any k ∈ Z: eikx = coskx + isinkx. Then |eikx| = 1 and e±ikπ =coskπ± isinkπ = (−1)k.

42

Page 47: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proof: Let e1 = v1‖v1‖ . Then

Span{v1 }= Span{e1 }

and the statement is true for n = 1 as the set E1 = {e1 } is obviously orthonormal.Then we continue inductively.14 Suppose that for some k ≥ 2 we have found an

orthonormal set Ek−1 = {e1, . . . ,ek−1 } such that its span coincides with the span of{v1, . . . ,vk−1 }. Then set

ek = vk−k−1

∑j=1

(vk,e j)e j .

Since ∑k−1j=1(vk,e j)e j ∈ Span(Ek−1) = Span{v1, . . . ,vk−1 } and v1, . . . ,vk are linearly

independent, we conclude that ek 6= 0. For every j < k

(ek,e j) = (vk,el)−k−1

∑j=1

(vk,e j)(e j,el) = (vk,el)− (vk,el) = 0

which implies that ek ⊥ e j. Finally let ek = ek/‖ek‖. Then {e1, . . . ,ek } is an orthonor-mal set such that

Span{e1, . . . ,ek }= Span{v1, . . . ,vk } .If the original sequence is finite, the orthonormalisation procedure will stop after afinite number of steps. Otherwise, we get an infinite sequence of ek. �

Corollary 7.7 Any infinite-dimensional inner product space contains a countable or-thonormal sequence.

Corollary 7.8 Any finite-dimensional inner product space has an orthonormal basis.

Proposition 7.9 Any finite dimensional inner product space is isometric to Cn (or Rn

if the space is real) equipped with the standard inner product.

Proof: Let n = dimV and e j, j = 1, . . . ,n be an orthonormal basis in V . Note that(ek,e j) = δk j. Any two vectors x,y ∈V can be written as

x =n

∑k=1

xkek and y =n

∑j=1

y je j .

Then

(x,y) =

(n

∑k=1

xkek,n

∑j=1

y je j

)=

n

∑k=1

n

∑j=1

xky j(ek,e j) =n

∑k=1

xkyk .

Therefore the map x 7→ (x1, . . . ,xn) is an isometry. �

We see that an arbitrary inner product, when written in orthonormal coordinates,takes the form of the “canonical” inner product on Cn (or Rn if the original space isreal).

14For example, let k = 2. We define e2 = v2−(v2,e1)e1. Then (e2,e1) = (v2,e1)−(v2,e1)(e1,e1) = 0.Since v1,v2 are linearly independent e2 6= 0. So we can define e2 = e2

‖e2‖.

43

Page 48: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

7.3 Bessel’s inequalityLemma 7.10 (Bessel’s inequality) If V is an inner product space and E = (ek)∞

k=1 isan orthonormal sequence, then for every x ∈V

∑k=1|(x,ek)|2 ≤ ‖x‖2 .

Proof: We note that for any n ∈ N:∥∥∥∥∥x−n

∑k=1

(x,ek)ek

∥∥∥∥∥2

=

(x−

n

∑k=1

(x,ek)ek,x−n

∑k=1

(x,ek)ek

)

= ‖x‖2−2n

∑k=1|(x,ek)|2 +

n

∑k=1|(x,ek)|2

= ‖x‖2−n

∑k=1|(x,ek)|2 .

Since the left hand side is not negative,n

∑k=1|(x,ek)|2 ≤ ‖x‖2

and the lemma follows by taking the limit as n→ ∞. �

Corollary 7.11 If E is an orthonormal set in an inner product space V , then for anyx ∈V the set

Ex = {e ∈ E : (x,e) 6= 0}is at most countable.

Proof: For any m ∈ N the set Em = {e : |(x,e)|> 1m } has a finite number of elements.

Otherwise there would be an infinite sequence (ek)∞k=1 with ek ∈ Em, then the series

∑∞k=1 |(x,ek)|2 = +∞ which contradicts to Bessel’s inequality. Therefore Ex =∪∞

m=1Emis a countable union of finite sets and hence at most countable. �

7.4 ConvergenceIn this section we will discuss convergence of series which involve elements from anorthonormal set.

Lemma 7.12 Let H be a Hilbert space and E = (ek)∞k=1 an orthonormal sequence.

The series ∑∞k=1 αkek converges iff ∑

∞k=1 |αk|2 < +∞. Then∥∥∥∥∥ ∞

∑k=1

αkek

∥∥∥∥∥2

=∞

∑k=1|αk|2 . (7.1)

44

Page 49: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proof: Let xn = ∑nk=1 αkek and βn = ∑

nk=1 |αk|2. Lemma 7.5 implies that ‖xn‖2 = βn

and that for any n > m

‖xn− xm‖2 =

∥∥∥∥∥ n

∑k=m+1

αkek

∥∥∥∥∥2

=n

∑k=m+1

|αk|2 = βn−βm .

Consequently, xn is a Cauchy sequence in H iff βn is Cauchy in R. Since both spacesare complete, the sequences converge or diverge simultaneously.

If they converge, we take the limit as n→ ∞ in the equality ‖xn‖2 = βn to get (7.1)(the limit commutes with ‖ · ‖2). �

Definition 7.13 A series ∑∞n=1 xn in a Banach space X is unconditionally convergent

if for every permutation σ : N→ N the series ∑∞n=1 xσ(n) converges.

Every absolutely convergent series is unconditionally convergent, but the converseimplication does not hold in general. In Rn a series is unconditionally convergent ifand only if it is absolutely convergent.

Lemma 7.12 and Bessel’s inequality imply:

Corollary 7.14 If H is a Hilbert space and E = (ek)∞k=1 is an orthonormal sequence,

then for every x ∈ H the sequence

∑k=1

(x,ek)ek

converges unconditionally.

Lemma 7.15 Let H be a Hilbert space, E = (ek)∞k=1 an orthonormal sequence and

x ∈ H. If x = ∑∞k=1 αkek, then

αk = (x,ek) for all k ∈ N.

Proof: Exercise. �

7.5 Orthonormal basis in a Hilbert spaceDefinition 7.16 A set E is a basis for H if every x ∈ H can be written uniquely in theform

x =∞

∑k=1

αkek

for some αk ∈ K and ek ∈ E. If additionally E is an orthonormal set, then E is anorthonormal basis.

45

Page 50: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

If E is a basis, then it is a linearly independent set. Indeed, if ∑nk αkek = 0 then

αk = 0 due to the uniqueness.

Note that in this definition the uniqueness is a delicate point. Indeed, the sum∑

∞k=1 αkek is defined as a limit of partial sums xn = ∑

nk=1 αkek. A permutation of ek

changes the partial sums and may lead to a different limit. In general, we cannot evenguarantee that after a permutation the series remains convergent.

If E is countable, we can assume that the sum involves all elements of the basis(some αk can be zero) and that the summation is taken following the order of a selectedenumeration of E. The situation is more difficult if E is uncountable since in this casethere is no natural way of numbering the elements.

The situation is much simpler if E is orthonormal as in this case the series convergeunconditionally and the order of summations is not important.

Proposition 7.17 Let E = {e j : j ∈ N} be an orthonormal set in a Hilbert space H.Then the following statements are equivalent:

(a) E is a basis in H;

(b) x = ∑∞k=1(x,ek)ek for all x ∈ H;

(c) ‖x‖2 = ∑∞k=1 |(x,ek)|2;

(d) (x,en) = 0 for all n ∈ N implies x = 0;

(e) the linear span of E is dense in H.

Proof:(a)⇐⇒ (b): use Lemma 7.15.

(b) =⇒ (c): use Lemma 7.12.

(c) =⇒ (d): Let (x,ek) = 0 for all k, then (c) implies that ‖x‖= 0 hence x = 0.

(d) =⇒ (b): let y = x−∑∞k=1(x,ek)ek. Corollary 7.14 implies that the series con-

verges. Then Lemma 6.5 implies we can swap the limit and the inner product to getfor every n

(y,en) =

(x−

∑k=1

(x,ek)ek,en

)

= (x,en)−∞

∑k=1

(x,ek)(ek,en) = (x,en)− (x,en) = 0 .

Since (y,en) = 0 for all n, then (d) implies that y = 0 which is equivalent to x =∑

∞k=1(x,ek)ek as required.

46

Page 51: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

(e) =⇒ (d): since Span(E) is dense in H for any x ∈ H there is a sequence xn ∈Span(E) such that xn→ x. Take x such that (x,en) = 0 for all n. Then (xn,x) = 0 andconsequently

‖x‖2 =(

limn→∞

xn,x)

= limn→∞

(xn,x) = 0 .

Therefore x = 0.

(a) =⇒ (e): Since E is a basis any x = limn→∞ xn with xn = ∑nk=1 αkek ∈ Span(E).

Example: The orthonormal sets from examples of Section 7.1 are also examples oforthonormal bases.

47

Page 52: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

8 Closest points and approximations

8.1 Closest points in convex subsetsDefinition 8.1 A subset A of a vector space V is convex if λx +(1−λ )y ∈ A for anytwo vectors x,y ∈V and any λ ∈ [0,1].

Lemma 8.2 Let A be a non-empty closed convex subspace of a Hilbert space H andx ∈ H. Then there is a unique a∗ ∈ A such that

‖x−a∗‖= infa∈A‖x−a‖ .

Proof: The parallelogram rule implies:

‖(x−u)+(x− v)‖2 +‖(x−u)− (x− v)‖2 = 2‖x−u‖2 +2‖x− v‖2 .

Then‖u− v‖2 = 2‖x−u‖2 +2‖x− v‖2−4‖x− 1

2(u+ v)‖2 .

Let d = infa∈A ‖x− a‖. Since A is convex, 12(u + v) ∈ A for any u,v ∈ A, and conse-

quently ‖x− 12(u+ v)‖ ≥ d. Then

‖u− v‖2 ≤ 2‖x−u‖2 +2‖x− v‖2−4d2 . (8.1)

Since d is the infinum, for any n there is an ∈ A such that ‖x− an‖2 < d2 + 1n . Then

equation (8.1) implies that

‖an−am‖ ≤ 2d2 +2n

+2d2 +2m−4d2 =

2n

+2m

.

Consequently (an) is Cauchy and, since H is complete, it converges to some a∗. SinceA is closed, a∗ ∈ A. Then

‖x−a∗‖2 = limn→∞‖x−an‖2 = d2 .

Therefore a∗ is the point closest to x. Now suppose that there is another point a ∈ Asuch that ‖x− a‖= d, then (8.1) implies

‖a∗− a‖ ≤ 2‖x−a∗‖2 +2‖x− a‖2−4d2 = 2d2 +2d2−4d2 = 0 .

So a = a∗ and a∗ is unique. �

48

Page 53: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

8.2 Orthogonal complementsIn an infinite dimensional space a linear subspace does not need to be closed. Forexample the space ` f of all sequences with only a finite number of non-zero elementsis a linear subspace of `2 but it not closed in `2 (e.g. consider the sequence xn =(1,2−1,2−2, . . . ,2−n,0,0, . . .)).

Definition 8.3 Let X ⊆ H. The orthogonal complement of X in H is the set

X⊥ = {u ∈ H : (u,x) = 0 for all x ∈ X } .

Proposition 8.4 If X ⊆ H, then X⊥ is a closed linear subspace of H.

Proof: If u,v ∈ X⊥ and α ∈ K then

(u+αv,x) = (u,x)+α(v,x) = 0

for all x ∈ X . Therefore X⊥ is a linear subspace. Now suppose that un ∈ X⊥ andun→ u ∈ H. Then for all x ∈ X

(u,x) = ( limn→∞

un,x) = limn→∞

(un,x) = 0 .

Consequently, u ∈ X⊥ and so X⊥ is closed. �

Exercises:

1. If E is a basis in H, then E⊥ = {0}.

2. If Y ⊆ X , then X⊥ ⊆ Y⊥.

3. X ⊆ (X⊥)⊥

4. If X is a closed linear subspace in H, then X = (X⊥)⊥

Definition 8.5 The closed linear span of E ⊂H is a minimal closed set which containsSpan(E):

Span(E) = {u ∈ H : ∀ε > 0 ∃x ∈ Span(E) such that ‖x−u‖< ε } .

Proposition 8.6 If E ⊂ H then E⊥ = (Span(E))⊥ = (Span(E))⊥ .

Proof: Since E ⊆ Span(E)⊆ Span(E) we have (Span(E))⊥ ⊆ (Span(E))⊥ ⊆ E⊥. Sowe need to prove the inverse inclusion. Take u ∈ E⊥ and x ∈ Span(E). Then there isxn ∈ Span(E) such that xn→ x. Then

(x,u) = ( limn→∞

xn,u) = limn→∞

(xn,u) = 0 .

Consequently, u ∈ (Span(E))⊥ and we proved E⊥ ⊆ (Span(E))⊥. �

49

Page 54: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Theorem 8.7 If U is a closed linear subspace of a Hilbert space H then

1. any x ∈H can be written uniquely in the form x = u+v with u ∈U and v ∈U⊥.

2. u is the closest point to x in U.

3. The map PU : H→U defined by PU x = u is linear and satisfies

P2U x = PU x and ‖PU(x)‖ ≤ ‖x‖ for all x ∈ H .

Definition 8.8 The map PU is called the orthogonal projector onto U.

Proof: Any linear subspace is obviously convex. Then Lemma 8.2 implies that thereis a unique u ∈U such that

‖x−u‖= infa∈U‖x−a‖ .

Let v = x− u. Let us show that v ∈ U⊥. Indeed, take any y ∈ U and consider thefunction ∆ : C→ R defined by

∆(t) = ‖v+ ty‖2 = ‖x− (u− ty)‖2 .

Since the definition of u together with u− ty ∈U imply that ∆(t) ≥ ∆(0) = ‖x−u‖2,the function ∆ has a minimum at t = 0. On the other hand

∆(t) = ‖v+ ty‖2 = (v+ ty,v+ ty)= (v,v)+ t(y,v)+ t(v,y)+ |t|2(y,y) .

First suppose that t is real. Then t = t and d∆

dt (0) = 0 implies

(y,v)+(v,y) = 0 .

Then suppose that t is purely imaginary, Then t =−t and d∆

dt (0) = 0 implies

(y,v)− (v,y) = 0 .

Taking the sum of these two equalities we conclude

(y,v) = 0 for every y ∈U .

Therefore v ∈U⊥.

In order to prove the uniqueness of the representation suppose x = u1 + v1 = u+ vwith u1,u∈U and v1,v∈U⊥. Then u1−u = v−v1. Since u−u1 ∈U and v−v1 ∈U⊥,

‖v− v1‖2 = (v− v1,v− v1) = (v− v1,u1−u) = 0 .

Therefore u and v are unique.

Finally x = u+v with u⊥ v implies ‖x‖2 = ‖u‖2 +‖v‖2. Consequently ‖PU(x)‖=‖u‖ ≤ ‖x‖. We also note that PU(u) = u for any u∈U . So P2

U(x) = PU(x) as PU(x)∈U�

Corollary 8.9 If U is a closed linear subspace in a Hilbert space H and x ∈ H, thenPU(x) is the closest point to x in U.

50

Page 55: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

8.3 Best approximationsTheorem 8.10 Let E be an orthonormal sequence: E = {e j : j ∈J } where J iseither finite or countable set. Then for any x ∈ H, the closest point to x in Span(E) isgiven by

y = ∑j∈J

(x,e j)e j .

Corollary 8.11 If E is an orthonormal basis in a closed subspace U ⊂ H, then theorthogonal projection onto U is given by

PU(x) = ∑j∈J

(x,e j)e j .

Proof: Corollary 7.14 implies that u = ∑ j∈J (x,e j)e j converges. Then obviously u ∈Span(E) which is a closed linear subset. Let v = x−u. Since (v,ek) = (x,ek)−(u,ek) =0 for all k ∈ J, we conclude v ∈ E⊥ = (Span(E))⊥ (Lemma 8.6). Theorem 8.7 impliesthat u is the closest point. �

Example: The best approximation of an element x ∈ `2 in terms of the elements of thestandard basis (e j)n

j=1 is given by

n

∑k=1

(x,e j)e j = (x1,x2, . . . ,xn,0,0, . . .) .

Example: Let (e j)∞j=1 be an orthonormal basis in H. The best approximation of an

element x ∈ H in terms of the first n elements of the orthonormal basis is given by

n

∑k=1

(x,e j)e j .

Now suppose that the set E is not orthonormal. If the set E is finite or countable wecan use the Gram-Schmidt orthonormalisation procedure to construct an orthonormalbasis in Span(E). After that the theorem above gives us an explicit expression for thebest approximation. Let’s consider some examples.

Example: Find the best approximation of a function f ∈ L2(−1,1) with polynomialsof degree up to n. In other words, let E = {1,x,x2, . . . ,xn }. We need to find u ∈Span(E) such that

‖ f −u‖L2 = infp∈Span(E)

‖ f − p‖L2 .

The set E is not orthonormal. Let’s apply the Gram-Schmidt orthonormalisation pro-cedure to construct an orthonormal basis in Span(E). For the sake of shortness, let’swrite ‖ · ‖= ‖ · ‖L2(−1,1).

51

Page 56: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

First note that ‖1‖=√

2 and let

e1 =1√2

.

Then (1,x) =∫ 1−1 xdx = 0 and ‖x‖2 =

∫ 1−1 |x|2 dx = 2

3 so let

e2 =

√32

x .

Then

e3 = x2− (x2,e2)e2− (x2,e1)e1

= x2−√

32

x∫ 1

−1t2

√32

t dt− 1√2

∫ 1

−1t2 1√

2dt

= x2− 12

∫ 1

−1t2 dt = x2− 1

3.

Taking into account that

‖e3‖2 =∫ 1

−1

(x2− 1

3

)2

dt =8

45

we obtain

e3 =e3

‖e3‖=

√58

(3x2−1) .

Exercise: Show that e4 =√

78(5x3−3x) is orthogonal to e1, e2 and e3.

The best approximation of any function f ∈ L2(−1,1) by a polynomial of thirddegree is given by

78(5x3−3x)

∫ 1

−1f (t)(5t3−3t)dt +

58(3x2−1)

∫ 1

−1f (t)(3t2−1)dt

+32

x∫ 1

−1t f (t)dt +

12

∫ 1

−1f (t)dt

For example, if f (x) = |x| its best approximation by a third degree polynomial is

p3 =15x2 +3

16.

We can check (after computing the corresponding integral);

‖ f − p3‖2 =3

16.

52

Page 57: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Note that the best approximation in the L2 norm is not necessarily the best approx-imation in the sup norm. Indeed, for example,

supx∈[−1,1]

∣∣∣∣|x|− 15x2 +316

∣∣∣∣> 316

(the supremum is larger than the values at x = 0). At the same time

supx∈[−1,1]

∣∣∣∣|x|−(x2 +18

)∣∣∣∣= 18

.

8.4 Weierstrass Approximation TheoremIn this section we will prove an approximation theorem which is independent from thediscussions of the previous lectures. This theorem implies that polynomials are densein the space of continuous functions on an interval.

Theorem 8.12 If f : [0,1]→R is continuous on [0,1] then the sequence of polynomials

Pn(x) =n

∑p=0

(np

)f (p/n)xp(1− x)n−p

uniformly converges to f on [0,1].

Proof: The binomial theorem states that

(x+ y)n =n

∑p=0

(np

)xpyn−p .

Differentiating with respect to x and multiplying by x we get

nx(x+ y)n−1 =n

∑p=0

p(

np

)xpyn−p .

Differentiating the original identity twice with respect to x and multiplying by x2 weget

n(n−1)x2(x+ y)n−2 =n

∑p=0

p(p−1)(

np

)xpyn−p .

Now substitute y = 1− x and denote

rp(x) =(

np

)xp(1− x)n−p .

53

Page 58: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

We get

n

∑p=0

rp(x) = 1 ,

n

∑p=0

prp(x) = nx .

n

∑p=0

p(p−1)rp(x) = n(n−1)x2 .

Consequently,

n

∑p=0

(p−nx)2rp(x) =n

∑p=0

p2rp(x)−2nxn

∑p=0

prp(x)+n2x2n

∑p=0

rp(x)

= n(n−1)x2 +nx−2(nx)2 +n2x2 = nx(1− x) .

Let M = supx∈[0,1] | f (x)|. Note that f is uniformly continuous, i.e., for every ε > 0there is δ > 0 such that

|x− y|< δ =⇒ | f (x)− f (y)|< ε .

Now we can estimate

| f (x)−Pn(x)| =

∣∣∣∣∣ f (x)− n

∑p=0

f (p/n)rp(x)

∣∣∣∣∣=

∣∣∣∣∣ n

∑p=0

( f (x)− f (p/n))rp(x)

∣∣∣∣∣≤

∣∣∣∣∣ ∑|x−p/n|<δ

( f (x)− f (p/n))rp(x)

∣∣∣∣∣+

∣∣∣∣∣ ∑|x−p/n|>δ

( f (x)− f (p/n))rp(x)

∣∣∣∣∣The first sum is bounded by∣∣∣∣∣ ∑

|x−p/n|<δ

( f (x)− f (p/n))rp(x)

∣∣∣∣∣≤ ε ∑|x−p/n|<δ

rp(x) < ε

54

Page 59: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

The second sum is bounded by∣∣∣∣∣ ∑|x−p/n|>δ

( f (x)− f (p/n))rp(x)

∣∣∣∣∣ ≤ 2M ∑|nx−p|>nδ

rp(x)

≤n

∑p=0

(p−nx)2

n2δ 2 rp(x)

=2Mx(1− x)

nδ 2 ≤ 2Mnδ 2

which is less than ε for any n > 2Mnδ 2ε

. Therefore for these values of n

| f (x)−Pn(x)|< 2ε .

Consequently,

supx∈[0,1]

| f (x)−Pn(x)| → 0 as n→ ∞. �

Corollary 8.13 The set of polynomials is dense in C[0,1] equipped with the supremumnorm.

55

Page 60: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

9 Separable Hilbert spaces

9.1 Definition and examplesDefinition 9.1 A normed space is separable if it contains a countable dense subset.

In other words, a space H is separable if there is a countable set {xn ∈ H : n ∈ N}such that for any u ∈ H and any ε > 0 there is n ∈ N such that

‖xn−u‖< ε .

Examples: R is separable (Q is dense). Rn is separable (Qn is dense), Cn is separable(Qn + iQn is dense).

Example: `2 is separable. Indeed, the set of sequences (x1,x2, . . . ,xn,0,0,0, . . .) withx j ∈Q is dense and countable.

Example: The space C[0,1] is separable. Indeed, the Weierstrass approximation the-orem states that every continuous function can be approximated (in the sup norm) bya polynomial. The dense countable set is given by polynomials with rational coeffi-cients.

Example: L2(0,1) is separable. Indeed, continuous functions are dense in L2(0,1)(in the L2-norm). The polynomials are dense in C[0,1] (in the supremum norm andtherefore in the L2 norm as well). The set of polynomials with rational coefficients isdense in the set of all polynomials and, consequently, it is also dense in L2[0,1] (in theL2 norm).

9.2 Isometry to `2

If H is a Hilbert space, then its separability is equivalent to existence of a countableorthonormal basis.

Proposition 9.2 An infinite-dimensional Hilbert space is separable iff it has a count-able orthonormal basis.

Proof: If a Hilbert space has a countable basis, then we can construct a countable denseset by taking finite linear combinations of the basis elements with rational coefficients.Therefore the space is separable.

If H is separable, then it contains a countable dense subset V = {xn : n ∈ N}.Obviously, the closed linear span of V coincides with H. First we construct a linearindependent set V which has the same linear span as V by eliminating from V thosexn which are not linearly independent from {x1, . . . ,xn−1 }. Then the Gram-Schmidtprocess gives an orthonormal sequence with the same closed linear span, i.e., it is abasis by characterisation (e) of Proposition 7.17. �

56

Page 61: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

The following theorem shows that all infinite dimensional separable spaces areisometric. So in some sense `2 is essentially the “only” separable infinite-dimensionalspace.

Theorem 9.3 Any infinite-dimensional Hilbert space is isometric to `2.

Proof: Let {e j } be an orthonormal basis in H. The map A : H→ `2 defined by

A : u→ ((u,e1),(u,e2),(u,e3), . . .}

is invertible. Indeed, the image of A is in `2 due to Lemma 7.12, and the inverse mapis given by

A−1 : (xk)∞k=1 7→

∑k=1

xkek .

The characterisation of a basis in Proposition 7.17 implies that ‖u‖H = ‖A(u)‖`2 .�

Note that there are Hilbert spaces which are not separable.

Example: Let J be uncountable. The space of all functions f : J → R such that

∑j∈J| f ( j)|2 < ∞

is a Hilbert space. It is not separable.15

15How do we define the sum over an uncountable set? For any n∈N the set Jn = { j ∈J : | f ( j)|>1n } is finite (otherwise the sum is obviously infinite). Consequently, the set J ( f ) := { j ∈J : | f ( j)|>0} is countable because it is a countable union of finite sets: J ( f ) = ∪∞

n=1Jn. Therefore, the numberof non-zero terms in the sum is countable and the usual definition of an infinite sum can be used.

57

Page 62: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

10 Linear maps between Banach spacesA linear map on a vector space is traditionally called a linear operator. All linearfunctions defined on a finite-dimensional space are continuous. This statement is nolonger true in the case of an infinite dimensional space.

We will begin our study with continuous operators: this class has a rich theory andnumerous applications. We will only slightly touch some of them (the most remarkableexamples will be the shift operators on `2, and integral operators and multiplicationoperators on L2).

Of course many interesting linear maps are not continuous, i.e., the differentialoperator A : f 7→ f ′ on the space of continuously differentiable functions. More accu-rately, let D(A) =C1[0,1]⊂ L2(0,1) be the domain of A Obviously A : D(A)→ L2(0,1)is linear but not continuous. Indeed, consider the sequence xn(t) = n−1 sin(nt). Obvi-ously ‖xn‖L2 ≤ n−1 so xn→ 0, but A(xn) = cos(nt) does not converge to A(0) = 0 inthe L2 norm so A is not continuous.

Some definitions and properties from the theory of continuous linear operators canbe literally extended onto unbounded ones, but sometimes subtle differences appear:e.g., we will see that a bounded operator is self-adjoint iff it is symmetric, which isno longer true for unbounded operators. In a study of unbounded operators a specialattention should be paid to their domains.

10.1 Continuous linear mapsLet U and V be vector spaces over K.

Definition 10.1 A function A : U →V is called a linear operator if

A(αx+βy) = αA(x)+βA(y) for all x,y ∈U and α,β ∈K.

We will often write Ax to denote A(x).The collection of all linear operators from U to V is a vector space. If A,B : U→V

are linear operators and α,β ∈K then we define

(αA+βB)(x) = αAx+βBx .

Obviously, αA+βB is also linear.

Definition 10.2 A linear operator A : U →V is bounded if there is a constant M suchthat

‖Ax‖V ≤M‖x‖U for all x ∈U. (10.1)

If an operator is bounded, then the image of a bounded set is also bounded.

Lemma 10.3 A linear operator A : U →V is continuous iff it is bounded.

58

Page 63: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proof: Suppose A is bounded. Then there is M > 0 such that

‖A(x)−A(y)‖= ‖A(x− y)‖ ≤M‖x− y‖

for all x,y ∈V and consequently A is continuous.Now suppose A is continuous. Obviously A(0) = 0. Then for ε = 1 there is δ > 0

such that ‖A(x)‖< ε = 1 for all ‖x‖< δ . For any u ∈U , u 6= 0,

A(u) =2‖u‖

δA(

δ

2‖u‖u)

.

Since∥∥∥ δ

2‖u‖ u∥∥∥= δ

2 < δ we get ‖A(u)‖ ≤ 2‖u‖δ

and consequently A is bounded. �

The space of all bounded linear operators from U to V is denoted by B(U,V ).

Definition 10.4 The operator norm of A : U →V is

‖A‖B(U,V ) = supx 6=0

‖A(x)‖V‖x‖U

.

We will often write ‖A‖op instead of ‖A‖B(U,V ).

Since A is linear‖A‖B(U,V ) = sup

‖x‖U=1‖A(x)‖V .

We note that ‖A‖B(U,V ) is the smallest M such that (10.1) holds: indeed, it is easy tosee that the definition of operator norm implies

‖A(x)‖V ≤ ‖A‖B(U,V ) ‖x‖U

and (10.1) holds with M = ‖A‖B(U,V ). On the other hand, (10.1) implies M ≥ ‖Ax‖V‖x‖U for

any x 6= 0 and consequently M ≥ ‖A‖B(U,V ).

Theorem 10.5 Let U be a normed space and V be a Banach space. Then B(U,V ) is aBanach space.

Proof: Let (An)∞n=1 be a Cauchy sequence in B(U,V ). Take a vector u ∈ U . The

sequence vn = An(u) is a Cauchy sequence in V :

‖vn− vm‖= ‖An(u)−Am(u)‖= ‖(An−Am)(u)‖ ≤ ‖An−Am‖op‖u‖.

Since V is complete there is v ∈V such that vn→ v. Let A(u) = v.The operator A is linear. Indeed,

A(α1u1 +α2u2) = limn→∞

An(α1u1 +α2u2) = limn→∞

(α1An(u1)+α2An(u2))

= α1 limn→∞

Anu1 +α2 limn→∞

Anu2 = α1Au1 +α2Au2 .

59

Page 64: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

The operator A is bounded. Indeed, An is Cauchy and hence bounded: there is constantM ∈R such that ‖An‖op < M for all n. Taking the limit in the inequality ‖Anu‖≤M‖u‖implies ‖Au‖ ≤M‖u‖. Therefore A ∈ B(U,V ).

Finally, An→ A in the operator norm. Indeed, Since An is Cauchy, for any ε > 0there is N such that ‖An−Am‖op < ε or

‖An(u)−Am(u)‖ ≤ ε‖u‖ for all m,n > N.

Taking the limit as m→ ∞

‖An(u)−A(u)‖ ≤ ε‖u‖ for all n > N.

Consequently ‖An−A‖ ≤ ε and so An→ A. Therefore B(U,V ) is complete. �

10.2 Examples1. Example: Shift operator: Tl,Tr : `2→ `2:

Tr(x) = (0,x1,x2,x3, . . .) and Tl(x) = (x2,x3,x4, . . .) .

Both operators are obviously linear. Moreover,

‖Tr(x)‖2`2 =

∑k=1|xk|2 = ‖x‖`2 .

Consequently, ‖Tr‖op = 1. We also have

‖Tl(x)‖2`2 =

∑k=2|xk|2 ≤ ‖x‖`2 .

Consequently, ‖Tl‖op ≤ 1. However, if x = (0,x2,x3,x4, . . .) then ‖Tl(x)‖`2 =‖x‖`2 . Therefore ‖Tl‖op = 1.

2. Example: Multiplication operator: Let f be a continuous function on [a,b]. Theequation

(Ax)(t) = f (t)x(t)

defines a bounded linear operator A : L2[a,b]→ L2[a,b]. Indeed, A is obviouslylinear. It is bounded since

‖Ax‖2 =∫ b

a| f (t)x(t)|2 dt ≤ ‖ f‖2

∫ b

a|x(t)|dt = ‖ f‖2

∞‖x‖L2 .

Consequently ‖A‖op ≤ ‖ f‖∞. Now let t0 be a maximum of f . If t0 6= b, considerthe characteristic function

xε = χ[t0,t0+ε] .

60

Page 65: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

(If t0 = b let xε = χ[t0−ε,t0].) Since f is continuous,

‖Axε‖‖xε‖

=1ε

∫ t0+ε

t0| f (t)|2dt→ | f (t)|2 as ε → 0.

Therefore ‖A‖op = ‖ f‖∞.

3. Example: Integral operator on L2(a,b):

(Ax)(t) =∫ b

aK(t,s)x(s)ds for all t ∈ [a,b],

where ∫ b

a

∫ b

a|K(s, t)|dsdt < +∞ .

Let us estimate the norm of A:

‖Ax‖2 =∫ b

a

∣∣∣∣∫ b

aK(t,s)x(s)ds

∣∣∣∣2 dt

≤∫ b

a

(∫ b

a|K(t,s)|2ds

∫ b

a|x(s)|2ds

)dt (Cauchy-Schwartz)

=∫ b

a

∫ b

a|K(t,s)|2dsdt ‖x‖2 .

Consequently

‖A‖2op ≤

∫ b

a

∫ b

a|K(t,s)|2dsdt .

Note that this example requires a bit more from the theory of Lebesgue integralsthan we discussed in Section 5. If you are not taking Measure Theory and feeluncomfortable with these integrals, you may assume that x,y and K are continu-ous functions.

10.3 Kernel and rangeDefinition 10.6 Kernel of A:

KerA = {x ∈U : Ax = 0}

Range of A:RangeA = {y ∈V : ∃x ∈U such that y = Ax}

We note that 0 ∈ KerA for any linear operator A. We say that KerA is trivial ifKerA = {0}.

Proposition 10.7 If A ∈ B(U,V ) then KerA is a closed linear subspace of U.

61

Page 66: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proof: If x,y ∈ KerA and α,β ∈K, then

K(αx+βy) = αK(x)+βK(y) = 0 .

Consequently αx +βy ∈ KerA and it is a linear subspace. Furthermore if xn→ x andA(xn) = 0 for all n, then A(x) = 0 due to continuity of A. �

Note that the range is a linear subspace but not necessarily closed (see Examples3).

62

Page 67: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

11 Linear functionals

11.1 Definition and examplesDefinition 11.1 If U is a vector space then a linear map U → K is called a linearfunctional on U,

Definition 11.2 The space of all continuous functionals on a normed space U is calledthe dual space, i.e., U∗ = B(U,K) .

The dual space equipped with the operator norm is Banach. Indeed, K = R or Cwhich are both complete. Then Theorem 10.5 implies that U∗ is Banach.

1. Example: δx( f ) = f (x), x ∈ [a,b], is a bounded linear functional on C[a,b].

2. Example: Let φ ∈ C[a,b] (or φ ∈ L2(a,b)). Then `φ (x) =∫ b

a φ(t)x(t)dt is abounded linear functional on L2(a,b).

11.2 Riesz representation theoremLet H be a Hilbert space. Then for any y ∈ H

`y(x) = (x,y)

is a bounded functional `y : H→K (by the Cauchy-Schwartz inequality). The follow-ing theorem is one of the fundamental results of Functional Analysis: it states that themap y 7→ `y is an isometry between H and its dual space H∗.

Theorem 11.3 (Riesz Representation Theorem) Let H be a Hilbert space. For anybounded linear functional f : H→K there is a unique y ∈ H such that

f (x) = (x,y) for all x ∈ H.

Moreover, ‖ f‖H∗ = ‖y‖H .

Proof: Let K = Ker f . It is a closed linear subspace of H. If K = H then f (x) = 0for all x and the statement of the theorem is true with y = 0. Otherwise K⊥ 6= {0} andthere is a vector z ∈ K⊥ with ‖z‖H = 1.

Now we show that dimK⊥ = 1. Indeed, let u ∈ K⊥. Since K⊥ is a linear subspacev = f (z)u− f (u)z ∈ K⊥. On the other hand

f (v) = f ( f (z)u− f (u)z) = f (z) f (u)− f (u) f (z) = 0

and so v ∈ K. For any linear subspace K ∩K⊥ = {0}, and so v = 0. Then f (z)u−f (u)z = v = 0, i.e. u = f (u)

f (z) z. Consequently {z} is the basis in K⊥.

63

Page 68: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Theorem 8.7 implies that every vector x ∈ H can be written uniquely in the form

x = u+ v where u ∈ K and v ∈ K⊥.

Since K⊥ is one dimensional and ‖z‖= 1, u = (x,z)z. Moreover,

f (x) = f (u)+ f (v) = f (u) = (x,z) f (z) = (x, f (z)z) .

Set y = f (z)z to get the desired equality:

f (x) = (x,y) .

If there is another y′ ∈ H such that f (x) = (x,y′) for all x ∈ H, then (x,y) = (x,y′) forall x, i.e., (x,y− y′) = 0. Setting x = y− y′ we conclude ‖y− y′‖2 = 0, i.e. y = y′ isunique.

Finally, the Cauchy-Schwartz inequality implies

| f (x)|= |(x,y)| ≤ ‖x‖‖y‖,

i.e., ‖ f‖H∗ = ‖ f‖op ≤ ‖y‖. On the other hand,

‖ f‖op ≥| f (y)|‖y‖

=|(y,y)|‖y‖

= ‖y‖ .

Consequently, ‖ f‖H∗ = ‖y‖H . �

64

Page 69: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

12 Linear operators on Hilbert spaces

12.1 ComplexificationIn the next lectures we will discuss the spectral theory of linear operators. The spectraltheory looks more natural in complex spaces. For example, a part of the theory studieseigenvalues and eigenvectors of linear maps (i.e. non-zero solutions of the equationAx = λx). In the finite-dimensional space a linear operator can be describe by a ma-trix. You already know that a matrix (even a real one) can have complex eigenvalues.Fortunately a real Hilbert space can always be considered as a part of a complex onedue to the “complexification” procedure.

Definition 12.1 Let H be a Hilbert space of R. The complexification of H is the vectorspace

HC = {x+ iy : x,y ∈ H }where the addition and multiplication are respectively defined by

(x+ iy)+(u+ iw) = (x+u)+ i(y+w)(α + iβ )(x+ iy) = (αx−βy)+ i(αy+βx) .

The inner product is defined by

(x+ iy,u+ iw) = (x,u)− i(x,w)+ i(y,u)+(y,w) .

Exercise: Show that HC is a Hilbert space.

Exercise: Show that ‖x+ iy‖2HC

= ‖x‖2 +‖y‖2 for all x,y ∈ H.

The following lemma states that any bounded operator on H can be extended to abounded operator on HC.

Lemma 12.2 Let H be a real Hilbert space and A : H → H be a bounded operator.Then

AC(x+ iy) = A(x)+ iA(y)

is a bounded operator HC→ HC.

Exercise: Prove the lemma.

12.2 Adjoint operatorsTheorem 12.3 If A : H→ H is a bounded linear operator on a Hilbert space H, thenthere is a unique bounded operator A∗ : H→ H such that

(Ax,y) = (x,A∗y) for all x,y ∈ H .

Moreover, ‖A∗‖op ≤ ‖A‖op.

65

Page 70: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proof: Let y ∈ H and f (x) = (Ax,y) for all x ∈ H. The map f : H→K is linear and

| f (x)|= |(Ax,y)| ≤ ‖Ax‖‖y‖ ≤ ‖A‖op‖x‖‖y‖

where we have used the Cauchy-Schwartz inequality. Consequently, f is a boundedfunctional on H. The Riesz representation theorem implies that there is a unique z∈Hsuch that

(Ax,y) = (x,z) for all x ∈ H.

Define the function A∗ : H→ H by A∗y = z. Then

(Ax,y) = (x,A∗y) for all x,y ∈ H.

First, A∗ is linear since for any x,y1,y2 ∈ H and α1,α2 ∈K

(x,A∗(α1y1 +α2y2)) = (Ax,α1y1 +α2y2) = α1(Ax,y1)+ α2(Ax,y2)= α1(x,A∗y1)+ α2(x,A∗y2) = (x,α1A∗y1 +α2A∗y2) .

Since the equality is valid for all x ∈ H, it implies

A∗(α1y1 +α2y2) = α1A∗y1 +α2A∗y2 .

Second, A∗ is bounded since

‖A∗y‖2 = (A∗y,A∗y) = (AA∗y,y)≤ ‖AA∗y‖‖y‖ ≤ ‖A‖op‖A∗y‖‖y‖ .

Dividing by ‖A∗y‖ (do not forget to consider the case A∗y = 0 separately), we concludethat

‖A∗y‖ ≤ ‖A‖op‖y‖ .

Therefore A∗ is bounded and ‖A∗‖op ≤ ‖A‖op. �

Definition 12.4 The operator A∗ from Theorem 12.3 is called the adjoint operator.

1. Example: If A : Cn → Cn, then A∗ is the Hermitian conjugate of A, i.e. ifA∗ = AT (the complex conjugate of the transposed matrix).

2. Example: Integral operator on L2(0,1)

(Ax)(t) =∫ 1

0K(t,s)x(s)ds

The adjoint operator

(A∗y)(s) =∫ 1

0K(t,s)y(t)dt

66

Page 71: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Indeed, for any x,y ∈ L2(0,1):

(Ax,y) =∫ 1

0

(∫ 1

0K(t,s)x(s)ds

)y(t)dt

=∫ 1

0

∫ 1

0K(t,s)x(s)y(t)dsdt

=∫ 1

0x(s)

(∫ 1

0K(t,s)y(t)dt

)ds

= (x,A∗y) .

Note that we used Fubini’s Theorem to change the order of integration.

3. Example: Shift operators: T ∗l = Tr and T ∗r = Tl . Indeed,

(Trx,y) =∞

∑k=1

xkyk+1 = (x,Tly) .

The following lemma states some elementary properties of adjoint operators.

Lemma 12.5 If A,B : H→H are bounded operators on a Hilbert space H and α,β ∈C, then

1. (αA+βB)∗ = αA∗+ βB∗

2. (AB)∗ = B∗A∗

3. (A∗)∗ = A

4. ‖A∗‖= ‖A‖

5. ‖A∗A‖= ‖AA∗‖= ‖A‖2

Proof: Statements 1—3 follow directly from the definition of an adjoint operator (Ex-ercise). Statement 4 follows from 3 and the estimate of Theorem 12.3: indeed,

‖A∗‖ ≤ ‖A‖= ‖(A∗)∗‖ ≤ ‖A∗‖.

Finally since

‖Ax‖2 = (AX ,Ax) = (x,A∗Ax)≤ ‖x‖‖A∗Ax‖ ≤ ‖A∗A‖‖x‖2

implies ‖A‖2 ≤ ‖AA∗‖ and on the other hand ‖A∗A‖ ≤ ‖A∗‖‖A‖ = ‖A‖2, it followsthat ‖A∗A‖= ‖A‖2. �

67

Page 72: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

12.3 Self-adjoint operatorsDefinition 12.6 A linear operator A is self-adjoint, if A∗ = A.

Lemma 12.7 An operator A ∈ B(H,H) is self-adjoint iff

(x,Ay) = (Ax,y) for all x,y ∈ H.

1. Example: H = Rn, a linear map defined by a symmetric matrix is self-adjoint.

2. Example: H = Cn, a linear map defined by a Hermitian matrix is self-adjoint.

3. Example: A : L2(0,1)→ L2(0,1)

A f (t) =∫ 1

0K(t,s) f (s)ds

with real symmetric K, K(t,s) = K(s, t), is self-adjoint.

Theorem 12.8 Let A be a self-adjoint operator on a Hilbert space H. Then all eigen-values of A are real and the eigenvectors corresponding to distinct eigenvalues areorthogonal.

Proof: Suppose Ax = λx with x 6= 0. Then

λ‖x‖2 = (λx,x) = (Ax,x) = (x,A∗x) = (x,Ax) = (x,λx) = λ‖x‖2 .

Consequently, λ is real.Now if λ1 and λ2 are distinct eigenvalues and Ax1 = λ1x1, Ax2 = λ2x2, then

0 = (Ax1,x2)− (x1,Ax2) = (λ1x1,x2)− (x1,λ2x2) = (λ1−λ2)(x1,x2).

Since λ1−λ2 6= 0, we conclude (x1,x2) = 0. �

Exercise: Let A be a self-adjoint operator on a real space. Show that the complexifi-cation of A is also self-adjoint and has the same eigenvalues as the original operatorA.

Theorem 12.9 If A is a bounded self-adjoint operator then

1. (Ax,x) is real for all x ∈ H

2. ‖A‖op = sup‖x‖=1 |(Ax,x)|

68

Page 73: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proof: For any x ∈ H(Ax,x) = (x,Ax) = (Ax,x)

which implies (Ax,x) is real. Now let

M = sup‖x‖=1

|(Ax,x)| .

The Cauchy-Schwartz inequality implies

|(Ax,x)| ≤ ‖Ax‖‖x‖ ≤ ‖A‖op‖x‖2 = ‖A‖op

for all x ∈ H such that ‖x‖= 1. Consequently M ≤ ‖A‖op. On the other hand, for anyu,v ∈ H we have

4Re(Au,v) = (A(u+ v),u+ v)− (A(u− v),u− v)≤ M

(‖u+ v‖2 +‖u− v‖2)

= 2M(‖u‖2 +‖v‖2)

using the parallelogram law. If Au 6= 0 let

v =‖u‖‖Au‖

Au

to obtain, since ‖u‖= ‖v‖, that

‖u‖‖Au‖ ≤M‖u‖2 .

Consequently ‖Au‖ ≤M‖u‖ (for all u, including those with Au = 0) and ‖A‖op ≤M.Therefore ‖A‖op = M. �

69

Page 74: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Unbounded operators and their adjoint operators16

The notions of adjoint and self-adjoint operators play an important role in the generaltheory of linear operators. If an operator is not bounded a special care should be takenin the consideration of its domain of definition.

Let D(A) be a linear subspace of a Hilbert space H, and A : D(A)→ H be a linearoperator. If D(A) is dense in H we say that A is densely defined.

Example: Consider the operator A( f ) = d fdt on the set of all continuously differentiable

functions, i.e., D(A) = C1[0,1]⊂ L2(0,1). This operator is densely defined.

Given a densely defined linear operator A on H, its adjoint A∗ is defined as follows:

• D(A∗), the domain of A∗, consists of all vectors x ∈ H such that

y 7→ (x,Ay)

is a continuous linear functional D(A)→K. By continuity and density of D(A),it extends to a unique continuous linear functional on all of H.

• By the Riesz representation theorem, if x∈D(A∗), there is a unique vector z∈Hsuch that

(x,Ay) = (z,y) for all y ∈ D(A).

This vector z is defined to be A∗x.

It can be shown that the dependence of A∗ : D(A∗)→ H is linear.

Note that two properties play a key role in this definition: the density of the domainof A in H, and the uniqueness part of the Riesz representation theorem.

A linear operator is symmetric if (Ax,y) = (x,Ay) for all x,y ∈ D(A). From thisdefinition we see that D(A)⊆D(A∗) and symmetric A coincides with the restriction ofA∗ onto D(A). An operator is self adjoint if A = A∗, i.e., it is symmetric and D(A) =D(A∗). In general, the condition for a linear operator on a Hilbert space to be self-adjoint is stronger than to be symmetric. If an operator is bounded then it is normallyassumed that D(A) = D(A∗) = H and therefore a symmetric operator is self-adjoint.

The Hellinger-Toeplitz theorem states that an everywhere defined symmetric oper-ator on a Hilbert space is bounded.

16Optional topic

70

Page 75: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

13 Introduction to Spectral Theory

13.1 Point spectrumLet H be a complex Hilbert space and A : H → H a linear operator. If Ax = λx forsome x ∈ H, x 6= 0, and λ ∈ C, then λ is an eigenvalue of A and x is an eigenvector.The space

Eλ = {x ∈ H : Ax = λx}is called the eigenspace.

Exercise: Prove the following: If A ∈ B(H,H) and λ is an eigenvalue of A, then Eλ isa closed linear subspace in H. Moreover, Eλ is invariant, i.e., A(Eλ ) = Eλ .

Definition 13.1 The point spectrum of A consists of all eigenvalues of A:

σp(A) = {λ ∈ C : Ax = λx for some x ∈ H, x 6= 0} .

Proposition 13.2 If A : H→ H is bounded and λ is its eigenvalue then

‖λ‖ ≤ ‖A‖op.

Proof: If Ax = λx with x 6= 0, then

‖A‖op = supy6=0

‖Ay‖‖y‖

≥ ‖Ax‖‖x‖

= |λ | . �

Examples:

1. A linear map on an n-dimensional vector space has at least one and at most ndifferent eigenvalues.

2. The right shift Tr : `2→ `2 has no eigenvalues, i.e., the point spectrum is empty.Indeed, suppose Trx = λx, then

(0,x1,x2,x3,x4, . . .) = λ (x1,x2,x3,x4, . . .)

implies 0 = λx1, x1 = λx2, x2 = λx3, . . . If λ 6= 0, we divide by λ and concludex1 = x2 = · · ·= 0. If λ = 0 we also get x = 0. Consequently

σp(Tr) = /0.

3. The point spectrum of the left shift Tl : `2 → `2 is the open unit disk. Indeed,suppose Tlx = λx with λ ∈ C. Then

(x2,x3,x4, . . .) = λ (x1,x2,x3,x4, . . .)

is equivalent to x2 = λx1, x3 = λx2, x4 = λx3, . . . Consequently, x = (xk)∞k=1 with

xk = λ k−1x1 for all k≥ 2. This sequence belongs to `2 if and only if ∑∞k=1 |xk|2 =

∑∞k=1 |x1| |λ |2k converges or equivalently |λ |< 1. Therefore

σp(Tl) = {λ ∈ C : |λ |< 1} .

71

Page 76: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

13.2 Invertible operatorsLet us discuss the concept of an inverse operator.

Definition 13.3 (injective operator) We say that A : U→V is injective if the equationAx = y has a unique solution for every y ∈ Range(A).

Definition 13.4 (bijective operator) We say that A : U→V is bijective if the equationAx = y has exactly one solution for every y ∈V .

Definition 13.5 (inverse operator) We say that A is invertible if it is bijective. Thenthe equation Ax = y has a unique solution for all y ∈V and we define A−1y = x.

1. Exercise: Show that A−1 is a linear operator.

2. Exercise: Show that if A−1 is invertible, then A−1 is also invertible and

(A−1)−1 = A.

3. Exercise: Show that if A and B are two invertible linear operators, then AB isalso invertible and (AB)−1 = B−1A−1.

Proposition 13.6 A linear operator A : U →V is invertible iff

Ker(A) = {0} and Range(A) = V.

Exercise: prove the proposition.

We will use IV : V → V to denote the identity operator on V , i.e., IV (x) = x for allx ∈ V . Moreover, we will skip the subscript V if there is no danger of a mistake. It iseasy to see that if A : U →V is invertible then

AA−1 = IV and A−1A = IU .

Example: The right shift Tr : `2→ `2 has a trivial kernel and

TlTr = I.

but it is not invertible since Range(Tr) 6= `2. (Indeed, any sequence in the range of Trhas a zero on the first place). Consequently, the equality AB = I along does not implythat B = A−1.

Lemma 13.7 If A : U →V and B : V →U are linear operators such that

AB = IV and BA = IU

then A and B are both invertible and B = A−1.

Proof: The equality ABy = y for all y ∈V implies that KerB = {0} and RangeA = V .On the other hand BAx = x for all x ∈ U implies KerA = {0} and RangeB = U .Therefore both A and B satisfy the definition of invertible operator. �

72

Page 77: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

13.3 Resolvent and spectrumLet A : V → V be a linear operator on a vector space V . A complex number λ is aneigenvalue of A if Ax = λx for some x 6= 0. This equation is equivalent to (A−λ I)x = 0.Then we immediately see that A− λ I is not invertible since 0 has infinitely manypreimages: α x with α ∈ C.

If V is finite dimensional the reversed statement is also true: if A−λ I is not invert-ible then λ is an eigenvalue of A (recall the Fredholm alternative from the first yearLinear Algebra). In the infinite dimensional case this is not necessarily true.

Definition 13.8 (resolvent set and spectrum) The resolvent set of a linear operatorA : H→ H is defined by

R(A) = {λ ∈ C : (A−λ I)−1 ∈ B(H,H)} .

The resolvent set consists of regular values. The spectrum is the complement to theresolvent set in C:

σ(A) = C\R(A) .

Note that the definition of the resolvent set assumes existence of the inverse opera-tor (A−λ I)−1 for λ ∈R(A). If λ ∈σp(A) then (A−λ I) is not invertible. Consequentlyany eigenvalue λ ∈ σ(A) and

σp(A)⊆ σ(A) .

The spectrum of A can be larger than the point spectrum.

Example: The point spectrum of the right shift operator Tr is empty but sinceRangeTr 6= `2 it is not invertible and therefore 0 ∈ σ(Tr). So σp(Tr) 6= σ(Tr).

Technical lemmasLemma 13.9 If T ∈ B(H,H) and ‖T‖< 1, then (I−T )−1 ∈ B(H,H). Moreover

(I−T )−1 = I +T +T 2 +T 3 + . . .

and‖(I−T )−1‖ ≤ (1−‖T‖)−1 .

Proof: Consider the sequence Vn = I +T +T 2 + · · ·+T n. Since

‖T nx‖ ≤ ‖T‖‖T n−1x‖

we conclude that ‖T n‖ ≤ ‖T‖n. Consequently for any m > n we have

‖Vm−Vn‖ =∥∥T n+1 +T n+2 + · · ·+T m∥∥

≤ ‖T‖n+1 +‖T‖n+2 + · · ·+‖T‖m

=‖T‖n+1−‖T‖m+1

1−‖T‖≤ ‖T‖

n+1

1−‖T‖.

73

Page 78: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Since ‖T‖ < 1, Vn is a Cauchy sequence in the operator norm. The space B(H,H) iscomplete and there is V ∈ B(H,H) such that Vn→V . Moreover,

‖V‖ ≤ 1+‖T‖+‖T‖2 + · · ·= (1−‖T‖)−1 .

Finally, taking the limit as n→ ∞ in the equalities

Vn(I−T ) = Vn−VnT = I−T n+1 ,

(I−T )Vn = Vn−TVn = I−T n+1

and using that T n+1 → 0 in the operator norm we get V (I − T ) = (I − T )V = I.Lemma 13.7 implies (I−T )−1 = V . �

Lemma 13.10 Let H be a Hilbert space and T,T−1 ∈ B(H,H). If U ∈ B(H,H) and‖U‖< ‖T−1‖−1, then the operator T +U is invertible and

∥∥(T +U)−1∥∥≤ ‖T−1‖1−‖U‖‖T−1‖

.

Proof: Consider the operator V = T−1(T +U) = I +T−1U . Since

‖T−1U‖ ≤ ‖T−1‖‖U‖< 1,

Lemma 13.9 implies that V is invertible and

‖V−1‖ ≤(1−‖T−1‖‖U‖

)−1.

Moreover, since both V and T are invertible

I = V−1V = V−1T−1(T +U)

implies that T +U is invertible with

(T +U)−1 = V−1T−1 .

Finally, ‖(T +U)−1‖ ≤ ‖V−1‖‖T−1‖ implies the desired upper bound for the norm ofthe inverse operator. �

74

Page 79: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Properties of the spectrumLemma 13.11 If A is bounded and λ ∈ σ(A) then λ ∈ σ(A∗).

Proof: If λ ∈ R(A) then A−λ I has a bounded inverse:

(A−λ I)(A−λ I)−1 = I = (A−λ I)−1(A−λ I) .

Taking adjoints we obtain((A−λ I)−1)∗ (A∗− λ I) = I = (A∗− λ I)

((A−λ I)−1)∗ .

Consequently, (A∗− λ I) has a bounded inverse((A−λ I)−1)∗ (an adjoint of a bounded

operator). Therefore λ ∈ R(A) iff λ ∈ R(A∗). Since the spectrum is the complementof the resolvent set we also get λ ∈ σ(A) iff λ ∈ σ(A∗). �

Proposition 13.12 If A is bounded and λ ∈ σ(A) then |λ | ≤ ‖A‖op.

Proof: Take λ ∈ C such that |λ | > ‖A‖op. Since ‖λ−1A‖op < 1 Lemma 13.9 impliesthat I−λ−1A is invertible and the inverse operator is bounded. Consequently, A−λ I =−λ (I−λ−1A) also has a bounded inverse and so λ ∈ R(A). The proposition followsimmediately since σ(A) is the complement of R(A). �

Proposition 13.13 If A is bounded then R(A) is open and σ(A) is closed.

Proof: Let λ ∈ R(A). Then T = (A− λ I) has a bounded inverse. Set U = −δ I.Obviously, ‖U‖= |δ |. If

|δ | ≤ ‖T−1‖−1

Lemma 13.10 implies that T +U = A− (λ + δ )I also has a bounded inverse and soλ +δ ∈ R(A). Consequently R(A) is open and σ(A) = C\R(A) is closed. �

Example: The spectrum of Tl and of Tr are both equal to the closed unit disk on thecomplex plane.

Indeed, σp(Tl) = {λ ∈C : |λ |< 1}. Since σp(Tl)⊂ σ(Tl) and σ(Tl) is closed, weconclude that σ(Tl) includes the closed unit disk. On the other hand, Proposition 13.12implies that σ(Tl) is a subset of the closed disk |λ | ≤ ‖Tl‖op = 1. Therefore

σ(Tl) = {λ ∈ C : |λ | ≤ 1} .

Since Tr = T ∗l and σ(Tl) is invariant, Lemma 13.11 implies σ(Tr) = σ(Tl).

75

Page 80: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

13.4 Compact operatorsDefinition 13.14 Let X be a normed space and Y be a Banach space. Then a linearoperator A : X→Y is compact if the image of any bounded sequence has a convergentsubsequence.

Obviously a compact operator is bounded. Indeed, otherwise there is a sequencexn with ‖xn‖ = 1 such that ‖Axn‖ > n for each n. The sequence Axn does not containa convergent subsequence (it does not even contain a bounded subsequence) and so isnot sequentially compact.

Lemma 13.15 Let X be a normed space and Y be Banach. A linear operator A : X →Y is compact iff the image of the unit sphere is sequentially compact.

Example: Any bounded operator with finite-dimensional range is compact. Indeed, ina finite dimensional space any bounded sequence has a convergent subsequence.

Theorem 13.16 If X is a normed space and Y is a Banach space, then compact linearoperators form a closed linear subspace in B(X ,Y ).

Proof: If K1,K2 are compact operators and α1,α2 ∈K, then α1K1 +α2K2 is also com-pact. Indeed, take any bounded sequence (xn) in H. There is a subsequence xn1 j

such that K1xn1 j converges. This subsequence is also bounded, so it contains a subse-quence xn2 j such that K2xn2 j converges. Obviously K1xn2 j also converges and thereforeα1K1xn2 j + α2K2xn2 j is convergent and consequently α1K1 + α2K2 is compact There-fore the compact operators form a linear subspace.

Let us prove that this subspace is closed. Let Kn be a convergent sequence ofcompact operators: Kn→ K in B(H,H). Take any bounded sequence (xn) in X . SinceK1 is compact, there is a subsequence xn1 j such that K1xn1 j converges. Since xn1 j isbounded and K2 is compact, there is a subsequence xn2 j such that K2xn2 j converges.Repeat this inductively: for each k there is a subsequence xnk j of the original sequencesuch that Klxnk j converges as j→ ∞ for all l ≤ k.

Consider the diagonal sequence y j = xn j j . Obviously (y j)∞j=k is a subsequence of

(xnk j)∞j=1. Consequently Kly j converges as j→ ∞ for every l.

In order to show that K is compact it is sufficient to prove that Ky j is Cauchy:

‖Ky j−Kyl‖ ≤ ‖Ky j−Kny j‖+‖Kny j−Knyl‖+‖Kyl−Knyl‖≤ ‖K−Kn‖

(‖y j‖+‖yl‖

)+‖Kny j−Knyl‖ .

Given ε > 0 choose n sufficiently large to ensure that the first term is less than ε

2 ,then choose N sufficiently large to guarantee that the second term is less than ε

2 forall j, l > N. So Ky j is Cauchy and consequently converges. Therefore K is a compactoperator, and the subspace formed by compact operators is closed. �

76

Page 81: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proposition 13.17 The integral operator A : L2(a,b)→ L2(a,b) defined by

(A f )(t) =∫ b

aK(t,s) f (s)ds with

∫ b

a

∫ b

a|K(t,s)|2 dsdt < ∞

is compact.

Proof: Let {ϕk : k ∈ N} be an orthonormal basis in L2(a,b). Let κ jk = (Aϕ j,ϕk) andn ∈ N, and define an operator

An f =n

∑k=1

n

∑j=1

κ jk( f ,ϕ j)ϕk .

Obviously dimRange(An) = n and consequently An is compact. In order to completethe proof we need to show that An→ A in the operator norm.17 �

17We will not discuss this proof further.

77

Page 82: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

13.5 Spectral theory for compact self-adjoint operatorsLemma 13.18 Let H be an infinitely dimensional Hilbert space and T : H → H acompact self-adjoint operator. Then at least one of λ± = ±‖T‖op is an eigenvalue ofT .

Proof: Assume T 6= 0 (otherwise the lemma is trivial). Since

‖T‖op = sup‖x‖=1

|(T x,x)|

there is a sequence xn ∈ H such that ‖xn‖= 1 and (T xn,xn)→±‖T‖op = α . Since Tis compact, yn = T xn has a convergent subsequence. Relabel this subsequence as xnand let y = limn→∞ T xn. Then

‖T xn−αxn‖2 = ‖T xn‖2−2α(T xn,xn)+α2 ≤ 2α

2−2α(T xn,xn) .

The right hand side converges to 0 as n→ ∞. Consequently T xn−αxn→ 0. On theother hand T xn→ y and consequently

xn→ x = α−1y .

The operator T is continuous and consequently T x = αx. Finally, since ‖xn‖ = 1 forall n, we have ‖x‖= 1, and consequently α is an eigenvalue. �

Proposition 13.19 Let H be an infinitely dimensional Hilbert space and T : H → Ha compact self-adjoint operator. Then σp(T ) is either a finite set or countable se-quence tending to zero. Moreover, every non-zero eigenvalue corresponds to a finitedimensional eigenspace.

Proof: Suppose there is ε > 0 such that T has infinitely many different eigenvalueswith |λn|> ε . Let xn be corresponding eigenvectors with ‖xn‖= 1. Since the operatoris self-adjoint, this sequence is orthonormal and for any n 6= m

‖T xn−T xm‖2 = ‖λnxn−λmxm‖2

= (λnxn−λmxm,λnxn−λmxm) = |λn|2 + |λm|2 > 2ε .

Consequently, (T xn) does not have a convergent subsequence. This contradicts tothe compactness of T . Consequently, σp(T ) is either finite or a converging to zerosequence.

Now let λ 6= 0 be an eigenvalue and Eλ the corresponding eigenspace. Let A :Eλ → Eλ be the restriction of A onto Eλ . Since Ax = λx for any x ∈ Eλ , the operatorA maps the unit sphere into the sphere of radius λ . Since A is compact, the image ofthe unit sphere is sequentially compact. Therefore the sphere of radius λ is compact.Since Eλ is a Hilbert (and consequently Banach) space itself, Theorem 3.23 impliesthat Eλ is finite dimensional. �

78

Page 83: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Theorem 13.20 (Hilbert-Schmidt theorem) Let H be a Hilbert space and T : H→Hbe a compact self-adjoint operator. Then there is a finite or countable orthonormalsequence (en) of eigenvectors of T with corresponding real eigenvalues (λn) such that

T x = ∑j

λ j (x,e j)e j for all x ∈ H.

Proof: We construct the sequence e j inductively. Let H1 = H and T1 = T : H1 →H1. Lemma 13.18 implies that there is an eigenvector e1 ∈ H1 with ‖e1‖ = 1 and aneigenvalue λ1 ∈ R such that |λ1|= ‖T1‖B(H1,H1).

Then let H2 = {x ∈ H1 : x ⊥ e1 }. If x ∈ H2 then T x ∈ H2. Indeed, since T is selfadjoint

(T x,e1) = (x,Te1) = λ1(x,e1) = 0

and T x ∈ H2. Therefore the restriction of T onto H2 is an operator T2 : H2 → H2.Since H2 is an orthogonal complement, it is closed and so a Hilbert space itself.Lemma 13.18 implies that there is an eigenvector e2 ∈H2 with ‖e2‖= 1 and an eigen-value λ2 ∈R such that |λ2|= ‖T2‖B(H2,H2). Then let H3 = {x∈H2 : x⊥ e2 } and repeatthe procedure as long as Tn is not zero.

Suppose Tn = 0 for some n ∈ N. Then for any x ∈ H let

y = x−n−1

∑j=1

(x,e j)e j .

Applying T to the equality we get:

Ty = T x−n−1

∑j=1

(x,e j)Te j = T x−n−1

∑j=1

(x,e j)λ je j .

Since y⊥ e j for j < n we have y ∈ Hn and consequently Ty = Tny = 0. Therefore

T x =n−1

∑j=1

(x,e j)λ je j

which is the required formula for T .Suppose Tn 6= 0 for all n ∈ N. Then for any x ∈ H and any n consider

yn = x−n−1

∑j=1

(x,e j)e j .

Since yn ⊥ e j for j < n we have y ∈ Hn and

‖x‖2 = ‖yn‖2 +n−1

∑j=1|(x,e j)|2

79

Page 84: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

implies ‖yn‖2 ≤ ‖x‖2. On the other hand∥∥∥∥∥T x−n−1

∑j=1

(x,e j)λ je j

∥∥∥∥∥= ‖Tyn‖ ≤ ‖Tn‖‖yn‖ ≤ |λn|‖x‖

and since λn→ 0 as n→ ∞ we have

T x =∞

∑j=1

(x,e j)λ je j . �

Corollary 13.21 Let H be an infinite dimensional separable Hilbert space and T :H→H a compact self-adjoint operator. Then there is an orthonormal basis E = {e j :j ∈ N} in H such that Te j = λ je j for all j ∈ N and

T x =∞

∑j=1

λ j (x,e j)e j for all x ∈ H.

Exercise: Deduce that operators with finite range are dense among compact self-adjoint operators.

Theorem 13.22 If H is an infinite dimensional Hilbert space and T : H → H is acompact self-adjoint operator, then σ(T ) = σp(T ).

Proposition 13.19 implies that zero is the only possible limit point of σp(T ). There-fore the theorem means that either σ(T ) = σp(T ) or σ(T ) = σp(T )∪{0}. In partic-ular, σ(T ) = σp(T ) if zero is an eigenvalue. Note that if zero is not an eigenvalue,then there is a sequence of eigenvalues which accumulates to zero since H is infinitedimensional.

Proof: According to the Hilbert-Schmidt theorem

T x =∞

∑j=1

λ j(x,e j)e j

where {e j } is an orthonormal basis in H.18 Then x = ∑∞j=1(x,e j)e j and for any µ ∈C

(T −µI)x =∞

∑j=1

(λ j−µ)(x,e j)e j .

18The proof uses a countable basis in H, therefore it assumes that H is separable. The theorem remainvalid for a non-separable H but the proof should be slightly modified. The modification is based on thefollowing observation: Proposition 13.19 implies that (KerT )⊥ has a countable orthonormal basis ofeigenvectors {e j}. Then for any vector x ∈ H write x = PKerT (x) + ∑

∞j=1(x,e j)e j where PKerT is the

orthogonal projection on the kernel of T . Then follow the arguments of the proof (adding µ−1PKerT (y)to the definition of S).

80

Page 85: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Let µ ∈C\σp(T ) which is an open subset of C. Consequently there is ε > 0 such that|µ−λ |> ε for all λ ∈ σp(T )⊂ σp(T ). Consider an operator S defined by

Sy =∞

∑k=1

(y,ek)λk−µ

ek .

Lemma 7.12 implies that the series converges since |λk−µ|> ε and

‖Sy‖2 =∞

∑k=1

∣∣∣∣ (y,ek)λk−µ

∣∣∣∣2 ≤ ε−2

∑j=k|(y,ek)|2 = ε

−2‖y‖2 .

In particular we see that S is bounded with ‖S‖op ≤ ε−1. Moreover S = (T − µI)−1.Indeed,

(T −µI)Sy =∞

∑j=1

(λ j−µ)(Sy,e j)e j =∞

∑j=1

λ j−µ

λ j−µ(y,e j)e j = y .

and

S(T −µI)x =∞

∑j=1

((T −µI)x,e j)λ j−µ

e j =∞

∑j=1

λ j−µ

λ j−µ(x,e j)e j = x .

Then S = (T − µI)−1 and µ ∈ R(T ), and so σ(T ) ⊆ σp(T ). On the other hand,σp(T )⊆ σ(T ). We conclude σ(T ) = σp(T ). �

81

Page 86: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

14 Sturm-Liouville problemsIn this chapter we will study the Sturm-Liouville problem:

− ddx

(p(x)

dudx

)+q(x)u = λu with u(a) = u(b) = 0

where p and q are given function. The values of λ for which the problem has a non-trivial solution are called eigenvalues of the Sturm-Liouville problem and the corre-sponding solutions u are called eigenfunctions. We will prove that the eigenfunctionsform an orthonormal basis in L2(a,b).

We assume that p ∈C1[a,b], q ∈C[a,b] and

p(x) > 0 and q(x)≥ 0 for x ∈ [a,b].

For a function u ∈C2[a,b] we define

L(u) =− ddx

(p(x)

dudx

)+q(x)u .

Obviously L : C2[a,b]→ C0[a,b] is linear. We will see that RangeL = C0[a,b] andKerL = Span{u1,u2 }where u1,u2 are two linear independent solutions of the equationLu = 0. Therefore the operator is not invertible.

We will restrict L onto the space

D(L) :={

u ∈C2[a,b] : u(a) = u(b) = 0}

and show that L : D(L)→C0[a,b] is invertible. We will prove that L−1 is a restrictionon RangeL of a compact self-adjoint operator A : L2(a,b)→ L2(a,b). The Sturm-Liouville theorem states that the eigenfunction of A form an orthonormal basis inL2(a,b). Moreover, we will see that A and L have the same eigenfunctions.

Lemma 14.1 If both u1 and u2 are solutions of the equation Lu = 0, then

Wp(u1,u2)(x) = p(x)(u′1(x)u2(x)−u1(x)u′2(x))

is constant. Moreover, the solutions u1 and u2 are linearly independent iff

Wp(u1,u2) 6= 0.

Proof: The equation Lu = 0 implies pu′′ = −p′u′+ qu. Then differentiate W withrespect to x:

W ′p = p′(u′1u2−u1u′2)+ p(u′′1u2−u1u′′2)

= p′(u′1u2−u1u′2)+((−p′u′1 +qu1)u2− (−p′u′2 +qu2)u1) = 0 .

82

Page 87: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Therefore Wp is constant.Finally, since p 6= 0, Wp = 0 implies u′1(x)u2(x)−u1(x)u′2(x) = 0 for all x. Conse-

quently at all points where u1 and u2 do not vanish

u′1u1

=u′2u2

⇔ d ln |u1|dx

=d ln |u2|

dx,

which implies that u1 = Cu2 for some constant C, i.e., if Wp = 0 then u1 and u2 are notlinearly independent. In the reverse direction the statement is straightforward. �

Lemma 14.2 The equation Lu = 0 has two linearly independent solutions, u1,u2 ∈C2[a,b], such that u1(a) = u2(b) = 0.

Proof: Let u1,u2 be solutions of the Cauchy problems

Lu1 = 0 u1(a) = 0, u′1(a) = 1,

Lu2 = 0 u2(b) = 0, u′2(b) = 1 .

According to the theory of linear ordinary differential equations u1 and u2 exist, belongto C2[a,b] and are unique.

Moreover, u1 and u2 are linearly independent. Indeed, suppose Lu = 0 for someu ∈C2[a,b] such that u(a) = u(b) = 0. Then

0 = (Lu,u) =∫ b

a

(−(pu′)′u+qu2)dx

= p(x)u′(x)u(x)∣∣∣ba+∫ b

a

(p(u′)2 +qu2)dx (integration by parts)

=∫ b

a

(p(u′)2 +qu2)dx

Since p > 0 on [a,b], we conclude that u′ ≡ 0. Then u(a) = u(b) = 0 implies u(x) = 0for all x ∈ [a,b].

Consequently since u2(b) = 0 and u2 is not identically zero, u2(a) 6= 0 and so

Wp(u1,u2) = p(a)(u′1(a)u2(a)−u1(a)u′2(a)) = p(a)u′1(a)u2(a) 6= 0.

Therefore u1,u2 are linearly independent by Lemma 14.1. �

Proposition 14.3 If u1 and u2 are linearly independent solutions of the equation Lu =0 such that u1(a) = u2(b) = 0 and

G(x,y) =1

Wp(u1,u2)

{u1(x)u2(y), a≤ x < y≤ b,u1(y)u2(x), a≤ y≤ x≤ b,

then for any f ∈C0[a,b] the function

u(x) =∫ b

aG(x,y) f (y)dy

belongs to C2[a,b], satisfies the equation Lu = f and the boundary conditions u(a) =u(b) = 0.

83

Page 88: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proof: The statement is proved by a direct substitution of

u(x) =u2(x)

Wp(u1,u2)

∫ x

au1(y) f (y)dy+

u1(x)Wp(u1,u2)

∫ b

xu2(y) f (y)dy

into the differential equation. Moreover, since u1(a) = u2(b) = 0, we get u(a) = u(b) =0. �

Lemma 14.4 The operator A : L2(a,b)→ L2(a,b) defined by

(A f )(x) =∫ b

aG(x,y) f (y)dy .

is compact and self-adjoint. Moreover, RangeA is dense in L2(a,b), KerA = {0} andall its eigenfunctions, Au = µu, belong to C2[a,b] and satisfy u(a) = u(b) = 0.

Proof: The operator A is compact by Proposition 13.17. Moreover, G is symmetricand so A is self-adjoint. Proposition 14.3 implies the range of A contains all functionsfrom C2[a,b] such that u(a) = u(b) = 0. This set is dense in L2(a,b).

Now suppose Au = 0 for some u ∈ L2[a,b]. Then for any v ∈ L2

0 = (Au,v) = (u,Av) ,

which implies u = 0 because u is orthogonal to a dense set.Finally, let u be an eigenfunction of A, i.e., Au = µu. Since µ 6= 0 we can write

u = µ−1Au = µ−2A2u. A priori u ∈ L2(a,b), then Au is continuous and A2u is C2.Therefore, the eigenvectors of A are smooth. �

Proposition 14.5 The operator L : D(L)→C0[a,b] has a bounded inverse (in the op-erator norm induced by the L2 norm on both spaces).

Proof: Lemma 14.2 implies that KerL = {0}. Consequently, L is injective. Proposi-tion 14.3 implies that RangeL = C0[a,b] and the inverse operator is defined by

(L−1 f )(x) =∫ b

aG(x,y) f (y)dy .

Lemma 14.4 states that this operator is bounded. In other words, L−1 coincides withthe restriction of A onto RangeL. �

Theorem 14.6 The eigenfunctions of the Sturm-Liouville problem form an orthonor-mal basis in L2(a,b).

84

Page 89: Functional Analysis I - Files.Warwick · Definition 1.9 If a linear map L : V !W is a bijection, then L is called a linear isomorphism. We say that V and W are linearly isomorphic

Proof: Since A : L2(a,b)→ L2(a,b) is compact and self adjoint, Theorem 13.21 im-plies that its eigenvectors form an orthonormal basis in L2(a,b). If u is an eigenfunc-tion of A, then Au = µu, µ 6= 0 and u ∈C2[a,b]. Consequently Lu = λu = µ−1u, i.e.,u is also an eigenvector of L which corresponds to the eigenvalue λ = µ−1. �

Example: An application for Fourier series. Consider the Strum-Liouville problem

−d2udx2 = λu, u(0) = u(1) = 0 .

It corresponds to the choice p = 1, q = 0. Theorem 14.6 implies that the normalisedeigenfunctions of this problem form an orthonormal basis in L2(0,1). In this examplethe eigenfunctions are easy to find:{

1√2

sinkπx : k ∈ N}

.

Consequently any function f ∈ L2(0,1) can be written in the form

f (x) =∞

∑k=1

αk sinkπx

where

αk =12

∫ 1

0f (x)sinkπxdx .

The series converges in the L2 norm.

85