analysis of functions lecturer: scriberecommended books: brezis, functional analysis, sobolev spaces...

Analysis of Functions

Lecturer: Clément Mouhot

Scribe: Paul Minter

Lent Term 2017

These notes are produced entirely from the course I took, and my subsequent thoughts.They are not necessarily an accurate representation of what was presented, and may have

in places been substantially edited. Please send any corrections to [email protected]

Recommended books: Brezis, Functional analysis, Sobolev spaces and partial differentialequations; Lieb & Loss, Analysis.

Analysis of Functions Paul Minter

Contents

0. Motivation and Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1. Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.1. Recap from Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2. Integrability and Convergence Theorems . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3. Lebesgue Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.4. How Regular are Measurable and Integrable Functions? . . . . . . . . . . . . . . . . 21

2. Vector Spaces of Functions: Weak Topologies, Reflexivity and Separability . . . . . . . . 30

2.1. Normed Vector Spaces Recap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.2. Dual Spaces and Weak Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.3. Reflexivity and Weak Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2.4. Uniform Convexity and Reflexivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.5. Separability and Metrisability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.6. Concrete Function Spaces: Lp() and C0() . . . . . . . . . . . . . . . . . . . . . . . 52

2.7. Applications of the Baire Category Theorem to Weak Topologies . . . . . . . . . . . 60

2.8. Results on Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3. Fourier Decomposition of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.1. The Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.2. Fourier Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3.3. Solving PDEs with Fourier Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4. Generalised Derivatives and Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.1. Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.2. Regularity of Sobolev Functions: The Sobolev Embeddings . . . . . . . . . . . . . . 81

4.3. Sobolev Spaces on Open Sets and Distributions . . . . . . . . . . . . . . . . . . . . . 89

1


4.4. The Dirichlet Problem for Poisson’s Equation . . . . . . . . . . . . . . . . . . . . . . . 90

2


0. MOTIVATION AND INTRODUCTION

In this course our focus is on developing the necessary tools for tackling the existence and uniquenessof solutions to various PDE problems. In the end, we will prove a uniqueness and existence resultfor solutions of the Dirichlet problem to Poisson’s equation, although similar techniques can be usedfor more general PDEs (this will be discussed in Part III Analysis of PDEs).

To do this we will require the notion of a weak solution to a PDE. This involves generalising the con-cept of a derivative to a so-called weak derivative, which is essentially one for which the integrationby parts formula holds in an appropriate sense. The reason for generalising in this way is that thenew spaces of functions we are working with, known as Sobolev spaces, a complete and have goodcompactness properties, unlike the usual Ck spaces. Thus the method is broken down into two parts:

(i) First prove the existence of a solution in an appropriate Sobolev space

(ii) Then prove that this solution was actually differentiable in the classical sense that we wantedto begin with.

The latter step is about establishing appropriate regularity of the solution. To do this we will provethat the Sobolev spaces interact well with the Hölder spaces Ck,α through a group of embeddingsknown as the Sobolev embeddings.

For the existence results we tend to need to use compactness properties of the spaces we are workingwith to extract a limit. Recall the fact (proved in the Linear Analysis course) that the closed unit ballis compact in the norm topology if and only if the space is finite dimensional. This is a problem whenwe are working with function spaces such as Lp or Sobolev spaces, as they are not finite dimensional.To get around this we can look at weaker topologies and hope to get better compactness results forthose topologies which are sufficient for our needs. We will look at this “hunt for compactness”problem more generally and study so-called weak topologies and weak-star topologies on normedvector spaces.

So the plan for this course is as follows. First in §1 we will recall and establish the various facts aboutintegration and the Lp spaces which most of the course is built up from. In §2 we discuss the “huntfor compactness” and talk about weak topologies in general. In §3 we take a brief detour into Fourieranalysis and talk about Fourier decomposition, and finally in §4 we will discuss Sobolev spaces andapply all of this to the Dirichlet problem of Poisson’s inequation.

3


1. INTEGRATION

Measure theory allows us to measure more and more complicated sets, and Lebesgue integrationhelps us overcome the theoretical deficiency of Riemann integration theory. For instance, if weconsider the sequence fn : [0, 1]→ defined by

fn(x) = {q1,...,qn}(x)

for {q1, . . . } some enumeration of the rationals in ∩ [0, 1], then we have that fn → f pointwise,where

f = ∩[0,1].Now each fn is Riemann integrable, whilst the pointwise limit f is not Riemann integrable. This tellsus that there is no analogue of the dominated convergence theorem for Riemann integration, andthis is essentially what sets Lebesgue integration and Riemann integration apart as the dominatedconvergence theorem is what makes the Lebesgue theory so powerful.

Remark: It turns turn out that there is a form of the dominated convergence theorem for Riemannintegration, but you need to also assume that the pointwise limit is Riemann integrable itself - seeQuestion 18 on Example Sheet 1.

1.1. Recap from Measure Theory.

Let X be a set a let (X ) denote its power set.

Definition 1.1. A subset ⊂ (X ) is an algebra if:

(i) X ∈(ii) It is closed under complements, i.e. A∈ ⇒ X\A∈ .

(iii) It is closed under finite unions, i.e. if A1, . . . , An ∈ , thenn

i=1 Ai ∈ .

Thus an algebra is a subset of (X ) which is closed under finite set operations.

Definition 1.2. A subset ⊂ (X ) is a σ-algebra if it is an algebra which is closed undercountable unions,

i.e. a subset of (X ) which is close unit countable set operations.

Contrast the definition of a σ-algebra with the definition of a topology: ⊂ (X ) is a topologyif: (i) , X ∈ , (ii) is closed under arbitrary unions, (iii) is closed under finite intersections.Recall that elements of are then called the open sets of the topology .

Thus the distinction between being closed under finite, countable, or arbitrary intersections/closuresis important, and gives rise to very different objects.

4


Note: Using the fact that A∩B = (Ac∪Bc)c , properties (ii) and (iii) of an algebra show that an alge-bra is also closed under finite intersections. Similarly this shows that a σ-algebra is automaticallyclosed under countable intersections.

Note: If we take two σ-algebras σ1,σ2, then it is easily verified that σ1 ∩σ2 is also a σ-algebra.This gives a notion of the “smallest σ-algebra containing a given set M ⊂ (X )”. Indeed, givena collection of subsets M ⊂ (X ), the smallest σ-algebra containing M is:

σ(M) :=σ

σ

where the intersection is over all σ-algebras σ which contain M , i.e. M ⊂ σ. Note this as theintersection of σ-algebras is still a σ-algebra, this is still a σ-algebra, and moreover this is well-defined since the intersection is non-empty, as M ⊂ (X ) and (X ) is always a σ-algebra. Weoften refer to this σ(M) as the σ-algebra generated by M .

Definition 1.3. Given a topology on X , the Borel σ-algebra is the σ-algebra generated by thetopology , σ( ). We usually denote this σ-algebra by(X ).

Elements of(X ) are called Borel sets.

A key point is measure theory is actually knowing what it means to “measure” something. As usualwe can only consistently define a measure on certain subsets of X , and those subsets are exactlythose in a σ-algebra.

Definition 1.4. A measure on a measurable space (X , ) is a function µ : → [0,+∞] suchthat

(i) µ() = 0(ii) µ is σ-additive (i.e. countably additive), i.e. for (Ai)∞i=1 ⊂ (X ) a countable collection of

pairwise-disjoint subsets we have

µ

∞

i=1Ai

=∞

i=1

µ(Ai).

We then called the triple (X , ,µ) a measure space.

Definition 1.5. A say a measure space (X , ,µ) is complete if whenever we have B ⊂ A andµ(A) = 0, then B ∈ and µ(B) = 0,

i.e. all subsets of a set of measure zero are also measurable with measure 0.

For n with its usual (norm-) topology we have the Borel σ-algebra(n), which turns out to justbe the smallestσ-algebra generated by the open balls. We can then “complete” thisσ-algebra, whichforms the so-called Lebesgue σ-algebra, denoted (n). An interesting question then is how muchbigger (n) is from(n) - you may wish to think about calculating the cardinalities of these.

5


Definition 1.6. The completion of a σ-algebra is the smallest σ-algebra which contains aswell as all subsets of sets of measure 0 in .

It is natural to wonder if there is a measure on n which assigns hypercubes their “usual” volume.This turns outs to be possible in a unique way, and the resulting measure is known as the Lebesguemeasure, as the next theorem says:

Theorem 1.1 (Existence of Lebesgue Measure). There is a unique measure µ on (n,(n)),called the Lebesgue measure, such that

µ

n

i=1

[ai , bi]

=

n

i=1

(bi − ai).

Proof. None given - see the Probability and Measure course. □

Definition 1.7. We say that a measure µ is σ-finite if ∃ a countable increasing sequence of subsets(Ai)i ⊂ (X ) such that

i Ai = X and µ(Ai)


Proof. Note first that it is enough to prove that for all open sets U in the topology of Y that we havef −1(U) ∈ , since the topology generates .(i)

So let U be an open set in Y . As Y has a metric dY we can define approximations to U via:

Un := {x ∈ Y : dY (x , Y \U)> 1/n} and Vn := {x ∈ Y : dY (x , Y \U)≥ 1/n}i.e. these are the x ∈ Y which are further than 1/n away from Y \U . Thus they must lie inside U andbe 1/n from the “boundary” of U (this is almost like the points in U such that B1/n(x) or B1/n(x) liesin U).

Then clearly Un ⊂ Vn for all n and Un, Vn ⊂ U for all n. Moreover Vn ⊂ Un+1 for all n, i.e.U1 ⊂ V1 ⊂ U2 ⊂ V2 ⊂ · · · ⊂ U .

Moreover if x ∈ U , since U is open x must be some positive distance away from the outside of U ,and thus x ∈ UN for some N . Hence

U =

n≥1Un =

n≥1Vn.

Now,

f −1(U) = f −1

n≥1Un

=

n≥1f −1(Un).

But note that from the pointwise convergence we have:

n≥1f −1(Un) ⊂

m≥1

l≥1

k≥lf −1k (Um).

Indeed, if x ∈

n≥1 f−1(Un), i.e. f (x) ∈ Un for some n, then B1/n( f (x)) ⊂ U . Thus as fk(x)→ f (x)

we could find l such that for all k ≥ l we have fk(x) ∈ B 12n( f (x)). Then we would have B 1

2n( fk(x)) ⊂

B1/n( f (x)) for all k ≥ l, and so fk(x) ∈ U2n for all k ≥ l, i.e.x ∈

l≥1

k≥lf −1k (U2n).

Thus (taking m= 2n) we have for any x ∈

n≥1 f−1(Un) we have

x ∈

m≥1

l≥1

k≥lf −1k (Um)

and so we get the desired inclusion. Hence we have shown

f −1(U) =

m≥1

l≥1

k≥lf −1k (Um).

Then since f −1k (Um) ⊂ f −1k (Vm) for all m we have

l≥1

k≥lf −1k (Um) ⊂

l≥1

k≥lf −1k (Vm)

and since the Vm are closed we have

l≥1

k≥1f −1k (Vm) ⊂ f −1(Vm).

(i)This is essentially because we have results such as f −1(U ∪ V ) = f −1(U) ∪ f −1(V ) and f −1(Y \U) = X\ f −1(U) andthat is a σ-algebra, and so if we can prove the individual sets f −1(U) are in , then by the properties of the σ-algebrawe will automatically get all sets in are pulled back to sets in , as is generated by countable unions, complements,etc of the open sets.

7


Indeed, if x ∈

l≥1

k≥l f−1k (Vm), then ∃l ≥ 1 with fk(x) ∈ Vm for all k ≥ l. But as fk(x)→ f (x)

and Vm is closed, this would imply f (x) ∈ Vm, i.e. x ∈ f −1(Vm). Hence we have this inclusion.

So combining we have

f −1(U) ⊂

m≥1f −1(Vm) = f

−1(U)

where the last set equality we noted above. Thus we must have all the previous set inclusions actuallybeing set equalities, and so

f −1(U) =

m≥1

l≥1

k≥lf −1k (Um).

But as the fk are measurable, the RHS of this is a measurable set since it is a countable union andintersection of measurable sets (note each Um is clearly open as if x ∈ Um then B1/m(x) ⊂ Um), andthus as is a σ-algebra the RHS is also in . Hence f −1(U) ∈ , and so f is measurable and weare done.

□

Remark: In the case (Y,) = (,()) the above proof can be adapted as then we can write f =lim supk fk and use properties of lim supk (as the lim sup of measurable functions is measurable).

Things to think about: What happens in Proposition 1.1 when Y is merely a topological space?Or what happens if instead of pointwise convergence of the fk we only have convergence almosteverywhere?

1.2. Integrability and Convergence Theorems.

Recall that Riemann integration theory is based on subdividing the input space with the order of ,whereas the output space can be a general Banach space. By contrast Lebesgue integration theory isbased on subdividing the output space with the order of , whereas the input space can be a generalmeasure space. For Riemann integration we go from discrete to continuous analysis via Riemannsums, whilst for Lebesgue integration we use simply functions. Here we shall look at the Lebesgueapproach, and we shall look at real or complex valued functions over a measure space (X , ,µ) (or will always have the Borel σ-algebra with the usual topology).

Definition 1.9. A function f : X → [0,∞) is a simple function if it is measurable and only takeson a finite number of values.

Example 1.2. Consider the characteristic/indicator function for a set A:

A(x) :=

1 if x ∈ A0 otherwise.

Then A is a measurable function if and only if A is measurable. Hence we see that A is a simplefunction exactly when A is a measurable set, giving our first class of simple functions (namelycharacteristic functions of measurable sets).

8


Note: Positive linear combinations of such indicator functions give simple functions also, andmoreover we can write any simple function s as s =

ni=1αiAi , for n ∈ , some measurable sets

Ai , and some αi ≥ 0 [Exercise to check].

Now we prove a result which will help us define and prove results about the integral of a positivemeasurable function.

Proposition 1.2. Let f : X → [0,∞] be measurable. Then ∃ an increasing sequence (sn)n ofsimple functions converging pointwise to f .

Proof. Note that if f (x) = +∞ for some x , then pointwise convergence means that sn(x)→∞ asn→∞.

So define:

sn(x) =

i−12n if f (x) ∈ i−1

2n ,i

2n

, for i = 1, 2, . . . , n2n

n elsewhere

i.e.

sn =n2n

i=1

i − 12n

f −1([ i−12n , i2n ]) + n f −1((n,∞]).

Clearly sn is a simple function as it is a positive finite sum of indicator functions of measurable sets(as f is measurable). Moreover if f (x) f (x) we have |sn(x)− f (x)|≤ 12n and sosn(x)→ f (x). In f (x) = +∞ then sn(x) = n for all n and so sn(x)→∞.

Finally the sn are clearly increasing since the partition is finer each time and so sn(x) can neverdecrease. So these (sn)n work.

□

We can easily define the integral of a simple function to “be what it should be”, and then use Propo-sition 1.2 to extend this to more general measurable functions.

Definition 1.10. For a simple function s on a measure space (X , ,µ), writing s =n

i=1αiAi ,we define the integral of s over a measurable set E by:

Es dµ :=

n

i=1

αiµ(Ai ∩ E)

i.e. the “total size” of s over E.

Remark: For a given simple function s, the map µ̃ : → [0,∞] sending E →

E s dµ defines ameasure on (X , ). Countable additivity is due to that of µ since the sum is finite. Moreover ifs ≡ 1= X , then µ̃≡ µ.

We can now use Proposition 1.2 to define the integral of a positive measurable function.

9


Definition 1.11. Let f : X → [0,∞] be a positive measurable function. Then we define theintegral of f over a measurable set E by:

Ef dµ := sup

Es dµ : s ≤ f is a simple function

.

Lemma 1.1 (Chebyshev’s Inequality). Let f : X → [0,∞] be a positive measurable function.Then for any α> 0 we have

µ ({x ∈ X : f (x)≥ α})≤ 1α

Xf dµ.

Proof. Set A = {x ∈ X : f (x) ≥ α} and let s = A. Then for x ∈ A we have f (x)/α ≥ 1, and so asf (x) ≥ 0 for all x this shows that f (x)α ≥ s(x) for all x . Thus by definition of the integral being thesupremum:

µ(A) =

Xs dµ≤

X

fα

dµ=1α

Xf dµ

as required. □

Remark: From this we see that if

X f dµ


So let E = X . Note that as fk+1 ≥ fk for all k, the sequence

X fk dµ

k⊂ [0,∞] is monotone

increasing, and thus it converges to a limit α ∈ [0,∞].

But then since fk ≤ f always, we have

X fk dµ≤

X f dµ for all k, and so α≤

X f dµ. So we onlyneed to prove the other inequality.

So let s be any simple function with s ≤ f . Let c ∈ (0, 1). Set:Ek := {x ∈ X : fk(x)≥ cs(x)}.

Since the fk are increasing we clearly have Ek ⊂ Ek+1 for all k. Also for any x ∈ X , we can find > 0with f (x)− cs(x) = | f (x)− cs(x)|> (as s(x)≤ f (x) and c < 1) and hence as fk(x)→ f (x), if wechoose k sufficiently large so that f (x)− fk(x)< /2, then

fk(x) = f (x)− ( f (x)− fk(x))> f (x)− /2> cs(x) + /2> cs(x)and thus x ∈ Ek. Hence this shows

X =

k≥1Ek.

Now note that Ek is measurable for all k since fk and s are measurable. From a previous remark weknow that the map

µ̃(A) :=

As dµ

is a measure on (X , ), and so by countably additivity it is continuous from below, and so

µ̃

k≥1Ek

= lim

k→∞µ̃(Ek)

i.e.

Xs dµ= µ̃(X ) = µ̃

k≥1Ek

= lim

k→∞µ̃(Ek) = lim

k→∞

Ek

s dµ.

Thus we have

Xfk dµ≥

Ek

fk dµ as fk ≥ 0 and Ek ⊂ X

≥

Ek

cs dµ by definition of Ek

≥ c

Ek

s dµ by linearity of the integral

and so taking k→∞ in this inequality gives

α≥ c

Xs dµ

and so taking c ↑ 1 shows that α ≥

X s dµ. So this shows for any simple function s with s ≤ f , wehave

X s dµ ≤ α. So by the definition of

X f dµ being a supremum over all such s, we see thatX f dµ≤ α, and so hence combining we have

limk→∞

Xfk dµ= α=

Xf dµ

as we wanted to show. □

11


Note: Applying monotone convergence to the partial sums gn :=n

k=1 fk, we see that wheneverwe have a sequence ( fk)k of measurable positive-valued functions fk : X → [0,∞], we have

X

∞

k=1

fk

dµ=

∞

k=1

Xfk dµ

i.e. we can exchange the limit from the infinite sum with the limit from the integral when allfunctions are measurable and ≥ 0.

Recall the following two results from the Probability and Measure course:

Proposition 1.3 (Change of Measure Formula). Let f : X → [0,∞] be measurable on a measurespace (X , ,µ). Then ν : → [0,∞] defined by

ν(A) :=

Af dµ

is a measure, and moreover if g : X → [0,∞] is measurable we have

Xg dν=

Xf g dµ.

Proof. None given (see the Probability and Measure course). □

Proposition 1.4 (Fatou’s Lemma). Let ( fk)k be a sequence of positive measurable functions fk :X → [0,+∞]. Then:

Xlim inf

kfk dµ≤ lim inf

k

Xfk dµ.

Proof. Define gk := infk≥n fk. Then (gn)n is a positive increasing sequence with gn→ g := lim infk fkpointwise. Thus by monotone convergence,

Xg dµ= lim

n→∞

Xgn dµ.

But note for any k ≥ n we have gn ≤ fk, and thus

X gn dµ≤

X fk dµ for all k ≥ n. In particular wehave

X gn dµ≤ infk≥n

X fk dµ, and thus

limn→∞

Xgn dµ≤ limn→∞ infk≥n

Xfk dµ= lim inf

k

Xfk dµ

and so combining gives the result. □

Now we extend our notion of integrals to more general measurable functions.

12


Definition 1.12. If f : X → is measurable, we say that f is integrable if | f | : X → [0,∞) hasX | f | dµ


Now using the fact that for real sequences (an)n, (bn)n, if an→ a thenlim inf

n(an + bn) = a+ lim infn bn

we have by definition of hk and linearity of the integral

lim infk

Xhk dµ= lim inf

k

X2g dµ−

X| fk − f | dµ

=

X2g dµ− lim sup

k→∞

X| fk − f | dµ

i.e.

X2g dµ≤

X2g dµ− lim sup

k

X| fk − f | dµ

and so rearranging (as

X g dµ


Remark: To compare Riemann and Lebesgue integration, we will see on Example Sheet 1 thatfor f : [a, b]→ a bounded function,f is Riemann integrable ⇐⇒ {x ∈ [a, b] : f is discontinuous at x} has Lebesgue measure 0.

For such an f it is also true that Riemann integrability implies that f is measurable with respect tothe Lebesgueσ-algebra but not necessarily the Borelσ-algebra, and that if f is Riemann integrablethen it is automatically Lebesgue integrable, and the Riemann integral agrees with the Lebesgueintegral. Thus the Lebesgue integral is a more general object than the Riemann integral.

1.3. Lebesgue Spaces.

Next we need to prove some results about the Lebesgue spaces, Lp, such as completeness, densityresults, and separability (for p < +∞ in the case (X ,µ) = (n, Lebesgue measure)). Since the Lpspaces will form the basis of Sobolev spaces we study later these properties are critical.

Definition 1.14. For p ∈ [1,∞] we define the Lebesgue space Lp(X) to be the set of equivalenceclasses of measurable functions f : X → (or ) such that

| f |p is integrable if p M}) = 0}

i.e. the smallest constant such that | f | is ≤ M almost everywhere. We say that a measurablefunction f is essentially bounded if ess.supX | f | < ∞, i.e. f is bounded except on a set ofmeasure zero.

Theorem 1.4 (Completeness of Lp). Endowed with the norms: f p :=

X | f |p dµ1/p

if p N1 such that for all n, m ≥ N2 we have

15


fn − fmp < 14 . Set fn2 = fN2 . Then repeat this inductively). So settinggk := fnk+1 − fnk

we have gkp < 12k for all k. Also set hk = |gk|, so hkp <12k . Then for any M ≥ 0,

M

k=1

hk

p

≤M

k=1

hkp ≤M

k=1

12k≤ 1

and so applying monotone convergence (taking M →∞, and we can do this as all the hk are ≥ 0)we see that

∞

k=1

hk

p

≤ 1.

From this and a Corollary of Chebyshev’s inequality (see the remark after Lemma 1.1) we know thisimplies that∞

k=1 hk 0. Then since ( fn)n is Cauchy, wecan find N such that for all n, m≥ N ,

fn − fmp < .In particular we get (taking m = nk) fn − fnkp < for all k sufficiently large. Hence by Fatou’slemma,

X| fn − f |p dµ=

Xlim infk→∞

| fn − fnk |p dµ≤ lim infk→∞

X| fn − fnk |p = lim infk→∞ fn − fnk

pp ≤ p

for all n≥ N , i.e. fn − f p ≤ .

In particular this shows that fn→ f in (Lp, · p), and that f ∈ Lp, since f p ≤ fN − f p + fNp ≤ + fNp fn − fm∞}

has measure zero for each m, n by definition of of ·∞. Thus the set A=

m,n Am,n has measure zeroas it is the countable unions of sets of measure zero. Then on X\A the functions fn− fm are boundedand ( fn)n is a uniformly Cauchy sequence (since now the essential supremum is just a supremum).Hence ( fn)n converges uniformly to some f on X\A, and so extend f to a function on all of X bysetting it to be 0 on A. Clearly f is then bounded and so is in L∞.

Then we have fn→ f in · ∞ since fn→ f uniformly on X\A, and A has measure zero. So done.

□

16


Remark: Note that we have proven the following result within the proof of the completeness ofLp: if fn→ f in Lp, then ∃ a subsequence ( fnk)k which converges to f pointwise a.e.

The converse to this remark is also true if we assume monotonicity or domination (by the monon-tone convergence/dominated convergence theorems) - see the Probability and Measure coursefor more.

Theorem 1.5 (Simple Functions are dense in Lp(X )). Let (X , ,µ) be a measure space andp ∈ [1,∞]. Then simple functions in Lp(X ) are dense in Lp(X ).

Proof. By splitting up the real and imaginary parts, and then their positive and negative parts, wesee that it is enough to just deal with the case when f ∈ Lp(X ) has f ≥ 0.

In this case, define:

sn(x) :=

i−12n if f (x) ∈ i−1

2n ,i

2n

, for i = 1, . . . , n2n

n if f (x)≥ n.

Then since f ∈ Lp and 0≤ sn ≤ f we see that sn ∈ Lp. Moreover since sn ↑ f pointwise, in particular| f − sn|p = ( f − sn)p ↓ 0 pointwise, and so we can apply monotone/dominated convergence to seethat sn→ f in Lp [Exercise to check details].

□

Note: For any measurable set A we always have A ∈ L∞, but however for p < ∞ we haveA ∈ Lp ⇔ µ(A) 0, ∃U ⊂ X open with A⊂ U and µ(U\A)< ,• Inner regular if for any given A with µ(A) 0 ∃K ⊂ X compact with K ⊂ A

and µ(A\K)< .• Regular if it is both outer and inner regular.

Intuitively, outer regular says that any measurable set can be approximated to an arbitrary accuracy inmeasure by larger open sets (i.e. “outer open sets”) whilst inner regularity says that any measurableset of finite measure can be approximated by inner compact sets.

Then we have:

17


Theorem 1.6. The Lebesgue measure on n is a regular measure.

Proof. None given. □

Definition 1.16. A topological space is separable if it has a countable dense subset.

Theorem 1.7 (Separability of Lp(U), p


FIGURE 1. An illustration of approximation an open set U by hypercubes with ra-tional endpoints and disjoint interiors. The cubes shaded green are those ‘kept” atthe first stage of the iteration, whilst the cubes shaded red are the “ambiguous” oneswhich lie partially in U and partially outside U . We further subdivide these in halfalong each direction, so the endpoints always are rationals, and keep subdividinguntil we cover all of U .

So we now know how we can approximate open sets by cubes with rational endpoints. But not allBorel sets are open sets. To get around this we use the outer regularity of the Lebesgue measure(Theorem 1.6). So let A be a bounded Borel set. Then by outer regularity, for any given > 0 wecan find an open set V with µ(V\A)< (thus we also have that µ(V )


Thus for any simple function s =l

k=1αkAk , as sn ∈ Lp we must have µ(Ak) 0, and use that for any r ∕= r ′, Br (0) − Br′ (0)∞ = 1.]

20


1.4. How Regular are Measurable and Integrable Functions?

Before moving onto the main content of the course, we first quickly address the question of “howregular” measurable functions or integrable functions are. Clearly they can be bad on sets of smallmeasure (e.g. measure zero) as for example we can essentially redefine an integrable function on aset of measure zero to whatever we want without changing its integrability. However we will provetwo results here which tell us about how “nice” such functions are, namely:

• The Lebesgue differentiation theorem, which tells us that the “fundamental theorem of calcu-lus” holds a.e. for a L1(n) function• Lusin’s theorem, which tells us that for any measurable function f : → , we can redefine

f on sets of arbitrarily small measure to make it continuous.

So measurable functions are “close to continuous functions” in some sense.

Let us start with some definitions.

Definition 1.18. Consider f : n→ (or ). Then we say x ∈ n is a Lebesgue point of f if1

µ(Br(x))

Br (x)| f (y)− f (x)| dµ(y)→ 0 as r ↓ 0

where µ is the Lebesgue measure, i.e. intuitively this says that the average difference from f (x) off tends to 0.

As a first observation, note that if f is continuous at x then x is a Lebesgue point of f . Indeed, let > 0. Then by continuity ∃δ > 0 such that if |y − x | < δ, i.e. y ∈ Bδ(x), then | f (y)− f (x)| < .Thus for all r < δ we have if y ∈ Br(x) then | f (y)− f (x)|< , and so

1µ(Br(x))

Br (x)| f (y)− f (x)| dµ(y)≤ 1

µ(Br(x))

Br (x) dµ(y) =

and so as > 0 was arbitrary this shows the LHS→ 0 as r → 0.

We will need the following technical covering lemma to streamline the proof of the Lebesgue differ-entiation theorem:

Lemma 1.2 (Vitali’s Covering Lemma). Let X ⊂ n and suppose X is covered by a finite numberof open balls, i.e.

X ⊂N

i=1Bri (x i)

for some N , x i , ri . Then ∃ a subset of indices J ⊂ {1, . . . , N} such that (Bri (x i))i∈J are all disjoint,and

X ⊂

i∈JB3ri (x i).

21


Proof. Firstly order the radii so wlog r1 ≥ r2 ≥ · · · ≥ rN . We build up our set J as follows. Considerfirst Br1(x1). We remove any ball which intersects this (i.e. do not include any such index in J). Thisis then OK, since if Br j (x j)∩ Br1(x1) ∕= for some j, then as r1 ≥ r j we have

Br j (x j) ⊂ B3r1(x1)and so when we increase the radii by a factor of 3 these will still cover all of X .

So now we are left with a new, smaller collection of balls (if all balls are disjoint already then we canjust take J = {1, . . . , N}, otherwise the set of balls we are looking out decreases by at least 1), whereBr1(x1) is disjoint from the rest and when we multiply the radii by factors of 3 the balls still cover.

But then we can just repeat this processes inductively on the new set of balls with the second largestradius left (so as to not look at Br1(x1) again). As the set is finite and we either decrease the setsize by one at each step or if we don’t then we are done, this inductive process eventually terminatesafter a finite number of steps, and it preserves the fact that X is contained in the required union andthe collection is disjoint. So done.

□

Theorem 1.8 (Lebesgue Differentiation Theorem). Let f ∈ L1(n). Then almost every x ∈ nis a Lebesgue point.

Proof. Note that we know the result holds true for continuous functions in L1(n) (i.e. on C0(n)∩L1(n)) by the discussion after definition of a Lebesgue point. We also know (by Theorem 1.7) thatcontinuous functions in L1(n) are dense in L1(n). Thus we know the theorem is true on a densesubset of L1(n), and so we just need to use this to extend the result to the general case.

Let f ∈ L1(n). Consider the following function:

M f (x) := supr>0

m f (x , r) where m f (x , r) :=1

µ(Br(x))

Br (x)| f | dµ

(this M f is sometimes called the Hardy-Littlewood maximal operator). Consider the set

Ea := {x : M f (x)> a}.We claim that Ea is open for any a. Indeed, let x ∈ Ea. Then we know M f (x) > a, and thus∃r > 0 such that m f (x , r)> a by definition of M f . Now let > 0 (to be chosen later), and considery ∈ B(x). Then since µ(Br(x)) =ωnrn for ωn the Lebesgue measure of the open unit ball in n,

m f (y, r + ) =1

µ(Br+(y))

Br+(y)| f | dµ

≥ 1ωn(r + )n

Br (x)| f | dµ as Br(x) ⊂ Br+(y)

= r

r +

n· 1µ(Br(x))

Br (x)| f | dµ

= r

r +

nm f (x , r)

22


and so if we choose > 0 such that r

r+

nm f (x , r)> a (which we can do as m f (x , r)> a) then we

get

m f (y, r + )> a

i.e. y ∈ Ea. Hence we have shown B(x) ⊂ Ea, and so Ea is open. So as Ea = M−1f ((a,∞)) andthe sets {(a,∞)}a∈ generate the Borel σ-algebra. on this shows that M f : n → [0,∞] is ameasurable function and in particular that Ea is a Borel set.

Next we prove the following:

Claim: ∀a > 0 we have µ(Ea)≤ 3n

a f L1(n).

Proof. Let K ⊂ Ea be any compact subset. Just as before, if x ∈ K , then x ∈ Ea andso we can find rx > 0 with m f (x , rx)> a. Hence

K ⊂

x∈KBrx (x)

is a open cover of K , and so by compactness we can extract a finite subcover

K ⊂N

i=1Bri (x i).

By the Vitali covering lemma (Lemma 1.2), we can find a subset of indices J ⊂{1, . . . , N} such that K ⊂

i∈J B3ri (x i) and the balls (Bri (x i))i∈J are disjoint. Thus

we have

µ(K)≤

j∈Jµ(B3ri (x i))

= 3n

j∈Jµ(Bri (x i))

≤ 3n

j∈J

1a

Br j (x j)| f | dµ by definition of m f (x , rx) (‡)

=3n

a

j∈J Br j (x j)| f | dµ as these balls are disjoint

≤ 3n

a

n| f | dµ

where in (‡) we have just used that by definition of m f (x , rx) and by choice of rx ,

a < m f (x i , ri) =1

µ(Bri (x i))

Bri (x i)| f | dµ =⇒ µ(Bri (x i))≤

1a

Bri (x i)| f | dµ.

But then by inner regularity of the Lebesgue measure, we have (as Ea is a Borel set)

µ(Ea) = supK⊂Ea compact

µ(K)≤ 3n

a

n| f | dµ= 3

n

a f L1(n).

□

23


Finally since continuous functions in L1(n) are dense in L1(n), we can find a continuous functiong ∈ L1(n) with f − gL1(n) < , for any > 0 given. So setting h = f − g, we have that we canwrite

f = g + hwhere g ∈ L1(n) is continuous and hL1(n) < . Now define

T f (x) := limr↓0

t f (x , r) where t f (x , r) :=1

µ(Br(x))

Br (x)| f (x)− f (y)| dµ(y).

Then by the triangle inequality and the fact f = g + h,

t f (x , r) =1

µ(Br(x))

Br (x)|g(x)− g(y) + h(x)− h(y)| dµ(y)

≤ 1µ(Br(x))

Br (x)|g(x)− g(y)|+ |h(x)|+ |h(y)| dµ(y)

= tg(x , r) + |h(x)|+1

µ(Br(y))

Br (x)|h(y)| dµ(y)

≤ tg(x , r) + |h(x)|+Mh(x)where in the last line we have used the definition of Mh(x) being a supremum. Hence taking r ↓ 0,using the fact that g is continuous and so every point is Lebesgue, we get

T f (x)≤ |h(x)|+Mh(x).So note that this implies for any k > 0, if T f (x) >

1k then we must have one of |h(x)| > 12k or

Mh(x)>1

2k (simply because if neither happened we would get T f (x)≤ 12k+ 12k = 1k , a contradiction),which tells us:

{x : T f (x)> 1/k} ⊂ {x : |h(x)|> 1/2k}∪ {x : Mh(x)> 1/2k}and so taking measures of this we see

µ{x : T f (x)> 1/k}

≤ µ

x : |h(x)|> 12k

∪

x : Mh(x)>1

2k

≤ µ

x : |h(x)|> 12k

+µ

x : Mh(x)>1

2k

≤ 2k

n|h| dµ+µE 1

2k

by Chebyshev’s inequality

≤ 2khL1(n) + 3n · 2k · hL1(n) by previous claim< 2k(3n + 1) since hL1(n) < .

But then > 0 was arbitrary, and so this shows (taking ↓ 0 for fixed k) thatµ{x : T f (x)> 1/k}

= 0

for every k, and thus µ{x : T f (x)> 0}

= 0 (as {x : T f (x) > 0} =

k≥1{x : T f (x) > 1/k} is

a countable union of sets of measure 0). Hence as T f (x) ≥ 0 always, this shows that we haveT f (x) = 0 a.e., i.e.

limr↓0

1µ(Br(x))

Br (x)| f (x)− f (y)| dµ(y) = 0 for a.e. x

which is what we wanted. □

24


Theorem 1.9 (Lebesgue Density Theorem). Let E ∈(n) be a Borel set. Then for almost everyx ∈ n we have the density ratio:

µ(E ∩ Br(x))µ(Br(x))

→ E(x) as r ↓ 0

i.e. the point x will either lie outside E or inside E when we zoom in close enough.

Proof. We want to just take f = E and apply the Lebesgue differentiation theorem, but E mightnot have finite measure and so f might not be in L1(n). But the point is that we are only lookingat a given x at a time: so fix M > 0. Then note for any x ∈ BM (0), for any r ∈ (0, 1) we haveBr(x) ⊂ BM+1(0), and so

E ∩ Br(x) = E ∩ BM+1(0)∩ Br(x).So take f = E∩BM+1(0) ∈ L1(n), since E ∩ BM+1(0) has finite measure. Then by the Lebesguedifferentiation theorem, we have for almost all x ∈ BM (0),

f (x) = limr↓0

1µ(Br(x))

Br (x)f (y) dµ(y) = lim

r↓0µ(E ∩ BM+1(0)∩ Br(x))

µ(Br(x))= lim

r↓0µ(E ∩ Br(x))µ(Br(x))

and so as x ∈ BM+1(0) the LHS is just E(x) and so we have for almost every x ∈ BM (0),

E(x) = limr↓0µ(E ∩ Br(x))µ(Br(x))

.

Thus if we set AM := {x ∈ BM (0) : claim fails}, we have shown that µ(AM ) = 0 for all M > 0. Henceas

{x ∈ n : claim fails}=∞

M=1AM

is a countable union of sets of measure zero, we see this set has measure zero, and so the result holdsfor almost every x ∈ n.

□

Now let us use this results to explore the links between Lebesgue integration and differentiation.

Corollary 1.1 (Lebesgue Fundamental Theorem of Calculus). Let f ∈ L1() and define

F(x) =

x

−∞f (y) dy.

Then F is differentiable a.e., with F ′(x) = f (x) a.e.

Proof. We have

F(x +δ)− F(x)δ

=1

µ([x , x +δ])

[x ,x+δ]f (y) dy

25


and soF(x +δ)− F(x)

δ− f (x)≤

1µ([x , x +δ])

[x ,x+δ]| f (y)− f (x)| dy

≤ 2µ(Bδ(x))

Bδ(x)| f (y)− f (x)| dy as Bδ(x) = (x −δ, x +δ) here

→ 0 a.e. by the Lebesgue differentiability theorem.

Hence this shows F is differentiable a.e., with F ′(x) = f (x) for a.e. x ∈ .

□

Remark: It turns out the converse to Corollary 1.1 is not true, i.e. if F is differentiable a.e. withF ′ = f ∈ L1, it is not necessarily true that F(y) − F(x) =

yx f (z) dz. Indeed, Cantor’s ‘stair’

function is a counterexample. One can however prove that F can be written as the integral of anL1 function if and only if F is absolutely continuous.

Now we turn to our last point, namely the link between measurable functions and continuous func-tions. Let us first prove that pointwise convergence and uniform convergence are “the same” excepton sets of arbitrarily small measure.

Theorem 1.10 (Egorov’s Theorem). Let ( fk)k be a sequence of measurable functions fk : n→ .Then suppose we have fk → f pointwise on a Borel set A which has finite measure. Then for any > 0, we can find a Borel set A ⊂ A with µ(A\A)≤ and fk→ f uniformly on A.

i.e. pointwise convergence on a set of finite measure gives uniform convergence “on almost all of theset”.

Remark: Be cautious, since when removing a set of finite measure we can change the functiondramatically (e.g. we can remove as it has measure zero).

Remark: Note that the assumption that A has finite measure is necessary: e.g. if we took fk =[k,k+1] on , then fk → f ≡ 0 pointwise on , but we can never have fk → f uniformly on any\B with B of finite measure.

Proof. Define the sets:

E(k)N :=

p≥N{x ∈ A : | fp(x)− f (x)|≤ 1/k}.

Then we clearly have

(i) E(k)N ⊂ E(k)N+1, as there are fewer sets in the intersection

(ii) E(k+1)N ⊂ E(k)N , as

1k+1 ≤ 1k .

(iii) For every k ≥ 1 we have A =

N≥1 E(k)N from the pointwise convergence, as for any x ∈ A

we are eventually always within 1/k of f (x).

26


So fix > 0. Then for a fixed k, as (E(k)N )N is increasing (by (i)) towards A (by (iii)) we can chooseNk such that ∆k := A\E(k)Nk has

µ(∆k)≤

2k

(simply because (i) and (iii) give supN≥1µ(E(k)N ) = µ(A)


FIGURE 3. An illustration of the sets being used in the proof of the first version ofLusin’s theorem.

ThusµF\(Kn ∪ K ′n)<

2n+1+

2n+1=

2n

(simply because both are disjoint, one in f −1(Vn) the other in K\ f −1(Vn) and so F\(Kn ∪ K ′n) =f −1(Vn)\Kn∪(F\ f −1(Vn))\K ′n

).

So define:K :=

n≥1(Kn ∪ K ′n).

Then we haveµ(F\K)≤

n≥1µF\(Kn ∪ K ′n)≤

(once again by summing the series).

Now take E = F\K . Now since Kn, K ′n are disjoint compact sets, and is a normal topological space,we can find an open set Un separating Kn, K

′n, so Kn ⊂ Un and Un∩K ′n = (this isn’t the full strength

of normality but it is sufficient).

We claim that f |K is continuous. Indeed, if x ∈ K ⊂ F , then x ∈ F and so we can consider f (x). Butthis is a real number, and so f (x) ∈ VN for some N . But as x ∈ K we know that x ∈ Kn ∪ K ′n for alln, in particular x ∈ KN ∪K ′N . If we had x ∈ K ′N , then x ∕∈ f −1(VN ), a contradiction. So we must havex ∈ KN . But then as KN ⊂ UN ⊂ F\K ′N this shows

f (UN ∩ K) ⊂ VN(since UN ∩ K ⊂ (F ∩ K)\K ′N , but to be in K and not K ′N (like on the RHS) means you must be in KNand so must be in f −1(VN ), i.e. (F ∩ K)\K ′N ⊂ f −1(VN )).

But this shows that f |K is continuous at x . Indeed, take any open neighbourhood V of f (x). Thensince V is open we can find N with VN ⊂ V . By the above we then have f (UN ∩ K) ⊂ VN ⊂ V , i.e.f |K(UN ) ⊂ V , and so f |K is continuous at x .

So as x ∈ K was arbitrary, we are done.

□

28


Theorem 1.12 (Lusin’s Theorem - Version 2). Let f : → be measurable. Then for any > 0,∃ a measurable set G ⊂ and a continuous function g : → such that µ(G) < and f = g on\G,

i.e. we can modify f on arbitrarily small sets to make it continuous (but this modification mightchange things drastically as we have seen, as “small” sets can be uncountable, etc!).

Proof. Apply version 1 of Lusin’s theorem to find E measurable with µ(E) < /2 and f |\E contin-uous. Now by outer regularity of the Lebesgue measure, we can find G ⊃ E open with µ(G) < .Since G ⊂ is open we can write G as a pairwise disjoint countable union of open intervals, i.e.G =

k≥1(ak, bk) (do this iteratively, using that G is open). Then define g : → by:

g :=

f on \Gf (ak) +

x−akb−ak ( f (bk)− f (ak)) on (ak, bk)

i.e. on any of the “small” intervals, simply take a linear interpolation of f at the endpoints. Then gis continuous and f = g on \G, where µ(G)< . So done.

FIGURE 4. An illustration of the g we construct in the second version of Lusin’s theorem.

□

Exercise: Give a proof that continuous functions are dense in L1() by using Lusin’s theorem.

29


2. VECTOR SPACES OF FUNCTIONS: WEAK TOPOLOGIES, REFLEXIVITY AND SEPARABILITY

In this section we are going to go hunting for compactness in infinite dimensional vector spaces. Weneed to work in vector spaces because we need to consider the dual space associated to the vectorspace to define the topologies we are interested in. As mentioned before, we know that the closedunit ball is not compact with respect to a norm topology within an infinite dimensional space, and sowe cannot extract norm-convergent subsequences. To recover compactness, we will need to reducethe strength of the topology and look at the so-called weak topology and weak-∗ topology.

We will see that we can recover compactness of the closed unit ball on the dual space with respect tothe weak-∗ topology. Now since not every space is the dual space of another, the problem becomestransferring this compactness from the dual space back to the original space. By embedding our spaceinto its double dual in a natural way (via the evaluation maps) we will see that we can “pullback”this compactness from the dual to our original space (in now the weak topology) provided our spaceis reflexive. This is the real importance of reflexivity, as a means of knowing when a given space iscompact with respect to the weak topology.

Separability comes in when we want to know when the weak topology is actually metrizable, i.e.induced by a metric. We know that compactness and sequential compactness are equivalent in met-ric spaces, and thus if we want to actually extract convergent subsequences, what we really want issequential compactness. Thus separability always us to transition this last step, from weak compact-ness to weak sequential compactness.

This is the big picture of what is going on, and we will slowly uncover the detail and many otherinteresting results along the path to proving these results. Note that the useful examples of ℓp() andLp(), for p ∈ (1,∞) we already know to be reflexive and separable, and thus the results we willdevelop here will be applicable to them, and this forms the basis of the PDE theory we will developlater on in the course.

2.1. Normed Vector Spaces Recap.

Here all of our vector spaces will be taken to be over .

Definition 2.1. E is a normed vector space (nvs) if E is a vector space and furthermore ∃ afunction · : E→ ≥0, called a norm, such that

(i) α f = |α| · f for all α ∈ , f ∈ E.(ii) f + g ≤ f + g (the triangle inequality)

(iii) f = 0⇒ f = 0.

Remark: A function · : E → ≥0 which obeys all of the above properties except (iii) is calleda semi-norm.

30


Example 2.1. Let I be a finite or countable index set. Consider I , and a sequence (wi)i∈I ⊂ >0of positive real numbers called the weights. Then the x ∈ I such that:

xp,w :=

i∈I |x i |pwpi

1/pif p ∈ [1,∞)

x∞,w := supi∈I |x i |wi if p =∞ 0. Then ∃N such that for all n ≥ 1we have

fn − f ℓp() ≤ =⇒∞

k=1

| fn(k)− f (k)|p ≤ p =⇒ | fn(k)− f (k)|≤ ∀k ≥ 1

and thus fn(k)→ f (k).

Moreover the converse to this is true if we are only considering ℓp(I ,), for I a finite set, i.e. finitesequences in ℓp. However it is false in general if I is infinite.

Recall that a Banach space is a complete normed vector space. We have already seen that Lp() is aBanach space, and we also know that ℓp() is a Banach space for any p ∈ [1,∞] (see Linear Analysisnotes). A useful criterion for checking completeness of a normed vector space is the following:

Lemma 2.1. Let (E, · ) be a normed vector space. Then:E is complete ⇐⇒ E is s.t. whenever ( fn)n ⊂ E has

n

fn


A subspace F of a normed vector space E is simply a subset of E which is closed under finite additionand scalar multiplication (i.e. F itself is a vector space with the inherited operations).

Definition 2.3. A subspace F of E is closed if E\F is open.

Remark: As usual, we know that a subspace F is closed if and only if it contains all of its limitpoints. If E is finite dimensional, then all subspaces are closed simply because all norms on E areequivalent, and so if we take E with the corresponding Euclidean norm, we know E is completeand so is any subspace (as isomorphic to a corresponding n).

However if E is infinite dimensional, not all subspaces need to be closed. Indeed, consider E =ℓ1() and consider

F = c0 := {x ∈ ℓ1() : x i = 0 eventually always}i.e. c0 is the subspace of all sequences which have only finitely many non-zero entries. Then if weconsider

x (n) :=

1,12

,122

, . . . ,12n

, 0, 0, . . .

then we have x (n) ∈ c0 for all n, and x (n)→ x in ℓ1(), where

x =

1,12

,122

, . . .

.

But x ∕∈ c0 and thus c0 is not closed.

Exercise: Show that if F is a subspace of a Banach space E, then

F is closed ⇐⇒ F is Banach.

Definition 2.4. A normed vector space E is said to be Euclidean if ∃ an inner product 〈·, ·〉 :E × E→ which induces the norm, i.e. x=

〈x , x〉 for all x ∈ E.

We say E is a Hilbert space if it is a complete inner product space.

Recall: If it exists, the inner product inducing the norm is unique, since it is completely determinedby the norm (via the polarisation identity). Moreover, we can show that:

∃ an inner product inducing the norm · ⇐⇒ The parallelogram law holds for · .Recall that the parallelogram law says that:

f + g2 + f − g2 = 2 f 2 + 2g2 ∀ f , g ∈ E.

2.2. Dual Spaces and Weak Topologies.

Definition 2.5. For E a normed vector space, the dual space E∗ is:

E∗ := {F : E→ : F is linear and continuous}.

32


In the linear analysis course properties of the dual space were studied via the Hahn-Banach theorem.Here we give a slightly different formulation of the Hahn-Banach theorem which is more geometric.

Theorem 2.1 (Geometric Hahn-Banach). Let E be a normed vector space. Suppose that A, B ⊂ Eare convex, non-empty, disjoint subsets (not necessarily subspaces!). Then:

(i) If A is open, then A and B can be weakly separated by a closed hyperplane, i.e. ∃F ∈ E∗and α ∈ such that A⊂ {F < α} and B ⊂ {F ≥ α}.

(ii) If A is instead closed and B is compact, then A and B can be weakly strictly separated bya closed hyperplane, i.e. ∃F ∈ E∗, α ∈ and > 0 such that A ⊂ {F ≤ α − } andB ⊂ {F ≥ α+ }.

Proof. None given (see Part III Functional Analysis if interested!). □

Remark: The “closed hyperplane” separating A and B each case if the set {F = α}. In the firstcase we might need this hyperplane to touch B, whilst in the second we can ensure they do not.

FIGURE 5. An illustration of the two cases in the geometric Hahn-Banach result. Wecan either separate A, B with a hyperplane which may touch one of A or B, or if wehave other assumptions we can choose it to be a positive distance away from bothA, B.

We will see that this theorem has many applications in this course, as well as many applications ingeneral (e.g. the Krein-Milman theorem - see Part III Functional Analysis).

Definition 2.6. Let X be a set and (Yi)i∈I topological spaces, and suppose we have maps ϕi : X →Yi . Then the initial topology on X generated by the (ϕi)i∈I is the smallest/coarsest topology onX such that all of the ϕi are continuous.

33


Thus if U ⊂ Yi is open, then we need ϕ−1i (U) to be open in X . Thus the initial topology generatedby (ϕi)i is the topology generated by:

{V : V = ϕ−1i (U) for some i and some U ⊂ Yi open}

and so in particular such a topology does exist. In particular, the topology generated by such acollection of sets is formed by all the arbitrary unions of finite intersections of the sets (i.e. finiteintersections first, then unions - taking unions first and then intersections will not create a topologywithout taking unions at the end).

Thus the initial topology generated by (ϕi)i can be written as [Exercise to check this is a topology]:

=

i∈IVi : I is arbitrary and Vi =

j∈Jiϕ−1j (U

(i)j ) for some finite Ji and U

(i)j ⊂ Yj open

.

Example 2.3 (Discrete topology is an initial topology). Suppose we have ϕi : X → beingconstant, i.e. ϕi(x) = ci for all x, for some ci ∈ . Then what is the initial topology generated bythe ϕi? Well for the ϕi to be continuous we need ϕ

−1i (U) to be open in X for any U ⊂ open. But

we have

ϕ−1i (U) =

X if ci ∈ U otherwise

and thus the smallest topology on X such that all the ϕi are continuous is the smallest topologycontaining X ,, i.e. = {X ,}, which is the discrete topology.

Definition 2.7. Let (X ,τ) be an arbitrary topological space, and x ∈ X . Then we say:

• V is a neighbourhood of x if ∃U ∈ open with x ∈ U ⊂ V .• A neighbourhood system of x, denoted (x), is the collection of all neighbourhoods of

x.

• A neighbourhood basis of x, denoted B(x), is a subcollection of (x) such that anyV ∈ (x) has B ⊂ V for some B ∈ B(x).

Proposition 2.1. Let X be a set, (Yi)i∈I topological spaces and ϕi : X → Yi . Let be the initialtopology on X generated by the (ϕi)i∈I . Then for any x ∈ X , a neighbourhood basis of x is:

B(x) =

finite

ϕ−1i (Ui) : Ui is an open neighbourhood of ϕi(x) in Yi

.

Proof. First note that each element of B(x) is indeed a (open) neighbourhood of x in the initialtopology generated by the (ϕi)i . Indeed, each ϕ−1i (Ui) is open containing x , and a finite union ofopen sets which contains x will also be an open set containing x , and thus

finiteϕ

−1i (Ui) is a (open)

neighbourhood of x .

34


Now let V ∈ (x) be an arbitrary neighbourhood of x . So ∃U ⊂ V open with x ∈ U . So as U is openin the initial topology and we know what the open sets of the initial topology look like, we have

U =

i∈IVi

where the Vi are finite intersections of pre-images under the ϕi . In particular we must have x ∈ Vifor some Vi , and wlog (for notational simplicity) say x ∈ V1. Then we know V1 is of the form

V1 =

j∈Jϕ−1j (Wj)

for some Wj ⊂ Yj open and J finite. In particular, x ∈ V1 and so ϕ j(x) ∈Wj for all j ∈ J , and so Wjis an open neighbourhood of ϕ j(x) in Yj . Hence x ∈ V1 ⊂ U ⊂ V is an open neighbourhood of therequired form, and thus this B(x) is a neighbourhood basis of x .

□

So we know what the open sets of an initial topology look like. However in applications, we tendto work with convergent sequences as opposed to the actual open sets themselves. So what doesconvergence look like in an initial topology? It turns out to be very simple:

Lemma 2.2 (Convergence in initial topologies). Let X be a set, (Yi)i∈I topological spaces andϕi : X → Yi . Let be the initial topology on X generated by the (ϕi)i∈I . Suppose we have asequence (xn)n ⊂ X . Then:

xn→ x in (X , ) ⇐⇒ ϕi(xn)→ ϕi(x) in Yi , ∀i ∈ I .

Proof. (⇒) : Suppose xn → x in (X , ). Then this means that for any open neighbourhood U of x ,we have xn ∈ U eventually always.

So fix i ∈ I . To show ϕi(xn)→ ϕi(x) we need to show that for any open neighbourhood W of ϕi(x)we have ϕi(xn) ∈W eventually always. So let W be an open neighbourhood of ϕi(x). Then ϕ−1i (W )is an open neighbourhood of x in X , by definition of the initial topology. Thus ∃N such that for alln ≥ N we have xn ∈ ϕ−1i (W ), i.e. ϕi(xn) ∈ W for all n ≥ N . Hence this shows ϕi(xn)→ ϕi(x) forany i, since i ∈ I was arbitrary.

(⇐) : Suppose ϕi(xn)→ ϕi(x) in Yi for all i. Let U ∋ x be an open neighbourhood of x in X . Thenby Proposition 2.1 (although we don’t really need to use the full power of Proposition 2.1 - we couldjust use that fact that we know what the open sets in look like) we know that some element ofB(x) is contained in U , i.e. we have

x ∈

finite

ϕ−1i (Wi) ⊂ U

for finitely many i and Wi ⊂ Yi open. Hence ϕi(x) ∈Wi is an open neighbourhood for these i, and sofor each i we can choose Ni such that ∀n ≥ Ni we have ϕi(xn) ∈Wi , since ϕi(xn)→ ϕi(x). TakingN =maxfinite{Ni}, we have ϕi(xn) ∈Wi for all i, ∀n≥ N . Hence for all n≥ N we have

xn ∈

finite

ϕ−1i (Wi) ⊂ U

35


i.e. we are eventually always within U . Thus as U ∋ x was an arbitrary open neighbourhood thisshows xn→ x in (X , ) and so we are done.

□

Lemma 2.3 (“Universal Property” of initial topologies). Let X be a set, (Yi)i∈I topological spacesand ϕi : X → Yi . Let be the initial topology on X generated by the (ϕi)i∈I . Let Z be anothertopological space and suppose we have ψ : Z → X . Then:

ψ is continuous ⇐⇒ ϕi ◦ψ : Z → Yi is continuous ∀i ∈ I .

Proof. (⇒) : Suppose ψ is continuous. Then we know each ϕi is continuous, since by definition ofthe initial topology it is the smallest topology such that all ϕi are continuous. Hence ϕi ◦ψ is acomposition of continuous maps and so is continuous.

(⇐) : To see that ψ is continuous we need to show for any U ⊂ X we have ψ−1(U) is open. But anyU ⊂ X can be written as a union and intersection of sets of the form ϕ−1i (Ui) for Ui ⊂ Yi open, andso as ψ−1(A∪ B) = ψ−1(A) ∪ψ−1(B) and ψ−1(A∩ B) = ψ−1(A) ∩ψ−1(B) in general, it suffices toshow that ψ−1(ϕ−1i (Ui)) is open for any i and Ui . But

ψ−1(ϕ−1i (Ui)) = (ϕi ◦ψ)−1(Ui)and thus this is open since ϕi ◦ψ is continuous.

□

To motivate initial topologies, let us see that many topologies we already know are initial topologies.

Example 2.4 (Product topology is an initial topology). Let (X i)i be an arbitrary collection oftopological spaces. Then the product topology on ⊗iX i is exactly the initial topology generated bythe projection maps πi : ⊗iX i → X i , defined by ϕi(x) := x i .

Example 2.5 (Subspace topology is an initial topology). Let X be a topological space and Y ⊂ Xa subset. Then the subspace topology on Y is exactly the initial topology generated by the inclusionmap ι : Y → X , defined by ι(x) := x.

Example 2.6 (Quotient Topology is a final topology). Similarly to an initial topology, we candefine a final topology as follows. Suppose X is a set and (Yi)i∈I are topological spaces and wehave maps ϕi : Yi → X (so now the maps land in X). Then the final topology on X is the smallesttopology on X such that all ϕi are continuous.

Now if X be a topological space and Y ⊂ X a subset. Then the quotient topology on X/Y is exactlythe final topology generated by the quotient map π : X → X/Y sending x → [x].

36


Let us now define the main topologies we will be working with and see how they behave.

Definition 2.8. Let E be a normed vector space, with dual space E∗. Then the weak topology onE, denoted σ(E,E∗) or w, is the initial topology on E generated by E∗,

i.e. it is the smallest topology on E with respect to which all elements of E∗ are continuous.

Notation: If we have an initial topology on a set X generated by maps (ϕ)i∈I , we tend to denotethe initial topology by σ(X , (ϕi)i∈I ).

Remark: If we have (xn)n ⊂ X , if we have xn converges to x in the weak topology on X we denotethis by: xn x , or xn

w→ x . From Lemma 2.2 we then have:xn x ⇐⇒ F(xn)→ F(x) ∀F ∈ E∗.

Proposition 2.2. The weak topology σ(E, E∗) on a normed vector space E is Hausdorff.

Proof. This comes immediately from the (geometric) Hahn-Banach theorem. Suppose x ∕= y in E.Then we have A= {x} is closed, convex and non-empty, and B = {y} is compact, convex and non-empty, with A and B disjoint (“closed” and “compact” here are with respect to the norm topology onE).

Then from the geometric Hahn-Banach theorem, we get that ∃F ∈ E∗ and α ∈ such thatA⊂ {F < α}= F−1((−∞,α)) =: U1B ⊂ {F > α}= F−1((α,+∞)) =: U2.

Then by definition of an initial topology, since (−∞,α), (α,+∞) are open in we have thatF−1((−∞,α)) and F−1((α,+∞)) are open in σ(E, E∗). Hence x ∈ U1, y ∈ U2, with U1 ∩ U2 = and U1, U2 are open in the weak topology. Thus (E,σ(E, E∗)) is Hausdorff.

□

A comment on language: For a normed vector space E, we call the topology induced by thenorm the strong topology on E. When we talk about open and closed sets, we tend to say a set isstrongly open, strongly compact, etc, if it is open/compact with respect to the strong topology,whilst we say that a set is weakly open, weakly compact, etc, if it is open/compact with respectto the weak topology, σ(E, E∗).

Proposition 2.3 (Relation between strong and weak topologies). Let E be a normed vector space.Then the weak topology is always coarser than the strong topology, i.e.

σ(E, E∗) ⊂ strongfor strong the strong topology on E. Moreover we have

E is finite dimensional ⇐⇒ σ(E, E∗) = strong.

37


Remark: Proposition 2.3 tells us that in particular for infinite dimensional normed vector spaceswe always have σ(E, E∗) ⊊ strong. However this does not mean that convergence in the weakand strong topologies can’t be the same, e.g. for E = ℓ1() and (xn)n ⊂ ℓ1() we have

xn→ x strongly ⇐⇒ xn x weakly.[see Example Sheet 2].

Proof of Proposition 2.3. For the first claim, note that by definition of E∗, we know all elements F ∈ E∗are continuous with respect to the strong topology on E. Thus as σ(E, E∗) is the smallest topologyon E for which all the F ∈ E∗ are continuous, this gives σ(E, E∗) ⊂ strong in general.

(⇒) : Suppose E is finite dimensional. We just need to show strong ⊂ σ(E, E∗). To do this, itsuffices to show that every open ball Br(x) in the strong topology is weakly open, since the openballs generate the strong topology.

Moreover since E is finite dimensional all norms on E are equivalent, and the topology generatedby any norm on E is the same as the topology generated by any other norm. Thus we can take thenorm on E to wlog by the supremum norm, i.e. · ∞. Then,

Br(x) := {y ∈ E : |yi − x i |< r for all i = 1, . . . , n}if n = dim(E). Consider the projection maps πi : E → , sending (x1, . . . , xn) → x i (this is all w.r.tsome chosen basis on E). Then clearly we have πi ∈ E∗ for each i, and so

Br(x) = {y ∈ E : yi ∈ (x i − r, x i + r) for all i = 1, . . . , n}

=n

i=1{y ∈ E : πi(y) ∈ (x i − r, x i + r)}

=n

i=1π−1i ((x i − r, x i + r))

which is a finite intersection of weakly open sets (as πi ∈ E∗ for each i) and thus is weakly open.Hence Br(x) is weakly open for any x , r and thus any strongly open set is weakly open. So we aredone.

(⇐) : We prove the contrapositive. Suppose that E is infinite dimensional. Consider the unit spherein X :

S := {x ∈ X : x= 1}.Clearly S is strongly closed. We will show that S is not weakly closed by showing that 0 is in theweak closure of S (and then this shows σ(E, E∗) ∕= strong and we are done).

So let U be any weak neighbourhood of 0. Then by Proposition 2.1 we know that we can findF1, . . . , Fn ∈ E∗ such that

0 ∈n

i=1F−1i (Ui) ⊂ U

where Ui is an open neighbourhood of Fi(0) = 0 in . Thus for each i we can find i > 0 with(−i ,i) ⊂ Ui , and so taking =min{1, . . . ,n}> 0, we have

0 ∈n

i=1F−1i ((−,))

=: A

⊂ U .

38


Now consider the map Φ : E→ n defined byΦ(x) := (F1(x), . . . , Fn(x)).

Then Φ is clearly linear as the Fi are, and we have

ker(Φ) =n

i=1ker(Fi).

Then since E is infinite dimensional and n is finite dimensional, we must have that ker(Φ) is infinitedimensional (by the rank-nullity theorem - if you don’t like applying the rank-nullity theorem whenX is not finite dimensional, apply the rank-nullity theorem on Φ|Ei for Ei a subspace of dimension N ,and then take N →∞).

In particular, have ker(Φ) ∕= , and so ∃x ∕= 0, x ∈ ker(Φ) =n

i=1 ker(Fi), i.e. ∃x ∈ E\{0} withFi(x) = 0 for all i. Thus as the Fi are linear, we have Fi(λx) = 0 for all λ ∈ and so λx ∈ A for allλ ∈ and so λx ∈ U for all λ ∈ (iv). Taking λ = 1/x this shows that

xx ∈ U ∩ S.

Hence any weakly open neighbourhood of 0 intersects S. Hence 0 belongs to the weak closure of S:

indeed, if 0 ∕∈ Sσ(E,E∗)

:= B (the weak closure), then 0 ∈ E\B which is a weakly open set (being thecomplement of a weakly closed set), and thus it must intersect S, i.e. ∃y ∈ S ∩ (E\B). But as y ∕∈ Bthis implies y ∕∈ S, which is a contradiction.

Thus 0 is in the weak closure of S, and so as 0 is not in the strong closure of S (as this is just S) weare done.

□

Now we define the next important initial topology, but this time it is a topology only defined on dualspaces. Recall that if E is a normed vector space, then we know that E∗ is a normed vector space aswell, with norm

FE∗ := supx∈E\{0}

|F(x)|xE

≡ supx∈E: xE≤1

|F(x)|.

Definition 2.9. Let E be a normed vector space. Then the weak-∗ topology (i.e.weak star topology) on E∗, denoted σ(E∗,E) or w∗, is the initial topology on E∗ generatedby the evaluation maps:

ϕ f : E∗→ defined by ϕ f (F) := F( f ) for each f ∈ E

(note that ϕ f ∈ E∗∗ for each f ∈ E).

Remark: Recall that we have a canonical embedding Φ : E → E∗∗ defined by f → ϕ f , where ϕ fis as in Definition 2.9. We sometimes write Φ( f ) ≡ f̂ for the canonical embedding, i.e. ϕ f ≡ f̂ .Thus the notation σ(E∗, E) for the weak-∗ topology really means σ(E∗, Ê) to be consistent withour previous notation, i.e. it is the initial topology generated by all elements of Ê, which are theevaluation maps.

(iv)Note that we have shown that in infinite dimensional spaces that weakly open sets of, e.g. 0, are huge, as this showsthat they contain some lines.

39


Note: The weak topology can be defined on any normed vector space, whilst the weak-∗ topologyis only defined on dual spaces. We usually write Fn

∗ F to mean that (Fn)n ⊂ E∗ converges to F

in the weak star topology to F ∈ E∗.

Note that once again using Lemma 2.2, we can see exactly what convergence in the weak star topol-ogy actually means, i.e. if (Fn)n ⊂ E∗, then

Fn∗ F ⇐⇒ ϕ f (Fn)→ ϕ f (F) ∀ f ∈ E, i.e. ⇐⇒ Fn( f )→ F( f ) ∀ f ∈ E.

So weak star convergence is just pointwise convergence on E.

Proposition 2.4. E∗ with the weak star topology σ(E∗, E) is Hausdorff.

Proof. Suppose F1 ∕= F2 in E∗. Then we can find f ∈ E with F1( f ) ∕= F2( f ). So wlog assumeF1( f )< α< F2( f ) for some α ∈ . Then let

U1 := ϕ−1f ((−∞,α)), U2 = ϕ−1f ((α,∞)).

Note that these are both open in the weak-∗ topology (as the topology is generated by the ϕ f ’s) andclearly U1 ∩ U2 = , F1 ∈ U1, F2 ∈ U2. So these separate F1, F2 and so we are done. □

So now we have 3 potential topologies we could place on E∗: the strong topology (induced bythe norm · E∗), the weak topology (generated by all elements of E∗∗) and the weak-∗ topology(generated by the ϕ f , i.e. Ê ⊂ E∗∗). Since Ê ⊂ E∗∗, the weak-∗ topology is generated by a smallerclass of functions than the weak topology, and thus we have

weak-∗ ≤ weakand because the weak topology is always weaker than the strong topology, we have

σ(E∗, E) weak-∗

⊂ σ(E∗, E∗∗) weak

⊂ strong.

The next result is of critical importance in weak topologies, and is the first main result demonstratingwhy weak topologies (and in particular the weak-∗ topology) are useful. It shows that the “hunt forcompactness” is possible. However we do not prove it:

Theorem 2.2 (Banach-Alaoglu). The closed unit ball of E∗ is always weak-∗ compact.

Proof. None given - see Part III Functional Analysis (the proof uses Tychonoff’s theorem: a productof an arbitrary collection of compact topological spaces is compact in the product topology). □

So the weak-∗ topology finds the right balance between having enough information whilst havingfew enough open sets. The fewer open sets there are in a topology the more compact sets there are(just from the definition of compactness in terms of open covers) but the fewer open sets the lessprecise the convergence is. Thus the weak-∗ topology contains few enough open sets to be compact.

40


2.3. Reflexivity and Weak Compactness.

We have seen how Banach-Alaoglu gives weak-∗ compactness of the closed unit ball of the dual.However this is only useful if the space we are looking at is the dual of another space, which is notalways the case. So next we look at how we can use the Banach-Alaoglu result to recover compactnesson the original space E for the weak topology.

Indeed, recall that we have a canonical embedding ϕ : E → E∗∗ sending f → ϕ f . Moreover weknow that this map is an isometry (from Hahn-Banach), meaning that the closed unit ball of E, BE ,is mapped into the closed unit ball of E∗∗, i.e.

ϕ(BE) ⊂ BE∗∗ .This is useful, since E∗∗ is a dual space, we know that BE∗∗ is compact in the weak-∗ topology. Sowhat could we say if we ever had ϕ(BE) = BE∗∗? By rescaling this would tell us that ϕ is a surjectionE → E∗∗ and so we can consider its inverse, ϕ−1 : E∗∗ → E. Now consider ϕ−1 as a map from(E∗∗,σ(E∗∗, E∗)) (E∗∗ with its weak-∗ topology) to (E,σ(E, E∗)) (E with its weak topology). Weclaim that this map is continuous.

Indeed to see this we can use the Universal Property, Lemma 2.3. To show that this map is continuouswe just need to show that ϕi ◦ ϕ−1 is continuous for each ϕi generating σ(E, E∗), i.e. for eachϕi = F ∈ E∗. So if ϕi = F ∈ E∗ we want to show that F ◦ϕ−1 is continuous in σ(E∗∗, E∗). So notethat for any g ∈ E∗∗, we have g = ϕ( f ) = ϕ f for some f ∈ E, and so

(F ◦ϕ−1)(g) = F( f ) = f̂ (F) = ϕ f (F) = g(F).

But now note if we define Ψ : E∗∗ → by Ψ(g) := g(F) ≡ F̂(g), this is one of the evaluation mapsgenerating σ(E∗∗, E∗), and thus is continuous. Hence F ◦ϕ−1 is continuous on (E∗∗,σ(E∗∗, E∗)), andhence ϕ−1 : (E∗∗,σ(E∗∗, E∗)) → (E,σ(E, E∗)) is a continuous map. Thus we see that since BE∗∗ isweak-∗ compact, we have

BE = ϕ−1(BE∗∗)

is the continuous image of a compact set, and thus BE is weakly compact in E∗.

Thus we have shown that the closed unit ball on E is compact, provided we had ϕ(BE) = BE∗∗ . Putwe can easily see that this condition is equivalent to ϕ being surjective, and thus is equivalent to Ebeing reflexive (in particular as reflexive spaces must be Banach, we must have E being a Banachspace). This shows the true importance of reflexivity, as being a means of showing that the closedunit ball in our space is weakly compact. It turns out that the converse is true as well, and so wehave the following:

Theorem 2.3 (Kakutani). Let E be a Banach space. Then:

E is reflexive ⇐⇒ The closed unit ball in E is weakly compact.

Proof. (⇒) : This direction is just what we have seen above. If E is reflexive then ϕ : E → E∗∗ issurjective and so ϕ(BE) = BE∗∗ . Then above we showed that

ϕ−1 : (E∗∗,σ(E∗∗, E∗))→ (E,σ(E, E∗))

41


is a continuous map, and thus BE = ϕ−1(BE∗∗) is the continuous image of a compact set (by Banach-Alaoglu) and so is compact in (E,σ(E, E∗)), as we wanted.

(⇐) : We need a preliminary lemma before proving this. □

Proving the converse to Theorem 2.3 requires the following lemma, due to Goldstine.

Lemma 2.4 (The Goldstine Lemma). Let ϕ : E → E∗∗ be the canonical embedding. Then ϕ(BE)is weak-∗ dense in BE∗∗ (i.e. dense with respect to the σ(E∗∗, E∗)−topology on E∗∗).

Proof. We already know since ϕ is an isometry that ϕ(BE) ⊂ BE∗∗ , and since by Banach-Alaoglu BE∗∗is weak-∗ closed, we have

ϕBE w∗⊂ BE∗∗

w∗

= BE∗∗ .

So we just need the other inclusion. Let ψ ∈ BE∗∗ , and let V be a w∗-neighbourhood of ψ in E∗∗.Then from Proposition 2.1 we know what a neighbourhood basis ofψ is, and thus we can wlog (justby restricting V ) assume that V is an element of the neighbourhood basis and thus takes the form:

V = {Ψ ∈ E∗∗ : |Ψ(Fi)−ψ(Fi)|< for i = 1, . . . , n}for some F1, . . . , Fn ∈ E∗ (as the topology is generated by the evaluation maps on elements of E∗).

We try to find f ∈ E with |Fi( f )−ψ(Fi)| < for i = 1, . . . , n. Indeed, write αi := ψ(Fi) and notethat for any β1, . . . ,βn ∈ ,

i

βiαi

=

i

βiψ(Fi)

=ψ

i

βi Fi

≤ ψE∗∗ ·

i

βi Fi

E∗

≤

i

βi Fi

E∗

where we have used the fact that ψ ∈ BE∗∗ and so ψE∗∗ ≤ 1. Now if no such f ∈ E existed, thenthe map H : E → n sending f → (F1( f ), . . . , Fn( f )) would have α = (α1, . . . ,αn) ∕∈ Image(H). But{α} and H(BE) are convex, closed and disjoint subsets of n (due to linearity of H) and so they canbe separated by a hyperplane. Thus ∃β ∈ n and γ ∈ with

β ·H( f )< γ< β ·αfor all f ∈ BE . Thus we have

i

βi Fi( f ) = β ·H( f )< γ< β ·α

for all f ∈ BE , which shows that

i

βi Fi

E∗

≤ γ< β ·α≤

i

βiαi

which is a contradiction to the above inequality. Hence this means that we must be able to find f ∈ Ewith |Fi( f )−ψ(Fi)|< for i = 1, . . . , n, i.e.

|ϕ f (Fi)−ψ(Fi)|< for all i = 1, . . . , ni.e. ϕ f ≡ ϕ( f ) ∈ V . Hence this shows that

V ∩ϕ(BE) ∕=

42


for any weak-∗ neighbourhood V of ψ ∈ BE∗∗ in E∗∗. But then this implies ψ ∈ ϕ(BE)w∗

: indeed, if

not then E∗∗\ϕ(BE)w∗

is a weak-∗ open neighbourhood of ψ, and so must intersect ϕ(BE), which isa contradiction. Thusψ ∈ ϕ(BE)

w∗

for anyψ ∈ BE∗∗ , i.e. BE∗∗ ⊂ ϕ(BE)w∗

, giving the other inclusionand so we are done.

□

Proof of Theorem 2.3 continued.

(⇐): First note that ϕ : (E,σ(E, E∗))→ (E∗∗,σ(E∗∗, E∗∗∗)) is continuous. Indeed, from the UniversalProperty Lemma 2.3 this is equivalent to ζ ◦ϕ : (E,σ(E, E∗))→ being continuous for all ζ ∈ E∗∗∗.But both ζ,ϕ are continuous in the strong topologies, and thus ζ ◦ ϕ is continuous in the strongtopology, i.e. ζ ◦ϕ : E → is linear and strongly continuous, and hence it is an element of E∗, andso by definition of σ(E, E∗), ζ ◦ϕ is continuous as a map on (E,σ(E, E∗)).

So we know ϕ : E → E∗∗ is continuous with respect to the weak topologies on both E, E∗∗. Butsince the weak-∗ topology is weaker than the weak topology, this tells us (just from the topologicaldefinition of continuity) that ϕ : (E,σ(E, E∗)) → (E,σ(E∗∗, E∗)) is continuous (i.e. we know ϕ :(E, w)→ (E∗∗, w) is continuous and hence so is ϕ : (E, w)→ (E∗∗, w∗)).

Hence as BE is weakly compact, ϕ(BE) is weak-∗ compact in E∗∗ (as it is the continuous image of acompact set). But by the Goldstine Lemma we know that it is weak-∗ dense in BE∗∗ . But since theweak-∗ topology is Hausdorff, weak-∗ compact sets and weak-∗ closed, and so ϕ(BE) is a weak-∗compact dense subset of BE∗∗ , and so must equal BE∗∗ (seen just be taking weak-∗ closures).

So hence ϕ(BE) = BE∗∗ , and so by rescaling ϕ(E) = E∗∗. Thus ϕ is surjective and hence E is reflexive.

□

Remark: When E is not reflexive ϕ(BE) is never dense in BE∗∗ for the strong topology. Indeed,note that ϕ(BE) is also closed in E∗∗ for the strong topology, since if (ϕ( fn))n is convergent in(E∗∗, · E∗∗) it is clearly Cauchy, and so since ϕ is an isometry this tells us that ( fn)n is Cauchy inBE and so converges in E (as it is a Banach space). Thus if fn→ f we have

ϕ( fn)→ ϕ( f ) ∈ BE∗∗since f ∈ BE . Henceϕ(BE) is a closed subset of BE∗∗ , and so if it were dense for the strong topologythen we would need ϕ(BE) = BE∗∗ , i.e. ϕ(E) = E∗∗ and so E would be reflexive.

Let us now see some other useful ways of showing spaces are reflexive.

Proposition 2.5. A closed subspace of a reflexive Banach space is reflexive.

Proof. Let M ⊂ E be a closed subspace (with the induced norm). Consider the natural embeddingϕ : M → M∗∗. We need to show this is surjective. So let q ∈ M∗∗. Then define q̃ ∈ E∗∗ by

q̃(F) := q(F |M )

43


(here F ∈ E∗ and so F |M ∈ M∗). This q̃ is well-defined, linear and continuous, and so q̃ ∈ E∗∗. SinceE is reflexive we therefore have q̃ = f̂ for some f ∈ E.

Then we must have f ∈ M . Indeed, if f ∕∈ M , then (since M is closed) by the Hahn-Banach theoremwe can find F ∈ E∗ with F( f ) = 1 and F |M ≡ 0 (i.e. define F like this on the subspace span{M , f }and extend by Hahn-Banach). But then we would have

q̃(F) = q(F |M ) = q(0) = 0

but at the same time

q̃(F) = f̂ (F) = F( f ) = 1

a contradiction. So we must have f ∈ M .

Now we would be finished if we can show that ϕ( f ) = q (where ϕ here is M → M∗∗, not E→ E∗∗).So let G ∈ M∗. Then by Hahn-Banach we can extend F to some H ∈ E∗. So,

(ϕ( f ))(G) = G( f ) = H( f ) as H|M = G and f ∈ M= f̂ (H)

= q̃(H)

= q(H|M )= q(G)

i.e. ϕ( f ) = q as G ∈ M∗ was arbitrary. Hence ϕ is surjective and we are done.

□

Proposition 2.6. Let E be a Banach space. Then

E is reflexive ⇐⇒ E∗ is reflexive.

Proof. (⇒) : If E is reflexive then ϕ(E) = E∗∗ and so σ(E∗, E) = σ(E∗, E∗∗), i.e. the weak-∗ and weaktopologies on E∗ are generated by the same functionals and so are equal. But Banach-Alaoglu tellsus BE∗ is weak-∗ compact, and thus is weakly compact as the topologies agree. Hence by Kakutani(Theorem 2.3) we see E∗ is reflexive.

(⇐): If E∗ is reflexive, then by the (⇒) direction just shown we have that E∗∗ is reflexive. But thenE is isometrically isomorphic to its image in E∗∗ under ϕ, which in particular is a closed subspace ofE∗∗. Thus by Proposition 2.5, ϕ(E) is reflexive, which tells us that E is reflexive (as E ∼= ϕ(E) areisometrically isomorphic).

□

2.4. Uniform Convexity and Reflexivity.

Uniform convexity provides us with a criterion which implies reflexivity.

44


Definition 2.10. A Banach space E is uniformly convex if ∀ > 0, ∃δ > 0 such that for allf , g ∈ BE ,

f − gE > =⇒

f + g2

E< 1−δ.

So uniform convexity is a geometric property of some unit balls, which roughly says that if two pointsare different their average will be away from the boundary of the unit ball, and this is true uniformlyacross points in the unit ball separated by at least the same distance.

FIGURE 6. An illustration of the unit balls in 2 with respect to different norms.The unit ball in the L2 (Euclidean) norm is just a circle, which can be seen to beuniformly convex. For the L1 norm the unit ball is a diamond-like shape, which isnot uniformly convex. Similarly for the L∞ norm. To see that these last two are notuniformly convex, just consider two distinct points on one side of the shape - theyare a positive distance apart, but their midpoint still lies on the side and so has norm1.

Remark: Any Hilbert norm on a Banach space (i.e. a norm induced by a complete inner product)is uniformly convex.

Theorem 2.4 (Milman-Pettis). All uniformly convex Banach spaces are reflexive.

Note: As usual we need the space to be Banach if it is reflexive, so we need to assume it is Banach.

Proof. Note that from a previous remark we know ϕ(BE) is closed in E∗∗ for the strong topology ofE∗∗. So it is enough to show that ϕ(BE) is dense in BE∗∗ for the strong topology, as then

ϕ(BE) = ϕBE= BE∗∗

and so ϕ is surjective onto BE∗∗ and thus is surjective onto E∗∗, giving that E is reflexive.

So considerψ ∈ E∗∗ with ψE∗∗ = 1. Then given > 0, choose δ > 0 as in the definition of uniformconvexity. Then since

ψE∗∗ = sup{ψ(F) : FE∗ ≤ 1}= 1we can find F ∈ BE∗ such that ψ(F)> 1− δ2 . Then consider the set

V = {η ∈ E∗∗ : |ψ(F)−η(F)|< δ/2}

45


which is open in the weak-∗ topology σ(E∗∗, E∗) (as it is the preimage of the evaluation map of F).Now the Goldstine Lemma (Lemma 2.4) tells us that ϕ(BE) is weak-∗ dense in BE∗∗ , and so as V isa weak-∗ open neighbourhood of ψ ∈ BE∗∗ , ϕ(BE) must intersect V . Hence we can find f ∈ BE withϕ( f ) ∈ V .

We claim that ϕ( f ) −ψE∗∗ ≤ , which will then imply the density we want. Indeed to see this,suppose it were not true, i.e. we had

ϕ( f )−ψE∗∗ > .Then this is saying that ψ ∈ W , where W = E∗∗\B(ϕ( f )) (the complement of the closed unit ballof radius in E∗∗). But by Banach-Alaoglu, we know BE∗∗ is weak-∗ compact, and as the weak-∗topology is Hausdorff, BE∗∗ is weak-∗ closed. Thus W is the complement of a weak-∗ closed set andso is weak-∗ open.

So now we haveψ ∈W ∩V , which is a finite intersection of weak-∗ open sets and so is weak-∗ open.Hence using the density from the Goldstine Lemma again we can find g ∈ BE with ϕ(g) ∈ V ∩W .

But then we have ϕ( f ),ϕ(g) ∈ V , and so|ψ(F)−ϕ( f )(F)|< δ/2 and |ψ(F)−ϕ(g)(F)|< δ/2

which implies

2ψ(F)< (δ/2+ϕ( f )(F)) + (δ/2+ϕ(g)(F)) = F( f ) + F(g) +δ = F( f + g) +δ

and thus

ψ(F)≤ |ψ(F)|≤F

f + g2

+δ

2≤ FE∗ ·

f + g2

E+δ

2≤

f + g2

E+δ

2

since FE∗ ≤ 1. But then ψ(F)> 1−δ/2, which impliesf + g

2

E> 1−δ

but this then contradicts ϕ(g) ∈W , since this tells us ϕ(g)−ϕ( f )E∗∗ > , i.e. g − f E > , andso this contradicts the definition of uniform convexity.

So thus we have ϕ( f ) −ψE∗∗ ≤ , and as > 0 was arbitrary we can find a sequence in ϕ(BE)which converges to ψ in the strong topology. So this deals with the case when ψE∗∗ = 1. For thecase ψE∗∗ ≤ 1, simply consider η = ψ/ψE∗∗ which has norm 1, so by the above we can find asequence (ϕ( fn))n, fn ∈ BE , with ϕ( fn)→ η, and thus

ϕ(ψE∗∗ fn) = ψE∗∗ϕ( fn)→ψin the strong topology, and ψE∗∗ fn ∈ BE as well so we are done.

□

Note: The main application of uniform convexity which we will see is to the Lp spaces for p ∈(1,∞), which are all uniformly convex.

One natural question is how “different” convergence is between the strong and weak topologies. Foruniformly convex spaces we get a nice characterisation:

46


Proposition 2.7. Let E be a uniformly convex Banach space, and let ( fn)n ⊂ E. Then:fn→ f w.r.t the strong topology ⇐⇒ fn→ f w.r.t. the weak topology and fnE → f E

i.e. strong convergence is the same as weak convergence and convergence in norm (in uniformlyconvex Banach spaces).

Proof. (⇒): This is immediate from strong convergence implying weak convergence, and strongconvergence implying convergence in norm (e.g. | fnE − f E |≤ fn − f E → 0).

(⇐): If f = 0, then by assumption we have fn → 0 and hence we have strong convergence of fnto 0 and we are done.

If f ∕= 0, then f > 0 and so from fn → f we can take N sufficiently large such that for alln≥ N we have fn> 0. But restricting ourselves to n≥ N we can wlog assume fn> 0 for all n.

Then define gn := fn/ fn and g = f / f , i.e. move everything to the unit ball so that we can useuniform convexity. Then clearly gn g weakly, since for every F ∈ E∗,

F(gn) =F( fn) fn→ F( f ) f = F(g)

where we have used that fn f weakly so that F( fn)→ F( f ) and that fn → f . In particularwe have gn+g2 g weakly, which gives

1= gE ≤ lim infn gn + g

2

E

(v).

But then we have for every n, by the triangle inequality, gn+g

2

E ≤

12(gn+g) = 1, and so hence

lim supn g+gn

2

E ≤ 1. In particular this shows that we have

g + gn2

→ 1.

So hence by definition of uniform reflexivity we must have g − gnE → 0. This then gives fn − f = fngn − f g

≤ gn( fn − f ) + f (gn − g) = | fn − f |

→

analysis of functions lecturer: scriberecommended books: brezis, functional analysis, sobolev spaces...

Documents