real analysis i 1 equivalence of sets and cardinality

REAL ANALYSIS IFall 2002

Note The course as such begins with Section 3. More precisely, with 3.1,Measurable spaces. The first two sections are preliminary chatter.

1 Equivalence of sets and cardinality; some ba-sics

Definition 1 Let A,B be sets. We say that A is equivalent to B, or that Ais equipotent with B, and we write A ∼ B if there exists a one-to-one, ontofunction f : A → B.

It is quite easy to prove that

Lemma 1.1 The relation just defined is and equivalence relation; that is,

1. For every set A, we have A ∼ A.

2. If A,B are sets and A ∼ B, then B ∼ A.

3. If A,B, C are sets, if A ∼ B and B ∼ C, then A ∼ C.

There are a number of simple things one can do here, their proof depends onthe Peano axioms (especially mathematical induction), some set theory axioms.In all events they are obvious enough that one may safely assume then proved.Or so I hope.

Let N denote the set of positive integers (natural numbers). We will writeJ0 to denote the empty set and, if n ∈ N, we’ll write

Jn = {1, 2, . . . , n} = Jn−1 ∪ {n}.

Lemma 1.2 Let n,m be non-negative integers. Then Jn ∼ Jm if and only ifn = m.

Definition 2 A set A is finite if there exists n ∈ N ∪ {0} such that A ∼ Jn.We then say that n is the cardinality of A. If a set A is not finite, we say it isinfinite.

We are still doing the simple things, but the following definition is very impor-tant.

Definition 3 A set A is said to be countable if it is either finite or A ∼ N.

Some authors define countable as being equipotent to N and exclude finite sets.I feel it is preferable to keep finite sets as part of the class of countable sets. Degustibus non est disputandum. In all events, N is the smallest infinite set. It iseasy to see that N is indeed an infinite set.

Lemma 1.3 Let A be an infinite set. Then A contains a subset B such thatB ∼ N .

Intuitively, this lemma says that there is no set which is infinite and smallerthan N, but there is a snag: Could N contain things bigger than itself? Whydoes B ⊂ A imply B is smaller than A? Answering these questions takes us outof the simple things, and we are still simple.Some examples. You have all probably seen proofs that the following sets arecountable:

1

i. N (of course!).

ii. The set Z of all integers.

iii. The set Q of all rational numbers.

iv. The set A of all algebraic numbers; a complex number z being algebraic ifthere exists n ∈ N, integers a0, a1, . . . , an with an 6= 0 such that anzn +an−1z

n−1 + · · ·+ a1z + a0 = 0. Some examples of algebraic numbers andthe equations they satisfy are the following:

1. Every rational number is algebraic; if r = a/b where a, b are integers,b 6= 0, then it satisfies bz + (−a) = 0.

2. The number i satisfies z2 + 1 = 0.

3. Square and cubic roots of integers; for example√

3 satisfies z2 +(−3) = 0, while 3

√7 satisfies z3 + (−7) = 0.

With some difficulty one shows that A is a field. Numbers which are notalgebraic are called transcendental; the best known examples of transcen-dental numbers are π and e.

An example of a non-countable set is the set R of real numbers. The followinglemma concludes our real simple things.

Lemma 1.4 Let I be a countable set and for every ι ∈ I let Aι be countable.Then ⋃

ι∈I

Aι

is countable. (A countable union of countable sets is countable.)

Well, the last lemma is perhaps not that simple, but its proof uses the same ideaused to prove that Q is countable. Frequently it is used with the sets Aι beingfinite; here we see one reason why one wants to include finite sets among thecountable ones; otherwise results like this one become more awkward to state.

Let’s get to a more serious result.

Definition 4 Let A,B be sets. We’ll write A ¹ B, and say (somewhat infor-mally) that A is smaller than B, or has a smaller cardinality than B, if thereexists a one-to-one function f : A → B.

It is still quite easy to prove:

Lemma 1.5 The relation ¹ satisfies:

1. For every set A, we have A ¹ A.

2. If A,B, C are sets, if A ¹ B and B ¹ C, then A ¹ C.

If A is a set and B ⊂ A, then we clearly have B ¹ A; the identity map of Awhen restricted to B is one-to-one from B to A. It is also easy to see that if wehave finite sets A,B of cardinalities n,m respectively, then A ¹ B if and only ifn ≤ m. And, obviously, if A ∼ B then A ¹ B and B ¹ A. What is not so easyto see is the converse of this last result. It is difficult enough to have a name.

Theorem 1.6 (Cantor-Schroeder-Bernstein). Let A,B be sets. If A ¹ B andB ¹ A, then A ∼ B.

2

To appreciate a bit the difficulty of the theorem, try to think how you wouldprove it. You have given that there is a one-to-one function f : A → B and aone-to-one function g : B → A. You have to construct a one-to-one and ONTOfunction from A to B. How do you do this? You could try to prove that fis actually onto, but there is no way of doing this because it is usually false.Consider the following example. We define the following function f : N → Q;f(n) = n for all n ∈ N. This function is clearly one-to-one, and clearly NOTonto. Now we’ll define g : Q→ N as follows: Let x ∈ Q. Then there is a uniqueway of representing it in the form x = m/n where m, n are integers, n > 0 andthe greatest common divisor of m and n is 1. We define then

g(x) = 2n3m if m ≥ 0,

g(x) = 2n5m if m < 0.

For example, g(1/2) = 22 · 31 = 12, g(6) = 21 · 36 = 1278, g(0) = g(0/1) =21 · 30 = 2, etc. The uniqueness of the prime factorization of an integer n > 1shows that the function g is one-to-one. Clearly, it isn’t onto. Now, due to theexistence of f we have N ¹ Q, due to g we have Q ¹ N. Cantor-Schroeder-Bernstein says that N ∼ Q and it seems that somehow we should be able toconstruct a one-to-one, onto function h : N→ Q from f and g. Do you see how?

Now we can rephrase Lemma 1.3 in the form:

Lemma 1.7 N ¹ A for every infinite set A.

Definition 5 Let A be a set. The power set of A is the set denoted by P(A)consisting of all subsets of A. Thus C ∈ P(A) if and only if C ⊂ A.

An important, though not too difficult result is

Lemma 1.8 (The Russell paradox, positive version) Let A be a set. Then A ≺P(A), that is A ¹ P(A) but A 6∼ P(A).

One consequence of this lemma is that there is no end to the power of sets,one can always get a set of higher cardinality by taking the power set of whatone has.

Now let A = N and let us compare P(N) with [0, 1) = {x ∈ R : 0 <x < 1}. x ∈ [0, 1) there is only one decimal expansion x = 0.a1a2 . . . if weagree not to allow expansions ending in a “99999 . . ..” That is, we do not allowx = 0.499999 . . .; we write x = 0.5 instead. Now let pn for n = 1, 2, . . . be then-th prime, so p1 = 2, p2 = 3, p3 = 5, etc. Define, if x = 0.a0a1 . . ., withai ∈ {0, 1, 2, . . . , 9},

f(x) = {pa01 , pa0

1 pa12 , pa0

1 pa12 pa2

3 , . . .}.

For example,

f(0) = {1, 1, 1, . . .}f(0.5) = {32, 32, 32, . . .}f(1/3) = {8, 216, 27000, . . .}

f(1/√

2) = f(0.70710678 . . . = {27, 2730, 273057, . . .}

It is clear (I think) that this function is one-to-one from [0, 1) to P(N). Wedefine now g : P(N) → [0, 1) as follows. Let C ⊂ N. We can order this set andwrite it in the form

C = {c1, c2, . . . , cn}

3

if it is a finite set of n elements, or

C = {c1, c2, . . .}

if it is infinite, where 1 ≤ c1 < c2 < · · · . Now let kn be the number of digits ofthe number cn. In the finite case, if n > N we set kn = 0. We assign to this seta real number g(C) as follows. Schematically we write

g(C) = 0.k10c1k20c2k30c3 . . .

recalling that ki, ci are numbers that could have several digits. For example,the set of powers of 5, C = {1, 5, 25, 125, 625, 3125, . . .} gets assigned to

g(C) = 0.10520253012530625403125 . . .

The set C = {1234, 23456, 4455660099887721} gets assigned to

g(C) = 0.401234502345616044556600998877210000 . . .}

We assign g(∅) = 0.One can reconstruct C from the rule. If the first digit of C is 0, then all

should be 0 and C = ∅. If not, find the first 0. The digits between the first 0 andthe decimal point form a number telling you how many digits the first elementof C has. The 0 is then followed by the first entry of C. Once you determinedthe first entry, the digits following it up to the next 0 tell you how many digitsthe second entry of C has. And so forth. The function g is one-to-one.

The point of this is that we have P(N) ¹ [0, 1) and [0, 1) ¹ P(N), so thatP(N) ∼ [0, 1). On the other hand, [0, 1) ∼ R. We can see, for example (all thiscan be done in many different ways) that setting

f(x) =x

1 + x

for x ∈ R, x ≥ 0 defines a one-to-one, onto function from [,∞) to [0, 1), so[0, 1) ∼ [0,∞). Of course [0,∞) ¹ R; on the other hand the map x 7→ ex isone to one and has range in [0,∞) so that R ¹ [0,∞). Thus [0,∞) ∼ R. Bytransitivity, we see that P(N) ∼ R.

Is there anything in between? That is, we have N, the smallest infinite set.Then comes R ∼ P(N), a larger set. Is there a set C such that N ¹ C ¹ R, butN 6∼ C 6∼ R? Equivalently: Is every infinite uncountable subset of R equipotentwith R? Cantor, who started this, looked for sets as strange as possible in R.But for every set he managed to prove that it was either countable or equipotentwith R. Finally, he conjectured that the answer to the question above is NO, thisconjecture is known as the continuum hypothesis. Kurt Godel, in 1940, provedthat the continuum hypothesis was consistent with the axioms of set theory;you can’t disprove it. In 1963, Paul Cohen proved it was independent of theseaxioms; you can’t prove it. It has also been established, I think by Cohen, thatthe continuum hypothesis is equivalent to the generalized continuum hypothesis:If A is an infinite set and if C is a set such that A ¹ C ¹ P(A), then eitherC ∼ A or C ∼ P(A).Operations with cardinals The following results are not terribly hard toprove, and useful when working with infinite sets.

Lemma 1.9 Let A, B be sets, assume B ¹ A and A is infinite. Then A∪B ∼A, and if B 6= ∅, A×B ∼ A.

4

Finally, let A,B be sets. Is it always true that one either has A ¹ B orB ¹ A? The answer is a guarded yes. A more precise statement is that a yesanswer is equivalent to the axiom of choice. Since we usually accept the axiomof choice, the answer is yes. This allows one to define the cardinal number of aset as an equivalence class of sets. Every set ends by having a cardinal numberand two sets are equipotent if and only if they have the same cardinal number.

2 Digression on sums

Definition 6 Let I be a set (finite, infinite, countable or uncountable). Forevery ι ∈ I let aι ∈ R, aι ≥ 0. Then

∑

ι∈I

aι = sup{∑

ι∈F

aι : F ⊂ I, F finite}.

If F is a finite set, then one has, for some n, F = {ι1, . . . , ιn} and

∑

ι∈F

aι = aι1 + · · ·+ aιn =n∑

j=1

aιj

is the usual sum.The following properties are sort of immediate. Or very easy.

Lemma 2.1 Let I be a set and for each ι ∈ I let aι ∈ R, aι ≥ 0.

1. If K ⊂ J ⊂ I, then ∑

ι∈K

aι ≤∑

ι∈J

aι.

2. If J,K ⊂ I are such that J ∩K = ∅ and I = J ∪K, then∑

ι∈I

aι =∑

ι∈J

aι +∑

ι∈K

aι.

3. If I = {1, . . . , n} (is a finite set, and then we can assume it is a sectionof the integers), then

∑

ι∈I

aι = a1 + · · ·+ an.

The following result is more interesting. We provide a proof.

Lemma 2.2 Let I be a set and for each ι ∈ I let aι ∈ R, aι ≥ 0. Assume thatI is an infinite set and ∑

ι∈I

aι < ∞.

Then:

1. The set J = {ι ∈ I : aι > 0} is countable and∑

ι∈I

aι =∑

ι∈J

aι

2. If J is infinite let {j1, j2, . . .} denote any ordering of the elements of Jinto a sequence. Then

∑

ι∈J

aι = limn→∞

n∑

k=1

ajk.

5

Proof. Let α =∑

ι∈I aι. For n ∈ N, denote by Jn the set of indices ι ∈ I suchthat aι > 1/n. Then J =

⋃n∈N Jn. Let F be a finite subset of Jn, let’s say that

F has k elements. Thenα ≥

∑

ι∈F

aι ≥ k1n

,

so that k ≤ nα. The fact that every finite subset of Jn has fewer than nαelements, means that Jn itself can’t have more than nα elements. Thus Jn

is finite (countable) and so is J =⋃

n∈N Jn. This proves 1. For 2, assumeJ = {jk : k ∈ N}. For n ∈ N set

αn = aj1 + · · ·+ ajn=

n∑

k=1

ajk.

We have to prove that∑

ι∈J

aι = limn→∞

αn =∑

ι∈I

aι.

However, the fact that ∑

ι∈J

aι =∑

ι∈I

aι

is immediate, so we just do∑

ι∈J

aι = limn→∞

αn.

In the first place, the sequence {αn} increases, thus its limit exists. Let uscall the limit β. Let F be a finite subset of J . Then there exists n such thatF ⊂ {j1, . . . , jn} and it follows that

∑

ι∈F

aι ≤n∑

k=1

ajk= αn ≤ β.

Thus∑

ι∈F aι ≤ β for every finite subset F of J , hence also∑

ι∈J

aι = supF⊂J,F finite

∑

ι∈F

aι ≤ β.

On the other hand, Fn = {j1, . . . , jn} ⊂ J so that

αn =∑

ι∈Fn

aι ≤∑

ι∈J

aι

for all n, thus also β ≤ ∑ι∈J aι. This proves that

∑ι∈J aι = β.

We notice that in proving that∑

ι∈J aι = β we did not really need to assumethat β < ∞; the same proof shows: Let a1, a2, . . . be a sequence of non-negativenumbers. Then ∑

n∈Nan = lim

n→∞

n∑

k=1

ak.

Notice that the left hand side of this equation is independent of the order inwhich the numbers are listed; that is if φ : N→ N is one-to-one and onto, thenthe map F 7→ φ(F ) is one-to-one and onto from finite subsets of N to finitesubsets of N . Thus ∑

n∈Naφ(n) =

∑

n∈Nan.

It follows that the right hand sides are also the same, proving the rearrangementinvariance of the sum of series of NON-NEGATIVE terms.

6

3 Measurable sets and measures

The basic ingredients of measure theory are a set X and some notion of ameasure for the subsets of X. What we want (and one should remember thatwhat we want is not necessarily what we get) is to define the measure of subsetsof X. Our abstract theory should somehow agree with our intuitive ideas aboutmeasure. So let us suppose we have a set X. We could say that a measure in Xis a function µ (it is traditional to use symbols like µ, ν, for measures) assigningto every subset of X a positive number. Well, not quite; the empty set shouldhave measure 0. We’ll also want the measure of a union of two sets with nocommon elements to be the sum of the individual measures:

µ(A ∪B) = µ(A) + µ(B) if A,B ⊂ X and A ∩B = ∅.And, of course, if A ⊂ B, we’ll want µ(A) ≤ µ(B).

This could be a problem with large sets. If, for example, A is an uncountablesubset of X, if x ∈ X, let us write ax = µ({x}), the measure of the set consistingof the single element x. If ax > 0 for every x, then one can show that for everyreal number r there exist points x1, . . . , xN ∈ A with

ax1 + · · ·+ axN ≥ r;

i.e., if BN = {x1, . . . , xN}, then

µ(BN ) = ax1 + · · ·+ axN ≥ r.

Because we must have µ(A) ≥ µ(BN ), the only way out is µ(A) = ∞. Well, thisisn’t quite what we want; many subsets of the line and plane are uncountableand yet their usual measure is quite finite. The real way out of this is to relaxthe definition a bit, to allow µ(A) = 0 even if A 6= ∅. Is this enough? Is it goodenough to define: A measure on X is a map assigning to every subset A of Xa non-negative real number µ(A) such that µ(A ∪B) = µ(A) + µ(B) wheneverA ∩B = ∅? Notice that µ(A) ≤ µ(B) if A ⊂ B is a consequence;

A ⊂ B ⇒ A = B ∪ (A\B) ⇒ µ(A) = µ(B) + µ(A\B) ≥ µ(B).

If we think of how we measure, the answer is no. Let us think of the case of theplane, and we want to find the area of a circle of radius 1. We can fit squaresinto it, smaller and smaller squares, and add those areas. Areas of any figurewith curved boundaries is calculated by some sort of limiting process, and weneed to allow for this situation. We could try to require: Given any family {Aα}of subsets which are pairwise disjoint; that is, Aα ∩Aβ = ∅ if α 6= β, then

µ(⋃α

Aα) =∑α

µ(Aα).

But this won’t work. In the usual notion of measure in the line (length) or inthe plane (area), points have measure 0. Every set is a union of points, andtwo sets consisting of a single point each, are disjoint. So ALL sets in suchcircumstances would have measure 0. A bit of reflection shows that one wantsto restrict to countable unions, and only require: If A1, A2, . . . is a sequence ofsubsets of X such that An ∩Am = ∅ if n 6= m, then

µ(∞⋃

n=1

An) =∞∑

n=1

µ(An).

At this point we are almost OK, but still not quite OK. The difficulty onestill can run into is more subtle. It has to do with the fact that if a set X is

7

uncountable, it will contain very strange subsets. Trying to assign a measure toALL of them can be impossible. We’ll get back to this point later on; for now,accept that one needs to consider the possibility that not every subset can (orshould) be assigned a measure.

As it turned out, a satisfactory way of doing measure theory in a set isto decide on a family of subsets which can be measured. For things to worksmoothly, this family needs to have certain properties. For example, it shouldbe closed under countable unions. We start looking at such families of sets in amore precise way next.

3.1 Measurable spaces

Definition 7 Let X be a set. An algebra in X is a family A of subsets of Xsuch that

i. ∅, X ∈ A.

ii. If A ∈ A, then Ac ∈ A.

iii. If A,B ∈ A, then A ∪B ∈ A.

Definition 8 Let X be a set. A σ-algebra in X is a family S of subsets of Xsuch that

i. ∅, X ∈ S.

ii. If A ∈ S, then Ac ∈ S.

iii. If {An}∞n=1 is a sequence of subsets of X such that An ∈ S for each n ∈ N,then

⋃∞n=1 An ∈ S.

Remarks

1. Every σ-algebra is an algebra. In fact, if A,B ∈ S, and S is a σ-algebra,then setting A1 = A, A2 = B and An = ∅ for n ≥ 3, we see that {An}∞n=1

is a sequence of sets in S, hence A ∪B =⋃∞

n=1 An ∈ S.

2. Algebras are closed under set operations; specifically, if A is an algebraand A,B ∈ A, then

A ∩B = (Ac ∪Bc)c ∈ A,

A\B = A ∩Bc ∈ A,

A∆B = A\B ∪B\A ∈ A.

3. In addition, σ-algebras are closed under countable intersections: If {An}is a sequence of sets in the σ-algebra S, then

∞⋂n=1

An =

( ∞⋃n=1

An

)c

∈ S.

Examples.

1. Let X be any set. Then S = {∅, X} is a σ-algebra in X.

2. Let X be any set. Then P(X), the power set of X, is a σ-algebra.

3. Let X be a set, assume A ⊂ X, ∅ 6= A 6= X. Then S = {∅, X, A,Ac} is aσ-algebra.

8

4. Let X be a set, assume X is uncountable (to make it interesting). Let

S = {A : A ⊂ X, either A or Ac is countable}.Then S is a σ-algebra in X.

5. Let X = R. Let A consist of all subsets of R which can be written as afinite union of intervals. An interval is, of course, any subset of R of theform

(a, b) = {x ∈ R : a < x < b}, with −∞ ≤ a ≤ b ≤ ∞;

[a, b) = {x ∈ R : a ≤ x < b}, with −∞ < a ≤ b ≤ ∞;

(a, b] = {x ∈ R : a < x ≤ b}, with −∞ ≤ a ≤ b < ∞;

[a, b] = {x ∈ R : a ≤ x ≤ b}, with −∞ < a ≤ b < ∞.

R = (−∞,∞) is an interval and so is ∅ = (0, 0) = (0, 0]. A more concisedefinition is that an interval is a connected subset of R.

Then A is an algebra which is NOT a σ-algebra.

Exercise 1 Prove all the assertions in the previous examples.

Lemma 3.1 Let X be a set. For each λ in some index set Λ, let Sλ be aσ-algebra in X. Then ⋂

λ∈Λ

Sλ

is a σ-algebra in X.(The intersection of an arbitrary number of σ-algebras is a σ-algebra.)

Exercise 2 Prove Lemma 3.1.

Definition 9 Let X be a set, let C be a family of subsets of X. The σ-algebragenerated by C is the smallest σ-algebra σ(C) in X containing C. The precisedefinition is

σ(C) =⋂{S : S is a σ-algebra in X and C ⊂ S}.

Exercise 3 Determine the σ-algebra generated by the following families of sets.(Each point gives a set X and a family C of subsets of X. Determine σ(C).)

1. X an arbitrary set and C = ∅ (the empty family of subsets of X).

2. X a set, A ⊂ X, ∅ 6= A 6= X, C = {A}.3. X = N and C = {{n} : n ∈ N}. That is, C is the family of all singleton

subsets of the set N of positive integers.

4. X = R and C = {{x} : x ∈ R}. That is, Cis the family of all singletonsubsets of the set R of real numbers.

Exercise 4 This exercise introduces some simple though useful set theoreticaltechniques. Let A1, A2, A3, . . . be subsets of some set X.

1. (Writing unions as disjoint unions). Define sets B1, B2, . . . by

B1 = A1, Bn = An\

n−1⋃

j=1

Aj

if n > 1.

Prove:

9

(a) Bn ⊂ An for all n ∈ N.

(b) For every m ∈ N,m⋃

n=1

An =m⋃

n=1

Bn.

(c)∞⋃

n=1

An =∞⋃

n=1

Bn.

(d) Bn ∩Bm = ∅ if n,m ∈ N and n 6= m.

(e) Assume A is an algebra in X and An ∈ A for all n ∈ N. ThenBn ∈ A for all n ∈ N.

Thing to remember The union of a sequence of sets in an algebra canbe rewritten as a disjoint union of smaller sets in the same algebra. (aunion is said to be disjoint if all sets in the union are pairwise disjoint).

2. (Writing unions as increasing unions) This time set

Bn =n⋃

j=1

Aj for n = 1, 2, . . . .

Prove:

(a) An ⊂ Bn for all n ∈ N.

(b) For every m ∈ N,m⋃

n=1

An = Bm.

(c)∞⋃

n=1

An =∞⋃

n=1

Bn.

(d) B1 ⊂ B2 ⊂ B3 ⊂ · · · .(e) Assume A is an algebra in X and An ∈ A for all n ∈ N. Then

Bn ∈ A for all n ∈ N.

Thing to remember The union of a sequence of sets in an algebra can berewritten as a union of an increasing sequence of sets in the same algebra.

3. (Writing intersections as decreasing intersections) This time set

Bn =n⋂

j=1

Aj for n = 1, 2, . . . .

Prove:

(a) An ⊃ Bn for all n ∈ N.

(b) For every m ∈ N,m⋂

n=1

An = Bm.

(c)∞⋂

n=1

An =∞⋂

n=1

Bn.

10

(d) B1 ⊃ B2 ⊃ B3 ⊃ · · · .(e) Assume A is an algebra in X and An ∈ A for all n ∈ N. Then

Bn ∈ A for all n ∈ N.

Thing to remember The intersection of a sequence of sets in an algebracan be rewritten as the intersection of a decreasing sequence of sets in thesame algebra.

Exercise 5 Let A be an algebra in some set X. Prove: If

∞⋃n=1

An ∈ A

whenever {An} is a sequence of pairwise disjoint sets in A, then A is a σ-algebra.

The examples of generated σ-algebras of Exercise 3 are fairly simple. Thingsare not as easy in general. The way one goes from a family of sets C in X to thegenerated σ-algebra in a more or less constructive way is roughly as follows.

Step 0. Add the empty set and X to C (if they are not already in it).

Step 1. Add (to C) all countable unions of sets in C. The family is now closedunder countable unions. If by some miracle it is also closed under com-plementation, we are done; but that usually doesn’t happen in interestingsituations.

Step 2. Add all complements of the sets in the family. We now have a familyclosed under complementation. Unfortunately, it most likely ceased to beclosed under countable unions.

Step 3. Repeat Step 1.

Step 4. Repeat Step 2.

Step 5-ℵ0 Keep on repeating.

The process is essentially an uncountable one. For details, see [1]. The followingexercise may be of interest in this context.

Exercise 6 1. Give an example of a finite σ-algebra.

2. Let M be an infinite σ-algebra. Prove it is uncountable.

Note on topological spaces. To simplify, and because it really suffices for ourpurposes, we’ll work mostly in metric spaces. However, in most definitions it ispossible to replace every occurrence of the word metric by the word topological.

Definition 10 Let X be a metric space (or, more generally, a topological space).The σ-algebra of Borel sets of X is the σ-algebra generated by the family of opensubsets of X. A Borel subset of X is, by definition, an element of this σ-algebra.If X is a metric space, we denote the σ-algebra of Borel subsets of X by B(X).

In general, given a metric space, there is no necessary and sufficient conditionwhich allows us to determine whether a set is a Borel set or not. There are, ofcourse, plenty of sufficient conditions.

11

Exercise 7 Let X be a metric space. Verify that all of the following subsets(or collections of subsets) of X are Borel sets.

1. All open sets.

2. All closed sets.

3. All singleton sets.

4. All Gδ sets, where by definition, a subset of X is a Gδ-set if it is a count-able intersection of open sets.

5. All Fσ sets, where by definition, a subset of X is an Fσ-set if it is acountable union of closed sets.

Exercise 8 You could have used this in the previous exercise. Let X be a metricspace. Show that every open subset of X is an Fσ set and that every closed subsetis a Gδ set.

Exercise 9 Let X be a metric space. Show that the σ-algebra generated by theclosed subsets of X coincides with the σ-algebra of Borel sets.

Exercise 10 Consider Rn as a metric space, as usual. Show that the σ-algebragenerated by all compact subsets also coincides with the σ-algebra of Borel sets.Is the same true in a general metric space?

Definition 11 A measurable space is a pair (X,M) consisting of a set X anda σ-algebra M in X. The elements of M are, by definition, the measurablesubsets of X.

Note. By a usual abuse of language, one says “X is a measurable space.” Bythis one understands that one has specified some σ-algebra in X with which Xbecomes a measurable space; one just doesn’t feel the need to give it a specialsymbol. One rarely needs to refer to the σ-algebra by name; it is the σ-algebraof measurable subsets of X and one says E is measurable as an alternative tosaying E ∈M.

Exercise 11 Let (X,M) be a measurable space and let E ∈M. Define

(1) ME = {F ∈M : F ⊂ E}.Prove that ME is a σ-algebra in E and that a subset of F of E is in ME ifand only if F = E ∩A for some A ∈M.

Definition 12 Let (X,M) be a measurable space and let E be a measurablesubset of X. When we consider E as a measurable space, which occurs withsome frequency, we consider it as such with the σ-algebra ME of (1). Moreprecisely: Let (X,M) be a measurable space. A subspace of (X,M) is a pair(E,ME), where E ∈M and ME is defined by (1).

4 Measurable Functions.

In this whole section we assume given a measurable space (X,M).

Definition 13 Let Y be a metric (or, more generally, a topological) space. Afunction f : X → Y is said to be measurable if the inverse of every open subsetof Y is a measurable subset of X. In symbols: f is measurable if whenever U isopen in Y , f−1(U) = {x ∈ X : f(x) ∈ U} ∈ M.

12

Exercise 12 Describe all measurable functions for the following measurablespaces.

1. (X,P(X)), X a set.

2. (X, {X, ∅}), X a set.

3. (X, {X, A, Ac, ∅}), X a set; A ⊂ X, ∅ 6= A 6= X.

4. (X,M) where X is an uncountable set and M is the family of all subsetsA of X such that either A or X\A is countable.

Remark. The definition of measurable functions is a simple one, and is quitesatisfactory as long as one works in metric spaces which are not too horriblylarge. One condition that ensures that the space is not too large is secondcountability. A metric space Y is second countable if there exists a countablefamily U of open subsets of X such that every open subset of Y is a union ofsets in U . The simplest example of such metric spaces are separable ones; ametric space Y is separable if it contains a countable dense subset; i.e. there is

D = {y1, y2, , . . .} ⊂ Y

such that the closure of D is Y : D = Y . If Y is separable and D is a countabledense subset, we can take U = {B(y, r) : y ∈ D, r ∈ Q, r > 0} (where B(y, r) ={z ∈ Y : d(z, y) < r}). The family U is then a countable collection of open setsand every open subset of Y is easily seen to be a union of sets in U . If the metricspace X is not separable (or, worse, not second countable), the definition weare using of measurability begins to have serious difficulties, solved by addingmore conditions (which are automatically satisfied in the separable case). Inall events, most of our measurable functions will be either complex or extendedreal valued (see below).

The following exercises and lemmas give some concrete examples of measur-able functions.

Exercise 13 Let E ⊂ X; define χE : X → R (the characteristic function of E)by

χE(x) ={

1 if x ∈ E,0 if x ∈ X\E.

Prove χE is measurable if and only if E is measurable. Do NOT use the as yetunpresented characterization of real valued measurable functions in terms of thesets {f(x) < a}, etc.

Exercise 14 Let Y be a metric space. Show that constant functions from X toY are measurable.

Exercise 15 Let E be a measurable subset of X, let Y be a metric space andlet f : X → Y be measurable. Prove that the restriction f |E of f to E ismeasurable from E to Y . (We consider E as a measure space with the σ-algebraME defined earlier).

Prove also that if g : E → Y is measurable, if η ∈ Y , then the functionf : X → Y defined by

f(x) ={

g(x) if x ∈ E,η if x ∈ X\E,

is measurable.

13

Lemma 4.1 Let f : X → Y , Y a metric space. Then f is measurable if andonly if f−1(B) is measurable for every Borel subset B of Y .

Proof. In this proof we see, for the first time, how to deal with Borel sets;more generally, how one proves that something holds for all sets of a generatedσ-algebra.

One direction is easy. If f−1(B) is measurable for every Borel subset B ofY , then f−1(U) is measurable for every open subset of U of Y , since open setsare Borel sets. Hence f is measurable.

Conversely, assume f is measurable. Normally to prove f−1(B) is measur-able for every Borel subset B of Y , we’d start by saying Let B be a Borel subsetof Y . But to say this is a dead end, since we don’t have any general specificproperty Borel sets satisfy. Instead we use the following idea: To prove thata property holds for all elements of a σ-algebra generated by some family C ofsets, we let S denote the (possibly empty) family of all sets having the desiredproperty. Then we prove that all sets in C have the property, thus C ⊂ S. Nextwe prove that S is a σ-algebra. Once we have that S is a σ-algebra containing C,it follows that S must contain σ(C), the smallest σ-algebra containing C. Thismeans that every element of σ(C) has the desired property, and we are done.

Back to our proof. Assuming f measurable, let

S = {E ⊂ Y : f−1(E) ∈M}.By definition of measurability, S contains all open subsets of Y . It is an easyexercise (see below) to prove S is a σ-algebra in Y . It follows that S mustcontain all Borel subsets of Y .

Exercise 16 Show that the set S defined in Lemma 4.1 is indeed a σ-algebra.

Lemma 4.2 Let Y,Z be metric (or topological) spaces, let f : X → Y be mea-surable and let g : Y → Z be continuous. Then g ◦ f : X → Z is measurable.

Proof. Let U be open in Z. By continuity of g, g−1(U) is open in Y , by measura-bility of f , f−1

(g−1(U)

)is measurable in X. Since f−1

(g−1(U)

)= (g◦f)−1(U),

we proved that the inverse image under g ◦ f of every open subset of Z is mea-surable in X, hence g ◦ f is measurable.

Definition 14 Let Y, Z be metric spaces. A function g : Y → Z is Borelmeasurable iff it is measurable when Y is considered as a measurable space withthe σ-algebra of Borel sets.

Exercise 17 Let Y, Z be metric spaces. Prove: If g : Y → Z is continuous,then g is Borel measurable.

Exercise 18 Let (X,M) be a measurable space, let Y, Z be metric spaces.Prove: If f : X → Y is measurable and g : Y → Z is Borel measurable,then g ◦ f : X → Z is measurable.

We now need some results on real, extended real, and complex valued mea-surable functions. First a definition.

Definition 15 The set of extended reals is the set R obtained by adjoining toR two distinct non-real objects, usually denoted by −∞,∞; in symbols,

R = R ∪ {−∞,∞}.

14

The order relation and some operations are extended to R defining

x < ∞ ∀x ∈ R ∪ {−∞}, −∞ < x ∀x ∈ R ∪ {∞}.

If a, b ∈ R, we define the intervals (a, b), [a, b), (a, b], [a, b] in the usual way; inparticular R = [−∞,∞].The defined operations are, for now,

x +∞ = ∞+ x = ∞ if x ∈ (−∞,∞],x−∞ = −∞+ x = −∞ if x ∈ [−∞,∞),

x · ∞ = ∞ · x ={ ∞ if x ∈ (0,∞],−∞ if x ∈ [−∞, 0),

x · (−∞) = (−∞) · x ={ −∞ if x ∈ (0,∞],

∞ if x ∈ [−∞, 0),x

∞ =x

−∞ = 0.

(We are not defining yet (±∞)0).We also need a topology for R. R is a metric space, but we’ll rarely, if ever,

have any use for the metric. So we just define what it means for a subset of Rto be open. The definition is: Let U ⊂ R. We say U is open if it satisfies:

i. U ∩ R is open. (So automatically, all open subsets of R are also open in R).

ii. If ∞ ∈ U , then there exists a ∈ R with (a,∞] ⊂ U .

iii. If −∞ ∈ U , then there exists a ∈ R with [−∞, a) ⊂ U .

Exercise 19 Define d : R× R→ [0,∞) by

d(x, y) =∣∣∣∣

x

1 + |x| −y

1 + |y|

∣∣∣∣

if x, y ∈ R; complete it to a metric in R defining

d(x,∞) = d(∞, x) =∣∣∣∣

x

1 + |x| − 1∣∣∣∣ ,

d(x,−∞) = d(−∞, x) =∣∣∣∣

x

1 + |x| + 1∣∣∣∣ ,

and finally d(−∞,∞) = d(∞,−∞) = 2. Show that d is a metric in R havingprecisely the open sets defined above.

Theorem 4.3 (Characterization of extended real valued measurable functions)Let f : X → [−∞,∞]. The following statements are equivalent.

1. f is measurable.

2. For every a ∈ R, the set {x ∈ X : f(x) < a} is measurable.

3. For every a ∈ R, the set {x ∈ X : f(x) ≥ a} is measurable.

4. For every a ∈ R, the set {x ∈ X : f(x) > a} is measurable.

5. For every a ∈ R, the set {x ∈ X : f(x) ≤} is measurable.

15

Notation From now on, if f : X → R and α ∈ R, we’ll abbreviate the notationwriting as follows:

{f < α} = {x ∈ X : f(x) < α}with the sets {f ≤ α}, {f ≥ α}, {f > α}, {f = α} being similarly defined.

Proof.

1 ⇒ 2 Assume f is measurable and let a ∈ R. Since (−∞, a) is open in R, wesee that

{f < a} = f−1((a, b))

is measurable.

2 ⇒ 3 Assume 2; i.e., that all sets of the form {f < a} are measurable. Leta ∈ R. Since {f ≥ a} = R\{f < a}, we conclude {f ≥ a} is measurable.

3 ⇒ 4 Assume 3; i.e., that all sets of the form {f ≥ a} are measurable. Leta ∈ R. Since

{f > a} =∞⋃

n=1

{f ≥ a +1n}

(as is easily verified), we see {f > a} is measurable.

4 ⇒ 5 Assume 4; i.e., that all sets of the form {f > a} are measurable. Leta ∈ R. Since {f ≤ a} = R\{f > a}, we are done with this part.

5 ⇒ 2 Assume 5; i.e., that all sets of the form {f ≤ a} are measurable. Leta ∈ R. Since

{f < a} =∞⋃

n=1

{f ≤ a− 1n}

(as is easily verified), we see {f < a} is measurable.

At this stage we see that statements 2, 3, 4, and 5 are equivalent, and that1 implies all of them. We now want to complete the proof.

2 ⇒ 1 Assume 2; then 3, 4, and 5 also hold, and we see that if I is an openinterval in R, then f−1(I) is measurable. In fact, say I = (a, b). Then

f−1(I) = {f > a} ∩ {f < b} ∈ M.

Assume now U is open in R. Then U is a countable union of open intervalsin R. say U = ∪∞n=1In, where In = (an, bn), −∞ < an < bn < ∞, forn = 1, 2, . . . . Then

f−1(U) = ∪∞n=1f−1(In)

is measurable. Now, as one sees quite easily, an open subset V of R is ofprecisely one of the following four forms.

1. An open subset of R. In this case we just saw f−1(V ) is measurable.

2. V = U ∪ (a,∞] for some a ∈ R. Then

f−1(V ) = f−1(U) ∪ {f > a} ∈ M.

3. V = U ∪ [−∞, a) for some a ∈ R. Then

f−1(V ) = f−1(U) ∪ {f < a} ∈ M.

4. V = U ∪ [−∞c) ∪ (a,∞] for some c, a ∈ R, c ≤ a. Then

f−1(V ) = f−1(U) ∪ {f < c}{f > a} ∈ M.

16

The theorem is proved.

Notice For a measurable f : X → R one does have that {f = a}, {f ≤ a},etc. are measurable sets for every a ∈ [−∞,∞], even though the theorem onlymentions a ∈ R.

Exercise 20 Let X be a measurable space, let f : X → R such that the set{x ∈ X : f(x) ≥ r} is measurable for every r ∈ Q. Prove that f is measurable.

The following lemma gives a useful characterization of vector valued mea-surable functions. Its proof uses the fact that every open subset U of Rm is thecountable union of open “rectangles;’ i.e., of sets of the form

(2) R = {x = (x1, . . . , xm) ∈ Rm : ai < xi < bi for i = 1, . . . , m}

where a1, . . . , am, b1, . . . , bm ∈ R; ai < bi for i = 1, . . . , m. One can even assumethat ai, bi ∈ Q.

Lemma 4.4 Let f1, . . . , fm : X → R. Define f = (f1, . . . , fm) : X → Rm by

f(x) = (f1(x), . . . , fm(x)) for x ∈ X.

Then f is measurable if and only if every fi is measurable, i = 1, . . . , m.

Proof. Assume first f is measurable and let i ∈ {1, . . . ,m}. The map

πi : Rm → R defined by πi(x) = xi if x = (x1, . . . , xm),

is continuous. Thus fi = πi ◦ f is measurable.Conversely, assume that all the fi’s are measurable. Let U be open in Rm.

Then U =⋃∞

n=1 Rn, where each Rn is an open “rectangle” in Rm. Then

f−1(U) =∞⋃

n=1

f−1(Rn)

and to see F (U) is measurable it suffices to see that f−1(R) is measurable forevery open rectangle R in Rn. So let R be given by (2). Then

f−1(R) =m⋂

i=1

({fi > ai} ∩ {fi < bi}) ,

hence measurable.

The previous lemma can be generalized to any (finite) number of metricspaces, as long as they are separable.Notation. Given a measurable space (X,M) (as we are given), we’ll denoteby Me(X) (the subscript e for extended) the set of all extended real valuedmeasurable functions on X.We’ll denote by M(X) the set of all real valuedelements in Me(X).

Definition 16 Let {fn} be a sequence of extended real valued functions on X.The functions

infn

fn, supnfn, lim infn→∞

fn, lim supn→∞

fn,

17

from X to R are defined as follows at points x ∈ X:

(infn

fn)(x) = inf{fn(x) : n ∈ N},(sup

nfn)(x) = sup{fn(x) : n ∈ N},

(lim infn→∞

)(x) =(lim infn→∞

fn(x))

,

(lim supn→∞

)(x) =(

lim supn→∞

fn(x))

.

Notice that one has:

(3) lim infn→∞

fn = supn

(inf

m≥nfm

), lim sup

n→∞fn = inf

n

(supm≥n

fm

).

Theorem 4.5 The elements of Me(X) have the following properties:

1. If f ∈ Me(X) and assumes infinite values, we define 0f to be the nullfunction anyway. That is, we define 0 · (±∞) = 0 in this case. With thisdefinition, we have: Let f ∈ Me(X) and let c ∈ R. Then cf ∈ Me(X).

2. Let f, g ∈ Me(X) and let

C = ({f = ∞} ∩ {g = −∞}) ∪ ({f = −∞} ∩ {g = ∞}) .

(C is the set where f + g is undefined; with luck, C = ∅.) Then E = X\Cis measurable and f |E + g|E ∈ Me(E). (f |e is the restriction of f to E;we’ll usually write f + g ∈ Me(E) using, with some care, the same symbolfor a function and its restrictions (or extensions)

3. Let f ∈ Me(X). The functions f2, |f | ( the latter being defined by |f |(x) =|f(x)| for x ∈ X; with the added definition that |±∞| = ∞) are in Me(X).

4. Let f, g ∈ Me(X). Then fg ∈ Me(X) if we define fg(x) = 0 when one off(x), g(x) is zero and the other one ∞ or −∞.

5. Let f ∈ Me(X) and assume either that f(x) 6= 0 for all x ∈ X or thatf(x) ≥ 0 for all x ∈ X. Then 1/f ∈ Me(X) if in the second case wedefine 1/f(x) = ∞ in case f(x) = 0.

6. Let {fn} be a sequence of elements of Me(X). Then

infn

fn, supn

fn, lim infn→∞

fn, lim supn→∞

fn ∈ Me(X).

Proof. We use the characterizations of Theorem 4.3.

1. Let f ∈ Me(X) and let c ∈ R. If c = 0 then cf ≡ 0 is certainly measurable,so assume c 6= 0. Let a ∈ R. If c > 0, then for every x ∈ X, cf(x) > aif and only if f(x) > a/c; i.e., {cf > a} = {f > a/c}. The last set beingmeasurable, and a being arbitrary, we proved that {cf > a} is measurablefor every a ∈ R hence, by Theorem 4.3, cf is measurable. Similarly, ifc < 0, then {cf > a} = {f < a/c} is measurable, hence cf is measurable.

18

2. The set C is measurable since f, g are measurable. Restrictions of measur-able functions to measurable sets are measurable (see below), so we mayas well prove this point assuming C = ∅, E = X. Let a ∈ R. We claim:

{f + g < a} =⋃

r∈Q({f < r} ∩ {g < a− r}) .

In fact, assume x is in the set on the right hand side. Then there existssome r ∈ Q such that f(x) < r and g(x) < a − r, so that (f + g)(x) =f(x)+g(x) < r+(a−r) = a, so that x is also in the set on the left hand side.Conversely, assume x is in the left hand side set. Then f(x) + g(x) < a.But a ∈ R, so this forces f(x) < ∞, g(x) < ∞ and it follows that f(x) <a− g(x) (even if g(x) = −∞, in which case the inequality is trivially truesince f(x) < ∞). By the density of the rationals in R, there is r ∈ Q suchthat f(x) < r < a − g(x). Thus x ∈ {f < r}; it also is in {g < a − r}because g(x) < a− r is immediate from r < a− g(x) if g(x) is finite andtrivial if g(x) = −∞. Thus x ∈ {f < r} ∩ {g < a− r}, so that it is in theset on the right hand side. The claim is established.

Every set in the union of the right hand side is measurable by the measur-ability of f and of g; since the union is countable (because Q is countable),we see that {f + g < a} is measurable. Since a ∈ R was arbitrary, we aredone.

Note: If one can assume f, g ∈ M(X) (are finite valued) one can obtainthis theorem as an immediate consequence of Lemma 4.4, plus the con-tinuity of the map τ : R2 → R defined by τ(s, t) = s + t. In fact, thefunction F = (f, g) : x 7→ (f(x), g(x)) from X to R2 is measurable byLemma 4.4, hence so is f + g = τ ◦ F . One could extend lemma 4.4 sothat it also applies to Rm.

3. The maps s 7→ s2 and s 7→ |s| are continuous from R to R, so for finitevalued functions this point follows at once from Lemma 4.2. One canshow that the maps in question are actually continuous from R to R, sothe result also follows from Lemma 4.2in the more general case. But we’lluse again Theorem 4.3. We assume f ∈ Me(X) and notice that for a ∈ R

{f2 ≥ a} ={

X ∈M if a ≤ 0,{f ≥ √

a} ∪ {f ≤ −√a} ∈ M if a > 0,

proving f2 ∈ Me(X). Similarly

{|f | ≥ a} ={

X ∈M if a ≤ 0,{f ≥ a} ∪ {f ≤ −a} ∈ M if a > 0,

so |f | ∈ Me(X).

4. We can get fg measurable if f, g are measurable as an easy consequenceof Lemmas 4.2, 4.4 if f, g are finite valued. In fact fg is the compositionof x 7→ (f(x), g(x)) which is measurable from X to R2 by Lemma 4.4 andof the continuous map (s, t) 7→ st from R2 to R. We could extend thingsso they are valid for R replacing R, but there is also a quick direct proof.Unfortunately, we still have to worry about infinite values. Let us firstsort of eliminate them, defining

D = {|f | = ∞} ∪ {|g| = ∞};

19

D is the set where at least one of f and g assumes infinite values. ThenD is measurable, so is E = X\D and

fg =14

((f + g)2 − (f − g)2

)

shows that fg ∈ Me(E). We can now write D as the disjoint union of threesets, D = D0 ∪D+ ∪D−, which are defined as the subsets of D where fgis 0, ∞, −∞, respectively. Specifically, x ∈ D0 if f(x) = 0 and |g(x)| = ∞or if |f(x)| = ∞ and g(x) = 0; x ∈ D+ if f(x) = ∞, g(x) > 0, or iff(x) > 0, g(x) = ∞, or if f(x) < 0, g(x) = −∞, or if f(x) = −∞, g(x) < 0;similarly for D−, the set where fg = −∞. These three sets are measurable.Now, if a ∈ R, a ≥ 0, then

{fg > a} = {x ∈ E : (fg)(x) > a} ∪D+ ∈M;

if a < 0 then

{fg > a} = {x ∈ E : (fg)(x) > a} ∪D+ ∪D0 ∈M.

The measurability of fg follows.

5. Let f ∈ Me(X) and assume first f(x) 6= 0 for all x ∈ X. Since there is noreason why f should be continuous, we can have f assuming both positiveand negative values, as well as infinite ones. However,

{ 1f

> a} =

{f > 0} ∩ {f < 1a} if a > 0,

{f > 0} ∩ {f < ∞} if a = 0,{f > 0} ∪ {f < 1

a} if a < 0.

It follows that 1/f is measurable. In case f(x) ≥ 0 for all x ∈ X, we get

{ 1f

> a} =

{f < 1a} if a > 0,

{f < ∞} if a = 0,X if a < 0.

Measurability follows.

6. Let {fn} be a sequence of extended real valued measurable functions. Wenotice (prove as an exercise) that for every a ∈ R

infn{fn < a} =

∞⋃n=1

{fn < a},

supn{fn > a} =

∞⋃n=1

{fn > a},

which proves the measurability of infn fn and of supn fn; the formulas (3)now prove the measurability of lim infn→∞ fn and of lim supn→∞ fn.

Corollary 4.6 M(X) is a real vector space with the usual operations. It is alsoclosed under products; in other words, M(X) is an algebra over R.

20

Exercise 21 Let E be a measurable subset of X, let Y be a metric space andlet f : X → Y be measurable. Prove that the restriction f |E of f to E ismeasurable from E to Y . (We consider E as a measure space with the σ-algebraME defined earlier).

Prove also that if g : E → Y is measurable, if η ∈ Y , then the functionf : X → Y defined by

f(x) ={

g(x) if x ∈ E,η if x ∈ X\E,

is measurable.

Corollary 4.7 Let {fn} be a sequence of extended real valued functions on X.The set

E = {x ∈ X : limn→∞

fn(x) exists}

is measurable (in M) and if we define limn fn = limn→∞ fn on E by(

limn→∞

fn

)(x) = lim

n→∞(fn(x)) ,

then limn fn is measurable on E. In particular, if the sequence converges at eachx ∈ X, then limn fn : X → R is measurable. Here convergence is understood inR; i.e., it means either regular convergence or divergence to ∞ or to −∞.

Proof. The following result should, perhaps, have been mentioned earlier: Letf, g be two extended real valued measurable functions on X. Then the set{f = g} = {x ∈ X : f(x) = g(x)} is measurable. If f, g are finite valued, thenthe proof is immediate; in this case

{f = g} = {f − g = 0} = (f − g)−1(0)

is measurable, since f − g is measurable. If f , g assume infinite values, thenf − g is undefined. We could still use a variant of the previous argument, butwe can also proceed as follows. We simply notice that

{f < g} = {x ∈ X : f(x) < g(x)} =⋃

r∈Q({f < r} ∩ {g > r})

and, since Q is countable, it follows that {f < g} is measurable. Switch-ing the roles of f, g we get that {f > g} is measurable. Thus {f = g} =({f < g} ∪ {f > g})c is measurable. With this result out of the way, the lemmafollows easily from the fact that the set E defined in the statement is the setwhere the two measurable functions lim infn fn and lim supn fn coincide. Infact, once one sees that E ∈M, the rest follows at once from previous work.

Let A,B be sets. The functions of finite range are among the simplestfunctions from A to B. These include constant functions, functions taking ononly two values, etc. They are so simple they are called simple functions. Ofcourse, if B is finite, all functions from A to B are simple. We are interested inthe measurable ones.

Definition 17 Let Y be a metric space. A measurable simple function from Xto Y is a measurable function which only assumes a finite number of values ofY .

The following lemma is sort of immediate.

21

Lemma 4.8 Let Y be a metric space and let f : X → Y be simple; say f(X) ={y1, . . . , ym} where Yi 6= yj if i 6= j. Then f is measurable if and only if the setsEi = f−1({yi}) = {x ∈ X : f(x) = yi} are measurable for i = 1, . . . , m.

Exercise 22 Prove lemma 4.8

Mostly we will be concerned with real valued simple functions. Let f : X →R be simple and let {a1, . . . , am} be its range, with ai 6= aj if i 6= j. DefiningEi = {f = ai}, Lemma 4.8 tells us that f is measurable if and only if eachEi ∈ M. We notice the rather immediate fact that Ei ∩ Ej = ∅ if i 6= j andthat

f =m∑

i=1

aiχEi .

Writing out f in this way, we can even assume that ai 6= 0 for all i. In fact,assume that (for example) am = 0. Then obviously

f =m∑

i=1

aiχEi =m−1∑

i=1

aiχEi.

We will say that the simple function f : X → R is written in canonical form ifwe write

f =m∑

i=1

aiχEi

where Ei∩Ej = ∅ for i 6= j and a1, . . . , am are distinct, non-zero, real numbers.As an example assume X = R (with some σ-algebra that’s totally beside thepoint at this stage) and let

f = χ(0,1] + 2χ[1/2,5] − 3χ(1/4,4).

This is a simple function, but definitely not in canonical form. To write it incanonical form, we have to figure out what values it takes where.

Exercise 23 Verify that the canonical form of the simple function of the lastexample is

f = χ(0,1/4] − 2χ(1/4,1/2) − χ(1,4] + 2χ(4,5].

We have the following lemma.

Lemma 4.9 The measurable, real valued simple functions are a linear subspaceand a subalgebra of the space M(X). It is the smallest linear subspace containingall characteristic functions χE of measurable sets E.

Exercise 24 Prove lemma 4.9. The proof consists in verifying that if f, g are(measurable) simple functions, then f + g, fg and also cf for every c ∈ R aresimple. Plus, the already pointed out fact, that a function is simple if and onlyif it is a linear combination of characteristic functions.

Definition 18 Let {An} be a family of subsets of X. We define the setslim infn An = lim infn→∞An, lim supn An = lim supn→∞An by

lim infn→∞

An =∞⋃

n=1

( ∞⋂m=n

Am

),

lim supn→∞

An =∞⋂

n=1

( ∞⋃m=n

Am

).

22

We say the sequence converges if lim infn→∞An = lim supn→∞An, in whichcase we write limn→∞ = lim infn→∞An = lim supn→∞An (and call this set thelimit of the sequence).

Exercise 25 Let {An} be a sequence of subsets of X and denote the liminf andthe limsup by

A = lim infn→∞

An, A = lim supn→∞

An.

Prove the following statements

1. If An is measurable for each n ∈ N, then A and A are measurable.

2.

A = {x ∈ X : the set n ∈ N with x ∈ An is infinite},A = {x ∈ X : the set n ∈ N with x /∈ An is finite}.

3. A ⊂ A.

4.χA = lim inf

n→∞χAn

, χA = lim supn→∞

χAn.

5. The sequence {An} converges if and only if the sequence {χAn} of char-acteristic functions converges, in which case

χlimn An = limn

χAn .

6. Assume A1 ⊂ A2 ⊂ A3 ⊂ · · · . Then {An} converges and limn An =⋃∞n=1 An.

7. Assume A1 ⊃ A2 ⊃ A3 ⊃ · · · . Then {An} converges and limn An =⋂∞n=1 An.

Finally, we get to consider complex valued functions. In this course, forthe most part, C = R2 and it is just a matter of convenience whether we writez = (x, y) or z = x+ iy. The following lemma is thus an immediate consequenceof Lemma 4.4

Lemma 4.10 Let f = u + iv : X → C, where u, v are real valued. Then f ismeasurable if and only if both u and v are measurable.

Exercise 26 Prove the following statements.

1. Let f : X → C be measurable, let α ∈ C. The following complex valuedfunctions are measurable: αf , f2, |f |.

2. Let f, g : X → C be measurable. Then f + g, fg : X → C are measurable.

3. Let fn : X → C be measurable for all n ∈ N. Then the set E = {x ∈ X :limn→∞ fn(x)exists} is measurable and the function limn→∞ fn : X → Cis measurable.

Let’s return to the real valued (and the extended real valued case) for a while.begindefinition Let f : X → R. We define f+, f− : X → [0,∞] by

f+(x) = max(f(x), 0), f−(x) = −min(f(x), 0)

for x ∈ X.

23

Exercise 27 1. Let f : X → R. Prove f = f+ − f−, |f | = f+ + f−.Conversely, show that if g, h are non-negative functions on X satisfyingg − h = f , g + h = |f |, then g = f+, h = f−.

2. Show that if f ∈ Me(X), then f+, f− ∈ Me(X).

The following result (Theorem 1.17 in [2]) is rather important and of frequentuse. Some notation/terminology first. If f, g : X → R we write f ≤ g toindicate that f(x) ≤ g(x) for all x ∈ X. Similarly, f ≥ 0 means that f(x) ≥ 0for all x ∈ X and we say that f is non-negative. We define f ≤ 0 analogously.

Theorem 4.11 Let f be a non-negative element of Me(X); i.e., let f ∈ Me(X)and assume f(x) ≥ 0 for all x ∈ X. There exists a sequence of non-negativemeasurable simple functions {sn} with the following properties:

1. s1 ≤ s2 ≤ · · · ≤ f .

2. limn →∞sn(x) = fn(x) for all x ∈ X.

3. If f is bounded (there exists M such that f(x) ≤ M for all x ∈ X) thenthe convergence of {sn} to f is uniform.

Proof. Let n ∈ N. We define sn : X → R as follows: Let x ∈ X. If f(x) < n,then there is a unique integer k, 1 ≤ k ≤ n2n such that

k − 12n

≤ f(x) <k

2n.

We set sn(x) =k − 12n

. On the other hand, if f(x) ≥ n, we set sn(x) = n.Briefly,

sn(x) =

{k − 12n

ifk − 12n

≤ f(x) <k

2n, k = 1, . . . , n2n,

n if f(x) ≥ n.

The function sn is clearly simple; it only takes on the finite set of values

0,12n

, . . . , n− 12n

, n.

It is measurable because f is measurable; in fact

{sn =k − 12n

} = {k − 12n

≤ f <k

2n}

for k = 1, . . . n2n while{sn = n} = {f ≥ n}.

The way sn(x) is defined, it is clear that sn ≤ f . To see that sn ≤ sn+1 is alsoeasy and can be done as follows. Let x ∈ X. Assume first f(x) < n. Then

sn(x) =k − 12n

wherek − 12n

≤ f(x) <k

2n.

Multiplying and dividing by 2 we get

2k − 22n+1

≤ f(x) <2k

2n+1

24

which holds if either

2k − 22n+1

≤ f(x) <2k − 12n+1

or2k − 12n+1

≤ f(x) <2k

2n+1

holds. If the first one of these holds, then

sn+1(x) =2k − 22n+1

=k − 12n

= sn(x);

if the second, then

sn+1(x) =2k − 12n+1

=k − 12n

+1

2n+1> sn(x);

in either case sn+1(x) ≥ sn(x). Now assume f(x) ≥ n, so sn(x) = n. If actuallyf(x) ≥ n + 1, then sn+1(x) = n + 1 > sn(x). If f(x) < n + 1 let k be a positiveinteger such that

k − 12n+1

≤ f(x) <k

2n+1.

Clearly k ≥ n2n+1 + 1 (otherwise we can’t have f(x) ≥ n) so that then

sn+1(x) =k − 12n+1

≥ n = sn(x).

Part (a) has been proved. For part (b)( and (c)), assume first x ∈ X andf(x) = ∞. Then f(x) ≥ n for all n ∈ N and sn(x) = n for all n ∈ N, hencelimn→∞ sn(x) = ∞ = f(x). This proves convergence at all points where f isinfinite. For N ∈ N, let XN = {f ≤ N}. Then (a frequently used fact)

{f < ∞} =∞⋃

N=1

XN

so, to complete proving limn sn = f , it suffices to see convergence on each setXN . So let N ∈ N, let x ∈ XN . If n ≥ N + 1, then f(x) < n so that if kis such that 2−n(k − 1) ≤ f(x) < 2−nk, then sn(x) = 2−n(k − 1) and then0 ≤ f(x)− sn(x) < 2−nk − 2−n(k − 1) = 2−n. It follows that if ε > 0 is given,selecting n > max(1/ε,N), we get |f(x)−sn(x)| < ε for all x ∈ XN . This provesuniform convergence of sn to f on XN . As mentioned before, this completesthe proof that sn(x) → f(x) as n →∞ for all x ∈ X. Moreover, assuming thatf is bounded, say f(x) ≤ M for all x ∈ X, some M > 0, we see that X = XN

once N ≥ M ; uniform convergence follows.

5 Measure spaces and Integration

Definition 19 Let X be a set and let A be an algebra in A. A measure on Ais a map µ : A → [0,∞] such that the following properties hold:

1. µ(∅) = 0.

2. If E1, E2, . . . is a sequence of elements of A such that En ∩ Em = ∅ ifn 6= m and such that

⋃∞n=1 En ∈ A, then

µ(∞⋃

n=1

En) =∞∑

n=1

µ(En).

25

The main case is the case in which A is a σ-algebra rather than an algebra; thenthe condition

⋃∞n=1 En ∈ A is automatically satisfied. We have the following

definition.

Definition 20 A measure space is a triple (X,M, µ) such that X is a set, Mis a σ-algebra of subsets of X and µ : M→ [0,∞] such that

1. µ(∅) = 0.

2. If E1, E2, . . . is a sequence of elements of A such that En ∩ Em = ∅ ifn 6= m then

µ(∞⋃

n=1

En) =∞∑

n=1

µ(En).

Briefly, a measure space is a triple (X,M, µ) such that (X,M) is a measur-able space and µ is a measure on the σ-algebra M. Incidentally, notice that thecondition µ(∅) = 0 for a measure µ on an algebra A can be replaced by

2′. ∃ E ∈ A such that µ(E) < ∞.

In fact, if the condition holds, then E = ∅ ∈ A and µ(∅) = 0 < ∞. Conversely,if E ∈ A and µ(E) < ∞, let A1 = E, An = ∅ for n > 1. Then An ∈ A for all n,An ∩Am = ∅ if n 6= m and

⋃∞n=1 An = E ∈ A. Thus

µ(E) =∞∑

n=1

µ(An) = µ(E) + limn→∞

nµ(∅)

and the limit on the right is infinity except if µ(∅) = 0. It follows that µ(∅) = 0,since the left side is finite.

The reason for the first definition is that in a large number of cases of in-terest (Lebesgue measure, Hausdorff measures, etc.), one can only define themeasure easily on an algebra. For example, consider Lebesgue measure on thereal line. It is the measure assigning each interval its length. For the purposeof integration theory, one needs to show that this measure is defined on a σ-algebra containing the intervals; the smallest such σ-algebra being the σ-algebraof Borel sets. But how can we define the measure (length) of a Borel set if wehave no intrinsic definition of a Borel set? What we do, instead, is we lookat the algebra generated by the intervals, and see that we have a measure onthat algebra. Then we just appeal to Caratheodory’s Theorem of extension ofmeasures (to be seen later on in this course), which states that every measureon an algebra can be extended to a measure on the σ-algebra generated bythat algebra. The extension is even unique if the measure satisfies a so calledσ-finiteness condition (see below) (which Lebesgue measure satisfies).

A first set of obvious properties of measures on algebras (or σ-algebras) arecontained in the next Lemma.

Lemma 5.1 Let X be a set, A an algebra in X and let µ be a measure on A.The following properties hold:

1. If E1, . . . , En are pairwise disjoint sets in A (Ei ∩ Ej = ∅ if i 6= j, then

µ(E1 ∪ E2 ∪ · · · ∪ En) = µ(E1) + · · ·+ µ(En).

2. If A,B ∈ A and A ⊂ B, then µ(A) ≤ µ(B).

3. If E1, . . . , En ∈ A, then

µ(E1 ∪ E2 ∪ · · · ∪ En) ≤ µ(E1) + · · ·+ µ(En).

26

4. If A ∈ A and if for each n ∈ N we have En ∈ A and En ⊂ A, then

∞∑n=1

µ(En) ≤ µ(A).

5. If A,B ∈ A, if A ⊂ B and µ(A) < ∞, then µ(B\A) = µ(B)− µ(A).

Exercise 28 Prove the last lemma.

Exercise 29 Let (X,M, µ) be a measure space. Prove: If A,B ∈ M andµ(A∆B) = 0, then µ(A) = µ(B).

EXAMPLES

1. Let X be a set. If E ⊂ X and E is finite, let µ(E) be the number ofelements of E (i.e., the cardinality of E). If E is infinite, define µ(E) = ∞.Then (X,P(X), µ) is a measure space; we call this measure µ countingmeasure. It is, perhaps, the simplest interesting measure.

2. Let X be an uncountable set and letM consist of all subsets of X which areeither countable or have countable complement. Define µ : M → [0,∞]by

µ(E) ={

0, if E countable,1, if X\E countable.

Then (X,M, µ, ) is a measure space.

3. Let X be a set; a ∈ X. If E ⊂ X define

δa(E) = χE(a) ={

1, if a ∈ E,0, if a /∈ E.

These were fairly simple examples. We follow with a harder, but also muchmore important example.

Lebesgue-Stieltjes measures on R; first steps.We begin with an exercise.

Exercise 30 Let X be a set, let E be a family of subsets of X such that

E1 ∅ ∈ EE2 If A,B ∈ E, then A ∩B in E.E3 If A ∈ E, then Ac is a finite union of sets in E.Prove: The family A consisting of all subsets of X which can be expressed as afinite union of sets in E, is an algebra in X. (A subset A of X is in A if andonly if there exists E1, . . . , Em ∈ E such that A = E1 + · · ·+ Em.) (G. Follandcalls such a family E an elementary family of sets).

We have a particular family in mind, the family E consisting of the real line,plus all half-open intervals.That is, for the rest of this example, E is the familyof all sets I of one of the following forms:

i. I = (a, b] = {x ∈ R : a < x ≤ b} for some a, b ∈ R, a ≤ b. Notice that I = ∅if a = b.

ii. I = (−∞, b] = {x ∈ R : x ≤ b} for some b ∈ R.

27

iii. I = (a,∞) = {x ∈ R : x > a} for some a ∈ R.

iv. I = R.

It should be quite easy to see that E satisfies properties E1, E2, E3. It followsthat the family

A = {n⋃

j=1

Ij : I1, . . . , In}

is an algebra in R. The notation A will be reserved for this algebra for theremainder of this example.

Now let φ : R → R be an increasing function (also to remain fixed); that isφ(x) ≤ φ(y) if x, y ∈ R and x < y. It is well known (I hope) that an increasingfunction has one-sided limits at all points and the set of its discontinuities iscountable (maybe empty). That is, for every x ∈ R

φ(x−) = limy→x−

φ(y) and φ(x+) = limy→x+

φ(y)

exist and, in fact,

φ(x−) = sup{φ(y) : y < x},φ(x+) = inf{φ(y) : y > x}.

Clearly φ(x−) ≤ φ(x) ≤ φ(x+) for all x ∈ R and it is also well known (I hope)that the set of discontinuities of φ, which is the set

Dφ = {x ∈ R : φ(x+)− φ(x−) > 0},

is countable. Modifying the function at some points of Dφ (if necessary), whichis not a great deal of modifying, we can make it continuous from the right.That is we just redefine φ(x) = φ(x+) for all x ∈ R; the new φ differs fromφ at most on a countable set of points. In all events, from now on we assumeφ is increasing and continuous from the right; the previous discussion was justto show that the condition of being continuous from the right is not terriblyrestrictive.

Important examples of such functions φ:1. φ(x) = x for all x ∈ R. Could be the most important of all.2. φ a ”jump” or ”saltus” function; let x0 < x1 < · · · < xm; a0 < a1 < · · · <am+1 and define φ by

φ(x) = a0 if x < x0,

φ(x) = aj if xj ≤ x < xj+1, j = 0, . . . , m− 1,

φ(x) = am if x ≥ xm.

We will now associate a measure with φ. We begin defining it for elementsof E .

Case i. If I = (a, b] for some a, b ∈ R, a ≤ b, we set µφ(I) = φ(b)− φ(a).

Case ii. If I = (−∞, b] for some b ∈ R, we set µφ(I) = φ(b)− limx→−∞ φ(x).

Case iii. If I = (a,∞) for some a ∈ R, we set µφ(I) = limx→∞ φ(x)− φ(a).

Case iv. µφ(R) = limx→∞ φ(x)− limx→−∞ φ(x).

28

We remark that if limx→∞ φ(x) = ∞ or if limx→−∞ φ(x) = −∞ (and bothhappen if φ(x) = x for all x), then some sets have infinite measure. For example,if limx→∞ φ(x) = ∞ then µφ ((a,∞)) = ∞ for all a, −∞ ≤ a < ∞. We alsoremark that the empty set can be represented in the form ∅ = (a, a] (any a ∈ R)so that µφ(∅) = φ(a)− φ(a) = 0.

To extend this measure to the algebra A (and see it is a measure) we needthe following result, which we state as an exercise. It says that the elements ofthe algebra A can be written as finite unions of pairwise disjoint sets in E .

Exercise 31 Let E ∈ A. Then we can write E =⋃m

k=1 Jk, where Jk ∈ E fork = 1, . . . , m and Jk ∩ J` = ∅ if k 6= `.

Solution. We have to show: If E =⋃n

i=1 Ii, with Ii ∈ E for i = 1, . . . , n, thenwe can find pairwise disjoint intervals J1, . . . , Jm in E such that E =

⋃mi=1 Ji.

One can prove this by induction on n, the case n = 1 being obvious. But itgets a bit messy. Here is a better approach. Assume E =

⋃ni=1 Ii, with Ii ∈ E

for i = 1, . . . , n. If x, y ∈ E, write x ∼ y iff [x, y] ⊂ E. This is an equivalencerelation in E and thus partitions E into a number of equivalence classes. Thenumber of equivalence classes is not too large; if x, y ∈ Ii for some i, then[x, y] ⊂ Ii ⊂ E. It follows that all of Ii is contained in a single equivalence class,so there are no more than n equivalence classes, maybe fewer. So let us denotethe equivalence classes by J1, . . . , Jm (m ≤ n); from general nonsense aboutsuch classes we have Ji ∩ Jj = ∅ if i 6= j and E =

⋃mi=1 Ji. All that remains to

be seen is that Ji ∈ E for all i. It is easy to see that x, y ∈ Ji implies [x, y] ⊂ Ji,hence Ji is an interval. To get it to be in E we have to see two things: If it hasa left end-point, that endpoint is not in Ji; if it has a right one, the right one isin Ji. In other words,

inf Ji /∈ Ji; sup Ji ∈ Ji if sup Ji < ∞.

Let α = inf Ji; assume α ∈ Ji (so α > −∞). Then α ∈ E, so α ∈ Ij for some j.As remarked before, all elements of Ij are in the same equivalence class, becauseIj ∩Ji 6= ∅ (α is in the intersection), we have Ij ⊂ Ji. But the interval Ij is in E ,which means that if α ∈ Ij , there is δ > 0 so (α−δ, α] ⊂ Ij . Thus (α−δ, α] ⊂ Ji

contradicting the definition of α. Did we? Assume now β = sup Ji < ∞. Thenwe can find a sequence {bk} in Ji, b1 ≤ b2 ≤ · · · , converging to β. Becausebk ∈ E for all k, there exists ik with bk ∈ Iik

for all k. But we only have a finitenumber of choices for these ik, one of them must repeat infinitely many times:Passing to a subsequence, there is i such that a subsequence of {bk} stays inIi. It is easy to see that elements of E contain with any increasing sequence thelimit of that sequence; because Ii ∈ E , we conclude β ∈ Ii. Then β ∼ bk, so βis in the same equivalence class as bk, hence β ∈ Ji.

Or make up your own proof.With this done, we can define µφ(E) if E is in E by writing first E as a disjoint

union of intervals, and then defining µφ(E) to be the sum of the measures of allthese intervals. There are still uniqueness issues and we solve all problems atonce by proving:

Lemma 5.2 Let I, I1, I2, . . . ∈ E; assume In ∩ Im = ∅ if n 6= m and I =⋃∞n=1 In. Then

µφ(I) =∞∑

n=1

µφ(In).

Proof. The main step in the proof is to reduce it to the case of a finite numberof intervals In, and this is done using the Heine-Borel theorem. We will consider

29

first the case in which I = (a, b], where −∞ < a < b < ∞. First we shrink Ia little bit. Let ε > 0 be given. Because φ is continuous from the right at a,there is δ > 0 such that φ(a + δ) < φ(a) + ε (φ increases, so φ(a + δ) ≥ φ(a)for every δ > 0. But by right continuity, we’ll have φ(a + δ) < φ(a) + ε as longas 0 < δ < δ0 for some δ0 which depends on ε). We may assume a + δ < b.Second we expand all the intervals In by a very small amount. Since each In

is included in I and I = (a, b], the only form In can have is In = (an, bn] witha ≤ an ≤ bn ≤ b. Because φ is continuous from the right at bn, there is δn > 0such that φ(bn + δn) < φ(bn) +

ε

2n+1. Now let Jn = (an, bn + δn). We have

[a + δ, b] ⊂ (a, b] =∞⋃

n=1

In ⊂∞⋃

n=1

Jn.

But the interval [a + δ, b], being closed and bounded, is compact; each of theintervals Jn is open. By the Theorem of Heine-Borel, a finite number of theJn’s already cover I. That is, there exist intervals Jn1 , . . . , JnN

such that

I ⊂N⋃

k=1

Jnk.

We can (and will) make two further assumptions: 1)All the intervals {Jnk: k =

1, . . . , N} are needed to cover [a + δ, b] (achieved by throwing out any one thathas empty intersection with [a + δ, b]). 2)The left endpoints are in order; thatis,

an1 < an2 < . . . < anN .

This is achieved, if necessary, by relabelling. Notice that we can’t have the leftend points of different Jn’s be equal; the same would be true of the correspondingIn’s, contradicting the pairwise disjointness of the In’s.

It is sort of nasty to work with double sub-indices, and one could say nowthat we’ll write ak for ank

(which really amounts to another relabelling of theintervals), but I’ll stick to the double sub-indices.

We have, as a purely algebraic matter,

µφ(I) = φ(b)− φ(a) = (φ(b)− φ(anN ))+

(φ(anN )− φ(anN−1)

)+ · · ·+ (φ(a2)− φ(a1)) + (φ(an1)− φ(a)) .

If we introduce anN+1 by defining anN+1 = b and we also set an0 = a, we canwrite this in the form

(4) µφ(I) =N∑

k=0

(φ(ank+1)− φ(ank

)).

We consider the terms in this sum. If 1 ≤ k ≤ N − 1, then ank+1 < bnk+ δnk

.In fact, we have

[a + δ, b] ⊂N⋃

k=1

(ank, bnk

+ δnk).

Suppose that for some k, 1 ≤ k ≤ N − 1 it holds that ank+1 ≥ bnk+ δnk

.Consider the point z = bnk

+ δnk. This point is in [a + δ, b]. In fact, if not we

have z < a or z > b. If z < a, then

Jnk∩ [a + δ, b] = (ank

, bnk+ δnk

) ∩ [a + δ, b] = ∅

30

and the interval Jnkshould have been eliminated from the list. If z > b, then

(ank+1 , bnk+1 + δnk+1) ∩ [a + δ, b] = ∅

and Jnk+1 should have been thrown away. Because φ is increasing and by thedefinition of δn, we have for k = 1, . . . , N − 1,(5)φ(ank+1)−φ(ank

) ≤ φ(bnk+δnk

)−φ(ank) < φ(bnk

)−φ(ank)+

ε

2nk+1= µφ(Ink

)+ε

2nk+1.

The term with k = N can be similarly handled; we simply use that anN+1 =b ≤ bnN + δnN ; if not there would be a part of the interval [a + δ, b], namely thepart between bnN

+ δnNand b, left uncovered. This means that (5) also holds

for k = N . The term with k = 0 behaves in a slightly different fashion; thistime we use that we must have an1 < a + δ (or the portion of [a + δ, b] betweena + δ and an1 would be uncovered) so that

(6) φ(an1)− φ(an0) = φ(an1)− φ(a) ≤ φ(a + δ)− φ(a) < ε.

Thus

µφ(I) =N∑

k=0


))

< ε +N∑

k=1


))

= ε +N∑

k=1

µφ(Ink) +

N∑

k=1

ε

2nk+1

≤ ε +∞∑

n=1

µφ(In) +∞∑

n=1

ε

2n+1=

∞∑n=1

µφ(In) + 2ε.

Since ε > 0 is arbitrary, we proved

µφ(I) ≤∞∑

n=1

µφ(In).

For the converse inequality, it suffices to prove that

N∑n=1

µφ(In) ≤ µφ(I)

for every N ∈ N. Letting N ∈ N, we may as well assume that we have a1 <a2 < · · · < aN , where In = (an, bn] for n = 1, . . . , N . Then disjointness and thefact that

⋃Nn=1 In ⊂ I gives

a ≤ a1 ≤ b1 ≤ a2 ≤ · · · ≤ an ≤ bN ≤ b.

Thus,

N∑n=1

µφ(In) =∞∑

n=1

(φ(bn)− φ(an))

≤ (φ(bN )− φ(aN )) +N−1∑n=1

(φ(an+1)− φ(an))

= (φ(bN )− φ(aN )) + (φ(aN )− φ(a1)) = φ(bN )− φ(a1)≤ φ(b)− φ(a) = µφ(I).

31

At this point, we proved µφ(I) =∑

n µφ(In), assuming that I = (a, b];−∞ < a < b < ∞.

Assume now a = −∞, b < ∞. Our proof then shows that for every α ∈ R,α < b

φ(b)− φ(α) = µφ(I ∩ (α, b]) =∞∑

n=1

µφ(In ∩ (α, b]) ≤∞∑

n=1

µφ(In).

Letting α → −∞, we get µφ(I) ≤ ∑∞n=1 µφ(In). The proof of the converse

inequality is essentially the same as before. That is, we have to prove thatµφ(I) ≥ ∑N

n=1 µφ(In) for all N and we have two cases: 1) All the intervals In

are finite; 2) Not all the intervals In are finite. In the first case, there is cN ∈ Rsuch that cN < b and In ⊂ (cN , b] for n = 1, 2, . . . , N . Then we get, as for thecase a > −∞,

µφ(I) ≥ µφ((cN , b]) ≥N∑

n=1

µφ(In).

In the second case, because two intervals with left endpoint equal to −∞ cannotbe disjoint, there is precisely one In with an = −∞; we may assume it is I1.We then have

µφ(I) = φ(b)− limx→−∞

φ(x) = φ(b)− φ(b1) + φ(b1)− limx→−∞

φ(x);

that is,µφ(I) = µφ((b1, b]) + µφ(I1)

and it suffices to prove that

µφ((b1, b]) ≥N∑

n=1

µφ(In)

for all N ≥ 2. However, this is a consequence of the case a > −∞ (with a nowreplaced by b1).

So we are done with the case a = −∞, b < ∞. Similar arguments work inthe case a > −∞, b = ∞ and in the case a = −∞, b = ∞.

As a first corollary we get

Corollary 5.3 Let I1, . . . , Im; J1, . . . , Jn be intervals in E; assume Ii ∩ Ij = ∅if i, j ∈ {1, . . . , m} and i 6= j, Ji ∩ Jj = ∅ if i, j ∈ {1, . . . , n} and i 6= j; assumealso that

m⋃

i=1

Ii =n⋃

j=1

Jj .

Thenm∑

i=1

µφ(Ii) =n∑

j=1

µφ(Jj).

Remark It is easy to give a direct proof of this corollary, since it only has todo with finite unions. But, since we have the previous result, we may as welluse it.Proof. We have for i = 1, . . . ,m,

Ii ⊂n⋃

j=1

Jj

32

so that

Ii =n⋃

j=1

(Ii ∩ Jj)

and since the intervals Ii ∩ Jj are pairwise disjoint, the lemma gives

µφ(Ii) =n∑

j=1

µφ(Ii ∩ Jj).

Similarly, we obtain

µφ(Jj) =m∑

i=1

µφ(Ii ∩ Jj)

for j = 1, . . . , n. Thus

m∑

i=1

µφ(Ii) =m∑

i=1

n∑

j=1

µφ(Ii ∩ Jj)

=

n∑

j=1

(m∑

i=1

µφ(Ii ∩ Jj)

)=

n∑

j=1

µφ(Jj).

Finally! we can extend µφ to the algebra A. Let E ∈ A. By the exercise, wecan write E =

⋃Nn=1 In where each In is in E and In ∩ Im = ∅ if n 6= m. We

define µφ(E) =∑N

n=1 µφ(In); by the corollary to the Lemma, µφ(E) is welldefined.

Theorem 5.4 The map E 7→ µφ(E) defines a measure on the algebra A.

Proof. The only property that needs to be proved is: If E1, E2, . . . is a sequenceof pairwise disjoint sets in A such that E =

⋃∞n=1 En, then

µφ(E) =∞∑

n=1

µφ(En).

We can write

E =r⋃

j=1

Ij ,

where Ij ∈ E , Ij ∩ Ik = ∅ if j 6= k. Similarly, for each n we can write

En =rn⋃

j=1

Jn,j

where Jn,j ∈ E , Jn,j ∩ Jn,k = ∅ if j 6= k. Here r, r1, r2, . . . are positive integers.Now for j = 1, . . . , r,

Ij ⊂ E =∞⋃

n=1

En =∞⋃

n=1

rn⋃

`=1

Jn,`

hence

Ij =∞⋃

n=1

rn⋃

`=1

(Ij ∩ Jn,`) .

Though double indexed, {Ij ∩ Jn,`}n,` is a countable family of pairwise disjointintervals in E , Ij is an interval in E , so that

µφ(Ij) =∞∑

n=1

rn∑

`=1

µφ (Ij ∩ Jn,`) .

33

Thus

µφ(E) =r∑

j=1

µφ(Ij) =r∑

j=1

∞∑n=1

rn∑

`=1

µφ (Ij ∩ Jn,`)

=∞∑

n=1

rn∑

`=1

r∑

j=1

µφ (Ij ∩ Jn,`)

=∞∑

n=1

rn∑

`=1

µφ (Jn,`)

where we have used that

µφ(Jn,`) =r∑

j=1

µφ(Ij ∩ Jn,`),

valid because Jn,` ⊂ En ⊂ E, hence

Jn,` = E ∩ Jn,` =r⋃

j=1

(Ij ∩ Jn,`) ;

and all the elements in the last union are pairwise disjoint intervals in E . Since∑rn

`=1 µφ (Jn,`) = µφ(En), we are done.

Once we have the extension of measure theory, this result will imply: Thereexists a unique measure, also denoted by µφ, defined on the σ-algebra of Borelsubsets of R such that µφ((a, b]) = φ(b) − φ(a). This measure is called theLebesgue-Stieltjes measure induced by φ. If φ(x) = x for all x, it is calledLebesgue measure.

We’ll return to this after discussing exterior measures and extension theory.

6 Basic properties of measures.

We begin with some definitions.

Definition 21 Let (X,M, µ) be a measure space. We say it is:

1. finite (or a finite measure space) iff µ(X) < ∞;

2. a (probability space) iff µ(X) = 1,

3. σ-finite (or a σ-finite measure space) iff there exist measurable sets E1, E2, . . .,such that X =

⋃∞N=1 En and µ(En) < ∞ for each n ∈ N;

4. complete iff all subsets of null sets are measurable; i.e., if N ∈ M andµ(N) = 0, then E ∈M for all subsets E of N .

If the measure space is finite, we also say that the measure is finite; similarly onesays that µ is σ-finite iff (X,M, µ) is a σ-finite measure space; µ is a probabilitymeasure iff µ(X) = 1.

Theorem 6.1 (Continuity of measures) Let (X,M, µ) be a measure space.

1. If An ∈M for each n ∈ N and A1 ⊂ A2 ⊂ · · · , then

µ(∞⋃

n=1

An) = limn→∞

µ(An).

34

2. If An ∈M for each n ∈ N, if µ(A1) < ∞, and A1 ⊃ A2 ⊃ · · · , then

µ(∞⋂

n=1

An) = limn→∞

µ(An).

Proof.

1. Set A0 = ∅, Bn = An\An−1 for n = 1, 2, . . .. Then Bn∩Bm = ∅ if n 6= m,

AN =N⋃

n=1

Bn

for all N ∈ N and ∞⋃n=1

An =∞⋃

n=1

Bn.

It follows that

µ(AN ) =N∑

n=1

µ(Bn)

for every N ∈ N, hence

µ(∞⋃

n=1

An) = µ(∞⋃

n=1

Bn) =∞∑

n=1

µ(Bn) = limN→∞

N∑n=1

µ(Bn) = limN→∞

µ(AN ).

2. Set Bn = A1\An for n = 1, 2, . . .. Then B1 ⊂ B2 ⊂ · · · so that by part 1,

µ(∞⋃

n=1

Bn) = limn→∞

µ(Bn).

Now, because µ(An) ≤ µ(A1) < ∞ and µ(∩nAn) ≤ µ(A1) < ∞, we have

µ(∞⋃

n=1

Bn) = µ

(A1\(

∞⋂n=1

An)

)= µ(A1)− µ

( ∞⋂n=1

An

),

µ(Bn) = µ(A1)− µ(An),

so that we proved

µ(A1)− µ

( ∞⋂n=1

An

)= µ(

∞⋃n=1

Bn)

= limn→∞

µ(Bn) = µ(A1)− limn→∞

µ(An);

since µ(A1) < ∞, we can cancel µ(A1) to obtain the conclusion.

Exercise 32 Recall the definitions of limsup, liminf and lim of a sequence ofsets (see Exercise 18). Let (X,M, µ) be a measure space and let {An} be asequence of sets in M. Let

A = lim infn→∞

An =∞⋃

n=1

( ∞⋂m=n

Am

),

A = lim supn→∞

An =∞⋂

n=1

( ∞⋃m=n

Am

).

Prove the following statements.

35

1.µ(A) ≤ lim inf

n→∞An.

2. If µ(X) < ∞, thenµ(An) ≥ lim sup

n→∞An.

3. (Conclude from the previous points that) if µ(X) < ∞ and the sequence{An} converges to the set A, then

µ(A) = limn→∞

µ(An).

Null sets and completion In this subsection we assume that (X,M, µ) is ameasure space. A null set is a set N ∈ M such that µ(N) = 0. Null sets arethe negligible objects in measure theory, almost the same thing as the emptyset (which is, of course, a null set). The following result is immediate:

Lemma 6.2 Let Nn be a null set for all n ∈ N. Then⋃∞

n=1 Nn is a null set.

As we saw a short while ago, a measure space is said to be complete iff everysubset of a null set is measurable. Those subsets are then, by necessity, nullsets. It is always possible to complete measure spaces.

Theorem 6.3 Let (X,M, µ) be a measure space. The measure µ extends to aunique measure µ on the σ-algebra M generated by

M∪ {E : ∃N ∈M, µ(N) = 0, E ⊂ N}.

The new measure space (X,M, µ) is complete.

Proof. One shows that

Σ = {E ∪ F : E ∈M, ∃N ∈M, µ(N) = 0, F ⊂ N}

is a σ-algebra. The only non-trivial part of this proof is showing that A ∈ Σ im-plies Ac ∈ Σ, and this is easy. Since Σ containsM and all subsets of null sets, wehave M = Σ. One also shows that if E1, E2, N1, N2 ∈ M, µ(N1) = µ(N2) = 0,F1 ⊂ N1, F2 ⊂ N2, and if E1 ∪ F1 = E2 ∪ F2, then µ(E1) = µ(E2). One cantherefore define µ(E ∪ F ) = µ(E) if E ∈ M, F ⊂ N ∈ M, µ(N) = 0. Every-thing that remains to be done is trivial.

Example. Take X = R and let B be the σ-algebra of Borel sets. We will acceptthe following result (to be proved later): There exists a unique measure m onB such that the measure of each interval is its length:

m([a, b]) = m((a, b]) = m([a, b)) = m(a, b) = b− a

if a, b ∈ R, a ≤ b; this is an infinite but σ-finite measure. The completionσ-algebra is the σ-algebra of Lebesgue sets; thus a subset E of R is a Lebesgueset if and only if E = B ∪ F , where B is a Borel set and F is the subset of aBorel null set (null with respect to Lebesgue measure).

One can show that if a family mathcalC of subsets of a set X is countable,then the cardinality of σ(C) is at most the cardinality of the power set of N; i.e.,of R (and exactly the cardinality of R if C is infinite countable). Because B canbe generated by the countable family C = {(a, b) : a, b ∈ Q, a ≤ b}, it followsthat B ∼ R (B is equipotent with R).

36

The Cantor set plays a role here. The Cantor set is a compact subset Cof R which is uncountable and is a Lebesgue null set. Thus all of its subsetsare Lebesgue sets, but being uncountable, the family of its subsets is of highercardinality than R, thus some of these subsets cannot be Borel sets. Or, withΛ denoting the σ-algebra of Lebesgue sets,

card(B) = 2ℵ0 < 22ℵ0 = card(P(C))

and since P(C) ⊂ Λ, we see that B 6= Λ.We end this section with a definition related to null sets

Definition 22 Let (X,M, µ) be a measure space and let P be some propertythat can hold for points of X. We say the property holds almost everywhere(abbreviated to a.e.) or, if the measure needs to be mentioned, that it holdseverywhere with respect to µ (abbreviated to a.e. [µ]) iff the set {x ∈ X : P (x)does not hold } is a null set.

Examples. Let f, g : X → S, S some set. We say that f and g are equal a.e.,and write f = g a.e. [µ] iff the set {x ∈ X : f(x) 6= g(x)} is a µ-null set.

Let fn : X → [−∞,∞] be measurable for n = 1, 2, . . .. We say the sequence{fn} converges a.e. iff

µ ({x ∈ X : {fn(x)} has no limit }) = 0.

7 Integration

We begin defining the integral of a non-negative simple measurable function.The non-negativity is not terribly important here, but we only need it in thiscase. That the same formula holds in general is a simple consequence of thebasic properties of the integral, once we have it.

Definition 23 Let s : X → [0,∞) be a simple measurable function and let

s =n∑

i=1

aiχEi

be its canonical representation; that is {E1, . . . , En} are pairwise disjoint mea-surable sets and {a1, . . . , an} is a set of n distinct positive real numbers. Then

∫

X

s dµ =n∑

i=1

aiµ(Ei).

Remark. If any of the sets Ei in the canonical decomposition of s has infinitemeasure, then

∫X

s dµ = ∞.The first order of business may be to relax a little bit this canonical de-

composition part of the definition. But we’ll do it as part of the first set ofproperties of the integral.

Lemma 7.1 The following properties hold.

1. Let s, t : X → [0,∞) be measurable simple functions. Then∫

X

(s + t) dµ =∫

X

s dµ +∫

X

t dµ.

37

2. Let s : X → [0,∞) be a measurable simple function, let c ∈ R, c ≥ 0.Then ∫

X

(cs) dµ = c

∫

X

s dµ.

3. Let s, t : X → [0,∞) be measurable simple functions and assume thats ≤ t; i.e., s(x) ≤ t(x) for all x ∈ X. Then

∫

X

s dµ ≤∫

X

t dµ.

Proof. The only non-trivial (though easy) property is the first one. In prepara-tion of a proof, we show first: Let s : X → [0,∞) be simple and assume thats =

∑mj=1 cjχFj , where F1, . . . , Fm are measurable and pairwise disjoint (so the

representation of s differs from the canonical one in that we allow repetitionsamong the cj ’s; we do not assume that the cj ’s are distinct and allow that oneor several of them could be 0). Then

∫

X

s dµ =m∑

j=1

cjµ(Fj)

with the proviso that if cj = 0, µ(Fj) = ∞ for some j, then we set cjµ(Fj) = 0.We want to assume that the union of all the Fj ’s is X, and just in case it isn’tlet

F0 = X\

m⋃

j=1

Fj

and then

s =n∑

j=0

cjχFi ,

where c0 = 0. Clearlym∑

j=0

cjµ(Fj) =m∑

j=1

cjµ(Fj)

so nothing has changed.Now let s =

∑ni=1 aiχEi be the canonical representation, so all ai’s are

distinct, all positive, and {E1, . . . , En} is also a family of pairwise disjoint mea-surable sets. We set

E0 = X\(

n⋃

i=1

Ei

)

so that X =⋃n

i=0 Ei. Here E0 = {x ∈ X : s(x) = 0}. We’ll also set a0 = 0.There is an obvious relation between the two representations: If Fj ∩ Ei 6= ∅,then Ei ⊂ Fj and cj = aj . In fact, let x ∈ Fj ∩ Ei. Then s(x) = cj ands(x) = ai, so a− i = cj . But then

Fj ⊂ s−1({cj}) = s−1({ai}) = Ei.

If for every i ∈ {0, 1, . . . , n} we set

Ti = {j ∈ {0, . . . , m} : Fj ⊂ Ei} ,

then the Ti’s are pairwise disjoint and their union is {1, . . . , n}. It is also im-mediate that

Ei =⋃

j∈Ti

Fj

38

for i = 0, . . . , n. With the proviso that a0µ(E0) = 0 even if µ(E0) = ∞, we get

∫

X

s dµ =n∑

i=1

aiµ(Ei) =n∑

i=0

aiµ(Ei) =n∑

i=0

ai

∑

j∈Ti

µ(Fj)

=n∑

i=0

∑

j∈Ti

aiµ(Fj)

=

n∑

i=0

∑

j∈Ti

cjµ(Fj)

=

m∑

j=0

cjµ(Fj),

proving the statement. With this nonsense out of the way, let s, t be two non-negative, simple measurable functions and let

s =n∑

i=1

aiχEi , t =m∑

j=1

bjχFj

be their canonical representations. We add, as above,

E0 = X\(

n⋃

i=1

Ei

), F0 = X\

m⋃

j=1

Fj

,

a0 = 0, b0 = 0 which does not affect, as we saw, the integrals. Because

Ei ⊂ X =m⋃

j=0

Fj ,

we get

Ei =m⋃

j=0

(Ei ∩ Fj).

It is clear that we can write

s =n∑

i=0

m∑

j=0

aiχEi∩Fj .

Similarly,

t =n∑

i=0

m∑

j=0

bjχEi∩Fj .

It follows that

s + t =n∑

i=0

m∑

j=0

(ai + bj)χEi∩Fj .

Some of the sets Ei ∩Fj may be empty, not all the values ai + bj on non-emptysets are distinct, but because

(Ei ∩ Fj) ∩ (Ek ∩ F`) 6= ∅ ⇒ i = k, j = `,

by what we proved we have∫

X

(s + t) dµ =n∑

i=0

m∑

j=0

(ai + bj)µ(Ei ∩ Fj)

=n∑

i=0

ai

m∑

j=0

µ(Ei ∩ Fj)

+

m∑

j=0

bj

(n∑

i=0

µ(Ei ∩ Fj)

)

=n∑

i=0

aiµ(Ei) +m∑

j=0

bjµ(Fj) =∫

X

s dµ +∫

X

t dµ.

39

As mentioned before, the other properties are trivial and their proof is left asan exercise.

Corollary 7.2 Let

s =n∑

i=1

ciχEi

where c1, . . . , cn ∈ [0,∞) (not necessarily distinct, not necessarily all positive)and the sets E1, . . . , En are measurable (not necessarily pairwise disjoint). Then

∫

X

s dµ =n∑

i=1

ciµ(Ei)

with the usual proviso that ciµ(Ei) = 0 if ci = 0, even if µ(Ei) = ∞.

Let s : X → [0,∞) be simple measurable. If E is a measurable subset ofX, then (as we have seen), ME = {F ∈ M : F ⊂ E} is a σ-algebra in E. Ifwe restrict µ to this σ- algebra, we get the measure space (E,ME , µE), whereµE = µ|ME

. We can define

Definition 24 ∫

E

s dµ =∫

E

s|E dµE .

This definition has the advantage of reducing all results about∫

Eto results

about∫

X; we are just changing the space. But it is easier to work using the

following result:

Exercise 33 Let E ∈M. Show that if s =∑n

i=1 aiχEi is a simple measurablefunction with E1, . . . , En ∈M and c1, . . . , cn ≥ 0, then

∫

E

s dµ =n∑

i=1

ciµ(E ∩ Ei) =∫

X

χEs dµ.

The following result will be useful

Lemma 7.3 Let s : X → [0,∞] be a measurable simple function in the measurespace (X,M, µ). The map

E 7→∫

E

s dµ : M→ [0,∞]

is a measure on M.

Proof. Exercise.

The next step is to define the integral for all non-negative measurable func-tions.

Definition 25 Let f : X → [0,∞] be measurable. Then∫

X

f dµ = sup{∫

X

s dµ : s simple, measurable, 0 ≤ s ≤ f}.

40

The first observation is that apparently we now have two definitions of∫

Xs dµ

if s is simple, measurable, non-negative. That the two coincide is an easyconsequence of the fact that 0 ≤ s ≤ t implies

∫X

s dµ ≤ ∫X

t dµ, for simplemeasurable functions s, t. So we really have only one definition. A secondobservation could be that the integral of every non-negative measurable functionis defined, and is non-negative. By Theorem 4.11 there exists s simple, 0 ≤ s ≤ fif f ≥ 0 is measurable, so that the set of which

∫X

f dµ is the supremum containsat least one (non-negative) element. The integral can, of course, be equal to∞. We can now extend all the properties of Lemma 7.1; the proof is a simpleexercise in handling suprema of sets. We leave it as an exercise.

Theorem 7.4 Let (X,M, µ) be a measure space. The following properties hold.

1. Let f : X → [0,∞) be a measurable function, let c ∈ R, c ≥ 0. Then∫

X

(cf) dµ = c

∫

X

f dµ.

2. Let f, g : X → [0,∞) be measurable functions and assume that f ≤ g; i.e.,f(x) ≤ g(x) for all x ∈ X. Then

∫

X

f dµ ≤∫

X

g dµ.

Calculating suprema can sometimes be hard and life is made simpler thanksto the following theorem

Theorem 7.5 (Lebesgue’s monotone convergence theorem) Let (X,M, µ) be ameasure space and let fn : X → [0,∞] be measurable for n = 1, 2, . . .. Then

limn→∞

(∫

X

fn dµ

)=

∫

X

(lim

n→∞fn

)dµ.

Proof. Since {fn(x)} is increasing, limn→∞ fn(x) exists for all x ∈ X. Let f bedefined on X by f(x) = limn→∞ fn(x). By Theorem 4.5, f is measurable. ByTheorem 7.4, the sequence {∫

X

fn dµ

}

is increasing and bounded above by∫

Xf dµ. It follows that

L = limn→∞

∫

X

fn dµ

exists and L ≤ ∫X

f dµ. It remains to prove that L =∫

Xf dµ. A nice trick

proves this. Let s be a simple measurable function such that 0 ≤ s ≤ f . Letα ∈ (0, 1). For n = 1, 2, . . ., let

En = {x ∈ X : fn(x) > αs(x)}.The sets En are measurable (is this clear?) and, because the sequence {fn}increases, we have E1 ⊂ E2 ⊂ · · · . Because αs(x), f(x) = limn→∞ fn(x) forall x ∈ X, we have X =

⋃n En. Observe also that we have αχEns < fn. By

Lemma 7.3 and Theorem 6.1 it follows that

α

∫

X

s dµ =∫

X

αs dµ = limn→∞

∫

En

αs dµ

= limn→∞

∫

X

αχEns dµ ≤∫

X

fn dµ = L.

41

Since s was arbitrary as long as 0 ≤ s ≤ f , this proves that

α sup{∫

X

s dµ : s simple, measurable, 0 ≤ s ≤ f} ≤ L; i.e., α∫

X

f dµ ≤ L.

Since α ∈ (0, 1) was arbitrary, we proved∫

X

f dµ ≤ L, hence∫

X

f dµ = L.

We now get as a simple corollary the fact that the integral of a sum is thesum of the integrals. It is somewhat harder to attempt a direct proof.

Corollary 7.6 Let f, g : X → [0,∞] be measurable. Then∫

X

(f + g) dµ =∫

X

f dµ +∫

X

g dµ.

Proof. Let {sn}, {tn} be increasing sequences of measurable simple functions,converging to f, g, respectively. Such sequences exist by Theorem 4.11. It is thenclear that {sn + tn} is an increasing sequence of simple measurable functionsconverging to f + g. By Lebesgue’s monotone convergence theorem, and thefact that the conclusion of this corollary has been proved for measurable simplefunctions, we get

∫

X

(f + g) dµ = limn→∞

∫

X

(sn + tn) dµ = limn→∞

(∫

X

sn dµ +∫

X

tn dµ

)

= limn→∞

∫

X

sn dµ + limn→∞

∫

X

tn dµ =∫

X

f dµ +∫

X

g dµ.

The next corollary is also immediate; it is widely used as we shall see.

Corollary 7.7 (Beppo-Levi) Let fn : X → [0,∞] be measurable for n = 1, 2, . . ..Then ∫

X

( ∞∑n=1

fn

)dµ =

∞∑n=1

(∫

X

fn dµ

).

Proof. For N ∈ N, set FN =∑N

n=1 fn. Then {FN} is an increasing sequenceof measurable functions converging to

∑∞n=1 fn. The corollary follows from the

previous corollary and Lebesgue’s monotone convergence theorem.

We get to the second of the basic integration theorems, namely:

Theorem 7.8 (Fatou’s lemma) Let fn : X → [0,∞] be measurable for n =1, 2, . . .. Then ∫

X

lim infn→∞

fn dµ ≤ lim infn→∞

∫

X

fn dµ.

Proof. For n = 1, 2, . . ., define gn : X → [0,∞] by gn(x) = infm≥n fm(x) forx ∈ X; i.e., gn = infm≥n fm. Then gn is measurable for each n ∈ N, the sequence

42

{gn} is increasing and gn ≤ fn for all n ∈ N, and limn→∞ gn = lim infn→∞ fn.By all this, and Lebesgue’s monotone convergence theorem,

∫

X

lim infn→∞

fn dµ =∫

X

limn→∞

gn dµ = limn→∞

∫

X

gn dµ

≤ lim infn→∞

∫

X

fn dµ.

In the most frequent applications of Fatou’s lemma the sequence {fn} con-verges, or converges a.e.

Definition 26 Let S be a set and, as usual, (X,M, µ) a measure space. Wesay that f is defined a.e. from X to S iff there exists a null set; i.e., a setE ∈M such that µ(E) = 0, such that f : X\E → S.

Almost everywhere defined functions are almost as good as ones that are every-where defined. We recall exercise 21; surely you have done it by now!. Itshows that if we have a function from X to a metric space which is defined a.e.,then it can always be extended to be a measurable function defined on all of X.Different extensions are obviously equal a.e.. Because the most importantthing we can do with functions on a measure space is to integratethem, it frequently pays NOT to distinguish between functions that are equala.e., nor to require that a function be everywhere defined. We’ll see this inour next theorem, a vamped-up version of Lebesgue’s monotone convergencetheorem. But first, more null sets nonsense.

Lemma 7.9 Let f : X → [0,∞] be measurable. Then

1. If E ∈M and µ(E) = 0, then∫

Ef dµ = 0.

2. If∫

Xf dµ = 0, then f = 0 a.e.

Proof.

1. Let E ∈M be a null set. Let s be a simple measurable function, 0 ≤ s ≤ f .We leave it as a (very easy) exercise to check that

∫E

s dµ = 0 and to seethat, because s is arbitrary, this implies that

∫E

f dµ = 0.

2. If f = 0 a.e., let E = {x ∈ X : f(x) > 0}. Then E is a null set. Moreover,f = χEf . It follows that

∫

X

f dµ =∫

X

χEf dµ =∫

E

f dµ = 0.

Conversely, assume∫

Xf dµ = 0. Let En = {x ∈ X : f(x) > 1/n}. Then

1n

χEn ≤ f

so that1n

µ(En) =∫

X

1n

χEn dµ ≤∫

X

f dµ = 0.

It follows that µ(En) = 0 for all n, hence we also have that

µ({x ∈ X : f(x) > 0}) = µ

( ∞⋃n=1

En

)≤

∞∑n=1

µ(En) = 0.

It follows that f = 0 a.e.

43

As a trivial corollary, we get that if f = g a.e., then∫

Xf dµ =

∫X

g dµ; assumingboth integrals make sense. Well, it becomes more trivial once we get to integratenon-negative functions. Right now, one has to be a bit careful because whileit is true that f − g = 0 a.e., its integral is not defined yet (it isn’t necessarilya non-negative function). The way to do it, for example, is to let E = {x ∈X : f(x) 6= g(x). Define h(x) = f(x) if x ∈ X\E, h(x) = 0 if x ∈ E.Then f − h ≥ 0 and f − h = 0 a.e.; thus

∫X

f dµ =∫

Xh dµ. Similarly, one

sees that∫

Xg dµ =

∫X

h dµ. The conclusion follows. This allows us to defineintegrals of functions defined almost everywhere, since their values on a nullset are ignorable. But we’ll do this once we discuss integrals of complex valuedfunctions.

We are ready for a new version of Lebesgue’s monotonicity theorem.

Theorem 7.10 (Lebesgue’s monotone convergence theorem; version 2)Let fn :X → [0,∞] be measurable for n = 1, 2, . . .. Assume that for every n ∈ N theinequality fn ≤ fn+1 holds a.e.. Then f converges a.e. to a measurable functionf : X → [0,∞] and

limn→∞

∫

X

fn dµ =∫

X

f dµ.

Proof. For n ∈ N, let En = {fn < fn+1}. The sets En are all measurable and,by hypothesis, null sets. Thus E =

⋃∞n=1 En also is a null set. If x /∈ En, then

the sequence {fn(x)} is increasing, hence converges; let f(x) be the limit. Thenf : X\E → [0,∞] is measurable. Extend f to all of X, so it remains measurable(for example, set f(x) = 0 if x ∈ E). Then

∫

X

f dµ =∫

X

\Ef dµ = limn→∞

∫

X

\Efn dµ = limn→∞

∫

X

fn dµ;

the second equal sign being due to Lebesgue’s Monotone Convergence Theorem,the first version, all others to the fact that µ(E) = 0.

Definition 27 Let f : X → [0,∞] be measurable. We say that f is integrableiff ∫

X

f dµ < ∞.

Integrable functions are not too large, in the sense of the following twotheorems.

Theorem 7.11 Let f be a non-negative, extended real valued, measurable func-tion defined on the measure space X. Assume

∫X

f dµ < ∞. Then f is a.e.finite valued; i.e.,

µ({x ∈ X : f(x) = ∞}) = 0.

Proof. Let E = {x ∈ X : f(x) = ∞}. Then for every n ∈ N, nχE ≤ f .Integrating we get nµ(E) ≤ ∫

Xf dµ. The theorem follows.

Theorem 7.12 Let f be a non-negative, extended real valued, measurable func-tion defined on the measure space X. Assume

∫X

f dµ < ∞. For every ε > 0,the set {x ∈ X : f(x) > ε} has finite measure.

44

Proof. Exercise

As a corollary, we get if f is as in the last theorem that the set {f > 0} isσ-finite; i.e., the union of a countable family of sets of finite measure; in fact,

{f > 0} =∞⋃

n=1

{f >1n}.

And now we begin to consider extended real valued and complex valued func-tions.

Definition 28 Let f : X → [−∞,∞] be measurable. If at least one of∫

Xf+ dµ,∫

Xf− dµ is finite, we define the integral of f over X by

∫

X

f dµ =∫

X

f+ dµ−∫

X

f− dµ.

We say f is integrable iff ∫

X

|f | dµ < ∞.

A few remarks may be in order. If f : X → R is measurable, then |f | is ameasurable, non-negative function. Thus

∫X|f | dµ is always defined. If and

only if it is finite, we say that f is integrable. The reader (if there is any) shouldhave no problem proving

Exercise 34 Let f : X → [−∞,∞] be measurable. Then f is integrable if andonly if f−, f+ are integrable.

If f is complex valued, one only defines its integral when all quantities in-volved are finite. We define

Definition 29 Let f : X → C be measurable. We say f is integrable if andonly if ∫

X

|f | dµ < ∞.

The set of all complex valued integrable functions of the measure space (X,M,mu)is denoted by L1(µ), or by L1(X, µ), or by L1(X,M, µ).

If f : X → C is measurable, then |f | is a non-negative measurable function,and its integral is defined. It is a trivial exercise to prove that L1(µ) is a complexvector space. Let’s state it as part of our next theorem.

Theorem 7.13 Let (X,M, µ) be a measure space. The set L1(µ) is a subspaceof the complex vector space of complex valued functions. Moreover, f ∈ L1(µ) ifand only if setting u = <f , v = =f , we have that u+, u−, v+, v− are integrable.

We leave the proof as an exercise. It basically reduces to triangle inequalitiesand Corollary 7.6. The last part is immediate from the inequality

max(u+, u−, v+, v−) ≤ |f | ≤ u+ + u− + v+ + v−.

Definition 30 Let f = u + iv ∈ L1(µ), where u, v are real valued. Then wedefine∫

X

f dµ =∫

X

u dµ+i

∫

X

v dµ =(∫

X

u+ dµ−∫

X

u− dµ

)+i

(∫

X

v+ dµ−∫

X

v− dµ

).

45

Before we see the linearity of the integral of f with respect to the measure µ, wewant to establish a few other results. Moreover, we want to say a few more wordson the extended real case. We now have

∫X

f dµ defined if f : X → [−∞,∞]and either f+ or f− has finite integral, and also if f : X → C and |f | hasa finite integral. The extended real case is not quite a particular case of thecomplex case because of the possibility of assuming infinite values. But, supposef : X → [−∞,∞] is integrable. Then f+, f− have finite integrals hence, byTheorem 7.11, both are a.e. finite valued. Modifying a function on a null setdoes not change its integral, hence there exists f : X → R measurable, suchthat f = f a.e. and ∫

X

f dµ =∫

X

f dµ.

So if we identify functions which are equal a.e. (as we soon shall be doing),then one can see the extended real valued integrable functions as being asubset of L1(µ). When dealing with extended real valued functions, it becomesalmost necessary to work with functions defined only a.e., to avoid too manyconvolutions. So we can define: Let f be defined a.e. as an extended real valuedfunction on X. Then f+, f− are defined a.e., hence so are their integrals overX. We can then define

∫X

f dµ as before, assuming at least one of∫

Xf+ dµ,∫

Xf− dµ is finite.Suppose f, g : X → [−∞,∞] are defined a.e. (there is some abuse of notation

here). Then f + g will be defined a.e. iff the set

({f = ∞} ∩ {g = −∞}) ∪ ({f = −∞} ∩ {g = ∞})is a null set.

Theorem 7.14 (a) Let f, g : X → [−∞,∞] be measurable. If∫

Xf dµ and∫

Xg dµ exist, and they are not infinities of opposite sign (one equal to ∞,

the other one to −∞), then f + g is defined a.e.,∫

X(f + g) dµ exists and

∫

X

(f + g) dµ =∫

X

f dµ +∫

X

g dµ.

(b) Let f : X → [−∞,∞] be measurable and let c ∈ R. If∫

Xf dµ exists, then

so does∫

X(cf) dµ and

∫

X

(cf) dµ = c

∫

X

f dµ.

(c) Let f, g : X → [−∞,∞] be measurable. If∫

Xf dµ and

∫X

g dµ exist, and iff ≤ g a.e., then ∫

X

f dµ ≤∫

X

g dµ

Proof.

(a) We begin seeing that f + g is defined a.e., given all our assumptions. Ifthe set {f = ∞} ∩ {g = −∞} is not a null set, then neither {f = ∞} ={f+ = ∞} nor {g = −∞} = {g− = ∞} can be null sets; by Theorem 7.11this forces

∫X

f+ dµ = ∞, hence∫

Xf dµ = ∞, and

∫X

g− dµ = ∞, hence∫X

g dµ = −∞. But we are assuming that∫

Xf dµ and

∫X

g dµ are notinfinities of opposite signs. Similarly one sees that {f = −∞} ∩ {g = ∞}is a null set. With this out of the way, we see that the following equalitieshold (at least) a.e.:

f + g = (f + g)+ − (f + g)−, and also f + g = f+ − f− + g+ − g−,

46

hence, equating and rearranging,

(f + g)+ + f− + g− = (f + g)− + f+ + g+.

By Corollary 7.6, we see that(7)∫

X

(f+g)+ dµ+∫

X

f− dµ+∫

X

g− dµ =∫

X

(f+g)− dµ+∫

X

f+ dµ+∫

X

g+ dµ.

Suppose first that∫

Xf+ dµ = ∞. The right hand side of (7) is then

equal to ∞. Moreover,∫

Xf− dµ < ∞ and

∫x

f dµ = ∞, by all ourassumptions. Also,

∫X

g− dµ < ∞, otherwise∫

Xg dµ = −∞, a no-no

given that∫

Xf dµ = ∞. It follows that we must have

∫X

(f +g)+ dµ = ∞for the equality (7) to hold. We see

(f + g)− ≤ f− + g−

and because both f− and g− are integrable (as remarked above for f−;the integrability of g− is a consequence of the assumption that

∫g

dµ existsand is not −∞) we see that (f + g)− is integrable. Thus

∫

X

(f + g) dµ =∫

X

(f + g)+ dµ−∫

X

(f + g)− dµ

is defined and equals ∞. Since∫

Xf dµ = ∞ and

∫X

g dµ > −∞, we get∫

X

(f + g) dµ = ∞ =∫

X

f dµ +∫

X

g dµ

proving the equality in this case. The cases in which f−, g+ or g− haveinfinite integrals are identical, so we can assume now that f+, f−, g+ andg− are integrable. Then so are (f + g)+ ≤ f+ + g+, (f + g)− ≤ f− + g−

so that all integrals in (7) are finite and we get at once∫

X

(f+g)+ dµ−∫

X

(f+g)− dµ =∫

X

f+ dµ−∫

X

f− dµ+∫

X

g+ dµ−∫

X

f− dµ

concluding the proof of part (a).

(b) This is immediate. All one needs to see is that (cf)+ = cf+ and (cf)− =cf− if c > 0, (cf)+ = |c|f− and (cf)− = |c|f+ if c < 0 ( the case c = 0being totally trivially trivial).

(c) It is clear that if f ≥ 0 then its integral is non- negative. If g− f is defineda.e., we’d have g − f ≥ 0, hence, by parts (a), (b),

0 ≤∫

X

(f − g) dµ =∫

X

f dµ−∫

X

g dµ

and we would be done. Then we could deal with the exceptional case. Butit might be better to do a direct proof. All one needs to observe is thatf = f+− f− ≤ g = g+− g− a.e. implies f+ + g− ≤ g+ + f−. Now we aredealing with non-negative measurable functions, so that by our previoustheorems and results we get

∫

X

f+ dµ +∫

X

g− dµ ≤∫

X

g+ dµ +∫

X

f− dµ.

Now one simply has to rearrange, checking that one is not subtractinginfinity from infinity, which wouldn’t make any sense.

47

Linearity of the integral on L1(µ) is very easy at this point. We see that iff = u + iv ∈ L1(µ), then, if c ∈ R, cf = (cu) + i(cv) and

∫

X

(cf) dµ =∫

X

cu dµ + i

∫

X

cv dµ = c

∫

X

u dµ + ic

∫

X

v dµ = c

∫

X

f dµ

follows. We also have if = −v + iu so that∫

X

if dµ = −∫

X

v dµ + i

∫

X

u dµ = i

(i

∫

X

v dµ +∫

X

u dµ

)= i

∫

X

f dµ.

It follows that∫

Xαf dµ = α

∫X

f dµ for all α ∈ C. That the integral of a sumis the sum of the integrals is just as easy, we thus state without a proof:

Theorem 7.15 Let (X,M, µ) be a measure space. The map

f 7→∫

X

f dµ : L1(µ) → C

is linear.

Theorem 7.16 Let f ∈ L1(µ). Then∣∣∣∣∫

X

f dµ

∣∣∣∣ ≤∫

X

|f | dµ

Remark. The obvious way of proving this theorem is writing f = u + iv. It isfairly simple to see that the inequality holds in the real case, where it reducesto

∣∣∣∣∫

X

u dµ

∣∣∣∣ =∣∣∣∣∫

X

u+ dµ−∫

X

u− dµ

∣∣∣∣ ≤∫

X

u+ dµ +∫

X

u− dµ

=∫

X

(u+ + u−) dµ =∫

X

|u| dµ.

One could try∣∣∣∣∫

X

f dµ

∣∣∣∣ =∣∣∣∣∫

X

u dµ + i

∫

X

v dµ

∣∣∣∣ ≤∫

X

|u| dµ +∫

X

|v| dµ

but this won’t work. For example, it is possible to have |u| = |v|; then |f | =√|u|2 + |v|2 =

√2|u|. If we bound as in the last displayed inequalities, we are

bounding ∣∣∣∣∫

X

f dµ

∣∣∣∣ ≤∫

X

|u| dµ +∫

X

|v| dµ = 2∫

X

|u| dµ,

when in reality we want∣∣∣∣∫

X

f dµ

∣∣∣∣ ≤∫

X

|f | dµ =√

2∫

X

|u| dµ.

In other words, this obvious approach can’t work. One has to prove, in fact,that [(∫

X

u dµ

)2

+(∫

X

v dµ

)2]1/2

≤∫

X

√u2 + v2 dµ.

A direct approach is possible, but not really recommended.

48

Proof. Let

α =∫

X

f dµ;

we write a = reiθ, where r ≥ 0, θ ∈ R (if α = 0, a trivial case, we take r = 0and θ could be anything; say θ = 0). Then

∣∣∣∣∫

X

f dµ

∣∣∣∣ = |α| = e−iθα = e−iθ

∫

X

f dµ

=∫

X

(e−iθf) dµ.

We now take real parts. Notice for this that, by definition,

<(∫

X

f dµ

)=

∫

X

<f dµ

and, of course, the real part of a positive (real) quantity is the quantity itself.Thus

∣∣∣∣∫

X

f dµ

∣∣∣∣ = <(∣∣∣∣

∫

X

f dµ

∣∣∣∣)

= <(∫

X

(e−iθf) dµ

)=

∫

X

< (e−iθf

)dµ.

But < (e−iθf

) ≤ ∣∣e−iθf∣∣ = |f | so that by Theorem 7.14, part (c), we have

∣∣∣∣∫

X

f dµ

∣∣∣∣ =∫

X

< (e−iθf

)dµ ≤

∫

X

|f | dµ.

We finally get to the third basic integration theorem, perhaps the most usefulone of the three.

Theorem 7.17 (Lebesgue’s dominated convergence theorem) Let (X,M, µ) bea measure space and let fn : X → C be measurable for n = 1, 2, . . .. Assume

1. The sequence {fn} converges a.e., say to f (That is, f(x) = limn→∞ fn(x)exists for a.e. x ∈ X).

2. There exists g ∈ L1(µ) such that |fn| ≤ g a.e.

Then f ∈ L1(µ) and it holds that

(8) limn→∞

∫

X

|fn − f | dµ = 0

and

(9) limn→∞

∫

X

fn dµ =∫

X

f dµ.

Proof. The function f is measurable, being the a.e. limit of measurable func-tions. More precisely, the set on which f is the limit is a measurable set; weconsider f as being a.e. defined as this limit and then extend it in a measurableway (say as 0) to the rest of X. The function g (which is necessarily a.e. ≥ 0 andthus can be assumed to be ≥ 0 everywhere) is integrable and f = limn→∞ fn,|fn| ≤ g a.e., implies |f | ≤ g a.e. Careful! It is trivial but not tremendously

49

trivial, because the null sets on which |fn| ≤ g fails to hold could depend onn. But their union (being a countable union) is also a null set. We see thatf ∈ L1(µ) since

∫X|f | dµ ≤ ∫

Xg dµ < ∞. To conclude the proof, it suffices to

show that (8) holds, since (9) is an immediate consequence of (8). In fact,

limn→∞

∣∣∣∣∫

X

fn dµ−∫

X

f dµ

∣∣∣∣ = limn→∞

∣∣∣∣∫

X

(fn − f) dµ

∣∣∣∣ ≤ limn→∞

∫

X

|fn − f | dµ = 0.

(One really should use lim sup in the first two expressions, until one knows thereis a limit). To prove (8), let gn = 2g − |fn − f |. Then gn ≥ 0 (a.e., but we canmodify things on null sets and get everywhere) because

|fn − f | ≤ |fn|+ |f | ≤ 2g.

Sincelim infn→∞

gn = limn→∞

gn = 2g

(once more, a.e.), Fatou’s lemma implies∫

X

(2g) dµ ≤ lim infn→∞

∫

X

(2g − |fn − f |) dµ

= lim infn→∞

(∫

X

(2g) dµ−∫

X

|fn − f | dµ

)

=∫

X

(2g) dµ− lim supn→∞

∫

X

|fn − f | dµ.

Because∫

X(2g) dµ = 2

∫X

g dµ < ∞, we can subtract it from both sides to get

lim supn→∞

∫

X

|fn − f | dµ ≤ 0.

The sequence of which we are taking a lim sup consists of non negative terms,so that it has to converge to 0.

We haven’t had any exercise in a while. Lets do one or two before it is toolate

Exercise 35 Let X be a set, consider X as a measure space with countingmeasure µ. That is, consider the measure space (X,P(X), µ) where µ(E) isthe cardinality of E if E is a finite set, µ(E) = ∞ if it isn’t. All functionsf : X → [−∞,∞] or f : X → C are, of course, measurable. Prove:

1. If f : X → [0,∞] then∫

X

f dµ =∑

x∈X

f(x).

2. If f ∈ L1(µ), then {x ∈ X : f(x) 6= 0} is countable. Order the elementsof this set so that we either have {x ∈ X : f(x) 6= 0} = {x1, . . . , xN} forsome N ∈ N (finite case) or {x ∈ X : f(x) 6= 0} = {x1, , x2, . . .} (infinitecase). Show that then ∫

X

f dµ =N∑

j=1

f(xj)

in the finite case,∫

X

f dµ =∞∑

j=1

f(xj) = limN→∞

N∑

j=1

f(xj)

in the infinite (though countable) case.

50

The next exercise is introduces an important property of the integral of anon-negative function. It generalizes Lemma 7.3.

Exercise 36 Let (X,M, µ) be a measure space and let f : X → [0,∞] bemeasurable. If E ∈ M, define λ(E) =

∫E

f dµ =∫

XχEf dµ. Prove that λ is a

measure on M and that ∫

X

g dλ =∫

X

fg dµ

for all non-negative measurable functions g on X.Hint: There is a standard way of proving certain properties of integration, andthis is a first example. A certain result is to be proved for measurable functions.One proves it first for characteristic functions. Then one sees (if one can) thatthe result is “linear” in the sense that if it is true for functions f, g, it also truefor af + bg, a, b ∈ R. Now one has the result for all simple functions. Next, onesees that the result is “monotone;” i.e., if it holds for all functions in a sequence{fn} such that 0 ≤ f1 ≤ f2 ≤ · · · , then it also holds for f = limn→∞ fn. Nowone has it for all non-negative measurable functions; linearity kicks in again toextend it to all real-valued measurable functions, finally for all complex valuedfunctions. In this exercise, the process stops ones one reaches all non-negativemeasurable functions.

Intermezzo on Normed vector spaces.We recall a few concepts. We assume V is a complex vector space (all

results hold, mutatis mutandis, for real vector spaces). A norm in V is a mapx 7→ ‖x‖ : V → [0,∞) such that ‖x‖ = 0 iff and only if x = 0, ‖cx‖ = |c|‖x‖ forall c ∈ C, x ∈ V and ‖x+y‖ ≤ ‖x‖+‖y‖ for all x, y ∈ V . A normed vector spaceis automatically a metric space, the metric being defined by d(x, y) = ‖x− y‖.Thus it always makes sense to talk of open sets, closed sets, compact sets,convergence, etc., in a normed vector space. A Banach space is a completenormed vector space; i.e., a vector space in which all Cauchy sequences converge.In other words, if xn ∈ V for all n ∈ N and if for every ε > 0 there exists N ∈ Nsuch that ‖xn−xm‖ < ε whenever n,m ≥ N , then there exists x ∈ V such thatfor every ε > 0 there is N ∈ N (not necessarily the same as before) such that‖xn − x‖ < ε for all n ≥ N .

A useful result is

Lemma 7.18 Let V be a normed vector space. Then V is complete (i.e., aBanach space) if and only if every absolutely convergent series converges. Thismeans that completeness is equivalent to the following property: If xn ∈ V foreach n ∈ N and if

∞∑n=1

‖xn‖ < ∞,

then there exists x ∈ V such that

limn→∞

‖n∑

k=1

xk − x‖ = 0.

Proof. The proof is quite straightforward. Assume first V is complete and∑∞n=1 ‖xn‖ < ∞. If we set yn =

∑nk=1 xk for n = 1, 2, . . ., it is immediate

that {yn} is a Cauchy sequence in V . Convergence of this sequence to anelement x ∈ V is equivalent to convergence of the series to x. Conversely,assume that every absolutely convergent series converges, and let {yn} be aCauchy sequence in V . We define a sequence of positive integers {nk} with1 ≤ n1 < n2 < n3 < · · · as follows. Because the sequence is Cauchy, for everyk ∈ N, there is Nk ∈ N such that n,m ≥ Nk implies ‖yn − ym‖ < 2−k. Now set

51

n1 = N1 and assuming nk found for some k ≥ 1, set nk+1 = max(Nk+1, nk +1).The fact that the sequence {nk} is strictly increasing is clear; so is the fact that

‖ynk− ynk+1‖ < 2−k

for k = 1, 2, . . . because nk, nk+1 ≥ Nk. Setting xk = ynk−ynk+1 for k = 1, 2, . . .,

we have ∞∑

k=1

‖xk‖ ≤∞∑

k=1

2−k = 1 < ∞.

By our assumption, there is x ∈ V such that

limk→∞

‖k∑

j=1

xj − x‖ = 0.

But

k∑

j=1

xj = (yn1 − yn2) + (yn2 − yn3 + · · ·+ (ynk− ynk+1) = yn1 − ynk+1

so that the sequence {ynk} converges to y = yn1 − x. It is well known that if

a Cauchy sequence has a convergent subsequence, it converges. In all events,here is the proof. We have proved that the subsequence {ynk

} of the Cauchysequence {yn} converges (to y). Let ε > 0. There is then N ∈ N such thatn,m ≥ N implies ‖yn − ym‖ < ε/2. There is K ∈ N such that k ≥ K implies‖ynk

− y‖ < ε/2. Assume now n ≥ N . We can find k ∈ N such that k ≥ Kand k ≥ N ; then also nk ≥ k ≥ K and we have both ‖yn − ynk

‖ < ε/2 and‖ynk

− y‖ < ε/2. By the triangle inequality, ‖yn − y‖ < ε.

The intermezzo is over.

Definition 31 If f ∈ L1(µ) (given a measure space (X,M, µ)), we set

‖f‖1 =∫

X

|f | dµ.

The following exercise gives some first properties of this object. It should bevery easy or trivial.

Exercise 37 Show that the “norm” we just defined has the following properties:

1. If f ∈ L1(µ), then ‖f‖1 = 0 if and only if f = 0 a.e.

2. ‖cf‖1 = |c|‖f‖1 for f ∈ L1(µ), c ∈ C.

3. ‖f + g‖1 ≤ ‖f‖1 + ‖g‖1 for all f, g ∈ L1(µ).

With ‖ · ‖1, L1(µ) is almost a normed space, except that ‖f‖1 = 0 onlyimplies f = 0 a.e. (Of course, if the only null set is the empty set, as happenswith counting measure,then we have a normed space); we say ‖·‖1 is only a semi-norm. The rigorous, but very cumbersome, thing to do is to form the quotientspace L1(µ)/N , where N = {f ∈ L1(µ) : f = 0a.e.}. We do this,but with abit of hand waving. From now on, when talking about measurable functions ona measure space, except if otherwise indicated (the escape clause), we identifyfunctions which are equal a.e..

What this means is that when we say f ∈ L1(µ), for example, we are reallyconsidering a whole equivalence class of functions; f and every function differing

52

from f on at most a null set. We have to exercise some care, and only do thingsto f which do not depend on the particular representative of the equivalenceclass. Thus if f : X → C is measurable, in most cases it becomes a no-no to talkof the value of f at a single point, except if the singleton set consisting of thatpoint has positive measure. But

∫X

f dµ always makes sense. So do notionslike sums, differences and limits. For example if the sequence {fn} converges atevery point x ∈ X to f(x), if gn = fn a.e., then the sequence {gn} converges a.e.to f . This concept is meaningless, however: Assume fx : X → R is measurablefor each x ∈ [0, 1] and define g : [0, 1] → R by

g(t) = sup0≤x≤1

fx(t).

The function g is a badly defined function, does not make sense in this theory,because if we change every fx on a null set, there is no guarantee that the endresult will be a.e. to g, since we now have an uncountable family of null setswhose union might have positive measure.

The next Theorem is the main step in proving the completeness of the spaceswe just introduced.

Theorem 7.19 Let (X,M, µ) be a measure space, let fn ∈ L1(µ) for n =1, 2, . . . and assume that

∞∑n=1

‖fn‖1 < ∞.

Then

1. The series ∞∑n=1

fn

converges a.e.; i.e., there exists a null set E ⊂ X such that for everyx ∈ X\E the series of complex terms

∑∞n=1 fn(x) converges.

2. Setting

g =∞∑

n=1

fn;

i.e., defining the measurable function g : X → C by

g(x) =∞∑

n=1

fn(x)

for the set of all x for which the series converges, as any constant (0, forexample) on the set where the series does not converge, one has

limn→∞

∥∥∥∥∥g −n∑

k=1

fk

∥∥∥∥∥1

= 0.

Proof. We define a function G : X → [0,∞] by

G(x) =∞∑

n=1

|fn(x)|

for x ∈ X. This makes perfectly good sense; of course G(x) = ∞ for a lot,maybe all, x ∈ X is, in principle, possible. But only in principle, as we are

53

about to see. First we observe the usual nonsense; G is measurable and, beingnon-negative, its integral makes sense. By Beppo Levi’s Theorem,

∫

X

Gdµ =∞∑

n=1

∫

X

|fn| dµ < ∞.

Thus G is integrable; by Theorem 7.11 it is finite valued almost everywhere;that is

G(x) =∞∑

n=1

|fn(x)| < ∞a.e.

In other words, the series∑

n fn converges absolutely a.e., and since absoluteconvergence implies convergence, we conclude that the series

∞∑n=1

fn(x)

converges a.e., showing that the formula

g(x) =∞∑

n=1

fn(x),

defines g almost everywhere on X. Now

g −n∑

k=1

fk =∞∑

k=n+1

fk

(in the sense of a.e. convergence), thus |g−∑nk=1 fk| ≤

∑∞k=n+1 |fk| and hence,

integrating, ∥∥∥∥∥g −n∑

k=1

fk

∥∥∥∥∥1

≤∞∑

k=n+1

‖fn‖1.

Since∑

n ‖fn‖1 < ∞,

limn→∞

∥∥∥∥∥g −n∑

k=1

fk

∥∥∥∥∥1

= 0

follows.

The completeness of L1(µ) is now immediate. In fact, it follows at once fromLemma 7.18 and part 2 of Theorem 7.19. We state it as a Theorem for possiblereference later on.

Theorem 7.20 The space L1(µ) is a Banach space.

The next theorem is a useful corollary of Theorem 7.19 (and if one proves itin a direct way, one can then use it to deduce Theorem 7.20 from it).

Theorem 7.21 Let (X,M, µ) be a measure space and assume that the sequence{fn} converges in L1(µ) to some element f ∈ L1(µ). Then {fn} has a subse-quence which converges a.e. to g.

Proof. The sequence {fn} is a Cauchy sequence in L1(µ). Proceeding as in theproof of Lemma 7.18 we can find a subsequence {fnk

} such that

(10) ‖fnk− fnk+1‖1 < 2−k

54

for k = 1, 2, . . .. As a reminder, what we do is to define the sequence of positiveintegers {nk} with 1 ≤ n1 < n2 < n3 < · · · as follows. Because the sequenceis Cauchy, for every k ∈ N, there is Nk ∈ N such that n,m ≥ Nk implies‖fn−fm‖1 < 2−k. Now set n1 = N1 and assuming nk found for some k ≥ 1, setnk+1 = max(Nk+1, nk+1). The fact that the sequence {nk} is strictly increasingis clear; so is the fact that (10) holds. Now define g1 = fn1 , gk = fnk

− fnk−1 ifk ≥ 2. Then

∞∑

k=1

‖gk‖1 = ‖fn1‖1 +∞∑

k=1

2−k = ‖fn1‖1 + 1 < ∞

and by Theorem 7.19 the series∑

k gk converges a.e. Set h(x) =∑∞

k=1 gk(x)for x ∈ X. Since

k∑

j=1

gj = fnk

we proved that the subsequence {fnk} of {fn} converges to h . Proceeding as

in the proof of Theorem 7.19, we see that the fact that∑∞

k=1 ‖gk‖1 < ∞ im-plies that the series

∑∞k=1 gk converges to h in L1(µ); i.e., the sequence {fnk

}converges to h also in the norm of L1(µ). But as a subsequence of {fn} it con-verges to g in L1(µ). Thus h = g (in the a.e. sense) and the theorem is proved.

Exercise 38 1. Let p ∈ R, 1 < p < ∞ and let p′ = p/(p− 1) so that

1p

+1p′

= 1.

Let ψ : (0,∞) → (0,∞) be defined by

ψ(s) =sp

p+

s−p′

p′

Show that the minimum value of ψ is assumed for s = 1 and deduce theinequality

1 ≤ sp

p+

s−p′

p′

for all s > 0.

2. Let a > 0, b > 0 and apply the inequality of the last point with s =a/(ab)1/p. Conclude

ab ≤ ap

p+

bp′

p′

for all a, b > 0, hence for all a, b ≥ 0 (since it is trivially true if a or b is0.

3. Let f ∈ Lp(µ) and g ∈ Lp′(µ). By the previous point, we have

|f(x)||g(x)|‖f‖p)‖g‖p′

≤ 1p

|f(x)|p‖f‖p

p+

1p′|g(x)|p′

‖g‖p′p′

for all x ∈ X. Integrate to get∫

X

|fg| dµ ≤ ‖f‖p‖g‖p′ .

Conclude that you have proved the following theorem:

55

Theorem 7.22 Let f ∈ Lp(µ), g ∈ Lp′(µ), where 1 < p < ∞ and 1/p +1/p′ = 1. Then fg ∈ L1(µ) and

(11) ‖fg‖1 ≤ ‖f‖p‖g‖p′ .

Inequality (11) is known as Holder’s inequality. Notice that if p = 2, thenp′ = 2, and Holder’s inequality becomes Schwarz’ inequality:

∫

X

|fg| dµ ≤ ‖f‖2‖g‖2.

4. Let f, g ∈ Lp, 1 < p < ∞. It is easy to see that f + g ∈ Lp(µ), but wewant to see a bit more. We write

|f + g|p = |f + g||f + g|p−1 ≤ |f ||f + g|p−1 + |g||f + g|p−1.

Use Holder’s inequality on both terms of the right hand side of the inequal-ity, noticing that

∥∥|h|p−1∥∥

p′ =(∫

X

(|h|p−1)p/(p−1)

dµ

)1/p′

= ‖h‖p/p′p ,

to get‖f + g‖p

p = ||f‖p‖f + g‖p/p′p .

Conclude that ‖f + g‖p ≤ ‖f‖p + ‖g‖p′ .

5. Prove the following theorem:

Theorem 7.23 Let 1 < p < ∞. Then f 7→ ‖f‖p defines a norm inLp(µ). With this norm, Lp(µ) is a Banach space.

The scale of spaces Lp(µ) is extended to the case p = ∞ as follows. Firstwe need to define the essential supremum of a function.

Definition 32 Let (X,M, µ) be a measure space and let f : X → [−∞,∞] bemeasurable. The essential supremum of f is the extended real number ess supfdefined by

ess supf = inf{α ∈ [−∞, infty] : µ({f > α}) = 0}.There are many equivalent definitions. For example, suppose β = ess supf .Notice that the set {f > β} is a null set. In fact, if β = ∞, it is empty;otherwise it is the union of the null sets {f > β + 1/n}, n = 1, 2, . . .. If wedefine a new function g : X → [−∞,∞] by g(x) = f(x) if f(x) ≤ β, g(x) = βif f(x) > β, it follows that g = f a.e.; moreover sup g = β . On the otherhand, if g = f a.e. it is clear that ess supg = ess supf . It is also clear thatess supf ≤ sup f . Putting all this together one gets the following result, whichwe state in the form of an exercise.

Exercise 39 Let f be an extended real valued measurable function on the mea-sure space X. Show that

ess supf = inf{sup g : g : X → [−∞,∞] measurable, g = fa.e.}.Definition 33 Let (X,M, µ) be a measure space and let f : X → C be mea-surable. The L∞ norm of f is defined by

‖f‖∞ = ess supf.

We say f is in L∞(µ) iff ‖f‖∞ < ∞.

56

It is easy to see that ‖ · ‖ is indeed a norm, at least in the a.e. sense(‖f‖∞ = 0 if and only if f = 0 a.e.). Convergence in this norm is essentiallyuniform convergence; it is uniform convergence if we modify all functions on anull set. If we recall the Introductory Analysis proof that a sequence of functionsconverges uniformly if and only if it is a uniform Cauchy sequence, we can adaptit at once to show that L∞(µ) is complete in this norm. In other words, wehave the following theorem:

Theorem 7.24 Let (X,M, µ) be a measure space. Then L∞(µ) is a Banachspace when provided with the indicated norm.

The spaces Lp(µ), 1 ≤ p ≤ ∞, are the basic objects of functional analysis.Due to their importance, we’ll visit them more than once in the future. Thefollowing result has a certain interest in itself, more importantly it will be neededlater on to get the density of continuous functions in Lp spaces built using aRadon measure, 1 ≤ p < ∞.

Theorem 7.25 Let (X,M, µ) be a measure space. Let Σ(µ) denote the spaceof all complex valued measurable simple functions s on X such that µ({x ∈X : s(x) 6= 0}) < ∞. Then Σ(µ) ⊂ ⋂

1≤p<∞ Lp(µ) and for each p ∈ (1,∞],Σ(µ) is a dense subspace of Lp(µ); i.e., Σ(µ) is a vector subspace of Lp(µ) withthe property that for every f ∈ Lp(µ), ε > 0, there exists s ∈ Σ(µ) such that‖f − s‖p < ε.

Proof. Let p ∈ [1,∞), let f ∈ Lp(µ). Write f = u + iv, with u, v real valued.By Theorem 4.11, there exists sequences {s+

n }, {s−n }, {t+n }, {t−n } of measurablesimple functions such that

0 ≤ s+1 ≤ s+

2 ≤ · · · ≤ u+, limn→∞

s+n = u+,

0 ≤ s−1 ≤ s−2 ≤ · · · ≤ u−, limn→∞

s−n = u−,

0 ≤ t+1 ≤ t+2 ≤ · · · ≤ v+, limn→∞

t+n = v+,

0 ≤ t−1 ≤ t−2 ≤ · · · ≤ v−, limn→∞

t−n = v−.

Setting σn = (s+n − s−n ) + i(t+n − t−n ), we have a sequence {σn} of (complex

valued) measurable simple functions, converging to f and such that |σn| ≤ |f |for all n. Then

limn→∞

|σn(x)− f(x)|p = 0

for all x ∈ X, and

|σn(x)− f(x)|p ≤ (|σn(x)|+ |f(x)|)p ≤ 2p|f(x)|p

for all x ∈ X. Since |f |p ∈ L1(µ), Lebesgue’s dominated convergence theoremimplies

limn→∞

‖σn − f‖pp = lim

n→∞

∫

X

|σn − f |p dµ =∫

X

limn→∞

|σn − f |p dµ = 0.

Density has been proved.

8 Outer Measures

Definition 34 Let X be a set. An outer measure in X is a map µ : P(X) →[0,∞] such that

57

i. µ(∅) = 0.

ii. (Monotony) If A ⊂ B ⊂ X, then µ(A) ≤ µ(B).

iii. (Sub-additivity) If A1, A2, . . . is a countable family of subsets of X, then

µ

( ∞⋃n=1

An

)≤

∞∑n=1

µ(An).

The following examples of outer measures are among the most importantones. To make a possible future reference to them easier, and to emphasizetheir importance, we state them in the form of theorems.1. Outer measures coming from measures on an algebra.

Theorem 8.1 Let X be a set and let A be an algebra in X, and let µ be ameasure on A. If A ⊂ X define

µ∗(A) = inf{∞∑

n=1

µ(En) : all sequences {En}∞n=1 in A

such that A ⊂∞⋃

n=1

En}(12)

Then µ∗ : P(X) → [0,∞] is an outer measure in X which coincides with µ onA; i.e., µ∗(E) = µ(E) if E ∈ A.

Exercise 40 Prove Theorem 8.1.

Exercise 41 Let X be a set and let A be an algebra in X, and let µ be ameasure on A. Define µ∗ by (12). Prove that one also has

µ∗(A) = inf{∞∑

n=1

µ(En) : all disjoint sequences {En}∞n=1 in A


n=1

En}.

(A sequence of sets {En} is disjoint iff En ∩ Em = ∅ for n 6= m.)

2. Hausdorff measures. We’ll be pretty general at first, but get more concreteeventually. Let X be a metric space, and let d denote its distance function. IfE ⊂ X, E 6= ∅, then we define the diameter of E by

diam(E) = sup{d(x, y) : x, y ∈ E}.

It is clear that a non-empty set E is bounded if and only if diam(E) < ∞, thatdiam(E) = 0 if and only if E is a singleton set and that if for x ∈ X, r > 0,we denote by B(x, r) = {y ∈ X : d(y, x) < r}, then diam(B(x, r) ≤ 2r for allr > 0. Another simple property of the diameter is that if A is a non-emptysubset of X and A denotes its closure, then diam(A) = diam(A).

Now let F be a family of subsets of X and assume ζ is a non-negativeextended real valued function defined on F such that the following propertieshold:

H1. For every δ > 0 there exists E ∈ F such that diam(E) < δ and such thatζ(E) < δ.

58

H2. For every δ > 0 there exist sets E1, E2, . . . ∈ F such that diam(En) < δfor all n ∈ N and X =

⋃∞n=1 En.

One thing to notice is how little one demands of the function ζ or of the familyF for that matter. F can’t be empty because it has to contain a lot of smallsets. Some small sets must have small ζ values, that is all. We can define nowa whole slew of outer measures.

Definition 35 Let δ > 0. If A ⊂ X define

λδ(A) = inf{∞∑

n=1

ζ(En) : all sequences {En}∞n=1 in F

such that diam(En) < δ ∀n and A ⊂∞⋃

n=1

En}(13)

It is easy to see that for every A ⊂ X, λδ(A) ≥ λη(A) if 0 < δ < η. In fact, thisis merely due to the inclusion

{∞∑

n=1

ζ(En) : En ∈ F diam(En) < δ ∀n ∈ N, A ⊂∞⋃

n=1

En}

⊂ {∞∑

n=1

ζ(En) : En ∈ F diam(En) < η ∀n ∈ N, A ⊂∞⋃

n=1

En}.

It follows that

(14) λ(A) = limδ→0

λδ(A) = supδ>0

λδ(A)

exists. We verify that λ is an outer measure and a bit more. For the bit morewe recall (or state) that if A,B are non-empty subsets of a metric space X,thenone defines

dist(A,B) = inf{d(x, y) : x ∈ A, y ∈ B}.Exercise 42 1. If A,B are non-empty subsets of X, prove that d(A, B) > 0

implies A ∩B = ∅.2. Prove: If A is a closed non-empty set and B is a compact non empty set,

then d(A,B) > 0 if and only if A ∩B = ∅.3. Show that it is possible to have two non-empty closed sets A,B in a metric

space such that A ∩B = ∅ but d(A,B) = 0.

Theorem 8.2 Let X be a metric space, let F ⊂ P(X) and ζ : F → [0,∞]satisfy H1 and H2. Then the map λ defined by (14), with λδ defined by (13),is an outer measure in X which satisfies the following property: If A, B arenon-empty subsets of X and dist(A,B) > 0, then

(15) λ(A ∪B) = λ(A) + λ(B).

Proof. Step 1. λδ is an outer measure for each δ > 0. To prove this, letδ > 0. We begin seeing that we have λδ(∅) = 0, which could be one of theharder things to prove before one can declare λδ an outer measure. I findthe following argument sort of nasty, so if you have a better one please sayso! First, let ε > 0, ε ≤ δ. By H1, there exist sets E1, E2, . . . in F such that

59

diam(En) < e/2n < ε ≤ δ and ζ(En) < ε/2n for each n ∈ N. Then ∅ ⊂ ⋃∞n=1 En

and because all these sets have diameter less than δ, we see that

λδ(∅) ≤∞∑

n=1

ε

2n= ε.

It follows that λδ(∅) = 0. Assume now A ⊂ B ⊂ X. If {En} is any family of setsin F of diameter < δ such that B ⊂ ⋃

n En, then we also have A ⊂ ⋃n En and

λδ(A) ≤ λδ(B) follows easily from this. Next, assume An ⊂ X for n = 1, 2, . . .and set A =

⋃n An. The argument used to prove that

(16) λδ(A) ≤∞∑

n=1

λδ(An)

is referred to as an ε/2n argument. First assume that one of λδ(An) = ∞. Then(16) holds, so we may assume that λδ(An) < ∞ for all n. Then there exists foreach n ∈ N a sequence {Enk}∞k=1 of sets of F such that diam(Enk) < δ for eachk ∈ N, An ⊂

⋃k Enk, and

∞∑

k=1

ζ(Enk) < λδ(An) +ε

2n.

The family {Enk}∞n,k=1 is a countable family of sets in F , and could be arrangedas a sequence, if so wished. Each one of these sets has diameter < δ andA ⊂ ⋃

n,k Enk. Thus

λδ(A) ≤∞∑

n=1

∞∑

k=1

ζ(Enk) ≤∞∑

n=1

(λδ(An) +

ε

2n

)=

∞∑n=1

λδ(An) + ε.

Since ε > 0 is arbitrary, (16) follows. We proved that λδ is an outer measure.It is now immediate that λ is also an outer measure, since λ = limδ→0 λδ.

The only not quite immediate thing may be subadditivity, so we’ll prove it.Assume An ⊂ X for n = 1, 2, . . . and set A =

⋃n An. Because λδ(E) ≤ λ(E)

for all δ > 0,E ⊂ X, (16) implies

λδ(A) ≤∞∑

n=1

λδ(An) ≤∞∑

n=1

λ(An)

for all δ > 0. Now just let δ → 0.Finally, let A,B be subsets of X, and assume that d(A,B) > 0. We select

δ such that 0 < δ < d(A,B). If {En} is a family of subsets of F of diameter< δ such that A ∪ B ⊂ ⋃

n En, we can set S1 = {n ∈ N : En ∩ A 6= ∅} andS2 = {n ∈ N : En ∩B 6= ∅}. Clearly

A ⊂⋃

n∈S1

En, B ⊂⋃

n∈S2

En,

and, because d(A,B) > δ, S1 ∩ S2 = ∅. Thus

λδ(A) + λδ(B) ≤∑

n∈S1

ζ(En) +∑

n∈S2

ζ(En) =∑

n∈S1∪S2

ζ(En) ≤∞∑

n=1

ζ(En);

since {En} was an arbitrary covering A ∪ B of diameter < δ, we proved thatλδ(A) + λδ(B) ≤ λδ(A ∪ B). Letting δ → 0 gives λ(A) + λ(B) ≤ λ(A ∪ B).

60

Since λ(A ∪ B) ≤ λ(A) + λ(B) because λ is an outer measure, (15) is provedand we are done.

Let s ≥ 0. If we take F = P(X) and ζ(E) = γsdiams, where γs is a constantdepending on s which plays a normalizing role, then λ, usually denoted by Hs,is Hausdorff measure of dimension s. (There are slightly different definitions,depending on the author; they all agree up to a constant factor if X = Rn).3. Lebesgue Outer Measure In this section, n is a fixed positive integer.Let En be the family of all subsets of Rn of the form

n∏

i=1

(ai, bi] = {x = (x1, . . . , xn) ∈ Rn : ai < xi ≤ bi, i = 1, . . . , n},

where ai ≤ bi for i = 1, . . . , n. If E =∏n

i=1(ai, bi] ∈ En, we define its volume by

V (E) =n∏

i=1

(bi − ai).

If A ⊂ Rn we define

m(A) = inf{∞∑

k=1

V (Ek) : all sequences {Ek}∞k=1 in En


k=1

Ek}.(17)

We then have the following theorem:

Theorem 8.3 The map m : P(Rn) → [0,∞] defined by (17) is an outer mea-sure in Rn satisfying:

1. If A,B ⊂ Rn and dist((A,B) > 0 (usual distance in Rn), then m(A∪B) =m(A) + m(B).

2. m(E) = V (E) for all E ∈ En.

Remarks. 1.This definition is not too dissimilar from the definition of theHausdorff (and Hausdorff type) measures, so it isn’t surprising that part of theproof is essentially the same as that of Theorem 8.2

2.The hardest part, by far, of this proof is to see that m(E) = V (E) ifE ∈ En. It is, of course, obvious that m(E) ≤ V (E) since we can cover E byE ⊂ E ∪ ∅ ∪ ∅ ∪ · · · . But is it so obvious that one can’t improve over V (E)?,that maybe there is some covering E ⊂ ⋃

n En for which∑

n V (En) is smallerthan V (E)? I don’t think that there is any non-messy, half way elegant proof ofthis apparently simple fact with the means at our disposal at this point. Whichis why we will postpone the full proof of Theorem 8.3 and merely prove thefollowing version.

Theorem 8.4 The map m : P(Rn) → [0,∞] defined by (17) is an outer mea-sure in Rn satisfying: If A, B ⊂ Rn and dist((A,B) > 0 (usual distance in Rn),then m(A ∪B) = m(A) + m(B).

Proof. A lot is repetition of previous arguments. We have, as remarked, m(E) ≤V (E), in particular m(∅) ≤ V (∅) = 0. That A ⊂ B implies m(A) ≤ m(B) is animmediate consequence of the definition of m. Assume Ak ⊂ Rn for k = 1, 2, . . .;

61

let A =⋃∞

k=1 Ak. As for the Hausdorff type measures, if m(Ak) = ∞ for somek ∈ N, then

m(A) ≤∞∑

k=1

m(Ak)

is clearly true. Otherwise, let ε > 0; there exist sets Ekj ∈ En for k, j ∈ N suchthat

Ak ⊂∞⋃

j=1

Ekj ,

∞∑

j=1

V (Ekj) < m(Ak) +ε

2k

for k = 1, 2, . . .. Then A ⊂ ⋃k,j Ek,j and it follows that

m(A) ≤∞∑

k=1

∞∑

j=1

V (Ekj) <

∞∑

k=1

(m(Ak) + ε.

It follows that m(A) ≤ ∑∞k=1 m(Ak). This proves that m is an outer measure

in Rn. Now let A,B ⊂ Rn and assume that dist(A,B) > 0. To reduce the proofof this result to the same argument used in the proof of Theorem 8.2, we noticethat if E =

∏ni=1(ai, bi] ∈ En, if K ∈ N, if we divide each interval (ai, bi] into K

equal subintervals, we decompose E into Kn disjoint subintervals Ek1,...,kn ∈ En.To be specific, set δi = (bi − ai)/K for i = 1, . . . , n and if 1 ≤ k1, . . . , kn ≤ K,set

Ek1,...,kn =n∏

i=1

(ai + (ki − 1)δi, ai + kiδi].

ThenV (Ek1,...,kn) = δ1 · · · δn = (b1 − a1) · · · (bn − an)K−n

and it is clear that we have

V (E) =n∏

i=1

(bi − ai) = KNn∏

i=1

δi =K∑

k1=1

· · ·K∑

kn=1

V (V (Ek1,...,kn)).

Now diam(Ek1,...,kn) = (δ21 + · · · + δ2

n)1/2, and these computations prove thatwe have for every δ > 0, A ⊂ Rn,

m(A) = inf{∞∑

k=1

V (Ek) : Ek ∈ En, diam(Ek) < δ ∀ k ∈ N, A ⊂∞⋃

k=1

Ek}.

Now we can proceed exactly as in the proof of Theorem 8.2.

Remark. In proving Theorem 8.4 we proved a very, very particular result ofthe following result: Let E, E1, . . . , Er ∈ En and assume that Ek ∩ Ej = ∅ ifk 6= j and that E =

⋃rk=1 Ek. Then

V (E) =r∑

k=1

V (Ek).

We assumed the pairwise disjoint elements of En which joined to form the ele-ment E ∈ En came from a partition of the “sides” of E. If this is not so, forexample if the partition looks like the one portrayed in the picture below forn = 2, the proof is considerably harder. I suggest as a challenge that you provethe statement in italics merely for n = 2. It may make you appreciate more thesimple proof we’ll give later on when our artillery has a few more big guns.

62

E1 E2

E3

E4E5

E6

This ends our examples section, we continue now developing the concept ofouter measure.

Definition 36 Let µ be an outer measure in the set X. A subset E of X is saidto be µ-measurable, or simply measurable, iff it satisfies the following condition,called the Caratheodory condition: The equality

(18) µ(A) = µ(A ∩ E) + µ(A ∩ Ec)

holds for all subsets A of X.

We notice that because µ is subadditive, to see that (18) holds for all A ⊂ X,it suffices to prove that

µ(A) ≥ µ(A ∩ E) + µ(A ∩ Ec)

holds for all A ⊂ X; the converse inequality being always true.

Theorem 8.5 Let µ be an outer measure in the set X and let M denote thefamily of all µ-measurable sets. Then M is a σ-algebra in X and the restrictionof µ to M is a measure. The measure space (X,M, µ

∣∣∣M

) is complete; in fact

M contains all subsets E of X such that µ(E) = 0. Moreover,

µ

( ∞⋃n=1

A ∩ En

)=

∞∑n=1

µ(A ∩ En)

holds for all disjoint sequences {En} in M and for all A ⊂ X.

Proof. The definition of E ∈ M is symmetric in E, Ec, thus E ∈ M impliesEc ∈ M . Because µ(∅) = 0, it is also clear that ∅ (and X) are in M. AssumeE, F ∈ M. Then for every A ∈ M one gets, using first that E ∈ M, then thatF ∈M,

µ(A) = µ(A ∩ E ∩ F ) + µ(A ∩ E ∩ F c) + µ(A ∩ Ec ∩ F ) + µ(A ∩ Ec ∩ F c).

Now the last term on the right hand side satisfies µ(A∩Ec∩F c) = µ(A∩(E∪F )c)while the first three terms add up to µ(A ∩ (E ∪ F )). To see this, replace A byA ∩ (E ∪ F ) on the left hand side and use that

A ∩ (E ∪ F ) ∩ E ∩ F = A ∩ E ∩ F,

A ∩ (E ∪ F ) ∩ E ∩ F c = A ∩ E ∩ F c,

A ∩ (E ∪ F ) ∩ Ec ∩ F = A ∩ Ec ∩ F,

A ∩ (E ∪ F ) ∩ (E ∪ F )c = ∅.

63

Thusµ(A) = µ(A ∩ (E ∪ F )) + µ(A ∩ (E ∪ F )c)

proving E ∪ F ∈ M. By induction, all finite unions of sets in M are in M.Now let {En} be a disjoint sequence of sets in M. If A ⊂ X, then (becauseE2 ∈M)

µ(A ∩ (E1 ∪ E2)) = µ(A ∩ (E1 ∪ E2) ∩ E2) + µ(A ∩ (E1 ∪ E2) ∩ Ec2).

Since E1 ∩ E2 = ∅ this works out to

µ(A ∩ (E1 ∪ E2)) = µ(A ∩ E1) + µ(A ∩ E2).

By induction

µ

(m⋃

n=1

(A ∩ En)

)=

m∑n=1

µ(A ∩ En)

for all m ∈ N. Let Fm =⋃m

n=1 En; then (as proved) Fm ∈ M and the lastequality above can be written in the form

µ(A ∩ Fm) =m∑

n=1

µ(A ∩ En).

Let also F =⋃∞

n=1 En and notice that Fm ⊂ F , hence F c ⊂ F cm for all m ∈ N.

We have (because Fm ∈M)

µ(A) = µ(A ∩ Fm) + µ(A ∩ F cm) ≥ µ(A ∩ Fm) + µ(A ∩ F c)

=m∑

n=1

µ(A ∩ En) + µ(A ∩ F c).

Since this holds for all m ∈ N, we proved

(19) µ(A) ≥∞∑

n=1

µ(A ∩ En) + µ(A ∩ F c).

We can replace A in (19) by A ∩ F to get

µ(A ∩ F ) ≥∞∑

n=1

µ(A ∩ En)

since En ∩ F = En for all n and F ∩ F c = ∅. By the subadditivity of µ,

µ(A ∩ F ) = µ(∞⋃

n=1

A ∩ En) ≤∞∑

n=1

µ(A ∩ En)

so that we proved

(20) µ(A ∩ F ) =∞∑

n=1

µ(A ∩ En)

This proves the last statement of the theorem; using it in (19) gives

µ(A) ≥ µ(A ∩ F ) + µ(A ∩ F c)

which proves F ∈ M. So far we proved that M is an algebra which is closedunder disjoint countable unions. However, since every countable union in an

64

algebra can be written as a disjoint union of sets in the algebra (i.e., as aunion of pairwise disjoint sets in the algebra), we are done proving that Mis a σ-algebra. Notice that (20) is a stronger statement than simply say-ing that µ is a measure when restricted to M; (20) with A = X suffices tosee we have a measure. Finally, assume E ⊂ X and µ(E) = 0. For ev-ery A ⊂ X we’ll have µ(A ∩ E) ≤ µ(E) = 0; i.e., µ(A ∩ E) = 0. Thusµ(A) ≥ µ(A ∩ Ec) = µ(A ∩ E) + µ(A ∩ Ec) proving E ∈M.

Of course, in general it can be hard to decide which sets are measurable. Orit can be easy, because only ∅, X are measurable:

Exercise 43 Define µ : P(X) → [0,∞] by µ(∅) = 0, µ(A) = max A if A isa bounded, nonempty set of natural numbers, µ(A) = ∞ if A is an unboundedset of natural numbers. Show that µ is an outer measure. Show that the onlyµ-measurable sets are ∅ and N.

It is thus useful to have the following result in the case of metric spaces.

Theorem 8.6 Let X be a metric space and assume µ is an outer measure inX satisfying: If dist(A,B) > 0, then µ(A ∪ B) = µ(A) + µ(B). Then all Borelsets are measurable.

Proof. Let B = B(X) denote the σ-algebra of Borel sets of X. Because themeasurable sets form a σ-algebra, it suffices to prove that all open subsets ofX are measurable; i.e., that we have: If U is an open subset of X and A ⊂ X,then

µ(A) ≥ µ(A ∩ U) + µ(A\U).

So let U be open in X, let A be a subset of X. We can of course assume thatA\U 6= ∅, otherwise there is nothing to prove; in particular, U 6= ∅. (We don’treally need to make this assumption, it just avoids some trivialities. Withoutit, the sets Fε defined below are all equal to U = X and the sets En satisfyE0 = A, En = ∅ if n ≥ 1. Let’s carry on.) We can also assume, and we willassume this, that µ(A) < ∞. Otherwise the inequality to prove is trivially true.

If the distance between A∩U and A\U were positive, we are done, but thereis hardly a chance of this happening except in some very trivial situations. Sowe have to break things up. The argument we use is actually quite natural. Afirst approach is to define, for ε > 0,

Fε = {x ∈ X : dist(x,X\U) ≥ ε}.

Then dist((A ∩ Fε), A\U) ≥ ε > 0 and we get, since A ⊃ (A ∩ Fε) ∪A\U ,

µ(A) ≥ µ ((A ∩ Fε) ∪A\U) = µ(A ∩ Fε) + µ(A\U).

If we could just prove that limε→0 µ(A ∩ Fε) = µ(A ∩ U), we would be done. Ifwe remember how this was proved for measures (Theorem 6.1) we might wantto rewrite the sequence {F1/n}∞n=1 as a sequence of pairwise disjoint sets anduse a similar argument to the one used for measures. Mimicking that proof, weintroduce E0 = F1 ∩A,

En = (F1/(n+1) − F1/n) ∩A, n = 1, 2, . . . ;

these sets are all pairwise disjoint and their union is U ∩ A (Because U isopen, every element x ∈ U is in F1/n for some n ∈ N). The problem is thatdist(En, En+1) is not necessarily positive (in fact, most likely is that it is 0) sowe can’t conclude that µ of a union of these sets is the sum of µ of the sets, as we

65

do in the proof of Theorem 6.1. But we do have: If k, n ∈ N∪{0} and |n−k| ≥ 2,then dist(En, Ek) > 0. In fact, let us assume (as we may) that 0 ≤ k ≤ n − 2.Let x ∈ Ek, y ∈ En. Then x ∈ F1/(k+1) so that dist(x,X\U) ≥ 1/(k + 1) whiley /∈ F1/n so that dist(y,X\U) < 1/n. There exists thus z ∈ X\U such thatd(y, z) < 1/n, hence

1k + 1

≤ d(x, z) ≤ d(x, y) + d(y, z) < d(x, y) +1n

,

thusd(x, y) ≥ 1

k + 1− 1

n=

n− k − 1(k + 1)n

≥ 1(k + 1)n

> 0.

This suggests the following approach:Let

B =⋃

n≥0,n evenEn, C =

⋃

n≥0,n oddEn.

Then A ∩ U = B ∪ C. If we set also

BN =⋃

0≤n≤N,n evenEn, CN =

⋃

0≤n≤N,n oddEn.

for N ∈ N then we have by induction on N and our remarks above,

µ(BN ) =∑

0≤n≤N,n evenµ(En), µ(CN ) =

∑

0≤n≤N,n oddµ(En).

It is now easy to prove that

(21) µ(B) =∑

n≥0,n evenµ(En), µ(C) =

∑

n≥0,n oddµ(En).

In fact, because µ is an outer measure, we get that the right hand sides in (21)are dominated by the left hand sides. But BN ⊂ B so that we also have

∑

0≤n≤N,n evenµ(En) = µ(BN ) ≤ µ(B)

for all N ; similarly for C. This implies that the right hand sides in (21) dominatethe left sides.

We are getting there!Because B ⊂ A, C ⊂ A and µ(A) < ∞, we have that

∞∑n=0

µ(En) = µ(B) + µ(C) ≤ 2µ(A) < ∞,

hence given ε > 0 there is N ∈ N such that

∞∑

n=N+1

µ(En) < ε.

But then

(A ∩ U)\F1/N =∞⋃

n=N+1

En

implies

µ((A ∩ U)\F1/N )∞∑

n=N+1

µ(En) < ε

66

and it follows that

µ(A ∩ U) ≤ µ(A ∩ F1/N ) + µ((A ∩ U)\F1/N ) < µ(A ∩ F1/N ) + ε.

Let us recapitulate what we proved. We proved (that’s all that we need) thatgiven ε > 0, there is N ∈ N such that µ(A ∩ U) < µ(A ∩ F1/N ) + ε. Sinceby now we may have forgotten how all of this came up, we repeat some of thearguments as we finish the proof. Because dist(F1/N , A\U) > 0, we get fromthe fact that µ is an outer measure with the special property of being additiveon sets at positive distance from each other,

µ(A) ≥ µ(A∩F1/N ∪A\U) = µ(A∩F1/N )+µ(A\U) > µ(A∩U)+µ(A\U)− ε.

Since ε > 0 is arbitrary, we are done.It follows that all the Hausdorff style measures are Borel measures. By Theorem8.4, the measure m on Rn is also a Borel measure. By this we mean simply thattheir restriction to the σ- algebra of measurable sets are measures and that thisσ-algebra contains the Borel sets. We use the same notation to denote the outermeasure and the corresponding measure, one is just a restriction of the other one.But what happens if we begin with a measure µ on an algebra (possibly a σ-algebra) A, extend it to an outer measure µ∗ by (12)? The answer, at leastin the σ-finite case, is that the family of measurable sets M of µ∗ containsthe σ-algebra σ(A) generated by A and the measure space (X,M, µ∗) consistsexactly of the measure theoretic completion (in the sense of Theorem 6.3) of(X, σ(A), µ∗). We explore all this in the next section.

9 Extension of Measures

The main result of this section is the following theorem.

Theorem 9.1 Let A be an algebra in the set X and let µ : A → [0,∞] be ameasure on A. Define the outer measure µ∗ : P(X) → [0,∞] as in Definition12 and let M be the σ-algebra of µ∗-measurable sets provided by Theorem 8.5.Then the following results hold:

1. A ⊂M, hence also σ(A) ⊂M.

2. If µ is σ-finite and if ν is a measure on σ(A) such that ν(E) = µ(E) forall E ∈ A, then ν(E) = µ∗(E) for all E ∈ σ(A).

3. If µ is σ-finite, the measure space (X,M, µ∗|M)

is the completion of the

measure space (X,M, µ∗|σ(A)

).

As stated, the theorem is a bit cumbersome and technical; it has the followingcorollary which is quite adequate for most applications.

Corollary 9.2 Let A be an algebra in the set X and let µ : A → [0,∞] be ameasure on A. There exists a measure µ defined on the σ-algebra σ(A) generatedby A such that µ(E) = µ(E) if E ∈ A. If the measure µ is σ-finite, then theextension µ to σ(A) is unique.

It should be clear that Theorem 9.1 implies Corollary 9.2.The proof of Theorem 9.1 will be done in a series of steps. Let us begin by

seeing that A ⊂ M. So let E ∈ A and let A ⊂ X. We have to see that theCaratheodory condition is satisfied; i.e., that

µ∗(A) ≥ µ∗(A ∩ E) + µ∗(A ∩ Ec)

67

(the converse inequality is trivially satisfied). Since the inequality is clear ifµ∗(A) = ∞, assume µ∗(A) < ∞. Let ε > 0. By the definition of µ∗(A) as aninfimum, there exist sets E1, E2, . . . ∈ A such that A ⊂ ⋃∞

n=1 En and

µ∗(A) + ε >

∞∑n=1

µ(En).

Now A∩E ⊂ ⋃∞n=1 En ∩E and since En ∩E ∈ A for all n, the definition of µ∗

gives

µ∗(A ∩ E) ≤∞∑

n=1

µ(En ∩ E).

Similarly,

µ∗(A ∩ Ec) ≤∞∑

n=1

µ(En ∩ Ec).

But µ is a measure so that µ(En ∩E) + µ(En ∩Ec) = µ(En) for all n ∈ N andit follows that

µ∗(A) + ε >

∞∑n=1

(µ(En ∩ E) + µ(En ∩ Ec))

≥ µ∗(A ∩ E) + µ∗(A ∩ Ec).

Since ε > 0 is arbitrary, the Caratheodory condition is proved. We concludethat A ∈M, hence σ(A) ∈ M. We have successfully extended our measure onan algebra to a σ-algebra. The uniqueness part is a bit more difficult. From nowon, we assume that µ is σ-finite. There is an obvious way of proving uniqueness.Assume there is another measure ν : σ(A) → [0,∞] such that ν

∣∣∣A

= µ. Theobvious thing to do is to show that the family of all sets on which the measurescoincide is a σ-algebra; i.e., let E = {E ∈ σ(A) : ν(A) = µ∗(A)}. If weprove that E is a σ-algebra we are done; in fact, since E ⊃ A it must containσ(A), hence coincide with σ(A). Unfortunately, this obvious way seems to havesome serious execution problems. We use somewhat of a detour to avoid theseproblems.

Definition 37 A family E of subsets of X is said to be a monotone class (in X)iff whenever {En} is a monotone sequence of sets in E, then limn→∞En ∈ E. (Asequence of sets {En} is monotone if either E1 ⊂ E2 ⊂ · · · or E1 ⊃ E2 ⊃ · · · .)Recall that if E1 ⊂ E2 ⊂ · · · , then limn→∞En =

⋃∞n=1 En; if E1 ⊃ E2 ⊃ · · · ,

then limn→∞En =⋂∞

n=1 En.It should be clear and trivial that σ-algebras are monotone classes. We’ll

need a converse relation. But first a definition.

Definition 38 Let C be a family of subsets of X. The monotone class generatedby C is defined by

mon(C) =⋂{E : Eis a monotone class and C ⊂ E}.

In other words mon(C) is characterized by the following properties

1. mon(C) is a monotone class.

2. C ⊂ mon(C).3. If E is a monotone class and C ⊂ E then mon(C) ⊂ E .

68

Lemma 9.3 Let A be an algebra in X. Then mon(A) = σ(A).

Proof. Let E = mon(A) and let Σ = σ(A). Since σ-algebras are monotoneclasses, it is clear that E ⊂ Σ. To get the converse inclusion we begin showingthat if A ∈ E , then Ac ∈ E and A, B ∈ E implies A ∪ B ∈ E . (As it turnsout, that’s really all that needs to be proved.) Showing that something is inthis generated monotone class causes difficulties similar to proving things are inσ-algebras; similar techniques are needed. We begin showing E ∈ E implies Ec

in E . For this we introduce a new family of sets, say

D = {E ∈ E : Ec ∈ E}.Because A ∈ E and A is an algebra, it is clear that A ⊂ D. We also see that Dis a monotone class; in fact let {En} be a monotone sequence in D. It is easilyverified that in this case {Ec

n} is also monotone and(

limn→∞

En

)c

= limn→∞

Ecn ∈ E .

It follows that D is a monotone class containing A, hence E ⊂ D, hence D = E .This means that if E ∈ E , then E ∈ D, hence Ec ∈ E . To prove that A,B ∈ Eimplies A ∪ B ∈ E we define more families of sets. First of all, for an arbitrarysubset of A ∈ E set

DA = {B ∈ E : A ∪B ∈ E}.It is easy to see that DA is always a monotone class (maybe consisting of thesingle set ∅). This is simply due to the fact that E is a monotone class andif {En} is a monotone sequence, then so is {A ∪ En} with limn→∞A ∪ En =A ∪ limn→∞En.

Assume now A ∈ A. Then B ∈ A implies A ∪ B ∈ A ⊂ E so that B ∈ DA.It follows that A ⊂ DA; hence also E ⊂ DA, hence E = DA. We proved thatDA = E for all A ∈ A. Let now

F = {A ∈ E : DA = E}.We just proved F ⊃ A; let us see it is a monotone class. For this, assume {An}is a monotone sequence of sets in F ; then An ∈ E and DAn = E for all n. IfB ∈ E , the fact that DAn = E implies An ∪B ∈ E ; since {An ∪B} is monotonewe conclude that limn(An ∪ B) = limn A ∪ B ∈ E . Since B ∈ E is arbitrary,we just proved Dlimn An = E ; i.e., limn An ∈ F . Thus F is a monotone classcontaining A, hence F = E . It follows that if A, B ∈ E , then DA = E , henceB ∈ DA, hence A ∪B ∈ E .

All that remains to be seen is that if An ∈ E for all n ∈ N, then⋃

n An ∈ E .By induction, we have

Bm =m⋃

n=1

An ∈ E

for all m ∈ N; since {Bm} is monotone and limm Bm =⋃

n An, we proved⋃n An ∈ E .

We are ready to tackle the uniqueness of the extended measure. Recalling thatwe are assuming that µ is σ-finite, let ν be a measure on σ(A) agreeing with µon A. To see that ν must agree with µ∗ on σ(A), we introduce the family ofsets mentioned before, namely

E = {A ∈ σ(A) : ν(A) = µ∗(A).

As mentioned above, it seems to be hard to prove that this family is a σ-algebraby a direct approach. However, it is not so hard to prove it is a monotone class.

69

We do this in two steps. First we assume that µ(X) < ∞. In this case it isvery easy to see E is a monotone class. Notice that because X ∈ A we also haveν(X) = µ(X) < ∞. If {En} is a monotone sequence of sets in E then

µ∗( limn→∞

En) = limn→∞

µ∗(En) = limn→∞

ν(En) = ν( limn→∞

En)

so that limn En ∈ E . Thus E is a monotone class. Since it contains the algebraA,E ⊃ mon(A), hence E ⊃ σ(A) by Lemma 9.3. Uniqueness has been established.We now return to the general σ-finite case. The problem we encounter hereis that if {En} is a decreasing monotone sequence in E and µ(En) = ∞ forall n (hence also ν(En) = ∞ for all n), then we can’t say anything aboutµ(limn En) or ν(limn En); in principle there is no reason to assume they haveto be equal (and there are non-σ-finite counterexamples to equality). To getover this problem, we use σ-finiteness to write X =

⋃n∈NAn where An ∈ A,

µ(Am) = ν(Am) < ∞ for all m ∈ N. By the usual trick, we may and will assumethat the Am’s are pairwise disjoint, Am ∩ An = ∅ if m 6= n. Let m ∈ N andconsider the algebra

Am = {E ∩Am : E ∈ A}in Am. We see quite easily that one can also describe Am by

Am = {E ∈ A : E ⊂ Am}and that

σ(Am) = {E ∩Am : E ∈ σ(A)} = {E ∈ σ(A) : E ⊂ Am}.On σ(Am) we consider the measures µ∗m, νm, the restrictions of µ∗, ν, respec-tively, to σ(Am). In other words, µ∗m(E) = µ∗(E) if E ∈ σ(A) and E ⊂ Am,νm(E) = ν(E) if E ∈ σ(A) and E ⊂ Am. We are now in the “previous case;” themeasures µ∗m, νm are finite, they coincide on Am, hence (by what we proved)they coincide on σ(Am). That is, we proved that we have µ∗(E ∩ Am) =ν(E ∩ Am) for all E ∈ σ(A). This being true for every m ∈ N, we have forE ∈ σ(A),

µ∗(E) = µ∗( ⋃

m∈N(E ∩Am)

)=

∞∑m=1

µ∗(E∩Am) =∞∑

m=1

ν(E∩Am) = ν

( ⋃

m∈N(E ∩Am)

)= ν(E).

Uniqueness in the general σ-finite case has been established. Part (b) of Theo-rem 9.1 has been proved.

We still need to know a bit more about M, the σ-algebra of µ∗-measurablesets. We know it contains the σ-algebra generated by the algebra A; howmuch more does it contain? To answer this question, let us begin with a finitemeasure set A ∈ M. By the definition of µ∗(A), for every m ∈ N there existssets Em1, Em2, . . . in A such that

A ⊂∞⋃

n=1

Emn,

∞∑n=1

µ∗(Emn) < µ∗(A) +1m

.

Let Gm =⋃

n Emn; then Gm ∈ σ(A) and

A ⊂ Gm, µ∗(A) ≤ µ∗(Gm) < µ∗(A) +1m

.

Now consider the monotone decreasing sequence G1, G1 ∩G2, G1 ∩G2 ∩G3, . . ..Every set in this sequence contains A; also

µ∗(G1 ∩ · · · ∩Gm) ≤ µ∗(Gm) < µ∗(A) +1m

.

70

It follows that G =⋂∞

m=1 Gm satisfies G ∈ σ(A), A ⊂ G and

µ∗(G) = limm→∞

µ∗(

m⋂n=1

Gn

)≤ lim inf

m→∞(µ∗(A) +

1m

) = µ∗(A).

But A ⊂ G, thus µ∗(A) ≤ µ∗(G); i.e., µ∗(A) = µ∗(G). Because the measure ofA was finite and because we are dealing with sets in M, we have µ∗(G\A) = 0.We proved that sets of finite measure satisfy the following regularity condition.

There exists G ∈ σ(A) such that A ⊂ G and µ∗(G\A) = 0.

Because of σ-finiteness, the same is true for all measurable sets A. In fact,if A ∈ M, because of σ-finiteness, we can write A =

⋃n∈NAn, where An ∈ M

and µ∗(An) < ∞ for all n ∈ N. By what we just did, there exist sets Gn ∈ σ(A)such that An ⊂ Gn and µ∗(Gn\An) = 0 for n = 1, 2, . . .. Setting G =

⋃n∈NGn,

we get G ∈ σ(A), A ⊂ G and

G\A ⊂⋃

n∈NGn\An

so that µ∗(G\A) = 0.We can prove a bit more. If A ∈ M, then Ac ∈ M, so there will exist

H ∈ σ(A) such that Ac ⊂ H and µ∗(H\Ac) = 0. Letting K = Hc, we see thatK ∈ σ(A), K ⊂ A, and since A\K = H\Ac, µ∗(A\K) = 0.

We proved that for every measurable A there exist σ(A) sets K, G suchthat K ⊂ A ⊂ G and µ∗(G\K) = 0 so that the measure space (X,M, µ∗

∣∣∣M

)

is indeed the completion of the space (X,σ(A), µ∗∣∣∣σ(A)

). All statements of

Theorem 9.1 have been proved.

Exercise 44 Let A ⊂ X. Prove that the following statements are equivalent.

1. A ∈M2. There exists a set K ∈ σ(A) and a null set N ∈M such that A = K ∪N .

3. There exists a set K ∈ σ(A) and a null set N ∈M such that A = K∆N .

Exercise 45 (G. Folland) Let A be the family of all subsets of Q (the set ofrational numbers) which are finite unions of sets of the form (a, b] ∩ Q = {x ∈Q : a < x ≤ b} where a, b ∈ R, a ≤ b. If A ∈ A and A 6= ∅, set µ(A) = ∞. Setµ(∅) = 0.

1. Show that A is an algebra in Q.

2. Show that σ(A) = P(Q).

3. Show that µ is a measure on A which has more than one extension to ameasure in σ(A).

10 Lebesgue measure in Rn

10.1 Construction of Lebesgue measure in Rn.

This case is easily dealt with. We could simply apply Theorem 5.4 and then theextension theorem. But we prefer to give an independent approach based on

71

the construction of Theorem 8.3 and, incidentally, we are going to finally provethe missing part of Theorem 8.3.

We begin recalling the construction of Theorem 8.4. The family En consistsof all half-open/half-closed rectangles

∏ni=1(ai, bi], with ai ≤ bi for i = 1, . . . , n.

We allow the equality of ai with bi to deal smoothly with the empty rectan-gle. This family is not exactly an elementary class in the sense of the sectionon Lebesgue-Stieltjes measure (the complement of a finite rectangle cannot beexpressed as a finite union of finite rectangles), but it is close. We have thefollowing properties:

E1. If E, F ∈ En, then E ∩F ∈ En. This is immediate; if E =∏n

i=1(ai, bi] 6= ∅,F =

∏ni=1(ci, di] 6= ∅ and for some i we have max(ai, ci) ≥ min(bi, di),

then E ∩ F is empty. Otherwise,

E ∩ F =n∏

i=1

(max(ai, ci), min(bi, di)].

E2. Let E, F ∈ En. Then E\F can be written as the union of 2n (or fewer)pairwise disjoint elements of En. This is again fairly immediate. SayE =

∏ni=1(ai, bi], F =

∏ni=1(ci, di], then x = (x1, . . . , xn) ∈ E\F if and

only if ai < xi ≤ bi and either xi ≤ ci or xi > di for i = 1, . . . , n. Allowingfor once an interval (a, b] in which a could be strictly larger than b, andinterpreting it as the empty set, we see that

E\F =n∏

i=1

((ai, ci] ∪ (di, bi]) =⋃

S⊂{1,...,n}ES ,

where we define, if S is a subset of {1, . . . , n},

ES =n∏

i=1

Ai, Ai ={

(ai, ci], i ∈ S,(di, bi], i /∈ S.

Several of these sets ES could be empty; it should be clear that ES∩ET 6= ∅for S, T ⊂ {1, . . . , n} implies S = T .

E3. Let E,E1, . . . , Er ∈ En. there exists a family of pairwise disjoint setsF1, . . . , Fs ∈ En such that

E\

r⋃

j=1

Ej

=

s⋃

j=1

Fj .

This can be done by induction on r; the case r = 1 being dealt with in 10.1.So assume it proved up to some r ≥ 1. Assuming E,E1, . . . , Er+1 ∈ En,we can write

E\

r⋃

j=1

Ej

=

s⋃

j=1

Fj .

with F1, . . . Fs pairwise disjoint elements of En. Now

E\

r+1⋃

j=1

Ej

=

E\

r⋃

j=1

Ej

∩ Ec

r+1 =s⋃

j=1

(FjEcr+1)

and the case r + 1 follows easily from this and 10.1.

72

E4. Let E1, E2, . . . be elements of En. For every k ∈ N there exists rk ∈ N anda family {Fkj}rk

j=1 of elements of En such that

1. All the Fkj ’s are mutually disjoint; specifically if Fk1j1 ∩ Fk2j2 6= ∅,then k1 = k2 and j1 = j2.

2.

Ek =rk⋃

j=1

Fkj , k = 1, 2, . . . .

3. ∞⋃

k=1

Ek =∞⋃

k=1

rk⋃

j=1

Fkj .

The third property is an immediate consequence of the second one. Wecan take r1 = 1 and F1,1 = E1. If k ≥ 2, we use property 10.1 to write

Ek\(E1 ∪ · · · ∪ Ek−1) =rk⋃

j=1

Fkj

with pairwise disjoints elements Fk,1, . . . , Fk,rk∈ En.

We recall also that if E =∏n

i=1(ai, bi] ∈ En, −∞ < ai ≤ bi < ∞ for i = 1, . . . , n,then V (E) =

∏ni=1(Bi − ai). The definition of the Lebesgue outer measure is

m(A) = inf{∞∑

k=1

V (Ek) : A ⊂∞⋃

k=1

Ek, Ek ∈ En ∀ k ∈ N}.

By Theorem 8.4, m is an outer measure in Rn such that m(A ∪ B) = m(A) +m(B) if dist(A,B) > 0. By Theorem 8.6, m is a Borel measure in Rn; i.e., theσ-algebra of m- measurable sets contains the σ-algebra of Borel sets. Turningto the proof of Theorem 8.3, all we need to see is that m(E) = V (E) for E ∈ En.The definition of m implies at once that m(E) ≤ V (E), so it suffices to see thatm(E) ≥ V (E).

The proof will be by induction on the dimension n. We begin with the casen = 1 (where we can again use n as a general symbol for an integer). Themain step in this case is the following result. It is actually the case φ(x) = x ofLemma 5.2, but we prove it again in this somewhat simpler context.

Lemma 10.1 Let a, b ∈ R, −∞ < a < b < ∞ and assume that

(a, b] =∞⋃

n=1

(an, bn]

where an ≤ bn for all n ∈ N and (an, bn] ∩ (am, bm] = ∅ if n, m ∈ N, n 6= m.Then

b− a =∞∑

n=1

(bn − an).

5.4

Proof. Notice first that the finite case is trivial. In fact, assume

(a, b] =N⋃

n=1

(an, bn]

73

for some N ∈ N, where an ≤ bn for n = 1, 2, . . . and (an, bn] ∩ (am, bm] = ∅ ifn 6= m. We may assume that an < bn for n = 1, . . . , N (if an = bn for somen, just throw it out; nothing changes). Second, we may relabel the an’s so thatthey are ordered; i.e., we may assume that a1 < a2 < . . . < an. Then we musthave,

a = a1 < b1 = a2 < · · · < bN−1 = aN < bN = b.

Setting, for reasons of notation, aN+1 = bN = b, we have

N∑n=1

(bn − an) =N∑

n=1

(an+1 − an) = aN+1 − a1 = b− a.

Turning now to the infinite case, we may also assume that an < bn for all n ∈ N.In fact, throw out all intervals, if any, where an = bn. Either nothing changes,or it reduces to the finite case. From now on we assume an < bn for all n.

Let ε ∈ R, 0 < ε < b− a. Then

[a + ε, b] ⊂ (a, b] =∞⋃

n=1

(an, bn] ⊂∞⋃

n=1

(an, bn +ε

2n);

since [a + ε, b] is compact and each (an, bn + ε/2n) open, there exists a finitenumber of indices n1, . . . , nN such that

[a + ε, b] ⊂ (a, b] ⊂N⋃

k=1

(ank, bnk

+ε

2nk).

We may assume the indices n1, . . . , nN so chosen that an1 < an2 < . . . < anN. If

for any k we have bnk> ank+1 , then (ank

, bnk]∩(ank+1 , bnk+1 ] = (ank+1 , bnk

] 6= ∅.Thus bnk

≤ ank+1 for k = 1, 2, . . . , N − 1. We will set, for notational reasons,anN+1 = b; then the last inequalities also hold for k = N . That is, we have

bnk≤ ank+1 , k = 1, 2, . . . , N.

We also must have an1 < a + ε, otherwise the portion [a, a + ε) is not coveredby the union of the intervals (ank

, bnk+ ε/2nk). In addition, we must have

ank+1 ≤ bnk+ (ε/2nk for k = 1, . . . , N − 1; otherwise there is an uncovered gap

between bnk+ (ε/2nk and ank+1 . The same inequality holds for k = N if, as

before, we identify anN+1 with b; then it states that bnN+ (ε/2nN ) ≥ b, if not,

the portion of [a+ ε, b] from bnN+(ε/2nN ) to b is not covered. That is, we have

bnk≥ ank+1 −

ε

2nk, k = 1, 2, . . . , N.

We have:

∞∑n=1

(bn − an) ≥N∑

k=1

(bnk− ank

) ≥N∑

k=1

((ank+1 − ank

)− ε

2nk

)

≥N∑

k=1

(ank+1 − ank)− ε

= b− an1 − ε ≥ b− (a + ε)− ε = (b− a)− 2ε.

Since ε > 0 is arbitrary, we proved

b− a ≤∞∑

n=1

(an − bn).

74

For the converse inequality, it suffices to prove that

N∑n=1

(an − bn) ≤ b− a

for all N ∈ N. This is essentially done by the same argument that we used toprove the general result in the finite case. Relabelling so that a1 < · · · < aN ,one concludes that

a ≤ a1 < b1 ≤ a2 < b2 ≤ · · · , aN < bn ≤ b.

Then

(b1−a1)+(b2−a2)+· · ·+(bN−aN ) ≤ (a2−a1)+(a3−a2)+· · ·+(aN−aN−1)+(b−aN ) = b−a1 ≤ b−a.

The lemma follows.At this point we almost proved Theorem 8.3 for the case n = 1. In the onedimensional case, E is the family of all half open intervals (a, b], V ((a, b]) is justb− a and m : P(R) → [0,∞] is defined by

m(A) = inf{∞∑

k=1

V (Ek) : Ek ∈ E ∀k ∈ N, A ⊂∞⋃

k=1

Ek}.

By Theorem 8.4, m is an outer measure in R such that m(A∪B) = m(A)+m(B)if dist(A, B) > 0. By Theorem 9.1, m restricted to the σ-algebra of measurablesets is a Borel measure. So let us see that m((a, b]) ≥ b − a for all (a, b] ∈ E .Assume thus that we have

(a, b] ⊂∞⋃

n=1

(an, bn],

where for each n ∈ N, an, bn ∈ R, an < bn. The basic idea is that we canassume that the intervals (an, bn] are pairwise disjoint and the last inclusioncan be assumed to be an inequality. To do this, we first intersect each (an, b]with (a, b]. If the intersection is empty, we throw it out. If not, it results in anew interval of the form (a′n, b′n] ⊂ (an, bn] with

b′n − a′n = min(b, bn)−max(a, an) ≤ bn − an.

Then

(a, b] =∞⋃

n=1

(a′n, b′n].

By property 10.1, we can find elements Fnj = (αnj , βnj ] ∈ E1, j = 1, . . . rn

(some rn ∈ N), n = 1, 2, . . ., such that Fnj ∩ Fn′j′ = ∅ except if n = n′ andj = j′, and

(a′n, b′n]\ ⋃

1≤k<n

(a′k, b′k]

=

rn⋃

j=1

(αnj , βnj ]

for n = 1, 2, . . .;

(a, b] =∞⋃

n=1

rn⋃

j=1

(αnj , βnj ].

By Lemma 5.4,

b− a =

P∑n=1

rn∑

j=1

(βnj − αnj);

75

by the finite case of Lemma 5.4 (or the last part of its proof) we have

rn∑

j=1

(βnj − αnj) ≤ (b′n − a′n).

(The intervals (anj , bnj ] for j = 1, . . . , rn constitute a finite number of pairwisedisjoint intervals contained in (a′n, b′n], so we are in the same situation towardthe end of the proof of lemma 5.4 in which we had pairwise disjoint intervals(a1, b1], . . . , (aN , bN ] contained in (a, b].) It follows that

b− a =

P∑n=1

rn∑

j=1

(βnj − αnj) ≤∞∑

n=1

(b′n − a′n) ≤∞∑

n=1

(bn − an).

We have proved that b−a ≤ ∑∞n=1(bn−an) for an arbitrary sequence of intervals

{(an, bn]} whose union contains (a, b]. By the definition of the measure m, itfollows that b − a ≤ m((a, b]). Theorem 8.3 is proved in the one dimensionalcase. The time for applying induction has arrived, and to do so it will help tohave the following notation. If E =

∏nk=1(ak, bk] ∈ En, and E is not empty, we

will write E = E′ × (an, bn] where E′ =∏n−1

k=1(ak, bk]. Notice that this writesE as the cartesian product of an element E′ ∈ En−1 and an interval in E1. IfE = ∅, we will set E′ = ∅ and En = ∅.

The assumption now is that we proved m(E) = V (E) if m is the Lebesgueouter measure in Rn−1, V the volume function defined on “rectangle” of Rn−1;n ≥ 2. Assume E, E1, E2, . . . ∈ En and E ⊂ ⋃

k Ek. We would like to see that

V (E) ≤∞∑

k=1

V (Ek).

Because E ∈ En, we can write E = E′×(an, bn] for some E′ ∈ En−1, −∞ < an ≤bn < ∞. Similarly, Ek = E′

k×(akn, bkn] with E′k ∈ En−1, −∞ < akn ≤ bkn < ∞.

Now let z ∈ (an, bn], to be fixed for a short while. We can assume that E 6= ∅,hence E′ ⊂ Rn−1 is also non-empty and (x, z) ∈ E if and only if x ∈ E′. (Weremark that by (x, z), if x = (x1, . . . , xn−1) ∈ Rn−1, z ∈ R, we understand thepoint (x1, . . . , xn−1, z) ∈ Rn.)

We want to single out all the “rectangles” of the covering family that containpoints of the form (x, z), for some x ∈ E′. One way of doing this is to singleout their indices; we define

P (z) = {k ∈ N : akn < z ≤ bkn}.We easily see that

E′ ⊂⋃

k∈P (z)

E′k;

in fact, if x ∈ E′, then (x, z) ∈ E, hence there exists k such that (x, z) ∈ Ek =E′

k × (akn, bkn], hence x ∈ E′k, akn < z ≤ bkn, so that k ∈ P (z). We have

V (E′) = m(E′) ≤∑

k∈P (z)

V (E′k),

the equality above being due to the induction hypothesis, the inequality due tothe definition of n− 1-dimensional outer Lebesgue measure. Since k ∈ P (z) iffakn < z ≤ bkn; that is, if and only if z ∈ (akn, bkn], and this happens if and onlyif

χ(akn,bkn](z) = 1,

76

we can add some zero terms to our last displayed inequality and write it in theform

V (E′) ≤∞∑

k=1

V (E′k)χ(akn,bkn](z).

This equation is now valid for all z ∈ (an, bn]; the right hand side is 0 outsideof (an, bn] so that the equation

(22) V (E′)χ(an,bn](z) ≤∞∑

k=1

V (E′k)χ(akn,bkn](z)

is valid for all z ∈ R. We remind ourselves and the world once more thatwe have a measure defined on Borel sets in every Rn; by what we hope isnot a serious abuse of language we denote it by m regardless of the dimension(only occasionally by mn). Inequality (22) is a relation between non-negative,Borel measurable functions on R, we can integrate with respect to 1-dimensionalLebesgue measure m = m1 and using the fact that

∫

Rχ(a,b] dm = m((a, b]) = b− a,

we get from (22) (and Beppo-Levi’s Theorem)

V (E′)(bn − an) ≤∞∑

k=1

V (E′k)(bkn − akn);

i.e.,

V (E) ≤∞∑

k=1

V (Ek).

Since {Ek} was an arbitrary countable covering by elements of En of E, thisshows that V (E) is a lower bound of the set of which m(E) is the infimum. Itfollows that V (E) ≤ m(E). This concludes the proof of Theorem 8.3

What we know, so far, is that m is a Borel measure in Rn. It is clearly σ-finite. It comes from an outer measure which we also denote by m. The questionarises: What if we now use m, restricted to Borel sets, to define a potentiallynew outer measure m∗ in Rn? That is, we define for A ⊂ Rn

m∗(A) = inf{∞∑

k=1

m(Ek) : A ⊂∞⋃

k=1

Ek, Ek ∈ B(Rn) ∀ k}.

Because m is an outer measure, if A ⊂ ⋃∞k=1 Ek, then m(A) ≤ ∑∞

k=1 m(Ek). Itfollows that m(A) ≤ m∗(A). On the other hand if A ⊂ ⋃∞

k=1 Ek and Ek ∈ En ⊂B(Rn) for all k, then

m∗(A) ≤∞∑

k=1

m(Ek) =∞∑

k=1

V (Ek)

from which m∗(A) ≤ m(A) follows. Thus m = m∗; we get nothing new. In fact,we didn’t even need to work with all of B(Rn). We have

Lemma 10.2 Let A be any algebra of subsets of Rn such that En ⊂ A ⊂ B(Rn).Then, for every A ⊂ Rn,

m(A) = inf{∞∑

k=1

m(Ek) : A ⊂∞⋃

k=1

Ek, Ek ∈ A∀ k}.

77

Exercise 46 Prove lemma 10.2.

Assume now ν is a Borel measure in Rn such that ν(E) = V (E) for all E ∈ En.We want to see that ν = m on B(Rn). We do this by extending En to analgebra, and then invoking the uniqueness part of Theorem 9.1. This extensioncan be done in several ways. One way is to just add the “infinite rectangles”to E , making it into a true elementary class. We opt for a different approach(also a standard technique) in which we restrict all to a “finite rectangle” thatthen will have its “sides ” going to ∞. That is, let X =

∏ni=1(ai, bi] be a fixed

element of En. Let AX consist of all finite unions of elements E ∈ En such thatE ⊂ X. Because {E ∈ En : E ⊂ X} is an elementary class, it is easy to see thatAX is an algebra in X. Every element of AX can be written as a finite union ofa finite family of pairwise disjoint subsets of X in En from which it follows atonce that ν(E) = m(E) for every E ∈ AX . It being clear that σ(AX) = {E ∈B(Rn) : E ⊂ X}, Theorem 9.1, part ii, implies that ν(E) = m(E) for everyBorel subset E of X. Now let E be an arbitrary Borel subset of Rn. SettingXk = [−k, k)n for k = 1, 2, . . ., we proved ν(E ∩ Xk) = m(E ∩ Xk) for all k;letting k →∞ we get ν(E) = m(E).

We proved (and if there is any question remaining, it is an exercise to do allthat needs to be done to remove that question):

Theorem 10.3 There exists a unique Borel measure m in Rn such that

m(n∏

i=1

(ai, bi]) =n∏

i=1

(bi − ai)

for all elements∏n

i=1(ai, bi] ∈ En. The σ-algebra of measurable sets of m con-sists of all sets of the form

B ∪N,

where B is a Borel set and m(N) = 0.

The measure of Theorem 10.3 is, as we have said more than once, Lebesguemeasure in Rn. We will denote its σ-algebra of measurable sets by L or L(Rn),and refer to the elements of L as the Lebesgue sets. Since this is the mostcommon, important measure space structure one has on Rn, it is customaryto refer to a Lebesgue set simply as a measurable set. If E is a measurablesubset of Rn and f : E → C a function, we say f is measurable iff and onlyif it is measurable when E is endowed with the σ-algebra L(E) = {F ∈ L :FsubsetE}. Of course, (E,L(E),m) is a complete measure space. We will usethe same symbol for a measure on a σ-algebra in a set X and its restriction toσ-algebra elements in a subset E.

Further properties of Lebesgue measure are developed in the next few exer-cises.

Exercise 47 In this exercise, for reasons of clarity, we will denote by mn

Lebesgue measure in Rn. Let A ⊂ Rn, B ⊂ Rk. We identify A×B with a sub-set of Rn+k in the obvious way; if x = (x1, . . . , xn) ∈ A, y = (y1, . . . , yk) ∈ B,then we identify (x, y) with (x1, . . . , xn, y1, . . . , yk) ∈ Rn+k. Prove that if nei-ther A nor B are empty, then A×B ∈ L(Rn+k) if and only if A ∈ L(Rn) andB ∈ L(Rk), in which case

mn+k(A×B) = mn(A)mk(B).

In particular, if A or B is a null set, so is A×B. (This involves showing thatA×B is a null set if mn(A) = 0, even if mk(B) = ∞).

78

Exercise 48 Show that Lebesgue measure m satisfies the following regularityconditions:

1. If A ⊂ Rn, then

m(A) = inf{m(U) : U open in Rn, A ⊂ U},

2. If A is measurable (Lebesgue), then

m(A) = sup{m(K) : K compact in Rn, K ⊂ A}.

The last exercise can be restated by saying that Lebesgue measure is a regularBorel measure. Regular Borel measures are an important special case of Radonmeasures,which are measures defined on metric spaces and which play a verybig role in the theory; they constitute a space of dual objects (in a sense wewon’t explain here) to the space of continuous functions. Here is the definition,and more.

Definition 39 Let X be a metric space. A Borel measure µ is said to be innerregular on a measurable set U iff

µ(U) = sup{µ(K) : K a compact subset of U}.

It is said to be outer regular on a measurable set A iff

µ(A) = inf{µ(U) : U open, A ⊂ U}.

A Radon measure is a Borel measure µ such that µ(K) < ∞ for all compactsubsets K of X and which is outer regular on all measurable sets, inner regularon all open sets.

Definition 40 Let X be a metric space. A Borel measure µ is said to be regulariff µ(K) < ∞ for all compact subsets K of X and µ is inner and outer regularon all measurable subsets of X.

Thus Lebesgue measure is not only a Radon measure, but it is also regular.This is actually the case with every Radon measure which is either σ-finite(as Lebesgue measure is) or defined on a σ-compact metric space (as Lebesguemeasure is too).

Theorem 10.4 Let µ be a Radon measure in the metric space X. Then

µ(E) = sup{µ(K) : K compact ,K ⊂ E}

for all σ-finite measurable sets.

Proof. Assume first E is a Borel set and µ(E) < ∞. Let ε > 0. By outerregularity, there is an open set U such that E ⊂ U and µ(U) < µ(E) + ε.By inner regularity on open sets, there is a compact set K ⊂ U such thatµ(U)− ε < µ(K). Then K\E ⊂ U\E, hence

µ(K\E) ≤ µ(U\E) = µ(U)− µ(E) < ε.

Invoking outer regularity once again, there is an open set V , K\E ⊂ V andµ(V ) < ε. Then J = K\V is compact and J ⊂ E. Moreover,

µ(K) = µ(J ∪ (K ∩ V )) = µ(J) + µ(K ∩ V ) ≤ µ(J) + µ(V ) < µ(J) + ε,

79

thus µ(E) ≤ µ(U) < µ(K) + ε < µ(J) + 2ε. This proves the result if E is a setof finite measure. The extension to the σ-finite case is as can be imagined. IfE =

⋃n En, where {En} is a sequence of Borel sets such that µ(En) < ∞ for

all n, and µ(E) = ∞, then we must have that

∞∑n=1

µ(En) = ∞.

We may and will assume that En ∩Em = ∅ if n 6= m. For each n we can find acompact subset Kn of En such that µ(Kn) > µ(En)− 2−n. If for N ∈ N we set

JN =N⋃

n=1

Kn,

then JN is compact, since the Kn’s are necessarily pairwise disjoint,

µ(JN ) =N∑

n=1

µ(Kn) >

N∑n=1

µ(En)−N∑

n=1

2−n ≥N∑

n=1

µ(En)− 2.

Letting N →∞ it is thus clear that µ(JN ) →∞, hence

sup{µ(J) : J ⊂ E, J compact} = ∞ = µ(E).

Corollary 10.5 Every σ-finite Radon measure is regular. If X is a σ-compactmetric space, then every Radon measure in X is regular.

Exercise 49 Let X be a σ-compact metric space. Show that if µ is a Borelmeasure in X such that for every Borel set E, positive real number ε, thereis an open set U in X such that E ⊂ U and µ(UE) < ε, then µ is regular.(Hint or Solution: One may assume X =

⋃n Kn where Kn is compact and

Kn ⊂ Kn+1 for n ∈ N. If E is a Borel set, ε > 0, there exists U open withU ⊃ E, µ(U\Ec) < ε. Then F = U c is a closed subset of E, µ(E\F ) < ε.Setting Fn = F ∩ Kn, Fn is a compact subset of E and {µ(Fn)} increases toµ(F ).

Exercise 50 Recall the definition of the Hausdorff outer measure Hs for s ∈[0,∞). Let s ≥ 0.

1. Prove that all Borel sets are Hs-measurable. (This is basically quoting theappropriate result).

2. Prove that H0 is counting measure.

3. With m denoting Lebesgue measure in Rn, prove that there exists a con-stant c > 0 such that m = cHn (and one usually adjusts γs so that itdepends nicely on s and c = 1).

4. Let A ⊂ Rn. Prove there exists a (necessarily unique) s0, 0 ≤ s0 ≤ n suchthat

Hs(A) ={ ∞, 0 ≤ s < s0,

0, s0 < s ≤ n.

The number s0 is called the Hausdorff dimension of A and will be denotedby DH(A).

5. Prove that if U is an open, non-empty subset of Rn, then DH(U) = n.

80

10.2 Lebesgue measure on the real line.

In this section we assume that m is Lebesgue measure in R; B, L denote theσ-algebras of Borel and Lebesgue sets in R, respectively. We saw already thatthe B has the cardinality of the continuum (i.e., of R) and that the Cantor set(for one) was an uncountable Borel null set. All subsets of the Cantor set Care Lebesgue sets and, since the cardinality of the power set of C is strictlygreater than that of C (which is the same as that of R), one sees that there areLebesgue sets which are not Borel sets. One might wonder now if there are anysubsets of R which are not Lebesgue sets. The answer is yes, with a proviso. Tosee this, we first need to see that m is translation invariant.

Theorem 10.6 (a) If E ∈ B and x ∈ R, then x+E ∈ B and m(x+E) = m(E).

(b) If E ∈ B and x ∈ R, then x + E ∈ B and m(x + E) = m(E).

Proof. Let x ∈ R. We first see that if E is a Borel set, so is x + E. For this wedefine

Σ = {E ⊂ R : x + E ∈ B}.We leave as an easy exercise to prove that Σ is a σ-algebra in R. It is alsoimmediate to see that it contains all open subsets of R. It follows that Σ ⊃ B,so that if E ∈ B, then E ∈ Σ, hence x + E ∈ B. Now define the map mx : B →[0,∞] by mx(E) = m(x+E). It is immediate that mx is a measure on B. Since

mx((a, b]) = m((a + x, b + x]) = (b + x)− (a + x) = b− a,

and m is the unique Borel measure with this property, it follows that mx = m;i.e., m(x + E) = m(E) for all Borel sets E. This proves part (a) of the The-orem. To prove part (b), let A ∈ L. Then we can write A = E ∪ N whereE ∈ B, N ⊂ F ∈ B, m(F ) = 0, and m(A) = m(E). If x ∈ R, we havex + A = (x + E) ∪ (x + N); since x + N ⊂ x + F and F is a Borel nullset, part (a) of this theorem implies that we also have m(x + F ) = 0, thusm(x + A) = m(x + E) = m(E) = m(A), applying once more part (a) of thistheorem.

We are ready to tackle the existence of non-measurable subsets of R.Terminology. Given A ⊂ R, one says it is measurable iff it is a Lebesgue set.

We show now there exists a subset A of [0, 1] which is not measurable. Forthis purpose, we define an equivalence relation in R by x ∼ y iff x − y ∈ Q.Clearly ∼ is an equivalence relation in R. Let us denote the equivalence classof an element x ∈ R by x; thus, for every x ∈ R, x = x + Q. We notice thatx∩ [0, 1] 6= ∅ for every x ∈ R; in fact, with [x] denoting the integer part of x, wehave x ∼ (x−[x]), and x−[x] ∈ [0, 1]. By the Axiom of Choice, we can form aset consisting of exactly one element from each one of these non-empty sets; i.e.,we can form the set A ⊂ [0, 1] which contains one representative of each class.Or, let S = R/ ∼= {x : x ∈ R} be the family of all equivalence classes of thisequivalence relation; for each s ∈ S let xs ∈ s ∩ [0, 1]; set A = {xs : s ∈ [0, 1]}.We claim A is not measurable. Assume it is. We notice that

(23) [0, 1] ⊂⋃

r∈Q∩[−1,1]

(r + A) ⊂ [−1, 2].

In fact, for the first inclusion, let 0 ≤ x ≤ 1. There is y ∈ A with x ∼ y,thus x − y = r ∈ Q. But since x, y ∈ [0, 1] we see that −1 ≤ r ≤ 1. Thusx ∈ r + A, r ∈ Q ∩ [−1, 1]. This proves the first inclusion. For the secondinclusion, let x ∈ ⋃

r∈Q∩[−1,1](r + A). Then there is y ∈ A, r ∈ Q ∩ [−1, 1] such

81

that x = r + x. Since y ∈ [0, 1], it follows that x ∈ [−1, 2]. We also notice thatif r, r′ ∈ Q, r 6= r′, then r + A ∩ r′ + A = ∅; in fact, if x ∈ r + A ∩ r′ + A thereexist y, y′ ∈ A such that r + y = x = r′ + y′; then y − y′ ∈ Q so y ∼ y′. By thedefinition of A this implies y = y′, but then also r = r′, which is a contradiction.

Assume m(A) = 0. Then we also have m(r + A) for every r; becauseQ ∩ [−1, 1] is countable, we see that

⋃r∈Q∩[−1,1](r + A) is a null set, which

is a contradiction since by the first inclusion in (23) we have

m

⋃

r∈Q∩[−1,1]

(r + A)

≥ m([0, 1]) = 1.

Assume m(A) > 0. The second inclusion in (23) gives

m

⋃

r∈Q∩[−1,1]

(r + A)

≤ m([−1, 2]) = 3 < ∞.

Because Q ∩ [−1, 1] is countable, and all the sets r + A, r ∈ Q are pairwisedisjoint,

m

⋃

r∈Q∩[−1,1]

(r + A)

≤ m([−1, 2]) =

∑

r∈Q∩[−1,1]

m(r + A) =∑

r∈Q∩[−1,1]

m(A)

the last inequality being due to translation invariance. But this contradictsm

(⋃r∈Q∩[−1,1](r + A)

)< ∞, since m(A) > 0 and r ∈ Q ∩ [−1, 1] is an infinite

set.More generally, one proves.

Theorem 10.7 Let E be a measurable subset of R (i.e., a Lebesgue measurablesubset of R) such that m(E) > 0. Then E contains a non-measurable set.

Exercise 51 Prove Theorem 10.7.

We will see now that integration with respect to Lebesgue measure is thesame as Riemann integration, in case the Riemann integral exists. It is easyto see that this is so for continuous functions, since it is easy to prove thatthe Fundamental Theorem of Calculus holds for Lebesgue integration. Butwe want to have the result also for an arbitrary function. We consider first“proper” Riemann integrals, in which the function is bounded in the interval ofintegration, and the interval itself is bounded, then briefly discuss “improper”ones. We recall first some basic definitions and properties concerning Riemannintegration. For details, see (for example) Rudin [3]. Let f : [a, b] → R bebounded; −∞ < a < b < ∞. We denote by Pa,b (do not confuse with a powerset!) the family of all partitions of the interval [a, b]; a partition being anyn-tuple of points P = (x0, . . . , xn) such that x0 = a < x1 < · · · < xn = b.usually one thinks of partitions as being finite subsets of [a, b] whose elementscan be labelled x0, . . . , xn so that x0 = a < · · ·xn = b. This makes it easier todefine the relation finer than and refinements. Otherwise; . . ., I prefer to seethem as n-tuples. It is anyway a trivial distinction and one can let chacun ason gout apply. Given partitions P = (x0, . . . , xn), Q = (y0, . . . , ym) of [a, b],we say that Q is finer than P , and write P ≤ Q, iff there is a choice of numbers0 = i0 < i1 < . . . < in = m such that yik

= xk (almost like saying the entries inthe first partition are a subsequence of the entries of the “finer” partition; exceptthat our sequences are finite). Equivalently, given a partition P = (x0, . . . , xn)

82

of [a, b], denote its set of points by P . Thus x ∈ P if and only if there is i,0 ≤ i ≤ n, such that x = xi. Then define P ≤ Q iff P ⊂ Q. This orderrelation has the property that any two partitions have a common refinement: IfP, Q ∈ Pab, there exists R ∈ Pab such that P ≤ R, Q ≤ R. In fact, just takeR so that R = P ∪ Q. We also need another notion of size of a partition; ifP = (x0, . . . , xm) ∈ Pab, we define its gauge to be |P | = max1≤i≤m(xi − xi−1).

Let P = (x0, . . . , xn) ∈ Pab. The upper Riemann sum of f with respect to Pis the number U(P, f) defined by

U(P, f) =n∑

k=1

(sup

xk−1≤x≤xk

f(x)

)(xk − xk−1).

The lower Riemann sum of f with respect to P is the number L(P, f) definedby

L(P, f) =n∑

k=1

(inf

xk−1≤x≤xk

f(x))

(xk − xk−1).

It is easy to see that if P ≤ Q, Then U(P, f) ≥ U(Q, f) while L(P, f) ≤ L(Q, f).It is trivial, on the other hand, that L(P, f) ≤ U(P, f). Given P, Q ∈ Pab, lettingR be a common refinement, we see that

L(P, f) ≤ L(R, f) ≤ U(R, f) ≤ U(Q, f).

In other words, since there is no relation between P,Q, every lower sum is lessthan or equal every upper sum. It follows that every lower sum is a lower boundof the set of upper sums, every upper sum is an upper bound of the set of lowersums. Thus defining the upper and lower integrals by

∫ b

a

f(x) dx = inf{U(P, f) : P ∈ Pab},∫ b

a

f(x) dx = sup{L(P, f) : P ∈ Pab},

we see that

L(P, f) ≤∫ b

a

f(x) dx ∀P ∈ Pab,

U(P, f) ≥∫ b

a

f(x) dx ∀P ∈ Pab,

thus also ∫ b

a

f(x) dx ≤∫ b

a

f(x) dx.

Definition 41 Let f : [a, b] → R be bounded. We say f is Riemann integrableover [a, b] iff ∫ b

a

f(x) dx =∫ b

a

f(x) dx

in which case the common value of the upper and lower integrals is called theRiemann integral of f over [a, b] and denoted by

∫ b

a

f(x) dx.

83

Assume again f : [a, b] → R is bounded. If P = (x0, . . . , xn) ∈ Pab, we canassociate with f two step functions; for lack of better names we’ll call them sPf

and tPf and, if a ≤ x ≤ b define them by

sPf (x) ={

infa≤x≤x1 f(x) if a ≤ x < x1,infxk−1≤x≤xk

f(x) if xk−1 < x ≤ xk, k = 2, . . . , n;

tPf (x) ={

supa≤x≤x1f(x) if a ≤ x < x1,

supxk−1≤x≤xkf(x) if xk−1 < x ≤ xk, k = 2, . . . , n.

The following properties are either immediate or quite easy to prove; occasionalcomments are provided.

a1. If P, Q ∈ Pab,then sPf (x) ≤ f(x) ≤ tPf (x) for all x ∈ [a, b] (Immediate!)

a2. If P, Q ∈ Pab and P ≤ Q, then

sPf (x) ≤ sQf (x), tPf (x) ≥ tQf (x)

for all x ∈ [a, b]. (Same proof as L(P, f) ≤ L(Q, f), U(P, f) ≥ L(P, f)).

a3. If P ∈ Pab,then sPf , tPf ∈ L1(m) = L1([a, b], m) and∫

[a,b]

sPf dm = L(P, f),∫

[a,b]

tPf dm = U(P, f).

It is clear that sPf , tPf are simple measurable functions; in fact, settingµk = infxk−1≤x≤xk

f(x),

sPf = µ1χ{a} +n∑

k=1

µkχ(xk−1,xk];

since m({a}) = 0 and m((xk−1, xk]) = xk−xk−1, the result for sPf followsat once. Similarly for tPf .

Since the upper integral is the infimum of the set of upper sums, thereexists a sequence {Pn} of partitions such that {U(Pn, f)} converges to the upperintegral. Replacing P2 by a common refinement of P1, P2, then P3 by a commonrefinement of P3 and the new P2, and so forth, only decreases the upper sums.Thus we may assume P1 ≤ P2 ≤ · · · . We can get a similar result for the lowersums, go to a new common refinement and conclude there exists a sequence{Pn} of elements of Pab such that Pn ≤ Pn+1 for all n ∈ N and

limn→∞

U(Pn, f) =∫ b

a

f(x) dx, limn→∞

L(Pn, f) =∫ b

a

f(x) dx.

Refining further (if necessary) we can assume that |Pn| < 1/nfor all n ∈ N.Let sn = sPnf , tn = tPnf for n ∈ N. The sequence {sn} is increasing, {tn}

decreasing by property a1. above; thus

s(x) = limn→∞

sn(x) = limn→∞

sPnf (x), t(x) = limn→∞

tn(x) = limn→∞

tPnf (x)

exists for all x ∈ [a, b]; moreover, also by a1, s(x) ≤ f(x) ≤ t(x) for all x ∈[a, b]. The functions s, t are measurable, being limits of measurable functions.Being bounded on a bounded interval, they are also Lebesgue integrable. Moreto the point, the sequences {sn}, {tn} are uniformly bounded below by α =infx∈[a,b] f(x), above by β = supx∈[a,b] f(x). Because the constant function

84

x 7→ max(α, β) is LEBESGUE integrable on [a, b], the dominated convergencetheorem implies that

(24) limn→∞

∫

[a,b]

sn dm =∫

[a,b]

s dm, limn→∞

∫

[a,b]

tn dm =∫

[a,b]

t dm.

By property a3.,∫

[a,b]

tn dm = U(Pn, f),∫

[a,b]

sn dm = L(Pn, f),

thus, in combination with (24) we get(25)∫

[a,b]

s dm = limn→∞

L(Pn, f) =∫ b

a

f(x) dx,

∫

[a,b]

t dm = limn→∞

U(Pn, f) =∫ b

a

f(x) dx.

Since t ≥ s, we see that

(26)∫

[a,b]

|t− s| dm =∫

[a,b]

(t− s) dm =∫ b

a

f(x) dx,

∫ b

a

f(x) dx,

thus f is Riemann integrable over [a, b]∫[a, b]|s− t| dm = 0; that is, if and only

if s = t a.e. In this case, if f is Riemann integrable, since s ≤ f ≤ t, we concludethat s = t = f a.e., hence f ∈ L1(m) and

∫

[a,b]

f dm =∫

[a,b]

t dm limn→∞

∫

[a,b]

tn dm

= limn→∞

U(Pn, f) =∫ b

a

f(x) dx =∫ b

a

f(x) dx.

We proved, so far: If f is Riemann integrable over [a, b], then it is Lebesgueintegrable and the Lebesgue integral coincides with the Riemann integral.

Let us return to our sequences of simple functions {sn}, {tn}. They converge,respectively, to s and t; s ≤ f ≤ t. There is a direct characterization of s, t interms of f ; in fact, if we let D =

⋃∞n=1 Pn (a countable set), then for every

x ∈ [a, b]\D,

(27) s(x) = limδ→0

infy∈[a,b],|y−x|<δ

f(y), t(x) = limδ→0

supy∈[a,b],|y−x|<δ

f(y).

These limits exist for every x ∈ [a, b] because infy∈[a,b],|y−x|<δ f(y) increaseswith decreasing δ while supy∈[a,b],|y−x|<δ f(y) decreases with deceasing δ. Toprove (27), let x ∈ [a, b] and let δ > 0. Let n ∈ N be such that 1/n < δ andconsider sn(x). Since n stays fixed for a moment, say P = (y0, . . . , ym) andyk−1 ≤ x ≤ yk. Since

max(yk − x, x− yk−1) ≤ |Pn| < 1/n,

we have [yk−1, yk] ⊂ (x− δ, x + δ), hence

sn(x) = infy∈[yk−1,yk]

f(y) ≥ infy∈(x−δ,x+δ)∩[a,b]

f(y).

Letting first n →∞ and then δ → 0, we get

s(x) ≥ infy∈(x−δ,x+δ)∩[a,b]

f(y).

85

Conversely, let n ∈ N and assume x ∈ [a, b]\D. As before, let Pn = (y0, . . . , ym)and assume yk−1 ≤ x ≤ yk. Because x /∈ D, we have yk−1 < x < yk. Takingδ > 0 sufficiently small, we can get (x− δ, x + δ) ⊂ [xk−1, xk], hence

sn(x) = infy∈[yk−1,yk]

f(y) ≤ infy∈(x−δ,x+δ)∩[a,b]

f(y).

Letting this time δ → 0 first, then n →∞, we get

s(x) ≤ infy∈(x−δ,x+δ)∩[a,b]

f(y).

This proves the first equation (for x ∈ [a, b]\D) in (27), the second equation issimilarly proved. Finally, we observe that if x ∈ [a, b], then

limδ→0

infy∈(x−δ,x+δ)∩[a,b]

f(y) = limδ→0

supy∈(x−δ,x+δ)∩[a,b]

f(y)

if and only if f is continuous at x. We leave the proof of this simple fact as anexercise. Since Riemann integrability was equivalent to s = t a.e., properties(27) imply that f is Riemann integrable if and only if D is a Lebesgue null set.We proved:

Theorem 10.8 let f : [a, b] → R be bounded. Then f is Riemann integrableif and only if f is measurable and the set of discontinuities of f in [a, b] is aLebesgue null set.

It is important, perhaps, to understand that there is a difference betweenthe following two statements about a (measurable) function f : R → C:

The function f is continuous a.e.

The function f is equal a.e. to a continuous function.

For example, the function H(x) defined by H(x) = 0 if x < 0, H(x) = 1 if x > 0is continuous a.e., but there is no continuous function equal to H a.e. On theother hand, the Dirichlet function χQ is equal a.e. to the continuous identically0 function, but is not continuous a.e.

Exercise 52 Let f : R → C and assume there exists a Borel null set E suchthat f is continuous at all x ∈ R\E. Prove: There exists a Borel measurablefunction g such that f = g a.e.

Exercise 53 Let E be a Borel subset of R and let f : E → C. Let y ∈ R.Define fy : y + E → C by fy(x) = f(x− y). Prove:

1. f is (Borel) measurable if and only if fy is measurable.

2. If f ≥ 0 (i.e., if f is real valued and f(x) ≥ 0 for all x ∈ E) and ismeasurable, then ∫

y+E

fy dm =∫

E

f dm.

3. f ∈ L1(m) if and only if fy ∈ L1(m) (more precisely, f ∈ L1(E,BE ,m)if and only if fy ∈ L1(E,BEy , m), with BE denoting the family of Borelsubsets of E) and in this case

∫

y+E

fy dm =∫

E

f dm.

86

Exercise 54 Let I be an interval in R and let f : I → R be a Borel measurablefunction. Let

E = {x ∈ I : f is differentiable at x}.Show that E is a Borel set and f ′ : E → R is measurable. In particular, if f ′

exists a.e., then we can consider f ′ as a measurable function on E.Hint: Consider gn(x) = n[f(x + 1/n)− f(x)] if x, x + 1/n ∈ E; any value youwish if x ∈ E, x + 1/n /∈ E.

Exercise 55 Let f : [a, b] → R be continuous and non-decreasing; assume thef ′ is defined a.e. in [a, b]. Show that f ′ ≥ 0 and that

∫ b

a

f ′dm ≤ f(b)− f(a).

Comment: It is actually a fact that if a function is monotone (increasing ordecreasing), its derivative exists a.e. An example in which the integral of thederivative is strictly less than f(b)−f(a) appears in the next series of exercises.Hint: Consider the function gn of the preceding exercise, but gn(x) has to bedefined now with a minimum of care when b − 1/n < x ≤ b (the case of x in[a, b] such that x + 1/n /∈ [a, b]) so that gn ≥ 0 and one can apply one of thetheorems valid for non-negative measurable functions. There is, of course, anobvious choice for gn(x) if x ∈ (b− 1/n, b]; this choice works.

The Cantor Set. One possible inductive definition of the Cantor set is thefollowing. We construct a sequence {En} of compact subsets of [0, 1] as follows:E0 = [0, 1]. Assume that for some n ≥ 0 we have constructed

En =2n⋃

i=1

[an,i, bn,i]

where the intervals [an,i, bn,i] are disjoint and, in fact, bni < an,i+1 for i =1, . . . , 2n − 1. Set

an+1,2i−1 = an,i, bn+1,2i−1 = an,i +bn,i − an,i

3,

an+1,2i = an,i + 2bn,i − an,i

3, bn+1,2i = bn,i

for i = 1, . . . , 2n, and then

En+1 =2n+1⋃

i=1

[an+1,i, bn+1,i].

We have, of course a0, 1 = 0, b0, 1 = 1. Then a1,1 = a0,1 = 0, b1,1 = 1/3,a1,2 = 2/3, b1,2 = 1; and so forth. We are just cutting out the middle one thirdfrom each interval. The Cantor set is, by definition,

C =∞⋂

n=1

En.

Exercise 56 Prove that m(C) = 0.

It is, or should be, very easy to see that C is a compact subset of R. Itis also easy to see that C has no isolated points; i.e., every point of C is anaccumulation point. One way of seeing this is to observe that the endpoints of

87

the intervals whose union makes up En are in C for every n. In fact, once anendpoint, always an endpoint. Let x ∈ C, so x ∈ En for all n, and let δ > 0.Let n be such that 3−n < δ. Since x ∈ En, there exists i with x ∈ [an,i, bn,i]and we will have [an,i, bn,i] ⊂ (x − δ, x + δ). Thus an,i ∈ (x − δ, x + δ); sincean,i ∈ C, we proved that x is an accumulation point of C. Cantor called a closedset in which every point is an accumulation point perfect, and proved that everynon-empty perfect set is uncountable. We will get this result by the somewhatdifferent (though familiar) “base 3-base 2” argument. But before we do this, letus consider U = [0, 1]\C. This is an open subset of [0, 1],but because 0, 1 ∈ C,it is also an open subset of R. The construction of C shows at once that U isthe union of a countable family of pairwise disjoint open intervals; in fact,

U = (13,23) ∪ (

19,29) ∪ (

79,89) ∪ (

127

,227

) ∪ (727

,827

) ∪ (1927

,2027

) ∪ (2527

,2627

) ∪ · · · .

Since this is just a particular case of a general result,this might be a goodmoment to digress assigning as exercises the proof of the following two results,which are useful in their own right.

Exercise 57 Let U be an open subset of R. Then U is the union of a countablenumber of pairwise disjoint open intervals. That is, either

U =⋃

n∈N,n<N

Jn

for some N ∈ N, orU =

⋃

n∈NJn,

where in either case Jn is an open interval for each n and Jn∩Jm = ∅ if n 6= m.

Exercise 58 Let F be a closed subset of R and let f : F → R be continuous.Then f extends to a continuous map from R to R.

Comments and Hints. Both of these exercises are standard IntroductoryAnalysis exercises (and, as such, appear as exercises somewhere in Rudin’s Prin-ciples of Mathematical Analysis). The first one does not have any higher di-mension analogue; there is no family of basic open sets in R2 (for example) suchthat every open subset of R2 is a countable union of pairwise disjoint sets in thatfamily. The result of the second exercise is, however, true in every metric space(in fact, every normal topological space) as long as the function is bounded. Itis then known as Tietze’s Extension Theorem: Let X be a normal topologicalspace, let F be a closed subset of X and let f : F → R be a bounded continuousfunction. Then f extends to a continuous function on X. The hypothesis off being bounded can be dropped in some circumstances, for example if X isσ-compact.

The proof of Tietze’s theorem depends heavily on Urysohn’s lemma, and weare not going to discuss it here. Exercise 58 is considerably more elementary,thanks to Exercise 57. In fact, assume f : F → R is continuous, F ⊂ R closed.We will call the function f : RtoR defined as follows the linearized extensionof f ; proving the result of Exercise 58 reduces merely to checking that f iscontinuous (and a few other mostly immediate statements).

Since R\F is open, write

R\F =⋃

n∈S

(an, bn),

88

where either S = {1, . . . , N − 1} for some N ∈ N or S = N. (The former casewith N = 1 implies S = ∅ hence also R\F = ∅, hence F = R and there isnothing to prove. This allows us to assume, if we want to, that S 6= ∅.) Letx ∈ R. If x ∈ F , set f(x) = f(x). If x /∈ F , there exists n ∈ Ssuch thatan < x < bn. The points an, bn are in F , the domain of f ; we extend f linearlyinto the interval (an, bn), defining

f(x) =x− an

bn − anf(bn) +

bn − x

bn − anf(an).

The Cantor set is perhaps best described working with numbers in base 3.In general, if b ∈ N, b ≥ 2 is a number base, if a1, a2, . . . are b-digits; i.e., integersin the set {0, 1, . . . , b− 1}, we shall write x = (0.a1a2 . . .)b to indicate the baseb expansion of x ∈ [0, 1]; in other words,

x =∞∑

n=1

an

bn.

Of course, if an = 0 for n > N , we also write x = (0.a1 . . . aN )b.Returning to the Cantor set C and base 3, if S ⊂ N set

αS =∑

n∈S

23n

,

equivalently

αS = (0.a1a2 . . .)3, an = 2χS(n) ={

2, n ∈ S,0, n /∈ S.

Setting Sn = P ({k ∈ N : k ≤ n}) for n = 0, 1, . . ., we have the following char-acterizations of the sets En (and of C)

Exercise 59 Let n ∈ N ∪ {0}.1. If S, T ∈ Sn and

[αS , αS + 3−n] ∩ [αT , αT + 3−n] 6= ∅,

then S = T .

2.En =

⋃

S∈Sn

[αS , αS + 3−n].

We can call the points αS and αS +3−n with S ∈ Sn for some n, the end-pointsof C. They are really endpoints of the intervals making up the complementof C. One should also notice that if S ∈ Sn then αS + 3−n = αT , whereT = S ∪ {k ∈ N : k > n}. Conversely, if a set T ⊂ N contains all but a finitenumber of the natural numbers, say T ⊃ {k ∈ N : k > n} for some (certainlynot unique) n ∈ N, then setting S = T\{n + 1, n + 2, . . .} we have S ∈ Sn andαS + 3−n = αT .

Exercise 60 Show thatC = {αS : S ⊂ N}.

Conclude that the cardinality of C equals the cardinality of R.

89

This is not too hard, I hope.It is also easy to characterize the end-points of the intervals of R\C. Let (a, b)

be such an interval. Then there should not exist S ⊂ N such that a < αS < b.Write a = (0.a1a2 . . .)3, b = (0.b1 . . .)3; since a, b ∈ C we may assume thatan, bn ∈ {0, 2} for all n. Then we also have (because a < b) that there is n0

such that an = bn if n < n0 and an0 < bn0 . Then an0 = 0, bn0 = 2. Consider

x = (0.a1 . . . an0−1100 . . .)3 = (0.a1 . . . an0−1022 . . .)3.

Then x ∈ C (because of the second representation) and a ≤ x < b Thus x = a,and this implies that

a = (0.a1 . . . an0−1022 . . .)3.

Now take x = (0.a1 . . . an0−1200 . . .)3. Once more we have x ∈ C, but this timea < x ≤ b. It follows that x = b; i.e.,

b = (0.a1 . . . an0−1200 . . .)3.

All in all we proved: Let (a, b) be one of the pairwise disjoint non- empty openintervals making up [0, 1]\C. Then there exists n ∈ N, S ∈ Sn such that n ∈ Sand

b = αS ,

a = b− 13n

= α(S\{n})∪{i>n}.

It is time to get a bit more serious about all this. We define a map f : C →[0, 1] by

f(αS) =∑

j∈S

12j

;

equivalently, if x = (a1a2 . . .)3 ∈ C where a0, a1, . . . ∈ {0, 2}, let

f(x) = (0.a1

2a2

2. . .)2 =

∞∑

j=1

aj/22j

.

Exercise 61 Show that f is onto from C to [0, 1]. Moreover, show that ifx, y ∈ C and x < y, then f(x) < f(y) except if x, y are consecutive endpoints ofan interval removed from C, in which case f(x) = f(y).

Exercise 62 Show that f : C → [0, 1] is continuous.

Because C is closed, we can extend f to a continuous function on R. Letus denote the linearized extension again by f . We restrict it to [0, 1], so thatf : [0, 1] → [0, 1] is continuous and f |C coincides with the original f . Thisfunction is known as the Cantor function and it has a whole bunch of strange,and not so strange, properties.

Exercise 63 Show that f(x) ≤ f(y) for all x, y ∈ [0, 1] such that x < y.

Exercise 64 Show that the Cantor function f is differentiable a.e. and, in factf ′(x) = 0 for almost every x ∈ [0, 1].

It follows now that∫

[0,1]

f ′ dm = 0 < 1 = f(0)− f(1).

90

10.3 Lebesgue-Stieltjes measure on the real line

Since we have all the ingredients, we may as well do it. We have, in fact, thefollowing result.

Theorem 10.9 Let φ : R → R be increasing and continuous from the right.There exists a unique Borel measure µφ in R such that µφ((a, b]) = b− a when-ever −∞ < a ≤ b < ∞. This measure is regular.

Proof. Let A be the same algebra as in Theorem 5.4. From what we did inSection 5, where we discussed the Lebesgue-Stieltjes measure (first steps) it iseasy to see that the measure µphi defined there is the only measure on thealgebra A which satisfies µφ((a, b]) = b − a; that it is indeed a measure on Ais the content of Theorem 5.4. It being clearly a σ-finite measure, Theorem 9.1implies that it extends uniquely to σ(A), which in this case is seen to be theσ-algebra of Borel sets. For regularity it suffices to prove: If E is a Borel set,ε a positive real number, there exists an open set U in R such that E ⊂ U andµφ(U\E) < ε (see Exercise 49).

To prove this, we define a potentially new measure ν on all subsets of R by

ν(E) = inf{∞∑

n=1

µφ(In) : I1, I2 . . . open intervals, E ⊂⋃n

In}.

We remark that if I = (a, b) is an open interval, then

µφ(I) = µφ

( ∞⋃n=1

(a, b− 1n

]

)= lim

n→∞µφ((a, b− 1

n]) = φ(b−)− φ(a).

Of course, ν is not really a measure, but it is an outer measure on R. It alsosatisfies: If E, F ⊂ R and dist(E,F ) > 0, then ν(E∪F ) = ν(E)+ν(F ). All thisis immediate or quite easy. By Theorem 8.6, ν is (generates) a Borel measurein R. Let a, b ∈ R, a ≤ b. Then ν(a, b) ≤ µphi((a, b), by the definition of ν. Theconverse inequality is obtained by going through by now familiar arguments,though a bit simpler because we are dealing with measures. If (a, b) ⊂ ⋃

n In

where In = (an, bn) for all n, then µφ((a, b)) ≤ ∑n µphi(In), so that (the cov-

ering {In} being an arbitrary covering by open intervals), µφ((a, b)) ≤ ν((a, b)).

ν((a, b)) = µφ((a, b)) = φ(b−)− µphi(a)

for all open intervals (a, b). Writing (a, b] =⋂

n(a, b + 1/n), and using the rightcontinuity if φ, one sees that ν((a, b]) = φ(b)− φ(a) = µφ((a, b]) for all a, b ∈ R,a ≤ b. But µφ was the unique Borel measure with this property, so that ν = µφ

and we proved that

muφ(E) = inf{∞∑

n=1

µφ(In) : I1, I2 . . . open intervals, E ⊂⋃n

In}.

Assume now E is a Borel set, ε > 0. Assume first µ(E) < ∞. We can then finda sequence of intervals {In} such that E ⊂ ⋃

n In and

muφ(E) >

∞∑n=1

µφ(In) + ε.

Setting U =⋃

n In, we get E ⊂ U and µ(U\E) < ε. If µφ(E) = ∞, we dothe usual σ-finiteness argument. Writing R =

⋃n Fn where each Fn is of fi-

nite measure, we can find given ε > 0 open sets Un such that E ∩ Fn ⊂ Un)

91

and µφ(Un\(E ∩ Fn)) < e2−n−1. Setting U =⋃∞

n=1 Un, we get E ⊂ U andµφ(U\E) < ε.

Integration with respect to this measure is called Lebesgue-Stieltjes integra-tion.

11 Lusin’s Theorem and related results

Lusin’s Theorem characterizes measurable functions with respect to a Radonmeasure on a metric space. It shows they are not so far away from continuousfunctions as one might imagine. It is actually valid also in normal spaces ingeneral, since the main ingredient in its proof is Urysohn’s Lemma. Urysohn’slemma is quite difficult to prove in a general normal topological space; fortu-nately enough its proof in the case of a metric space is quite easy.

Theorem 11.1 (Urysohn’s Lemma–metric space version). Let A,B be closedsubsets of the metric space X such that A ∩B = ∅. Then there exists a contin-uous h : X → [0, 1] such that h(x) = 0 for all x ∈ A, h(x) = 1 for all x ∈ B.

Proof. If A or B is empty, the theorem is trivial. Assume A 6= ∅ 6= B. Then

h(x) =d(x,A)

d(x,A) + d(x,B)

has the desired properties.

Corollary 11.2 Let F,U be subsets of the metric space X; assume F is closed,U is open, and F ⊂ U . There exists a continuous function h : X → [0, 1] suchthat χF ≤ h ≤ χV .

Proof. The Theorem merely states that there exists h : X → [0, 1] continuoussuch that h(x) = 1 if x ∈ F , h(x) = 0 if x /∈ U . Just apply Urysohn’s lemmawith A = X\U , B = F .

We are ready to deal with Lusin’s Theorem.

Theorem 11.3 (Lusin) Let X be a metric space, let µ be a Radon measure inX; i.e., µ is outer and inner regular and µ(K) < ∞ for all compact sets K. Letf : X → C be measurable and assume that µ({f 6= 0}) < ∞. Then there existsa continuous g : X → C such that µ({f 6= g}) < ε.

Proof.Step 1. Let A = {f 6= 0}. We assume first that the measurable functionf is real-valued, 0 ≤ f(x) < 1 for all x ∈ X. It may help to remember theway we proved that f was the uniform limit of an increasing sequence of non-negative simple measurable functions, because we use the argument. For eachn ∈ N we let

En,k = {x ∈ X :k − 12n

≤ f(x) <k

2n,

for k = 1, . . . , 2n. These are, of course, measurable sets and obviously En,k ∩En,j = ∅ if k 6= j. Moreover, x ∈ X implies f(x) < 1 = (2n)/(2n) so that

2n⋃

k=1

En,k = X.

92

Set

sn =2n∑

k=1

k − 12n

χEn,k

for n = 1, 2, . . .. Given x ∈ X and n ∈ N, there exists a unique k ∈ N such that(k−1)2−n ≤ f(x) < k2−n; we have, of course k ≤ 2n, hence sn(x) = (k−1)2−n,0 ≤ f(x) − sn(x) < 2−n, and since the last inequalities hold for all x ∈ X,uniform convergence of the sequence {sn} to f is clear. Define now the sequenceof functions {tn} by t1 = s1, tn = s(n) − sn−1 if n ≥ 2. Then tn : X → R is asimple measurable function for each n ∈ N and

N∑n=1

tn = sN

for each N ∈ N. It follows that

∞∑n=1

tn = f,

convergence being uniform. Let us look at tn in a bit more detail. We claimthat

tn =12n

χFn

where Fn is a measurable subset of A. The exact form of Fn is not terriblyimportant. All we need to prove is that for every x ∈ X, one either has tn(x) =2−n or tn(x) = 0. So let x ∈ X. It is easy to see that t1(x) = s1(x) = 1/2if 1/2 ≤ f(x) < 1, t1(x) = s(x) = 0 otherwise, establishing the claim forn = 1. Assume now n ≥ 2. There exists a unique k, 1 ≤ k ≤ 2n−1 such that(k − 1)2−n+1 ≤ f(x) < k2−n+1, and sn−1(x) = (k − 1)2−n+1. We can alsowrite the interval in which f(x) can be found in the form (2k− 2)2−n ≤ f(x) <(2k)2−n, implying that the unique j such that (j − 1)2−n ≤ f(x) < j2−n iseither j = 2k − 1 or j = 2k. If the former, we have

sn(x) =2k − 2

2n=

k − 12n−1

= sn−1(x), thus tn(x) = 0,

in the latter case

sn(x) =2k − 1

2n=

k − 12n−1

= sn−1(x), thus tn(x) =2k − 1

2n− k − 1

2n−1=

12n

.

The claim is established. We have, of course, Fn = {tn = 2−n} = {tn 6= 0}.The claim we just established implies, very much in particular, that tn ≥ 0for all n; the fact that f =

∑n tn then implies that f ≥ tn for all n, hence

Fn ⊂ {f > 0} = A. Now let ε > 0. Because the Radon measure µ is regularwhen restricted to σ-finite sets, and because each set Fn has finite measure(being a subset of A), there exist for each n ∈ N an open set Un and a compactset Kn such that Kn ⊂ Fn ⊂ Un and µ(Un\Kn) < 2−n−1ε. By the Corollary toUrysohn’s Lemma, there also exists a continuous function hn : X → [0, 1] suchthat hn(x) = 1 if x ∈ Kn, hn(x) = 0 if x ∈ X\Un. Let

g =∞∑

n=1

2−nhn.

We notice that for every x ∈ X, n ∈ N, we have 0 ≤ hn(x) ≤ 1. By theWeierstrass majorant principle, the series of functions

∑∞n=1 2−nhn converges

93

uniformly in X, hence g as defined is a continuous function on X. To see that gfits the bill, we notice that if 2−nhn(x) = tn(x) for all n ∈ N, then g(x) = f(x).Thus g(x) 6= f(x) implies there exists n ∈ N with 2−nhn(x) 6= tn(x). If x ∈ Kn,then 2−nhn(x) = 2−n, tn(x) = 2−n, the first equality being due to hn|Kn

= 1;the second to Kn ⊂ Fn. Thus, assuming 2−nhn(x) 6= tn(x), we get x /∈ Kn.If x /∈ Un, then x /∈ Fn and both hn(x) and tn(x) are equal to zero. Thus2−nhn(x) 6= tn(x) implies that x ∈ Un\Kn, hence {g 6= f} ⊂ ⋃

n(UnKn) and

µ({g 6= f}) ≤∞∑

n=1

µ(Un\Kn) <

∞∑n=1

ε

2n+1= ε.

This takes care of the case of a measurable f with f(X) ⊂ [0, 1), except forthe assertion about the supremum of |g| being ≤ sup |f |. Assume now thatf : X → [0,∞). Setting An = {f ≥ n} we have A ⊃ A1 ⊃ A2 · · · and sinceµ(A) < ∞,

limn→∞

µ(An) = µ

( ∞⋂n=1

An

)= µ({f = ∞}) = 0

so that given ε > 0 we can find n with µ(An) < ε/2. The function

f =1n

χX\Anf

satisfies 0 ≤ f < 1 (and is zero outside of A) so that by what we proved thereexists a continuous function h on X such that µ({h 6= f}) < ε/2. Settingg = nh, we get g : X → R is continuous and µ({g 6= f}) < ε. The extensionto the complex case is obvious; writing now f = u + iv, where u, v are realvalued, we get, given ε > 0, continuous real valued–even non-negative–functionsg1, g2, g3, g4 such that

µ({g1 6= u+}) <ε

4, µ({g2 6= u−}) <

ε

4,

µ({g3 6= v+}) <ε

4, µ({g4 6= v−}) <

ε

4.

With g = g1 − g2 + i(g3 − g4), it follows that µ({g 6= f}) < ε.Finally, we address the question of sup|g| ≤ sup |f |. Let ε > 0 and assume

found a continuous function g1 : X → C such that µ({g1 6= f}) < ε. LetM = supx∈X |f(x)|, and assume M < ∞ (if M = ∞, there is nothing to prove).The function ψ : C→ C defined by

ψ(z) ={

z, |z||leM,M z

|z| , |z| > M,

is continuous, ψ(C) ⊂ {z ∈ C |z| ≤ M}. Thus g = ψ ◦ g1 : X → C is continuousand supX |g| ≤ M = supX |f |. Moreover, if g1(x) = f(x), then |g1(x)| ≤ M ,hence g(x) = g1(x) = f(x). Thus {g 6= f} ⊂ {g1 6= f} and µ({g 6= f}) < ε.If X is a metric space, denote by C(X) the space of all continuous complexvalued functions on X. As an easy, but important, corollary to Lusin’s Theorem,we get

Theorem 11.4 Let X be a metric space and let µ be a Radon measure in X.Then C(X) ∩ L1(µ) is dense in L1(µ).

Proof. Let f ∈ L1(µ) and assume ε > 0 has been given. By Theorem 7.25, thereexists a simple, integrable function s such that ‖f − s‖1 < ε/2. The function s,being an integrable simple function, vanishes outside of a set A of finite measure.

94

By Lusin’s Theorem, setting M = supX |s|, we can find a continuous functiong : X → C such that supX |g| ≤ M and µ({g 6= s}) < ε/(2M). We have

‖f − g‖1 ≤ ‖f − s‖1 + ‖s− g‖1 <ε

2+

∫

{g 6=s}|s− g| dµ

≤ ε

2+

∫

{g 6=s}(|s|+ |g|) dµ

≤ ε

2+ 2Mµ({g 6= s}) <

ε

2+

ε

2= ε.

12 Other form(s) of convergence.

Let (X,M, µ) be a measure space and for each n ∈ N let fn → C be a measurablefunction. We already saw several ways in which the sequence {fn} can convergeto a function f : X → C. Here are some measure theory relevant definitions ofconvergence which we either saw or are easy to relate to known forms.

Uniform convergence a.e. The sequence {fn} converges uniformly a.e. to fiff there exists E ∈ M, µ(E) = 0, such that {fn} converges uniformly inX\E.

Pointwise convergence a.e. The sequence {fn} converges a.e. to f iff theset E consisting of all x ∈ X such that the numerical sequence {fn(x)}does not converge to f(x) is contained in a null set.

L1 convergence. The sequence {fn} converges a.e. to f iff limn→∞ ‖f −fn‖1 = 0.

As is usual, we say that a sequence {fn} converges (in any of these forms, orany other form yet to come) iff there is some f to which it converges. Clearly,uniform convergence a.e. implies a.e. convergence; a.e. convergence plus theexistence of a dominating L1-function implies L1 convergence; L1 convergenceimplies there is a subsequence converging a.e. (Theorem 7.21). To make all thismuch more fun and exciting, we introduce another form in which sequences offunctions can converge.

Definition 42 Let (X,M, µ) be a measure space and let {fn} be a sequenceof complex valued measurable functions defined on X. We say the sequenceconverges almost uniformly to f : X → C iff for every ε > 0 there existsa measurable subset D of X such that µ(D) < ε and the sequence convergesuniformly to f on X\D.

It is clear that uniform convergence a.e. implies almost uniform convergence.On the other hand, we have:

Exercise 65 Let (X,M, µ) be a measure space and let {fn} be a sequence ofcomplex valued measurable functions defined on X, converging almost uniformlyto f . Show that the sequence converges a.e. to f .

The interesting thing is that the converse of the result of the last exercise isalmost true. It is true if the measure space is finite; that is Egoroff’s Theorem.

Theorem 12.1 (Egoroff) Let (X,M, µ) be a finite measure space and let {fn}be a sequence of complex valued, measurable functions on X converging a.e. toa function f : X → C. Then {fn} converges almost uniformly to f .

95

Proof. We proceed in a somewhat informal way. Our hypothesis is that ifwe denote by E the set of all x ∈ X for which the sequence of complex numbers{fn(x)} either does not converge or converges but not to f(x), then µ(E) = 0.We have that x /∈ E, if and only if for every ε > 0 we can find N ∈ N suchthat |fn(x) − f(x)| < ε if n ≥ N . This means that x ∈ E if and only if thereis some ε > 0 for which no N works. To say that no N works means that wecan keep finding larger and larger n’s for which |fn(x) − f(x)| < ε is false; inother words, for every N ∈ N, there is n ≥ N such that |fn(x) − f(x)| ≥ ε.It is clear that we can assume that this bad ε (which, of course, may dependon the point x) is of the form 1/m, with m ∈ N. So we see that x ∈ E if andonly if there exists m ∈ N such that for every N ∈ N there is n ≥ N such that|fn(x) − f(x)| ≥ 1/m. We can write this as follows in set theoretic notation.For m, n ∈ N we introduce the measurable set

Em,n = {x ∈ X : |fn(x)− f(x)| ≥ 1m}.

Then all that we have said can be compressed into the equation

E =∞⋃

m=1

∞⋂

N=1

∞⋃

n=N

Em,n.

Since E is a null set, so is each one of the sets

∞⋂

N=1

∞⋃

n=N

Em,n,

they are all subsets of E. Here is where the finiteness of the measure µ comesinto play, because of it

(28) 0 = µ

( ∞⋂

N=1

∞⋃

n=N

Em,n

)= lim

N→∞µ

( ∞⋃

n=N

Em,n

).

A bit of meditating suggests how to conclude this proof. It is time to bring inε > 0, so assume that ε > 0 has been given. By (28), which is valid for everym ∈ N, we can find for each m ∈ N a number Nm ∈ N such that

µ

( ∞⋃

n=N

Em,n

)<

ε

2m+1.

Setting

D =∞⋃

m=1

∞⋃

n=Nm

Em,n

we see that µ(D) < ε. If η > 0 is given, then let m ∈ N, 1/m < η. If x /∈ D,then

x /∈∞⋃

n=Nm

Em,n,

hence x /∈ Em,n for all n ≥ Nm, hence |fn(x) − f(x)| < 1/m < η if n ≥ Nm.This proves uniform convergence to f on X\D.

Exercise 66 Use Egoroff’s Theorem to give an alternative proof of Lebesgue’sDominated convergence Theorem. Do not use Fatou’s Lemma, but you may

96

have to use that if g ∈ L1(µ) and ε > 0, there exists a measurable set A in Xsuch that µ(A) < ∞ and ∫

X\A|g| dµ < ε.

Prove this result, by the way, which can be used to reduce the proof of the theoremto the finite measure case.

We conclude this section with one further type of convergence.

Definition 43 Let (X,M, µ) be a measure space and let {fn} be a sequence ofcomplex valued measurable functions defined on X. The sequence {fn} convergesin measure to f iff for every real α > 0 we have

limn→∞

µ({x ∈ X : |fn(x)− f(x)| > α}) = 0.

The next exercise explores these concepts a bit more.

Exercise 67 Let {fn} be a sequence of complex valued measurable functionsdefined on the measure space (X,M, µ). Prove:

1. If {fn} converges almost uniformly, it converges in measure.

2. If {fn} converges in measure, there is a subsequence converging a.e.

3. If {fn} converges in measure and there exists g ∈ L1(µ) such that |fn| ≤ ga.e. for every n ∈ N, then {fn} converges in L1(µ).

Exercise 68 Let (X,M, µ) be a measure space. If f, g : X → C are measurable,define

d(f, g) = min(

1, supε>0

µ({|f − g| > ε}))

.

Show that d defines a metric in the space of all complex measurable functionson X (f = g being interpreted as f = g a.e.) and a sequence {fn} converges inthis metric if and only if it converges in measure.

More exercises.

Exercise 69 Let µ be a Borel measure in Rn and assume that µ(K) < ∞for all compact sets K and that µ is outer regular; i.e., µ(E) = inf{µ(U) :U open, E ⊂ U}. True or false: µ is regular? Prove or disprove.

Exercise 70 A well known theorem due to Steinhaus states: Let E be a mea-surable subset of the real line of positive measure (if no measure, or σ-algebraof measurable sets is specified, it is understood that measurable means Lebesguemeasurable and the measure is Lebesgue measure). Then E−E = {x−y : x, y ∈E} is a neighborhood of 0. That is, there is some δ > 0 such that (−δ, δ) ⊂ E−E.Prove this theorem by justifying and using the following remarks

i. One can assume the measure of E is finite, so assume 0 < m(E) < ∞.

ii. Assuming E − E is not a neighborhood of 0, there exists a sequence {tn} ofreal numbers such that limn→∞tn = 0 and (tn +E)∩E = ∅ for all n ∈ N.

97

iii. Letc =

∫

Rχ2

E dm =∫

RχE dm = m(E) > 0,

but ∫

Rχtn+EχE dm = 0

for all n. If one could show that limn→∞ χtn+E = χE a.e., we’d have acontradiction.

Well, one can’t show this.

iv. Invoke Lusin’s theorem to replace χE by a continuous function ψ differingfrom chiE on a set of measure < ε. By a judicious choice of ε one can geta contradiction.

Exercise 71 Now that you have proved Steinhaus’ Theorem, prove the followinggeneralization: Let E, F be measurable subsets of R, assume that both E and Fhave positive measure. Then E − F = {x − y : x ∈ E, y ∈ F} has non-emptyinterior; i.e., there exists x ∈ E ∩F and δ > 0 such that (x− δ, x+ δ) ⊂ E−F .

Because we are having so much fun with all this, here is another way of gettingto Steinhaus’ theorem; perhaps a more elementary way than the one of exercise70. But we have to prepare the ground first.

Exercise 72 Let E be a measurable subset of R and assume m(E) > 0, wherem denotes Lebesgue measure. Show that for every α ∈ (0, 1) there is an openinterval I such that m(E ∩ I) > αm(I). (Because of the strict inequality, I isbounded and I 6= ∅.)Hint: Assume the result false and think outer regularity.

Remark Notice that the result of the previous exercise just falls just shortof including an interval into a set of positive measure. If one could prove theexistence of an interval such that m(E ∩ I) = m(I) (α = 1 case) (and assumenow I not empty), then E\I would be a null set.

Exercise 73 Let E be a measurable subset of R and assume m(E) > 0, wherem denotes Lebesgue measure. Show that if I = (a, b) is an open interval and1/2 < α < 1 is such that m(E ∩ I) > αm(I), then

E − E ⊃ (−δ, δ) for δ = (α− 12)(b− a).

In conjunction with Exercise 72 this proves, once more, Steinhaus’ Theorem.Hints: One can replace E by E ∩ I so E ⊂ (a, b). If x ∈ (−δ, δ) then bothE, x + E are contained in either (a− x, b) or in (a, b + x).

Exercise 74 (cf. Rudin, R.& C Analysis, Chapter 2, Exercise 17) Define thefollowing strange metric on the plane: If (x1, y1), (x2, y2) ∈ R2, define

d ((x1, y1), (x2, y2)) ={ |y1 − y2|, if x1 = x2,

1 + |y1 − y2|, if x1 6= x2.

Exercise 74, Part a. Show that d is indeed a metric for R2. Given E ⊂ R2,x, y ∈ R, define the vertical x-section of E and the horizontal y-section of E by

Ex = {t ∈ R : (x, t) ∈ E}, Ey = {s ∈ R : (s, y) ∈ E},

98

respectively. ThenE =

⋃x ∈ REx =

⋃y ∈ REy.

Show that a subset U of R2 is open (all topological/metric terms in this exerciserefer to the given metric, except if otherwise noted) if and only Ux is open inthe usual topology of R for each x ∈ R. Show that a subset K is compact if andonly if the set

{x ∈ R : Kx 6= ∅}is finite and Kx is compact for every x in that set (hence also for every x ∈ R).End of Part a.

We will define a measure µ on R2 as follows. The family of measurable setswill be

M = {E ⊂ R2 : Ex ∈ Λ∀x ∈ R},where Λdenotes the σ-algebra of Lebesgue subsets of R. In M we consider twoclasses of sets; class one, sets E such that Ex = ∅ for all but a countable numberof x ∈ R. For these sets we define

µ(E) =∑

x∈Rm(Ex).

Class two consists of all sets for which {x ∈ R : Ex 6= ∅} is uncountable. Forsuch sets we set µ(E) = ∞.Exercise 74, Part b. Show that (R2,M, µ) is a measure space. Show thatµ is a Radon measure in R2 (of course, with the funny metric) which is NOTregular. So here you have to show

i. M is a sigma-algebra, which is very easy.

ii. µ as defined is a measure on M. This is easy, though some division intocases may have to be done.

iii. µ(E) = inf{µ(V ) : E ⊂ V, V open} holds for all E ∈ M (or for all Borelsets E). This is again quite easy. Notice that if µ(E) = ∞ it is trivial. Ifµ(E) < ∞ you may have to use an ε/2n argument.

iv. If U is open in R2 (in the funny metric), then

µ(U) = sup{µ(K) : K compact ,K ⊂ U}.

Fairly easy, though one may have to divide into several cases.There isthe case when µ(U) < ∞. The case µ(U) = ∞ divides into subcases,depending on whether {x : Ux 6= ∅} is countable or not.

v. µ(K) < ∞ if K ⊂ R2 is compact. Easy

vi. There exists E ∈M such that

µ(E) > sup{µ(K) : K compact ,K ⊂ E}

A set that works (among many) is E = {(x, 0) : x ∈ R}.End part b.Exercise 74, Part c. With µ as defined show that

1. Let Y be a metric space. Then f : R2 → Y is measurable if and onlyfx : R→ Y is measurable for every x ∈ R, where fx is defined by

fx(t) = f(x, t), t ∈ R.

99

2. If f : R2 → [0,∞] is measurable, and let E(f) = {x ∈ R :, fx 6= 0}. To beprecise, x ∈ E(f) if there exists y ∈ R such that f(x, y) 6= 0 (even if it isonly one single y). Show that

∫

R2f dµ =

{ ∞, if E(f) is uncountable,∑x∈E(f)

∫R fx dm, if E(f) is countable.

3. Show that f ∈ L1(µ) if and only if E(f) is countable and

∑

x∈E(f)

∫

R|fx| dm < ∞.

In this case, writing E(f) = {x1, x2, . . .} where xi 6= xj if i 6= j, we have

∫

R2f dµ =

∞∑

i=1

∫ ∞

−∞f(xi, y) dy.

13 Product Measures

In this section we assume that (X,M, µ), (Y,N , ν) are σ- finite measure spaces.We want to provide the Cartesian product X×Y with a measure space structurein which all sets of the form A × B with A ∈ M, B ∈ N are measurable andhave measure µ(A)ν(B). We then want to see how you integrate with respectto this product measure. Finally, we discuss the case of more than two factors.

We begin with some notation. We will denote (in this section) by E thefamily of all cartesian products of sets of M by sets of N . That is,

E = {A×B : A ∈M, B ∈ N}.The elements of E are sometimes called elementary rectangles.

Lemma 13.1 E is an elementary class in X × Y .

Proof. Everything is immediate, one just has to remember what has to beproved. The following rather immediate formulas and facts do the job.

• X × Y ∈ E .

• (A×B) ∩ (C ×D) = (A ∩ C)× (B ∩D).

• (X × Y )\(A×B) = ((X\A)× Y ) ∪ (A× (Y \B)).

Because E is an elementary class, we know that finite unions of elements of Econstitute an algebra and every element of this algebra can be written as a unionof a pairwise disjoint finite number of elements of E . Let us call this algebra A;

A = {n⋃

i=1

Ei : E1, . . . , En ∈ E}

= {n⋃

i=1

(Ai ×Bi) : A1, . . . , An ∈M, B1, . . . , Bn ∈ N}

It is time to begin defining the product measure. It is denoted by µ⊗ν. We begindefining it on E and then extend it step by step to a σ-algebra of measurablesets. We will keep on using the same symbol for the (presumed or putative)measure as we extend it, but one should remember that at each extension, one

100

has to verify that the new definition still assigns the same measure to the setson which the measure had been defined so far. Our first definition is

µ⊗ ν(A×B) = µ(A)ν(B) if A ∈M, B ∈ N ,

with the proviso that if one of µ(A), µ(B) is 0, the other one ∞, then µ⊗ν(A×B) = 0. This defines µ⊗ ν : E → [0,∞].

Before proceeding we need to prove the following lemma, which (as in thedefinition of Lebesgue measure in Rn–which is a product measure) could be thehardest step. You may want to compare this proof with the proof of Theorem8.3; obviously it is the same argument.

Lemma 13.2 Let {An} be a sequence of sets in M, {Bn} be a sequence of setsin N ; assume (An ×Bn) ∩ (Am ×Bm) = ∅ if n 6= m and

∞⋃n=1

(An ×Bn) = A×B

for some A ∈M, B ∈M. Then

µ⊗ ν(A×B) =∞∑

n=1

µ⊗ ν(An ×Bn);

that is,

µ(A)ν(B) =∞∑

n=1

µ(An)ν(Bn).

Proof. Let y ∈ B (This is a statement that normally has to be preceded withan argument explaining why B is not empty; in our case, if B–or A- -is emptyone can see that the lemma reduces to 0 = 0 + 0 + · · · , so we can assume B notempty). We now single out all “rectangles” An × Bn such that y ∈ Bn, whichis done by singling out the indices n for which this happens. We define

P (y) = {n ∈ N : y ∈ Bn}.The set P (y) is, of course, not empty (once we assume B 6= ∅). We now claim:

1. If n,m ∈ P (y), n 6= m, then An ∩Am = ∅.2.

A =⋃

n∈P (y)

An.

In fact, let x ∈ An ∩ Am, n,m ∈ P (y). Then (x, y) ∈ An × Bn and (x, y) ∈Am ×Bm, hence n = m. This takes care of 1. Let x ∈ A. Then (x, y) ∈ A×Band hence there exists n ∈ N with (x, y) ∈ An × Bn; i.e., x ∈ An, y ∈ Bn.By definition of P (y) we have n ∈ P (y). This takes care of 2; the claim isestablished. By the claim, we have

µ(A) =∑

n∈P (y)

µ(An).

Now n ∈ P (y) if and only if y ∈ Bn, if and only if χBn(y) = 1; we canthus writethe expression we found for µ(A) in the form

µ(A) =∞∑

n=1

µ(An)χBn(y).

101

Finally, multiplying by χB(y) = 1 we get

µ(A)χB(y) =∞∑

n=1

µ(An)χBn(y).

This last equality is true not only for y ∈ B but also, quite trivially, for y ∈ Y \B(where all terms are 0). In other words, it holds for all y ∈ Y and we canintegrate with respect to ν to get (by Beppo-Levi’s or Lebesgue’s monotoneconvergence theorem),

µ(A)∫

Y

χB dν =∞∑

n=1

µ(An)∫

Y

χBn dν;

i.e.,

µ(A)ν(B) =∞∑

n=1

µ(An)ν(Bn) dν.

We are ready to extend our measure to A. Let E ∈ A. Assume that we have

E =N⋃

n=1

(An ×Bn) =M⋃

m=1

(Cm ×Dm),

where An, Cm ∈M, Bn, Dm ∈M for 1 ≤ n ≤ N , 1 ≤ mleM , and (An ×Bn)∩(Ak ×Bk) = ∅ if n 6= k, (Cm ×Dm) ∩ (Cj ×Dj) = ∅ if m 6= j. In other words,we have two representations of E as a union of pairwise disjoint elements of E .We can then form a third such representation by just intersecting sets from onefamily with sets from the other. We have

An ×Bn ⊂M⋃

m=1

Cm ×Dm,

hence

An ×Bn =M⋃

m=1

(An ×Bn) ∩ (Cm ×Dm) =M⋃

m=1

(An ∩ Cm)× (Bn ∩Dm).

The sets {(An ∩ Cm)× (Bn ∩Dm)}Mm=1 are pairwise disjoint and we can apply

Lemma 13.2 (if necessary add empty sets to this family so as to make a sequenceof pairwise disjoint sets out of it) to get

µ(An ×Bn) =M∑

m=1

µ(An ∩ Cm)ν(Bn ∩Dm).

Switching the roles of the families, we get similarly

µ(Cm ×Dm) =N∑

n=1

µ(An ∩ Cm)ν(Bn ∩Dm).

It follows thatN∑

n=1

µ(An ×Bn) =N∑

n=1

M∑m=1

µ(An ∩ Cm)ν(Bn ∩Dm)

=M∑

m=1

N∑n=1

µ(An ∩ Cm)ν(Bn ∩Dm) =M∑

m=1

µ(Cm)ν(Dm).

102

This allows us to define µ ⊗ ν(E) if E ∈ A as follows: Write, as one canE =

⋃Nn=1 An × Bn where An ∈ M, Bn ∈ N for 1 ≤ n ≤ N and the sets

An ×Bn are pairwise disjoint. Define

µ⊗ ν(E) =N∑

n=1

µ(An)ν(Bn).

By the computations preceding this definition, it is clear that the definition doesnot depend on the particular decomposition of E into pairwise disjoint elementsof E . It is also clear that if E = A×B ∈ E , using the decomposition E = A×Bshows that µ ⊗ ν(E) = µ(A)ν(B), so the new definition coincides with the oldone on E .

Lemma 13.3 µ⊗ ν is a measure on A.

Proof. All we need to prove is that if E,E1, E2, . . . ∈ A, E =⋃∞

n=1 En, and ifEn ∩ Em = ∅ if n 6= m, then

µ⊗ ν(E) =∞∑

n=1

µ⊗ ν(En).

So assume E, E1, E2, . . . as stated. We can write

E =K⋃

k=1

Ak ×Bk

where A1, . . . AK ∈M, B1, . . . , BK ∈ N , (Ak ×Bk)∩ (Ak′ ×Bk′) = ∅ if k 6= k′.Similarly, for each n ∈ N we can write

En =Jn⋃

j=1

An,j ×Bn,j

where An,1, . . . An,Jn ∈M, Bn,1, . . . , Bn,Jn ∈ N , (An,j×Bn,j)∩(An,j′×Bn,j′) =∅ if j 6= j′. We now have, since Ak ×Bk = (Ak ×Bk) ∩ E,

Ak ×Bk =∞⋃

n=1

Jn⋃

j=1

(Ak ∩An,j)× (Bk ∩Bn,j),

for k = 1, 2, . . . , K. The family of sets

{(Ak ∩An,j)× (Bk ∩Bn,j)}n∈N,1≤j≤Jn

consists of pairwise disjoint sets, so that by Lemma 13.2 we get

µ(Ak)ν(Bk) =∞∑

n=1

Jn∑

j=1

µ(Ak ∩An,j)ν(Bk ∩Bn,j),

for k = 1, . . . ,K. On the other hand, we also have for n ∈ N, 1 ≤ j ≤ Jn,An,j ×Bn,j = (An,j ×Bn,j) ∩ E, hence

An,j ×Bn,j =K⋃

k=1

(Ak ∩An,j)× (Bk ∩Bn,j).

103

Applying again Lemma 13.2 (any two sets in the union of the right hand sidebeing disjoint), we get

µ(An,j)ν(Bn,j) =K∑

k=1

µ(Ak ∩An,j)ν(Bk ∩Bn,j).

Putting it all together:

µ⊗ ν(E) =K∑

k=1

µ(Ak)ν(Bk) =K∑

k=1

∞∑n=1

Jn∑

j=1

µ(Ak ∩An,j)ν(Bk ∩Bn,j)

=∞∑

n=1

Jn∑

j=1

K∑

k=1

µ(Ak ∩An,j)ν(Bk ∩Bn,j)

=∞∑

n=1

Jn∑

j=1

µ(An,j)ν(Bn,j) =∞∑

n=1

µ⊗ ν(En).

Thanks to our extension theory, we are basically done. We will denote byM⊗Nthe σ-algebra generated by E in X × Y (which is also the σ-algebra generatedby A). . We recall that we are assuming, that µ, ν are σ-finite; then µ ⊗ ν isclearly σ-finite on A. By Lemma 13.2 and the extension theorem, Theorem 9.1we have

Theorem 13.4 Let (X, µ, ν), (Y,N , ν) be σ-finite measure spaces. There existsa unique measure µ⊗ν defined on the σ-algebra M⊗N generated by the familyof “measurable rectangles” A×B, A ∈ M , B ∈ N such that

µ⊗ ν(A×B) = µ(A)ν(B)

if A ∈M, B ∈ N .

We will denote by M ⊗N the σ-algebra one obtains when completing themeasure space (X × Y,M⊗N , µ⊗ ν); that is, for example, E ∈ M ⊗N if andonly if E = F ∪ G where F ∈ M⊗N , G ⊂ N ∈ M⊗N , µ ⊗ ν(N) = 0. Wewill say that a subset E of X × Y is measurable iff E ∈ M ⊗N .

Definition 44 Let E ⊂ X × Y . If x ∈ X, we define Ex ⊂ Y by

Ex = {y ∈ Y : (x, y) ∈ E}.

Similarly, if y ∈ Y , we define Ey ⊂ X by

Ey = {x ∈ X : (x, y) ∈ E}.

Lemma 13.5 Let E ∈M⊗N . Then

1. Ex ∈ N for all x ∈ X, the map x 7→ ν(Ex) : X → [0,∞] is measurableand ∫

X

ν(Ex) dµ(x) = µ⊗ ν(E).

2. Ey ∈ M for all y ∈ Y , the map y 7→ µ(Ey) : Y → [0,∞] is measurableand ∫

X

µ(Ey) dν(y) = µ⊗ ν(E).

104

Proof. We are here once more having to prove something for all sets in a σ-algebra generated by some family of sets. In this case the σ-algebra is M⊗N =σ(E). We know very little about the sets in this σ-algebra, so the only way toproceed is what should be by now a familiar one. We let Σ (to give it a name) bethe class of all sets satisfying the property. We show E ⊂ Σ. Next we try to showΣ is a σ-algebra. If we succeed, we are done, then Sigma ⊃ σ(E) = M⊗N .Well, we won’t succeed right away; in fact if we actually try to do this we’ll runinto some serious though familiar difficulties. So we try something else, we tryto show Σ is a monotone class. This turns out to work just fine, but once (orbefore) this is done we need a little bit more, we need to show all of the algebraA (not only E) is in Σ. There is no problem in doing this, so we finally haveproved (one all this is done) that Σ is a monotone class containing A. But amonotone class containing an algebra also contains the σ-algebra generated bythat algebra, so Σ ⊃ M⊗N = σ(A). Once we have checked all this out, it iseasy to write a professional real proof, that avoids false starts and wrong moves.The real proof, where all the good stuff is implemented starts now.Actual starting point of the proof. For simplicity, we only prove Property1; the proof of 2. is identical. Let Σ consist of all sets E ∈M⊗N (so it makessense to talk of their product measures) that satisfy Property 1; i.e., sets E suchthat Ex ∈ N for all x ∈ X, the map x 7→ ν(Ex) : X → [0,∞] is measurable and

∫

X

ν(Ex) dµ(x) = µ⊗ ν(E).

Step 1.E ⊂ Σ. From the immediately verified formulas

(A×B)x ={

B, x ∈ A,∅, x /∈ A.

we see that if E = A × B ∈ E , then Ex ∈ N for all x ∈ X, Ey ∈ M for ally ∈ Y .Aside. A×B ∈ E does NOT imply A ∈M, B ∈ N . This is because if E = ∅we have many representations of it as a measurable rectangle, for example A×∅,where A can be any subset of X, even a not measurable one–if non-measurablesubsets exist. However you represent it, ∅x = ∅ for all x ∈ X. End of theaside.

If E 6= ∅, we have ν(Ex) = ν(B)χA(x) for all x ∈ X. The same equalityis valid if E = ∅ with the proviso that we represent it in the form A × B withB ∈ N , so ν(B) makes sense. The best is to agree that we’ll take A = ∅ = Bif E = ∅. The equality just says 0 = 0 in this case. The map x 7→ ν(Ex) is themap ν(B)χA, clearly measurable since A is measurable and

∫

X

ν(Ex) dµ(x) = ν(B)∫

X

χA dµ = ν(B)µ(A) = µ⊗ ν(E).

It follows that E ∈ Σ. In particular X × Y ∈ Σ.Step 2. We next show that Σ is closed under finite disjoint unions. By induc-tion, it suffices to show that E, F ∈ Σ, E ∩ F = ∅, implies E ∪ F ∈ Σ. So letE, F be disjoint elements of E . If x ∈ X, we see that (E ∪ F )x = Ex ∪ Fx, so(E ∪ F )x ∈ N for all x ∈ X. Moreover, Ex ∩ Fx = ∅ for all x ∈ X so that

ν((E ∪ F )x) = ν(Ex) + ν(Fx)

from which measurability of the map x 7→ ν((E ∪ F )x) and∫

X

ν((E ∪ F )x) dµ(x) =∫

X

ν(Ex) dµ(x) +∫

X

ν(Fx) dµ(x)

= µ⊗ ν(E) + µ⊗ ν(F ) = µ⊗ ν(E ∪ F )

105

follows. This proves that E ∪ F ∈ Σ.Step 3. A ⊂ Σ. Since E ∈ Σ by Step 1, and every element of A can bewritten as a finite union of pairwise disjoint elements of E , this is an immediateconsequence of Step 2.Step 4. Σ is closed under limits of increasing sequences of sets. Assume En ∈ Σfor n ∈ N, E1 ⊂ E2 ⊂ E3 ⊂ · · · . We want to see that E =

⋃n En ∈ Σ. Since

Ex =⋃

n(En)x it follows that Ex ∈ N for all x. From (E1)x ⊂ (E2)x ⊂ · · · weget

ν(Ex) = limn→∞

ν((En)x)

for all x ∈ X, hence the function x 7→ ν((E)x, being the pointwise limit of thesequence of measurable functions {x 7→ ν((En)x)}, is measurable. Moreover, byLebesgue’s monotone convergence theorem,

∫

X

ν(Ex) dµ(x) = limn→∞

∫

X

ν((En)x) dµ(x) = limn→∞

µ⊗ ν(En) = µ⊗ ν(E).

This completes the proof of E ∈ Σ, so that Σ is closed under the limit ofincreasing sequences of sets.Step 5. Σ is closed under countable disjoint unions: Let En ∈ Σ for n ∈ Nand assume En ∩ Em = ∅ if n 6= m. Then e =

⋃n En ∈ Σ. To see this, set

FN =⋃N

n=1 En for N ∈ N. By Step 2, FN ∈ Σ for all N ∈ N. In addition, {FN}is an increasing family of sets converging to

⋃n En so the result follows FROM

Step 4. Step 6. Σ is closed under limits of decreasing sequences of sets. LetEn ∈ Σ for each n ∈ N, E1 ⊃ E2 ⊃ · · · , let E =

⋂n En and assume that there

exist sets A ∈ M, B ∈ N such that µ(A) < ∞, ν(B) < ∞ and E1 ⊂ A × B.Then E ∈ Σ. In fact, since Ex =

⋂n(En)x it follows that Ex ∈ N for all x. We

have, moreover,(E1)x ⊃ (E2)x ⊃ · · ·

and because (E1)x ⊂ (A × B)x, which is either B or empty, in either caseof finite ν measure, we can affirm that ν(Ex) = limn→∞ ν((En)x), hence x 7→ν((En)x) is measurable on X. Since

ν((En)x) ≤ ν((A×B)x) = ν(B)χA(x)

and ν(B)χA ∈ L1(µ), the dominated convergence theorem now gives

µ⊗ ν(En) = limn→∞

∫

X

ν((En)x) dµ(x) =∫

X

ν(Ex) dµ(x);

since µ ⊗ ν(E1) ≤ µ ⊗ ν(A × B) = µ(A)ν(B) < ∞ we have that limn→∞ µ ⊗ν(En) = µ⊗ ν(E) and we proved that

µ⊗ ν(E) =∫

X

ν(Ex) dµ(x).

We proved E ∈ Σ.We are almost done with the proof that Σ is a monotone class, but before

we can complete it we have to take a short detour.Step 7. Let A ∈M, B ∈ N , assume µ(A) < ∞, ν(B) < ∞. Then E∩(A×B) ∈Σ for every E ∈M⊗N . To prove this, we set up a new family of sets, namely

C = {E ∈M⊗N : E ∩ (A×B) ∈ Σ}.

We claim C is a monotone class containing A. In fact, if E ∈ A, then E ∩ (A×B) ∈ A, so that E ∩ (A × B) ∈ Σ by Step 3. Thus E ∈ C, proving A ⊂ C.

106

Let {En} be an increasing sequence of sets in C. Then {En ∩ (A × B)} is anincreasing sequence of sets in Σ, hence

∞⋃n=1

(En ∩ (A×B)) =

( ∞⋃n=1

En ∩ (A×B)

)∈ Σ

by Step 4, proving⋃

n En ∈ C. Similarly, if {En} be a decreasing sequence ofsets in C then {En ∩ (A×B)} is a decreasing sequence of sets in Σ that satisfiesthe hypothesis of Step 6. It follows that

( ∞⋂n=1

En ∩ (A×B)

)∈ Σ,

hence⋂

n En ∈ C. The claim is established; C is a monotone class containing A.By Lemma 9.3, C contains σ(A) = M⊗N , and we are done with this step.

We are back on the main road, ready to complete the proof that Σ is amonotone class.Step 8. Σ is closed under limits of decreasing sequences of sets. We assumeEn ∈ Σ for each n ∈ N, E1 ⊃ E2 ⊃ · · · and let E =

⋂n En. Since µ, ν

are σ-finite, we can find a sequence of pairwise disjoint sets {Ak} in M and asequence of pairwise disjoint sets {Bj} in N , such that µ(Ak) < ∞ for all k ∈ N,ν(Bj) < ∞ for all j ∈ J and

X =∞⋃

k=1

Ak, Y =∞⋃

j=1

Bj .

By Step 7, En∩ (Ak×Bj) ∈ Σ for all k, j, n, hence for all k, j, {En∩ (Ak×Bj)}is a decreasing sequence of sets in Σ satisfying the hypothesis of Step 6. Thus,by Step 6, we conclude that

E ∩ (Ak ×Bj) =∞⋂

n=1

(En ∩ (Ak ×Bj)) ∈ Σ

for all k, j ∈ N. But any two different ones of these sets are disjoint; because Σis closed under countable disjoint unions, we conclude that

E =⋃

k,j∈NE ∩ (Ak ×Bj) ∈ Σ.

Final Step. By Steps 4 and 8, Σ is a monotone class. By Step 3, it containsA. By Lemma 9.3, Σ ⊃M⊗N .

Definition 45 If f : X × Y → S, S some set, then for every x ∈ X wedefine fx : Y → S by fx(y) = f(x, y). For y ∈ Y we define fy : X → S byF y(x) = f(x, y).

Theorem 13.6 (Tonelli) Let f : X × Y → [0,∞] be measurable with respect tothe σ-algebra M⊗N . Then

1. fx is measurable with respect to N for every x ∈ X and fy is measurablewith respect to M for every y ∈ Y .

2. The maps

x 7→∫

Y

fx dν and y 7→∫

X

fy dµ

107

are measurable with respect to M and N , respectively and∫

X

(∫

Y

fx dν

)dµ(x) =

∫

X×Y

f dµ⊗ ν =∫

Y

(∫

X

fy dµ

)dν(y).

Proof. We use a standard approach; to prove something holds for non-negativemeasurable functions, we first prove it for characteristic functions of measurablesets, then extend the result to non-negative, measurable simple functions bylinearity, finally use the fact that non-negative measurable functions are limitsof increasing sequences of measurable simple functions to establish the resultfor non-negative measurable functions. This procedure does not always work;in this case it does. So here we go.

Let T denote the set of all M⊗N measurable functions f : X×Y → [0,∞]for which the conclusions of the theorem are true.Step 1. Assume f = χE , E ∈M⊗N . Since

(χE)x = χEx , (χE)y = χEy ,

the theorem reduces to Lemma 13.5 in this case. Thus χE ∈ T for all E ∈M⊗N .Step 2. If f, g ∈ T and a, b ∈ [0,∞), then af + bg ∈ T . Because

(af + bg)x = afx + bgx, (af + bg)y = afy + bgy

for all x ∈ X, y ∈ Y , this is clear.Step 3. If {fn} is a sequence of functions in T and f1 ≤ f2 ≤ f3 ≤ · · · , thenf = limn→∞ fn ∈ T . Once more, this is immediate. All we need to observe isthat we will have

(f1)x ≤ (f2)x ≤ · · · , limn→∞

(fn)x = fx

to conclude that fx : Y → [0,∞] is measurable for every x ∈ X. Integrating weget ∫

Y

(f1)x dν ≤∫

Y

(f2)x dν ≤ · · · , limn→∞

∫

Y

(fn)x dν =∫

Y

fx dν,

the last equality being a consequence of Lebesgue’s Monotone Convergence The-orem. The last equality also proves that x 7→ ∫

Yfx dν is measurable on X and,

because it is a limit of increasing, non- negative measurable functions,∫

X

(∫

Y

fx dν

)dµ(x) = lim

n→∞

∫

X

(∫

Y

(fn)x dν

)dµ(x) = lim

n→∞

∫

X×Y

fn dµ⊗ ν

=∫

X×Y

f dµ⊗ ν,

the last equality coming from yet another application of Lebesgue’s MonotoneConvergence Theorem. One deals similarly with fy.

We are done; by Step 1 and 2, all non-negative measurable simple functionssatisfy the conclusion of the theorem, thus by Step 3 (and Theorem 4.11) allnon-negative measurable functions satisfy the conclusions of the theorem.

Tonelli’s theorem deals only with non-negative functions. However, sinceevery measurable function can be decomposed into combinations of non-negativeones, we get at once

108

Theorem 13.7 Let f : X × Y :→ C be measurable with respect to M× N .Then fx : Y → C is measurable for every x ∈ X, fy : X → C is measurable forevery y ∈ Y .

It can be of interest to remember occasionally that every theorem aboutintegration also says something about summations; one simply assumes thatthe measure (or some of the measures) are counting measures. For example, ifX,Y are sets, if ax,y ∈ R, ax,y ≥ 0 for all (x, y) ∈ X × Y , if we consider X, Yas measure spaces with counting measures (µ equal counting measure on X, νon Y and all subsets measurable), we get

∑

x∈X

∑

y∈Y

ax,y

=

∑

(x,y)∈X×Y

=∑

y∈Y

(∑

x∈X

ax,y

).

If (X,M, µ) is an arbitrary measure space and for the second measure spacewe take (N,P(N),counting meas.), then it is easy to verify that a function F :X × N → C is measurable if and only if the map x 7→ F (x, n) : X → C ismeasurable. Actually, that’s not a too bad exercise, so let’s make it into one.

Exercise 75 Prove the assertion that was just made; i.e., prove that F asdefined is measurable if and only if F (·, n) : X → C is measurable for all n ∈ N.

Tonelli’s Theorem now implies: If F : X × N → [0,∞] and F (x, ·) is mea-surable for each n ∈ N, then

∫

X

∞∑n=1

F (x, n) dµ(x) =∞∑

n=1

∫

X

F (x, n) dµ(x).

I hope it is clear that this is precisely Beppo Levi’s Theorem. Of course, wewould never have been able to obtain Tonelli without the aid of Beppo Levi (orits equivalent twin, Lebesgue’s Monotone Theorem).

Our next Theorem is Fubini’s Theorem. Fubini and Tonelli go so hand inhand that they are frequently stated as one, and then called the Fubini- TonelliTheorem.

Theorem 13.8 (Fubini) Let (X,M, µ), (Y,N , ν) be σ-finite measure spacesand let f ∈ L1(X × Y,M⊗N , µ⊗ ν). Then

1. fx ∈ L1(ν) for a.e. x ∈ X and the a.e. defined map x 7→ ∫Y

fx dν is inL1(µ).

2. fy ∈ L1(µ) for a.e. y ∈ Y and the a.e. defined map y 7→ ∫X

fy dµ is inL1(ν).

3.∫

X

(∫

Y

fx dν

)dµ(x) =

∫

X×Y

f dµ⊗ ν =∫

Y

(∫

X

fy dµ

)dν(y).

Proof. Most of the serious work has already been done. As is usual, when work-ing with complex valued functions, one breaks them up into real and imaginaryparts, then into positive and negative parts. But a bit of care must be exer-cised. Our hypothesis is that f : X × Y → C is measurable with respect to theσ-algebra M⊗N and ∫

X×Y

|f | dµ⊗ ν < ∞.

109

By Theorem 13.7, fx is measurable for every x ∈ X. By Tonelli’s Theorem,∫

X

∫

Y

|fx| dν dµ(x) =∫

X×Y

|f | dµ⊗ ν

so that x 7→ ∫Y|fx| dν ∈ L1(µ). But a function in L1(µ) must be a.e. finite

(Theorem 7.11), hence ∫

Y

|fx| dν < ∞

a.e., implying fx ∈ L1(ν) a.e. x ∈ X. Similarly we see that fy ∈ L1(µ) for a.e.y ∈ Y . Now write f = u + iv, with u, v real valued. Tonelli’s theorem impliesthat the functions

x 7→∫

Y

u+x dν,

x 7→∫

Y

u−x dν,

x 7→∫

Y

v+x dν,

x 7→∫

Y

v−x dν,

are measurable from Xto[0,∞). They are all bounded by

x 7→∫

Y

|fx| dν,

so they are a.e. finite valued and the equality

(29)∫

Y

fx dν =∫

Y

u+x dν −

∫

Y

u−x dν + i

(∫

Y

v+x dν −

∫

Y

v−x dν

)

makes sense a.e. and shows that x 7→ ∫Y

fx dν is measurable and, in fact inL1(µ) since each one of the four non- negative functions of the right hand sideof (29) are in L1(µ) by Tonelli’s Theorem. Integrating both sides of (29) withrespect to µ gives (once more due to Tonelli)∫

X

∫

Y

fx dν dµ

=∫

X

∫

Y

u+x dν dµ−

∫

X

∫

Y

u−x dν dµ + i

(∫

X

∫

Y

v+x dν dµ−

∫

X

∫

Y

v−x dν dµ

)

=∫

X×Y

u+ dµ⊗ ν −∫

X×Y

u− dµ⊗ ν + i

(∫

X×Y

v+ dµ⊗ ν −∫

X×Y

v− dµ⊗ ν

)

=∫

X×Y

f dµ⊗ ν.

One deals with fy in a similar fashion.

Before continuing, and seeing the version of Tonelli-Fubini for the completedmeasure space (X × Y,M ⊗N , µ⊗ ν), we should do some exercises and appli-cations. For integrals involving Lebesgue measure it is traditional to use theRiemann notation, and we’ll follow this tradition.Application. Here is a nice application, involving a well known integral. Wewill compute ∫ ∞

0

sinx

xdx.

Actually, x 7→ sinx/x is not in L1(m):

110

Exercise 76 Prove that ∫ ∞

0

| sinx|x

dx = ∞.

So what we are going to compute is

limR→∞

∫ R

0

sin x

xdx.

There are many ways of getting this limit, one of them being by residues. Inthe one we will use we observe that

1x

=∫ ∞

0

e−xy dy

for all x > 0. This allows us to write∫ R

0

sin x

xdx =

∫ R

0

(∫ ∞

0

e−xy dy

)sin x dx.

All functions involved are continuous, so measurability is no problem. We wouldlike to change the order of integration. By Tonelli, that is always legal if theintegrand is ≥ 0; unfortunately it isn’t. So we need Fubini, which tells us thatthe interchange is legal if the integrand is integrable with respect to the productmeasure. To show it is integrable, we have to show the absolute value has finiteintegral. But once we take the absolute value, Tonelli kicks in allowing us tointegrate in any order we wish. So we begin doing

∫

(0,R)×(0,∞)

∣∣e−xy sin x∣∣ dx dy =

∫ R

0

(∫ ∞

0

∣∣e−xy sin x∣∣ dy

)dx

=∫ R

0

| sin x|(∫ ∞

0

e−xy dy

)dx =

∫ R

0

|sinx|x

, dx

≤∫ R

0

dx = R < ∞,

where we used the fact that | sinx|/x ≤ 1 for all x > 0. These calculations,justified by Tonelli, show that the integrand is in L1 of the product measure.Thus, by Fubini,

∫ R

0

sin x

xdx =

∫ R

0

(∫ ∞

0

e−xy dy

)sinx dx =

∫ ∞

0

(∫ R

0

e−xy sinx dx

)dy.

We might remember that∫

e−xy sinx dx = − 11 + y2

e−xy(cos x + y sin x) + C

so that ∫ R

0

e−xy sinx dx =1

1 + y2

(1− e−Ry(cos R + y sin R)

).

We thus get

∫ R

0

sin x

xdx =

∫ ∞

0

11 + y2

(1− e−Ry(cos R + y sin R)

)dy.

111

Taking limR→∞, the integrand on the right hand side converges pointwise to1/1 + y2; it is also always bounded by 1/1 + y2, so that by the DominatedConvergence Theorem we get

limR→∞

∫ R

0

sin x

xdx =

∫ ∞

0

11 + y2

dy = arctan y

∣∣∣∣∣

∞

0

=π

2.

This result that we just derived is usually written in the form∫ ∞

0

sin x

xdx =

π

2.

Exercise 77 Let f ∈ L1(µ), g ∈ L1(ν). We define the function f⊗g : X×Y →C by f ⊗ g(x, y) = f(x)g(y). Show that fotimesg ∈ L1(µ⊗ ν) and

∫

X×Y

f ⊗ g dµ⊗ ν =(∫

X

f dµ

)(∫

Y

g dν

).

The connection with Lebesgue measure is important enough to deserve spe-cial mention. We will denote by Bn the σ-algebra B(Rn) of Borel subsets ofRn; mn denotes Lebesgue measure in Rn. We consider the two measure spaces(Rn,Bn,mn) and (Rk,Mk,mk), and we identify Rn×Rk in the obvious way: Ifx = (x1, . . . , xn) ∈ Rn, y = (y1, . . . , yn) ∈ Rk we identify (x, y) with the elementξ = (ξ1, . . . , ξn+k) given by

ξj = xj , j = 1, . . . , n; ξj = yj−n, j = n + 1, . . . , n + k.

In other words

((x1, . . . , xn), (y1, . . . , yk)) = (x1, . . . , xn, y1, . . . , yk).

We then have

Theorem 13.9 With the identification described above,

(Rn × Rk,Bn ⊗ Bk,mn ⊗mk) = (Rn+k,Bn+k,mn+k).

Proof. The main thing that needs to be proved is that Bn+k = Bn⊗Bk. To see,Bn+k ⊂ Bn ⊗ Bk it suffices to prove (by Lemma 13.5) that if E ∈ Bn+k, thenEx ∈ Bk for all x ∈ Rn (and declare that the fact that Ey ∈ Rn for all y ∈ Rk

is done the same way, or is just a matter of switching the roles of x and y, oris the same thing). To achieve this, we fix x ∈ Rn and then look at all subsetsE of Rn+k such that Ex ∈ Bk. These sets constitute a σ algebra containing allopen subsets of Rn+k; it follows that this σ-algebra contains Bn+k.

To see that Bn+k ⊃ Bn ⊗Bk requires a more multi-step approach. One wayof doing it is as follows.Step 1. A × Rk ∈ Bn+k for all A ∈ Bn. This is proved in the standard way;the family of all subsets of A of Rn having this property is easily seen to be aσ-algebra containing all open sets. Similarly, Rn ×B ∈ Bn+k for all B ∈ Bk.Step 2. For A ⊂ Rn, define

M(A) = {B ∈ Bk : A×B ∈ Bn+k}.

For a general set A, all one can be sure that M(A) contains is the empty set.But, we claim: If A is open in Rn, then M(A) is a σ-algebra in Y containingall open subsets of Rk, hence containing Bk.

112

In fact, assume A is open in Rn. Then A × Rk is open in Rn+k, hence aBorel set, hence Rk ∈ M(A). Let B ∈ M(A). Then A × B ∈ Bn+k and theequation

A×Bc = [(A×B)c] ∩ (A× Rk).

Now A×Rk is open, hence a Borel set, in Rn+k, (A×B)c is the complement of aBorel set, so also a Borel set. It follows that A×Bc ∈ Bn+k, hence B ∈M(A).Assume now B` ∈M(A) for all ` ∈ N. Then

A×∞⋃

`=1

B` =∞⋃

`=1

(A×B`)

shows that⋃∞

`=1 B` ∈ M(A). Thus M(A) is a σ-algebra in Rk. Since A isopen, A×B is open, hence a Borel set, in Rn+k for every open subset B of Rk;thus M(A) contains all open subsets of Rk. The claim is established.Step 3. Let

N = {A ∈ Bn :,M(A) ⊃ Bk}.We claim that N is a σ-algebra in Rn containing all open sets.

In fact, N contains all open sets by Step 2. In particular, Rn, ∅ ∈ N . AssumeA ∈ N . To see that Ac ∈ N , we have to prove: If B ∈ Bk, then Ac×B ∈ Bn+k.This time we use

Ac ×B = (A×B)c ∩ (Rn ×B);

A × B is a Borel set because B ∈ Bk ⊂ M(A), hence (A × B)c is a Borel set;Rn × B is a Borel set by Step 1. It follows that Ac × B is a Borel set. ThusAc ∈ N if A ∈ N . Finally, if M(A`) ⊃ Bk for all ` ∈ N, let B ∈ Bk. ThenA` ×B ∈ Bn+k for every ` ∈ N, hence

( ∞⋃

`=1

A`

)×B =

( ∞⋃

`=1

A` ×B

)∈ Bn+k.

It follows that⋃∞

`=1 A` ∈ N . This shows that N is a σ-algebra in Rn andestablishes the claim. By Step 3, Bn ⊂ N . Thus A ∈ Bn implies A ∈ Nwhich implies A×B ∈ Bn+k for all B ∈ Bk Thus Bn+k contains all measurablerectangles, hence also the σ-algebra Bn ⊗ Bk generated by them.

Having proved Bn+k = Bn ⊗ Bk, consider the set En+k consisting of all setsof the form

∏n+kj=1 (aj , bj ]; a− j, bj ∈ R, aj ≤ bj for j = 1, . . . , n+ j. We see that

mn⊗mk and mn+k coincide on En+k. By the uniqueness of Lebesgue measure,we proved mn ⊗mk = mn+k.

Let us return to the theory. One problem with this theory so far is posed bythe null sets. Suppose, for example, that there is A ∈M, A 6= ∅, and µ(A) = 0,and assume there is B ⊂ Y such that B /∈ N . Then A×B /∈M⊗N . In fact, letx ∈ A; then (A×B)x = B /∈ N , and our assertion follows from Lemma 13.5. Onthe other hand A×B ⊂ A×Y ∈M⊗N and µ⊗ν(A×Y ) = µ(A)ν(Y ) = 0. Itfollows that if (X,M, µ) contains non-empty null sets (as happens for Lebesguemeasure), and Y contains non-measurable sets (as happens in the Lebesguecase), then (X × Y,M ⊗ N , µ ⊗ ν) is not a complete measure space. Sinceworking in non-complete measure spaces tends to be somewhat of an annoyance,one usually works in the completion (X × Y,M ⊗N , µ ⊗ ν). This means thatwe must have the analogue of the Fubini-Tonelli theorems in this case. Butfirst, let us do a warm-up exercise, something we could have done long ago. Weformulate it as a lemma, so we can refer to it later on.

113

Lemma 13.10 Let (X,M,mu) be a measure space and let (X,M, µ) be itscompletion. Let f : X → C. Show that f is measurable with respect to M ifand only if there exists g : X → C such that g is measurable with respect to Mand g = f a.e. (to be precise:{g 6= f} is a null set in M).

Exercise 78 Prove Lemma 13.10 Hints. One direction should be trivial: Iff = g a.e. and g is M-measurable, then f is M -measurable. For the conversedirection, one can reduce this to the case f : X → [0,∞]. Then one does thefamiliar division into cases; case 1, f = χE, E ∈ M. Case 2, f is simpleM-measurable. Case 3, f is the limit of an increasing sequence of simple M-measurable functions.

Back to our product spaces. From now on, and until further notice, ifE ⊂ X × Y , we will say E is measurable if and only if it is an elementof M ⊗N . If E ⊂ M ⊗ N then E is, of course, measurable in this sense.But if we want to emphasize that it is in the smaller σ-algebra, we’ll call itM⊗N -measurable. As usual, a subset of X (of Y ) will be called measurable ifand only if it is in M (in N ). We go directly to the main results.

Theorem 13.11 Let f : X × Y → C be measurable.

1. fx : Y → C is measurable for a.e. x ∈ X and fy : X → C is measurablefor almost every y ∈ Y .

2. (Tonelli) If f is real valued, non-negative, then the almost everywheredefined functions

x 7→∫

Y

fx dν : X → [0,∞],

y 7→∫

X

fy dµ : Y → [0,∞],

are measurable and∫

X

(∫

Y

fx dν

)dµ(x) =

∫

X×Y

f dµ⊗ ν =∫

Y

(∫

X

fy dµ

)dν(y).

3. If f ∈ L1(µ ⊗ ν), then fx ∈ L1(ν) for a.e. x ∈ X, fy ∈ L1(µ) for a.e.y ∈ Y , and the a.e. defined maps

x 7→∫

Y

fx dν : X → [0,∞],

y 7→∫

X

fy dµ : Y → [0,∞],

are in L1(µ), L1(ν), respectively; moreover∫

X

(∫

Y

fx dν

)dµ(x) =

∫

X×Y

f dµ⊗ ν =∫

Y

(∫

X

fy dµ

)dν(y).

Proof. By Lemma 13.10 there is a M⊗ N -measurable function g such thatg = f a.e. This means that the set on which g and f differ is contained in anull set of M⊗N ; i.e., there exists E ∈ M⊗N such that µ ⊗ ν(E) = 0 andf(x, y) = g(x, y) if (x, y) /∈ E. Now, by Lemma 13.5,

0 = µ⊗ ν(E) =∫

X

ν(Ex) dµ(x)

114

so that ν(Ex) = 0 a.e. Since y /∈ Ex implies (x, y) /∈ E, we see that fx(y) = gx(y)for all y /∈ Ex so that whenever Ex is a null set, we have fx = gxa.e.[ν]. Thisproves that fx = gx a.e. with respect to ν, for a.e. x ∈ X. Similarly, fy = gy

a.e. with respect to µ, for a.e. y ∈ Y . Since the statements are true for ginstead of f , it is now easy to see that they also hold for f .

Many of the results that led to the original, non-complete, version of Fubini-Tonelli are actually particular cases of Fubini or Tonelli. We did not need theiranalogues to prove them first to get the “complete space” version of Fubini-Tonelli and we could derive them now as corollaries. Or as exercises.

Exercise 79 Let E be a measurable subset of X×Y . Then Ex is a measurablesubset of Y for a.e. x ∈ X, Ey is a measurable subset of X for a.e. y ∈ Y .Moreover, the (almost everywhere defined) maps

x 7→ ν(Ex) : X → [0,∞], y 7→ µ(Ey) : Y → [0,∞]

are measurable and∫

X

ν(Ex) dµ(x) = µ⊗ ν(E) =∫

X

µ(Ey) dν(y).

Hint: Apply Tonelli’s theorem to χE.

The case of several factors. The case of a finite number (X1,M1, µ1),. . . ,(Xr,Mr, µr) of σ-finite measure spaces can be treated similarly. One can beginwith the elementary family {A1 × · · · ×Ar : Ai ∈Mi, i = 1, . . . , r}, defining

µ1 ⊗ · · · ⊗ µr(A1 × · · · ×Ar) = µ1(A1) · · ·µr(Ar),

and then proceeding in an obvious fashion. Or, one can just take the productstwo at a time. In all events one gets a complete measure space

(r∏

i=1

Xi,

r⊗

i=1

Mi,⊗ri=1µi)

in which all sets of the form∏r

i=1 Ai with Ai ∈Mi for i = 1, . . . , r are measur-able and

⊗ri=1µi(

r∏

i=1

Ai) =r∏

i=1

µ(Ai).

One gets a lots of Fubini-Tonelli theorems but basically they all say that theorder of integration doesn’t matter.

A different situation comes up when one has an infinite number of factors.Finding a suitable notion of a product measure plays an important role in severalareas of mathematics, most especially in Probability Theory and applications ofProbability Theory to Functional Analysis. We just give a very brief descriptionof how one proceeds.

Assume (Xλ,Mλ, µλ) is a measure space for every λ in some index set Λ.There is very little hope to define a measure ν on some σ-algebra of Y =∏

λ∈Λ Xλ such that

ν

(∏

λ∈Λ

Aλ

)=

∏

λ∈Λ

µλ(Aλ).

In the first place infinite products, especially over uncountable index sets, arenot as easy to define as infinite sums. So one has to explain first what is meant

115

by∏

λ∈Λ cλ when cλ ∈ [0,∞] for all λ ∈ Λ. There are, however, standard waysof doing this; for example here is one. Let cλ ∈ [0,∞] for λ ∈ Λ. Let c ∈ [0,∞].We write ∏

λ∈Λ

cλ = c

if and only if for every neighborhood V of c in [0,∞] there exists a finite subsetF0 of Λ such that ∏

λ∈F

cλ ∈ V

for all finite subsets F of Λ such that F ⊃ F0. We recall that a neighborhoodof ∞ in [0,∞] is any set V ⊂ [0,∞] such that there is a ∈ R with (a,∞] ⊂ V .A neighborhood of c ∈ (0,∞) is any set V such that there exists ε > 0 with(c− ε, c + ε) ⊂ V ; a neighborhood of 0 is any set containing an interval [0, ε) forsome ε > 0.

One thing we observe with this definition is that if all cλ < ∞ and a singlecλ = 0, then the product of all is 0. This is perhaps as it should be, but thismakes 0 a very special entry. For example, consider the case in which Λ = Nand we have c2 = c3 = · · · = 2. Then

∏

n∈Ncn =

{ ∞, if c1 > 0,0, if c1 = 0.

Opposite to what happens with sums, having zero factors is sort of bad; havingtoo many small factors is also sort of bad. Products of small factors results ineven smaller objects; the more factors, the smaller. That is why one does notsay a product converges to 0, but that it diverges to 0 if the limit as definedabove is 0. The possibility of having some cλ’s equal to 0, others equal to ∞also has to be addressed. One can, of course, define once more 0×∞ = 0. Butthe problems keep on mounting. While there exists a well developed theory ofinfinite products, which plays a big role in several areas of mathematics, keepingon this track will not lead us to a coherent and useful product measure in theproduct space Y . Just to give one example of a serious problem in the countablecase (and any countable problem tends to become an unsurmountable monsterof a problem when transported to the uncountable case), assume that Λ = Nand (Xλ,Mλ, µλ) = (R,L,m) for la = 1, 2, . . .. We take Aλ = [0, 1/λ]. Then,for any n ∈ N, if F is a finite subset of N which contains {1, 2, . . . , n}, we have

∏

λ∈F

m(Aλ) ≤ 1n!

so that∏

λ∈Nm(Aλ) diverges to 0. Should

n∏

j=1

[0,1j]

be a null set in RN? Notice also that if we take Aλ = [0, 1] for all λ ∈ N, we getthe product set to have measure 1; if we take Aλ = [0, 1 + ε], with ε > 0, it hasinfinite measure.

In view of these and other problems, one follows a different tack. One as-sumes first that all the spaces (Xλ,Mλ, µλ) are probability spaces; i.e., oneassumes µλ(Xλ) = 1 for each λ ∈ Λ. One now defines the family E of “measur-able rectangles,” frequently called now “elementary measurable cylinders” by:

116

a set A ⊂ Y is in E iff A =∏

λ∈Λ Aλ where A − λ ∈ Mλ for every λ ∈ Λ andAla = Xλ for all but a finite number of λ’s in La. One can then define

ν(A) =∏

λ∈Λ

µλ(Aλ)

and since all but a finite number of the factors on the left hand side of theequation defining ν(A) are equal to 1, the product is really a finite product. Allproblems have been removed. It is fairly easy to see that E is an elementaryclass, so that what needs to be done next is obvious. One has to first extendν to the algebra of finite unions of elements of E ; see that it is well defined onthat algebra and that it is a measure. Then Caratheodory’s extension theorytakes over. The main technical difficult (and it is not major) is the analogue ofLemma 13.2.

A cylinder set is a subset A of Y which in all but a finite number of directionscoincides with the whole space. A formal definition is that there exists a finitesubset F of Λ, say F = {λ1, . . . , λn} and a subset B ⊂ ∏n

j=1 Xλj such that

x = (xλ)λ∈Λ ∈ A if and only if (xλ1 , . . . , xλn) ∈ B.

(B is like the base of the cylinder.) Somewhat incorrectly, we could writeA = B×∏

λ/∈F Xλ to indicate this. It should be clear that the product measureν (assuming it constructed) coincides with µλ1 ⊗ · · · ⊗ µλn on the family of allcylinder sets A = B ×∏

λ/∈F Xλ, F = {λ1, . . . , λn}, B ∈Mλ1 ⊗ · · · ⊗Mλn .A very important example of this construction is the case in which Xλ = R,

Mλ = L for each λ ∈ Λ and µλ = γ is Gaussian measure for each λ ∈ Λ, definedby

γ(A) =

√2π

∫

A

e−x2/2 dx

for Lebesgue subsets A of R.

References

[1] P. Halmos, Measure Theory

[2] W. Rudin, Real and Complex Analysis

[3] W. Rudin, Principles of Mathematical Analysis

117

real analysis i 1 equivalence of sets and cardinality

Documents