a friendly introduction to analysis - …ms.mcmaster.ca/minoo/2ab_book.pdf · a friendly...

111
A FRIENDLY INTRODUCTION TO ANALYSIS Maung Min-Oo April 3, 2004

Upload: buithuan

Post on 19-Aug-2018

237 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

A FRIENDLY INTRODUCTION TO ANALYSIS

Maung Min-Oo

April 3, 2004

Page 2: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

2

Page 3: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Preface

Analysis is the rigorous and more advanced study of the techniques and re-sults used in calculus. As is well-known, although calculus was “invented”by Newton and Leibniz in the 17th century a more precise and rigorous foun-dation was laid only about 200 years later by Cauchy, Weierstrass BolzanoDedekind, Cantor and others. The main difficulty was to give a precise def-inition of the set of real numbers and understand the notions of infinities,infinitesimals and limits. Of course, such attempts go back to ancient timesas can be seen in the paradoxes of Zeno of Elea, 5th century B.C.

The main difficulty of modern-day undergraduate students in under-standing rigorous analysis is the lack of basic training in the nature andstructure of logical arguments and mathematical proofs. This is partly dueto the fact that disciplines like classical Euclidean Geometry, which gener-ations of human beings have studied, are no longer taught in schools. It isclaimed that after the Bible, Euclid’s Elements is the most widely read bookof western civilization. In any case, proofs not only “scare” many students,but it is also sometimes hard for students to appreciate why mathemati-cians go through all that trouble to “prove” what seems like obvious facts.I do admit that the notation and style of most analysis text books is quiteunappealing, especially to students whose only encounter with mathemat-ics is a “standardized calculus text-book” where proofs are not rigorous,definitions are not clearly stated and the emphasis is more on a rather su-perficial (cook-book-style) mastery of certain techniques. It is therefore notthe fault of the students that they are never exposed to some of the finerpoints and perhaps the true nature of mathematical reasoning in the firstyear at a University. Education can then degenerate into blindly followinga set of technical instructions (given by instructors (sic)) rather than anawe-inspiring and mind-opening experience that we all strive for in life. Iadmit I’m also not so sure that the prescribed material that I will be teach-ing you in this course is really that mind-boggling either! ( On a personalnote, I should add that in the early 70’s in Germany, where I was a student,

i

Page 4: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

there were no credits for courses and also no tuition fees. I never had totake a written test or examination. All examinations I took were oral andone-on-one with my professors. It was challenging and a lot of fun!)

These notes are written in an informal and personal style of a “friendly”mathematician, whose motto is: “It’s better to be approximately right thanexactly wrong”. The notes are by no means a substitute for my live lectures(which by the way are in general much more entertaining!). They are aimedto give a somewhat more organized and abbreviated outline of the materialto be discussed in the lectures. The notes (and also the lectures) are notsupposed to be comprehensive. The art of teaching (and learning) consistsin choosing carefully what is relevant and beautiful. My aim is to let youdiscover (McMaster motto!) rather than cover the maximum amount ofmaterial. I therefore strongly advise every student to read at least oneother recommended book in Real Analysis. Some are available on reserve inthe Thode library. I apologize for not including too many exercises in thenotes, but I will be making up on that by giving you more exercises andproblems during the term. I also plan to post extra supplementary materialand interesting links related to the course web site. My last (in my opinion,unforgivable) mistake is the omission of any pictures in these notes. I am ageometer and I think visually, but I couldn’t bring myself to spend the timeto make the appropriate ps files etc. I will certainly draw some pictures onthe blackboard during the lectures and hopefully, if I am allowed to teachthis course again and if these notes ever become publishable (you will thenhave to pay for them!), I will include some pictures, although, Analysis isstill not Geometry!

Yours truly,Min-Oo

ii

Page 5: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Contents

Preface i

0 Preliminaries 10.1 Some elementary propositional logic . . . . . . . . . . . . . . 10.2 Some naive set theory . . . . . . . . . . . . . . . . . . . . . . 20.3 Natural Numbers and the Principle of Induction . . . . . . . 40.4 Rational Numbers . . . . . . . . . . . . . . . . . . . . . . . . 4

1 Real Numbers 71.1 Field Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2 Axioms of Order . . . . . . . . . . . . . . . . . . . . . . . . . 101.3 The Completeness Axiom . . . . . . . . . . . . . . . . . . . . 131.4 Remarks on orders of Infinity . . . . . . . . . . . . . . . . . . 161.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.5.1 Short solutions to some of the exercise problems . . . 19

2 Sequences and Series 212.1 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.3 Monotonic sequences . . . . . . . . . . . . . . . . . . . . . . . 252.4 Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . 282.5 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.6 Convergence criteria for series . . . . . . . . . . . . . . . . . . 322.7 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.8 The Exponential Function . . . . . . . . . . . . . . . . . . . . 362.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.9.1 Short solutions to some of the exercises . . . . . . . . 42

iii

Page 6: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

3 Topology of the real line R 453.1 Open sets and closed sets . . . . . . . . . . . . . . . . . . . . 453.2 Connected sets . . . . . . . . . . . . . . . . . . . . . . . . . . 463.3 Compact sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.4 Elementary topology of Rn . . . . . . . . . . . . . . . . . . . 483.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.5.1 Solutions to some exercise problems . . . . . . . . . . 50

4 Continuity 514.1 Continuous functions . . . . . . . . . . . . . . . . . . . . . . . 514.2 Continuity and Connectedness . . . . . . . . . . . . . . . . . 534.3 Continuity and Compactness . . . . . . . . . . . . . . . . . . 55

4.3.1 The Contraction Mapping Principle . . . . . . . . . . 574.4 Uniform Continuity . . . . . . . . . . . . . . . . . . . . . . . . 574.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4.5.1 Short solutions to some of the exercise problems . . . 60

5 Differentiation 635.1 Definition and basic properties . . . . . . . . . . . . . . . . . 635.2 Local properties of the derivative . . . . . . . . . . . . . . . . 665.3 Some global properties of the derivative . . . . . . . . . . . . 67

5.3.1 The Logarithm . . . . . . . . . . . . . . . . . . . . . . 705.4 Differentiation in Rn . . . . . . . . . . . . . . . . . . . . . . . 70

5.4.1 Inverse Function Theorem . . . . . . . . . . . . . . . . 715.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.5.1 Hints and short solutions . . . . . . . . . . . . . . . . 75

6 Integration 776.1 Definition of the Riemann Integral . . . . . . . . . . . . . . . 776.2 Basic properties of the Integral . . . . . . . . . . . . . . . . . 796.3 Fundamental Theorem of Calculus . . . . . . . . . . . . . . . 806.4 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . 81

6.4.1 The Gamma Function . . . . . . . . . . . . . . . . . . 826.4.2 Stirling’s Formula . . . . . . . . . . . . . . . . . . . . 836.4.3 The Euler-Maclaurin Summation Formula . . . . . . . 85

6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 886.5.1 Hints and short solutions . . . . . . . . . . . . . . . . 90

iv

Page 7: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

7 Metric spaces and Uniform Convergence 937.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . 937.2 Uniform Convergence . . . . . . . . . . . . . . . . . . . . . . 95

7.2.1 Uniform convergence of power series . . . . . . . . . . 987.3 Weierstrass approximation theorem . . . . . . . . . . . . . . . 987.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

7.4.1 Hints and short solutions to the exercises . . . . . . . 102

v

Page 8: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

vi

Page 9: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Chapter 0

Preliminaries

0.1 Some elementary propositional logic

First we address some simple matters of the basic logic used in mathematicsespecially for definitions and proofs.

A proposition is a statement which can (only) be either true or false.For example, the proposition “1 + 2 = 2 + 1” is true, while the proposition“There is a rational number x such that x2 = 2 ” is false.

If p and q are propositions then the proposition p∧ q ( p AND q) is trueif both p and q are true and false otherwise. The proposition p ∨ q (p ORq) is true if either p or q (or both) is (are) true and is false otherwise. Theproposition ¬p (read NOT p) is true if p is false and is false if p is true. Insome books this is also written as ∼ p.

The proposition p ⇒ q (p implies q or if p then q) is only false if p is trueand q is false and is true otherwise. In particular, if p is false then p ⇒ qis true. In other words, p ⇒ q has the same truth value as ¬p ∨ q. Theproposition p ⇔ q (read p is equivalent to q or p if and only if q) is true ifeither both p and q are true or both p and q are false and is false otherwise.In other words, (p ⇔ q) ⇔ (p ⇒ q) ∧ (q ⇒ p). We sometimes write “iff” asan abbreviation for if and only if.

Many of the propositions we will encounter in this course are constructedusing the quantifiers ∀ and ∃. The symbol ∀ stands for for all or for everyor for each and the symbol ∃ stands for there exists or there is. To makethe expressions with these quantifiers easier to read we also use the phrase“such that ” (often abbreviated to “s.t.”)quite often. For example:

∀m ∈ Z ∃n ∈ Z s.t. n > m

1

Page 10: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

which we can read as: For every integer m there is some integer n such thatn > m . This is obviously true. However, the statement:

∃n ∈ Z ∀m ∈ Z s.t. n > m

which we can read as: There is some integer n such that for every integerm we have n > m is definitely false. So the negation:

∀n ∈ Z ∃m ∈ Z s.t. n ≤ m

is true. Note how we switched the quantifiers to negate a given proposition.The main message here is that the order in which the quantifiers are writtenis extremely important. It is easy to fall into logical blunders (even fordiscussions in everyday life!) if one is not careful with quantifiers.

0.2 Some naive set theory

We will ignore all the subtle paradoxical difficulties in defining sets and thinkof a set naively as a collection of objects called its elements. We use thenotation: x ∈ S if x is an element of a set S (x belongs to S) and x /∈ S ifx is not an element of S, (x does not belong to S).The simplest way to define a set is by listing all its elements. We use thenotation: S = π, e, 2.71828,

√2, 1.4142 to denote the set of five (distinct)

real numbers: π, e, 2.71828,√

2 and 1.4142 . This notation is of course,not always applicable especially if we have a set with an infinite numberof elements. In such cases, sets are defined by naming the property whichdistinguishes elements of the set from objects which are not in the set. Weuse the notation: x

∣∣ p(x) (or sometimes x : p(x) ), where p(x) is astatement about the object x which can only be true or false. The setconsists of all objects (of a bigger set) for which p(x) is true. For example,x ∈ Q

∣∣ x2 < 2 denotes the set of all rational numbers whose square isstrictly less than 2. We will assume some familiarity with the followingexamples:

(i) the natural numbers N = 1, 2, 3, . . . ,

(ii) the integers Z = . . . ,−2,−1, 0, 1, 2, . . . , and

(iii) the rational numbers (or fractions) Q = nd

∣∣ n, d ∈ Z, d > 0.

All these sets lie inside the set of real numbers R which is what this courseis mainly about. Here are some more standard notations that we will usefor sets:

2

Page 11: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

If A,B are two sets, we say that A is a subset of B and write A ⊂ B ifevery element of A is also an element of B. For example, A ⊂ A and ∅ ⊂ Afor any set A , where ∅ denotes the empty set which does not contain anyelements.A ∪ B = x | (x ∈ A) ∨ (x ∈ B) and A ∩ B = x | (x ∈ A) ∧ (x ∈ B) willdenote the union and intersection of two sets.A × B denotes the Cartesian product which is the set of all ordered pairs:(a, b)| a ∈ A, b ∈ B .

One of the important concepts in Mathematics is the idea of a function(or a map) from a set to another set. We will be mainly concerned withfunctions from intervals in R to R. It took a long time for mathematiciansto clarify exactly what should be meant by a function. A simple operationaldefinition is as follows:

A map or a function f from a set A to another set B, written f : A → Bis a rule that assigns to every element a ∈ A a unique element f(a) ∈ B.We write a 7→ b = f(a) and say that a is mapped to b. The set A is calledthe domain of f and the set of all b ∈ B with f(a) = b for some a ∈ A iscalled the range (or image) of f .

The graph of a function f is then defined to be the subset

(a, f(a)) ∈ A×B∣∣ a ∈ A

of the Cartesian product. (In fact, a more rigorous way to define a functionis through its graph).

f is said to be one-one (or injective) if two distinct elements of A nevermap to the same element of B. In other words: f : A → B is one-one iff(∀x, y ∈ A)(f(x) = f(y) ⇒ x = y). f is said to be onto (or surjective) ifevery element of B has some element of A mapped to it. In other words;f : A → B is onto iff ∀ b ∈ B ∃ a ∈ A s.t. f(a) = b. A map which is bothone-one and onto is called a bijection (or a one-one correspondence).

If f : A → B is a map, the image of a set X ⊂ A is defined by f(X) =b ∈ B | ∃x ∈ X s.t. b = f(x) and the inverse image of a set Y ⊂ B isdefined by f−1(Y ) = a ∈ A | f(a) ∈ Y .

3

Page 12: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

0.3 Natural Numbers and the Principle of Induc-tion

Natural numbers are used for counting and ordering. The main property(or axiom, if you wish) about N that we will use is the Induction Principle

Axiom 0.3.1 Principle of InductionIn order to prove the proposition P (n) for all n ≥ n0. it is sufficient to showthe following:

N1 P (n0) is true

N2 (induction step) ∀ k ≥ n0, we have P (k) ⇒ P (k + 1)

We also use the inductive property of the natural numbers for recursivedefinitions such as in the following definition of the factorial.

0! = 1

∀ k ≥ 0 (k + 1)! = (k + 1)k!

0.4 Rational Numbers

The rational numbers Q are all the fractions a/b with a ∈ Z and b ∈ N (b 6= 0 ), where we identify two expressions a/b and c/d as defining the samerational number if ad = bc (this is the usual way we cancel fractions). Weadd, subtract, multiply and divide rational numbers the same way as we didin elementary school (Of course, we do not divide by zero!) and find outthat the set of rational numbers Q form, what is known as a field (whichwe will define in the next chapter). It just means that if r, s ∈ Q thenr + s, r − s, r.s and also r/s (provideds 6= 0 are all in Q and the operationssatisfy the usual commutative, associative and distributive rules that we arefamiliar with from elementary school. The numbers 0 and 1 of course, playa special role. 0 is the “identity element” for addition and 1 is the “identityelement” for multiplication. Identity just meaning that it does not changeany element that it operates on.

The Greeks believed a long time ago that fractions were sufficient todescribe all “real” phenomena. However, the Pythagoreans, a philosophicalschool founded by Pythagoras of Samos (569 to 475 BC) discovered thefollowing result.

4

Page 13: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proposition 0.4.1 The real number√

2 is not a rational number.

Proof (by contradiction) Suppose that√

2 = a/b with a ∈ Z and b ∈ N.We may as well assume that a, b have no common factors else we couldcancel them out. Then 2b2 = a2 and so a is even. But then a2 and hence2b2 is divisible by 4 and so b2 is even. But then b is also even and so a and bdo have a common factor, viz. 2 . Thus we arrive at a contradiction. Hence√

2 is not in Q.QED

The Pythagoreans realized that one could easily construct a line seg-mentof length

√2 by elementary geometry (for example, the diagonal of a

unit square) and so they were forced to the conclusion that rational numberswere not sufficient to describe their geometric system. Mathematics needreal numbers to describe reality (or at least the shadow of it!).

5

Page 14: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

6

Page 15: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Chapter 1

Real Numbers

1.1 Field Axioms

The basic algebraic fact about the real numbers is that they form a field.Of course, there are many other fields and the theory of fields (for example,Galois theory) is an elegant algebraic subject. However, this is a course inanalysis, so we will simply assume that the student is familiar with the oper-ations of addition, subtraction, multiplication and division for real numbersand the basic rules that they satisfy (commuativity, associativity, distribu-tivity etc.) For the sake of completeness, we will give the formal algebraicdefinition of a field.

Definition 1.1.1 A field is a set F equipped with two operations + : F×F →F (addition) and · : F × F → F (multiplication) satisfying the followingaxioms:

F1 ∀x, y, z ∈ F;x + (y + z) = (x + y) + z and x · (y · z) = (x · y) · zF2 ∀x, y ∈ F;x + y = y + x and x · y = y · xF3 ∀x, y, z ∈ F;x · (y + z) = (x · y) + (x · z)

F4 ∃ 0 ∈ F s.t. ∀x ∈ F; 0 + x = x

F5 ∃ 1 ∈ F , 1 6= 0 s.t.∀x ∈ F; 1 · x = x

F6 ∀x ∈ F∃(−x) ∈ F s.t. (−x) + x = 0

F7 ∀x ∈ F with x 6= 0, ∃x−1 ∈ F such that (x−1) · x = 1

One normally omits the symbol · for multiplication and simply juxtapose.We also write x − y for x + (−y) . Our main algebraic axiom about R istherefore:

7

Page 16: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Axiom 1.1.1 The real numbers form a field under the usual rules of addi-tion and multiplication.

Other well-known examples of fields are:

1. Q is a field, but N and Z do not satisfy all the field axioms. (Can you seewhich of the axioms fail for N and for Z ?)

2. However integers modulo a prime number p denoted by Z/pZ form afield. (The main axiom to check is the last one about the existence of themultiplicative invers. can you see that this would fail if p is not a prime, sayp = 12)

3. The complex numbers C form a field. (Do you know any other fields?)

The field axioms trivially imply some algebraic operational rules that weare all familiar with from elementary school. A few selected examples ofsuch rules are the following:

(i) The commutative and the associative laws can be extended by inductionto any finite number of elements. The same is true for the distributive law.

(ii) The “neutral” elements 0 and 1 are unique and the same is true for both“inverses” −x and x−1.

(iii) 0 x = 0 for all x.Proof: 0x = (0 + 0)x = 0x + 0x . Now add the inverse −(0x) to both sidesof the equation and use associativity of multiplication: 0 = −(0x) + 0x =−(0x) + (0x + 0x) = (−(0x) + 0x) + 0x = 0 + 0x = 0x

This seems like splitting hairs but that is what we are learning here!

(iv) Simple rules such as −(−a) = a ; (a−1)−1 = a for a 6= 0 ; (−a) b =a.(−b) = −ab ; (−a)(−b) = ab etc.Sample proof: 0 = 0a = a0 = a(b + (−b)) = ab + a(−b) and so −(ab) =0 + (−(ab)) = −(ab) + ((ab) + (a(−b)) = 0 + a(−b) = a(−b).

(v) One important fact is the cancellation rule (or in fancier words, theabsence of zero divisors in a field!): ab = 0 if and only if a = 0 ∨ b = 0.This implies: ac = ab ∧ a 6= 0 ⇒ c = b . (This is, for example, not true inZ/12Z) .

Proof: If ab = 0 and suppose a 6= 0. Then ∃ a−1 so that a−1a = 1 , butthen 0 = a−10 = a−1(ab) = (a−1a)b = 1b = b . QED

(vi) Powers are defined inductively by: x0 = 1 , xn+1 = x.xn for n ∈ N. Wealso define x−n = (x−1)n = (xn)−1

8

Page 17: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

A more important result is the binomial formula:

Proposition 1.1.1

(x + y)n =n∑

k=0

(n

k

)xn−kyk

for any n ∈ N, where the binomial coefficients are defined by(n

k

)=

n!k!(n− k)!

Proof (by induction)For n = 1 : (x + y)1 = x + y =

(10

)x1y0 +

(11

)x0y1

Assume: (x + y)k =∑k

j=0

(kj

)xk−jyj

Then

(x + y)k+1 = (x + y)k∑

j=0

(k

j

)xk−jyj

=k∑

j=0

(k

j

)(xk−j+1yj + xk−jyj+1

)=

k∑j=0

(k

j

)xk−j+1yj +

k∑j=1

(k

j − 1

)xk−j+1yj

=k+1∑j=0

(k + 1

j

)xk+1−jyj

where we use the Pascal Triangle relation:(kj

)+

(k

j−1

)=

(k+1

j

).

This relation follows by a simple calculation with factorials from the defini-tion of the binomial coefficients, but if we use the fact that

(nk

)represents the

number of different ways of choosing k objects from n objects (that’s whywe read “n choose k” for

(nk

)), it can be seen immediately from the following

combinatorial argument: If we have to choose a committee of j membersfrom k students (that’s you) and 1 professor (that’s me), then there are onlytwo mutually exclusive possibilities: either I’m on the committee or I’m not.

9

Page 18: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

1.2 Axioms of Order

There is a non-empty subset of real numbers called the positive numbers(notation: a > 0) satisfying the following axioms:

O1 Trichotomy: For any a ∈ R exactly one of a > 0, a = 0, −a > 0 is true.

O2 If a, b > 0 then a + b > 0 and a · b > 0

A field with a relation called an ordering satisfying the three properties aboveis called an ordered field. The field Q of rationals is an ordered field (withthe usual ordering) but the field C of complex numbers is not an orderedfield (under any ordering). Our next basic axiom about R is therefore:

Axiom 1.2.1 The real numbers form an ordered field.

We will write a > b or b < a for a−b > 0. It follows from the axioms thatthe ordering > is transitive, i.e. , if a > b and b > c, then a > c (becausea − c = (a − b) + (b − c)) We also write a ≥ b to mean the same thing asa > b ∨ a = b. ( a ≤ b is equivalent to −a ≥ −b because b−a = −a− (−b)).

The axioms trivially imply some simple and familiar properties such as:

(i) 1 > 0 .

(ii) x > 0, y < 0 ⇒ xy < 0 ; x < 0, y < 0 ⇒ xy > 0 .

(iii) If a > b ⇒ a + c > b + c. If c > 0 and a > b, then ac > bc (becauseac− bc = (a− b)c, but if c < 0, then a > b ⇒ bc > ac .

(iv) If 0 < a < b, then a−1 > b−1 > 0, but if a < b < 0, then 0 > a−1 > b−1 .

(v) The square x2 of a real number x 6= 0 is always strictly positive: x2 > 0(because (−x)(−x) = x2). Of course 02 = 0.

The following notation will be used for intervals on the real line R: An openinterval: (a, b) = x ∈ R| a < x < b does not contain its end points. Aclosed interval: [a, b] = x ∈ R| a ≤ x ≤ b contains both end points. Wewill also use some other types of intervals: [a, b), (a, b], (−∞, b], (a,∞) , etc.Their meaning should be clear from the notation, for example: [a, b) = x ∈R| a ≤ x < b (−∞, b] = x ∈ R|x ≤ b .

We define the absolute value of a real number to be:

Definition 1.2.1 |x| = x if x ≥ 0 and |x| = −x if x < 0

10

Page 19: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Then we have−|x| ≤ x ≤ |x|. (in fact, it is true that either |x| = x or x = |x|.It can never happen that −|x| < x < |x| . If we add the two inequalities forx and y , we obtain: x + y ≤ |x| + |y| and also −x − y ≤ |x| + |y| . Thisproves the following basic:

Proposition 1.2.1 (Triangle Inequality)

|x + y| ≤ |x|+ |y|

for all x, y ∈ R.

Of course, it can happen now that the strict inequality holds, e.g., |2 +(−1)| = 1 is strictly less than |2|+ | − 1| = 2 + 1 = 3 .

Other very useful inequalities are the following:

Proposition 1.2.2 (Bernoulli’s Inequality)

(1 + x)n ≥ 1 + nx

for every x ≥ −1 and for all n ∈ N

Proof: (by induction) For n = 1, 1 + x = 1 + 1.x, so the inequality is true.Assuming that it is true for n = k, so that (1 + x)k ≥ 1 + kx, we have

(1+x)k+1 = (1+x)(1+x)k ≥ (1+x)(1+kx) = 1+(k+1)x+kx2 ≥ 1+(k+1)x

since 1 + x ≥ 0 and kx2 ≥ 0 .QED

Proposition 1.2.3 (Cauchy-Schwarz Inequality)

(x1y1 + · · ·xnyn)2 ≤ (x21 + · · ·x2

n)(y21 + · · · y2

n)

for every x1, · · · , xn, y1, · · · , yn ∈ R and for all n ∈ N. Moreover equalityholds iff the xi’s are proportional to the yi’s.

Proof: The inequality is trivially true if all x1 = · · · = xn = 0 or if y1 =· · · = yn = 0, so we can assume w.l.o.g. (without loss of generality) by asimple scaling, that x2

1 + · · ·+ x2n = y2

1 + · · ·+ y2n = 1.

n∑i=1

(xi − yi)2 ≥ 0 ⇒ 2− 2n∑

i=1

xiyi ≥ 0 ⇒n∑

i=1

xiyi ≤ 1

11

Page 20: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

n∑i=1

(xi + yi)2 ≥ 0 ⇒ 2 + 2n∑

i=1

xiyi ≥ 0 ⇒ −n∑

i=1

xiyi ≤ 1

Therefore ∣∣ n∑i=1

xiyi

∣∣ ≤ 1 =( n∑

i=1

x2i

)( n∑i=1

y2i

)Moreover, equality holds iff xi = yi or xi = −yi for all i .

QED

Proposition 1.2.4 (Geometric Mean / Arithmetic Mean Inequality)If x1, . . . , xn are positive real numbers then(x1 + · · ·+ xn

n

)n≥ x1 · · ·xn

Moreover, strict inequality holds unless all the numbers are equal.

Proof (by induction): For n = 1 the inequality is trivially true. Weassume the inequality is true for any n positive numbers, i.e. (AMn)n ≥(GMn)n where the arithmetic mean of n positive numbers x1 · · · , xn is de-fined by AMn = x1+···+xn

n and the geometric mean is defined by (GMn)n =(x1 · · ·xn). (I am just avoiding taking the nth root!).We want to show the inequality holds for n + 1 positive real numbers andwe may assume w.l.o.g. that x1 ≤ . . . ≤ xn ≤ xn+1. Now

AMn+1 = x1+···+xn+1

n+1 = nn+1AMn + 1

n+1xn+1 = AMn

(1 + 1

n+1(xn+1

AMn− 1)

)Therefore:(

AMn+1

AMn

)n+1=

(1 + 1

n+1(xn+1

AMn− 1)

)n+1≥ 1 + ( xn+1

AMn− 1) = xn+1

AMn

by the Bernoulli inequality. Hence, by the induction hypothesis

(AMn+1)n+1 ≥ (AMn)n xn+1 ≥ (GMn)n xn+1 = (GMn+1)n+1

QED

12

Page 21: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

1.3 The Completeness Axiom

The main analytic property which distinguishes R from Q is the complete-ness axiom. There are different ways to introduce this axiom dependingon whether one wants to assume the Archimedean property (see below) asan axiom or not, but for the sake of simplicity (remember this course issupposed to be a friendly introduction to analysis!), we will use the leastupper bound property which then automatically implies the Archimedeanproperty.

Definition 1.3.1 An upper bound of a non-empty subset A of R is anelement b ∈ R with b ≥ a for all a ∈ A. A is said to be bounded from aboveif A has an upper bound. A lower bound of a non-empty subset A of R isdefined analogously as an element b ∈ R with b ≤ a for all a ∈ A. A is saidto be bounded from below if A has a lower bound. A is said to be bounded ifit has both upper and lower bounds.

Definition 1.3.2 An element M ∈ R is called a least upper bound orsupremum of a non-empty set A, written lub(A) or sup(A), if M is anupper bound of A and if b is any upper bound of A then b ≥ M . M (if itexists!) is uniquely defined by this property.

In other words, M is the upper bound of A such that any other upper boundof A (different from M) is strictly larger than M . If A is not bounded fromabove we often write sup(A) = +∞

Definition 1.3.3 An element m ∈ R is called a greatest lower bound orinfimum of a non-empty set A, written glb(A) or inf(A), if m is a upperbound of A and if b is any lower bound of A then b ≤ m. m (if it exists!) isuniquely defined by this property.

In other words, m is the lower bound of A such that any other lower boundof A (different from m) is strictly smaller than M . If A is not bounded frombelow we often write inf(A) = −∞

We can now state The Least Upper Bound Property:

Axiom 1.3.1 If a non-empty subset A of R has an upper bound, it has aleast upper bound.

13

Page 22: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

An ordered field that satisfies the least upper bound property is called acomplete ordered field. So to sum up, all we assume about the real numbers Ris that it is a complete ordered (Archimedean) field. (In fact, one can showthat up to “ isomorphism of ordered fields”, that R is the only complete(Archimedean) ordered field).

Note that the ordered field Q is not complete. For example, the setA = q ∈ Q| q2 < 2 is bounded but does not have a least upper bound inQ. We will see why in a little while. We first list some easy consequences ofthe completeness axiom. First of all by just changing signs (and flipping theinequalities) it is obvious that any subset A ⊂ R which has a lower boundhas a greatest lower bound. If B = x ∈ R| − x ∈ A. Then B is boundedfrom above iff A is bounded from below and glb(A) = −lub(B) .

Before we state the next property we need to describe more preciselyhow the N, Z and Q are imbedded into the reals. First of all N is identifiedwith multiples of the unit element 1 in R, i.e. n = 1 + · · · + 1 where thesum is over n terms. The key fact here is that this is an injective (one toone) map. (This is for example, not true for finite fields). After that Z andQ can be imbedded in the obvious fashion, so p/q ∈ Q is identified withp · q−1 since R is a field. An important property about the real numberswhich follows from our axioms is the following:

Proposition 1.3.1 . ∀x ∈ R ∃n ∈ N such that n > x.

Equivalently: ∀a > 0 ∃n ∈ N such that 1/n < a.

Proof: This is equivalent to saying that N is not bounded above. This seemslike a very obvious fact, but we will prove it from the axioms. Suppose Nwere bounded above. Then it would have a least upper bound,M say. Butthen M − 1 is not an upper bound and so there is an integer n ∈ N withn > M − 1. But then n + 1 > M contradicting the fact that M is an upperbound for all of N. QED

The above property of the real numbers is called the Archimedeanproperty of the Reals and is been attributed to the famous Greek math-ematician Archimedes (287 to 212 BC) and appears in Book V of The Ele-ments of Euclid.

We also deduce the following important fact:

Proposition 1.3.2 . Between any two distinct real numbers there is atleast one (and hence infinitely many) rational numbers.

14

Page 23: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proof: Let a, b ∈ R with (say) a < b. Choose n ∈ N so that 1/n < b − a.Then look at integer multiples of 1/n . Since these are unbounded, we maychoose the first such multiple with m/n > a. We claim that m/n < b. Ifnot, then since (m− 1)/n < a and m/n > b we would have 1/n > b− a.

QED

A set A with the property that an element of A lies in every interval(a, b) of R is called dense in R. We have just proved that the rationals Qare dense in R. The irrationals (= R ∩Qc) are also dense in R.

We now prove the result we stated earlier.

Proposition 1.3.3 The real number√

2 exists.

Proof : We will get the existence of√

2 as the least upper bound of the setA = q ∈ Q| q2 < 2. We know that A is bounded above. 2 is an upperbound for A . (Why?) Let b = lub(A). We now prove that b2 < 2 and b2 > 2both lead to contradictions and so we must have b2 = 2 (by the trichotomyaxiom for the ordering).

So first suppose that b2 > 2. By the Archimedean property we canchoose an n ∈ N such that n > 2b

b2−2so that b2 − 2b

n > 2. Then (b − 1n)2 =

b2 − 2bn + ( 1

n)2 > 2, since a square is always non-negative. Thus b− 1n is an

upper bound of A, contradicting the assumption that b is the least upperbound.

Similarly, if b2 < 2, then a = b + 2−b2

b+2 = 2b+2b+2 is > b and is in A since

a2 = 4b2+8b+4b2+4b+4

= 2− 2(2−b2)(b+2)2

< 2. This contradicts the fact that b is an upperbound of A.

QED

We end this section by reminding you that real numbers can be defined bydecimal expansions. Given the decimal expansion n+

∑i>0 ai 10−i of a posi-

tive real number, the set of rational approximations qk = n+∑

0<i≤k ai 10−i

with k ∈ N form a bounded set and so it has a least upper bound. This is bydefinition then the real number defined by the decimal expansion. Every realnumber has a unique decimal expansion – except that decimals that termi-nate in a sequence of 9’s should be truncated. For example 1.29999... = 1.3 .Rational numbers have decimal expansions which repeat periodically (or ter-minate). This follows from the Euclidean algorithm. Of course, the reasonthat we use the number 10 as our base for decimals probably stems fromthe “accidental” fact that humans have ten fingers! One could also use anyother positive integer as a base. Computers use the binary system with base

15

Page 24: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

2 (and also base 16). The terniary system (base 3) is useful in describing theCantor set, which we will encounter in Chapter 3. The Cantor set consistsof all points in [0, 1] whose terniary decimal expansion does not contain 2.For example 0.12 = 5/9 in base 3, so is not in the Cantor set.

1.4 Remarks on orders of Infinity

It was not until the 19th century that mathematicians realized mainlythrough the pioneering work of Georg Cantor (1845-1918) that infinity comesin different orders.

Definition 1.4.1 A set is said to be countable if it can be put into one-onecorrespondence with N.

Intuitively this means that we can count off all the elements of a countableset. Examples and some basic properties for countable sets follows:

1. The sets Z and Q are countable.

2. A subset of a countable set is either finite or countable.

3. A countable union of finite or countable sets is finite or countable andalso any (finite) Cartesian product of countable sets is countable.

However:

Theorem 1.4.1 (Cantor) The set of real numbers R is not countable.

Proof:We will show that the interval (0, 1) ⊂ R is not countable. The method ofproof used below is the famous Cantor diagonalisation argument.Suppose we could write down the decimal expansions of all the real numbersin (0, 1) in a countable list: 0.a1a2a3..., 0.b1b2b3..., 0.c1c2c3..., ...Now define a decimal number: x = x1x2x3... by x1 = a1 + 1 if a1 ≤ 7 andx1 = 0 if a1 = 8 or 9, x2 = b2 + 1 if b2 ≤ 7 and x2 = 0 if b2 = 8 or 9,etc. Then the decimal expression of x differs from the nth element of thecountable list in the nth decimal place (and does not end in recurring 9’s).Hence it represents an element of the interval (0, 1) which is distinct fromall members of the list and hence (0, 1) is uncountable.

QED

A real number is called algebraic if it is a root of a polynomial withrational (or integer) coefficients. All other real numbers are called transcen-dental.

16

Page 25: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proposition 1.4.1 The set of algebraic numbers is countable. Hence thereare uncountably many transcendental numbers.

Proof: Since a polynomial of degree n with rational coeficients has (n + 1)coefficients, such polynomials can be put into one-one correspondence withQn+1. This is countable since Q is countable and so there are only countablymany polynomials of all degrees. Such a polynomial can have at most n rootsand so there are only countably many such roots. Now, the set of algebraicnumbers is the countable union of all roots of polynomials of all degrees andhence is a countable set.

Remark: Although there are uncountably many transcendental numbers,Liouville was the first to discover give a simple way to write down a decimalexpression of a real number that is not algebraic. For example a decimallike 0.110001000... (with a 1 in the n! place and 0 elsewhere) is transcen-dental. This follows from the fact that algebraic numbers cannot be ap-proximated too well by rational numbers whose denominators are not toolarge. There are of course two famous transcendental numbers: e and π.That e is transcendental was established in 1873 by Hermite and in 1882,Lindemann finally proved that π is transcendental ending the dreams of allcircle-squarers.

Another famous uncountable set with rather unusual properties thatCantor introduced and now named after him, is defined as follows:

Start with the unit interval A0 = [0, 1] . Now remove the middle third(open) segment (1

3 , 23) . So what remains is the union of 2 closed intervals

A1 = [0, 13 ] ∪ [23 , 1] . Now remove the middle third segment from each of

these two closed intevals so that we get A2 = [0, 19 ] ∪ [29 , 1

3 ] ∪ [23 , 79 ] ∪ [89 , 1] .

Continuing in this manner ad infinitum, we obtain the Cantor set C = ∪An .

Some important facts about the Cantor set C :

(i) x ∈ C iff the ternary (base 3) decimal expansion of x does not contain1.

(ii) C is uncountable. (This follows from (i), so please prove it!)

(iii) C is closed and bounded and hence compact (see Chapter 3 for defini-tions).

17

Page 26: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

1.5 Exercises

1. Find a formula for∑n

k=1(2k − 1) and prove your formula.

2. Show that: n2 < 2n < n! for all natural numbers n ≥ 5.

3. Show that√

2 +√

3 is irrational.

4. Show that 0 < a < b ⇒ a−1 > b−1 > 0 and a < b < 0 ⇒ 0 > a−1 > b−1

for all a, b ∈ F , where F is any ordered field.

5. Show that ||x| − |y|| ≤ |x− y| for all x, y ∈ R

6. Show that between any two distinct real numbers there is at least one(and hence infinitely many) irrational numbers.

7. Show that the number of subsets with exactly k elements of a set consist-ing of n elements (k ≤ n) is

(nk

). What are the odds of winning the lottery

“6 out of 49” ?

8. Show thatn∑

k=0

xk =1− xn+1

1− x

for any x ∈ R and for any n ∈ N .

9. Show that k!(nk

)≤ nk for all n ≥ 1 and 0 ≤ k ≤ n

10. Prove the following identities:

(i)∑n

k=0

(nk

)= 2n

(ii)∑l

k=0

(nk

)(m

l−k

)=

(n+m

l

)11. Show that (

1 +1n

)n

≤n∑

k=0

1k!

< 3

for all n ≥ 1.

12. Show that(1 + x)n ≥ 1

4n2x2

for all real numbers x ≥ 0 and for all natural numbers n ≥ 2.

18

Page 27: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

13. Show from first principles that ∀n ∈ N and ∀ y > 0, ∃ a unique x > 0such that xn = y .

1.5.1 Short solutions to some of the exercise problems

1.∑n

k=1(2k − 1) = n2 .Proof by induction. For n = 1 :

∑1k=1(2k − 1) = 1 = 12 .

Assume:∑n

k=1(2k−1) = n2 . Then∑n+1

k=1(2k−1) = n2+(2n+1) = (n+1)2

QED

2. By induction: For n = 5 : 52 = 25 < 25 = 32 .Assume: k2 < 2k . Then (k + 1)2 = (1 + 1

k )2 k2 < 2 k2 < 2 2k = 2k+1 , since(k+1

k )2 < (1 + 1k )2 ≤ (6

5)2 < 2 for all k ≥ 5Therefore: n2 < 2n for all natural numbers n ≥ 5 .

Similarly: 25 = 32 < 5! = 120 .Assume: 2k < k! . Then 2k+1 = 2 2k < 2 k! < (k + 1)k! = (k + 1)! for allk ≥ 5 .Therefore: 2n < n! for all natural numbers n ≥ 5.

3. Suppose that r =√

2 +√

3 > 0 is rational. Then r−1 =√

3−√

2 ∈ Qand therefore 1

2(r + r−1) =√

2 ∈ Q , which contradicts what was proved inthe lecture (and by the Greeks!).

4. ∀x (x > 0 ⇒ x−1 > 0) , since x−1 x = 1 = 12 > 0 . So b > a >0 ⇒ a−1b−1 = b−1a−1 > 0 . Therefore by the order axioms: b > a > 0 ⇒(a−1b−1)b = a−1 > (b−1a−1)a = b−1 > 0 .The case a < b < 0 is handled in a similar manner by applying the aboveargument to −a > −b > 0 .

5. x− y + y = x . Therefore, by the triangle inequality: |x− y|+ |y| ≥ |x| .This implies: |x− y| ≥ |x| − |y| . By interchanging the roles of x and y wealso have: |y − x| ≥ |y| − |x| . Combining these two inequalites, we have|x− y| ≥ ||x| − |y|| .

6. Given a < b , we can choose n > 10b−a , so that we have at least two

rational numbers of the form m+1n , m+2

n inside the open interval (a, b) . Nowm+

√2

n is irrational and lies between a and b .

19

Page 28: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

9. k!(nk

)= n(n− 1)(n− 2) · · · (n− k + 1) ≤ nk .

10.(i) Just apply the Binomial formula to (1 + 1)n .

(ii) (x + y)n(x + y)m = (x + y)n+m . Now apply the binomial formula toboth sides:( n∑

k=0

(n

k

)xn−kyk

)( m∑j=0

(m

j

)xm−kyk

)=

n+m∑l=0

(n + m

l

)xn+m−lyl

and collect terms on the left hand side and look at the coefficient of xn+m−lyl .

11. By the binomial formula:(1 +

1n

)n

= 1 + n1n

+n(n− 1)

2!1n2

+ · · ·+ n(n− 1) . . . 1n!

1nn

< 1 + 1 +12!

+ · · ·+ 1n!

< 1 + 1 +12

+ · · ·+ 12n

< 1 + 2 = 3

where we used the formula for the sum of a geometric series:

1 +12

+14

+ · · ·+ 12n

=1− 1

2n+1

1− 12

< 2

12. (1 + x)n = 1 + n x + 12n(n − 1) x2 + · · · ≥ 1

2n(n − 1) x2 ≥ 14n2x2 for

all real numbers x ≥ 0 and for all n ≥ 2 .

20

Page 29: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Chapter 2

Sequences and Series

We now come to one of the most fundamental concept in analysis: that of alimit of a sequence of real numbers. (It took mathematicians some time tosettle on an appropriate definition!)

2.1 Sequences

Definition 2.1.1 A sequence of real numbers is a map a : N → R.

We normally write ai = a(i) and think of a sequence as an (ordered)set of real numbers (ai)i∈N = (a1, a2, ..., ak, ...). Sometimes, for notationalconvenience, we begin a sequence with a0 instead of a1 or for that matter,with any an0 .

Examples:

0. The simplest sequence is the constant sequence: (a, a, . . . , a, . . . ), wherea(i) = a for all i.

1. an = 1n defines the sequence (1, 1

2 , 13 , ...) .

2. an = bn defines the geometric sequence (b, b2, b3, . . . ) , where b is anyreal number.

3. an = n2n defines (1

2 , 12 , 3

8 , ...) .

4. an =(1 + x

n

)n , where x is any real number, defines the well knownformula for compound interest.

21

Page 30: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

5. One may define a sequence inductively by a recursive formula such as:an+2 = an+1 +an with initial terms given by a1 = a2 = 1. This gives the Fi-bonacci sequence (1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, . . . ) ,first introduced by Fibonacci of Pisa (1170 to 1250).

6. Define an+1 = 12(an+ 2

an) with initial term a1 = 1 . This gives (1, 3

2 , 1712 , 577

408 , ...).This sequence can be written as a “continued fraction” and gives a very goodrational approximation for

√2. It was known (at least the first few terms)

to the ancient Sumerians.

2.2 Convergence

Informally, we say that a sequence of real numbers converges to a limitingnumber a if all the terms of the sequence become arbitrarily close to a forall sufficiently large n . The exact definition now follows:

Definition 2.2.1 We say that a sequence of real numbers (an)n∈N convergesto to a limit a ∈ R (and write: limn→∞ an = a or simply lim an = a) if andonly if the following is true:

∀ ε > 0, ∃N ∈ N such that |an − a| < ε for all n ≥ N .

N , of course, would in general depend on ε, because the quantifier for Ncomes after that of ε.

An equivalent definition is as follows:

Definition 2.2.2 A sequence of real numbers (an)n∈N converges to to alimit a ∈ R if and only if every ε-neighbourhood of a contains all but a finitenumber of the terms of the sequence.

where we used the following:

Definition 2.2.3 An ε-neighbourhood of a real number a is the open inter-val (a− ε, a + ε) = x ∈ R| |a− x| < ε

A sequence which converges to a limit is said to be convergent. Thelimit of a convergent sequence is unique. A sequence that is not convergentis called divergent.

Examples:

0. The constant sequence , (a, a, ..., a, ...) converges to a .

22

Page 31: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

1. The sequence an = 1n converges to 0 .

This is because ∀ ε > 0, | 1n − 0| = 1n < ε for all n ≥ N > 1

ε and the existenceof such an N ∈ N is guaranteed by the Archimedean property.

1 bis. More generally (as you know from first year calculus), the sameargument proves that the sequence an = P (n)

Q(n) converges to 0 , where P (n)and Q(n) are polynomials provided the degree of P is strictly less than thedegree of Q . In case they have the same degree the sequence converges to theratio of the coefficients of the terms of highest degree and if deg(P ) > deg(Q)then the sequence diverges.

2. The geometric sequence (b, b2, b3, . . . ) , converges to 0 if |b| < 1 and to1 if b = 1 . It is divergent otherwise. (This follows from Proposition 2.2.1below).

3. The sequence an = n2n converges to 0. We know from the Exercise 2 at

the end of Chapter 2, that n2 < 2n, so n2n < 1

n . Now we can use the sameargument that we use for the sequence ( 1

n) in Example 1 above.

4. an =(1 + x

n

)n converges to the exponential function ex .(See the exercisesat the end of this Chapter).

5. The Fibonacci sequence is obviously divergent, but the sequence of ratiosof consecutive terms (rn = an

an+1) is convergent. (Can you guess the limit of

these ratios ?)

6. The sequence defined by an+1 = 12(an + 2

an) and a1 = 1 converges to

√2 .

(This is one of the exercises at the end of this Chapter)

Here is a very basic limit, which we will use repeatedly. We prove it fromfirst principles (i.e., we don’t use logarithms and other stuff that we haven’tdefined yet).

Proposition 2.2.1 Let 0 < b < 1. Then limn→∞ bn = 0 .

Proof: Since 0 < b < 1 , b−1 = 1 + h with h > 0 . Therefore, by theBernoulli inequality we have:

bn =1

(1 + h)n<

11 + nh

<1

nh→ 0

as n →∞ , by example 1 above, since h is a positive constant.

23

Page 32: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

The following basic arithmetic rules about computing the limits of sums,differences, products and quotients of convergent sequences are rather obvi-ous from the definitions and the trivial proofs will be left to the reader asan exercise (please do it!), but I will give ample hints below.

Proposition 2.2.2 If an converges to a and bn converges to b, then:

(i) an ± bn converges to a± b respectively.

(ii) an bn converges to a b.

(iii) anbn

converges to ab provided b 6= 0. (This implies that bn 6= 0 for n

sufficiently large).

Proof (Sketch):(i): Let ε > 0 be given. Then ∃n1, n2 ∈ N such that |an − a| < 1

3ε for alln ≥ n1and |bn − b| < 1

3ε for all n ≥ n2, by the definition of convergence.Now choose any n0 ∈ N that is greater than both n1 and n2 (for examplen0 = 10(n1 + n2) would do), then n ≥ n0 ⇒ |(an + bn) − (a + b)| =|(an − a) + (bn − b)| ≤ |an − a| + |bn − b| < 1

3ε + 13ε < ε by the triangle

inequality.

(ii) First we use the fact that any convergent sequence is bounded, so thatfor example: |an| ≤ 1+|a| for all sufficiently large n . The proof then followsthe same pattern as above using the inequality:

|an bn − ab| = |an(bn − b) + (an − a)|b| ≤ (1 + |a|)|bn − b|+ |an − a|(1 + |b|)

(iii) We first prove that 1bn

converges to 1b and then apply (ii).

Since |b| > 0, ∃n1 ∈ N such that |bn − b| < |b|2 for all n ≥ n1. This implies

(by the triangle inequality) that |bn| > |b|2 for all n ≥ n1.

Now let ε > 0 be given. Then ∃n2 ∈ N such that |bn − b| < |b|22 ε for all

n ≥ n2. Now choose any n0 that is greater than both n1 and n2. Then

n ≥ n0 ⇒ | 1bn− 1

b| = |b− bn

bn b| ≤ 2

|b|1|b||b|2

2ε = ε

QED

We will sometimes use the following notation:

Definition 2.2.4

limn→∞ an = +∞ iff ∀R ∃N ∈ N s.t. n ≥ N ⇒ an ≥ R .We write limn→∞ an = −∞ if limn→∞(−an) = +∞

24

Page 33: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

For example, limn→∞ bn = +∞ for all b > 1 , but limn→∞ bn 6= −∞ forb < −1 .

2.3 Monotonic sequences

Definition 2.3.1 A sequence (an) is said to be bounded from above if theset of real numbers A = a1, a2, . . . , an, . . . has an upper bound. It is saidto be bounded from below if A has a lower bound and is said to be boundedif it has both upper and lower bounds.

It follows from the definition that any convergent sequence is bounded(from both above and below).

Quick proof: Let an be a covergent sequence with lim an = a. Since everyε-neighbourhood of a contains all but a finite number of points of the setA = a1, a2, . . . , an, . . . , A is contained in the union of a finite set and abounded interval (a − ε, a + ε) . A finite set of real numbers is obviouslybounded from above by its largest member and bounded from below by itssmallest member.

Note: That a sequence (an) is not bounded from above does not necessarilyimply that limn→∞ an = +∞.

We now want to investigate what the completeness axiom tells us aboutthe convergence of sequences. The example of the sequence (1, 0, 1, 0, . . . )shows that bounded sequences do not necessarily have limits. We need thefollowing.

Definition 2.3.2 A sequence (an) is said to be monotonically increasing ifan+1 ≥ an for all n ∈ N. It is said to be monotonically decreasing if an+1 ≤an for all n ∈ N. It is said to be monotonic if it is either monotonicallyincreasing or monotonically decreasing.

Remark: We say the sequence is strictly monotonically increasing (respec-tively decreasing) if we have the strict inequality in the definition above.

The main result about monotonic sequences, which is an immediate con-sequence of the completeness axiom for real numbers is the following:

Theorem 2.3.1 Every monotonically increasing sequence which is boundedfrom above is convergent. Similarly every monotonically decreasing sequencewhich is bounded from below is convergent

25

Page 34: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proof: Let (an) be a monotonically increasing sequence which is boundedfrom above. Let a = lub(A), where A = a1, a2, ..., an, .... We will provethat the sequence converges to a, its least upper bound. So let a be theleast upper bound of the sequence. Given ε > 0, we’ll show that all exceptpossibly a finite number of the terms of the sequence are in the interval(a − ε, a] ⊂ (a − ε, a + ε). Since a is an upper bound, we have an ≤ a, butsince a is the least upper bound, a− ε is not an upper bound of the sequenceand therefore there must be a term an0 which is strictly larger than a − ε.Now since the sequence is montonically increasing an ≥ an0 for all n ≥ n0.Thus we have shown that ∀ ε > 0, ∃n0 ∈ N such that a− ε < an ≤ a < forall n ≥ n0.

The statement about a monotonically decreasing sequences follow fromthe above proof since (an) is monotonically decreasing sequence and boundedfrom below if and only if (−an) is monotonically increasing and boundedfrom above.

QED

Of course, there are convergent sequences which are not monotonic. Forexample, an = 1+ (−1)n

n is not monotonic but converges to 1. Also there arebounded sequences which are not convergent. For example, the sequence(−1, 1,−1, 1, ...) defined by an = (−1)n is certainly not convergent eventhough it is bounded. We have seen some bounded sequences which do notconverge. We can, however, say something about such sequences.

Definition 2.3.3 A subsequence is an infinite ordered subset of a sequence.

Proposition 2.3.1 Any subsequence of a convergent sequence is convergent(to the same limit).

The proof of the above proposition is trivial. The next result however isnot totally obvious!

Proposition 2.3.2 Every sequence (convergent or not) contains a mono-tonic subsequence.

Proof: Let M = m ∈ N | an < am ∀n > m . There are two possibilities:

Case 1. M is infinite. M = m1 < m2 < . . . . Then bk = amkis a

monotonically (strictly) decreasing sequence.

Case 2. M is finite. Let n1 be larger than the maximum number in M .Now since n1 /∈ M, ∃n2 > n1 such that an1 ≥ an2 . Furthermore n2 /∈ Msince it is larger than n1 . Thus we can continue this procedure to obtain amonotonically increasing sequence an1 ≥ an2 ≥ an3 ≥ . . . .

26

Page 35: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

QED

Combining this last proposition with the basic theorem about monotonicsequences (Theorem 2.3.1), we obtian the famous result attributed to theCzech mathematician and philosopher Bernard Bolzano (1781 to 1848) andthe German mathematician Karl Weierstrass (1815 to 1897):

Theorem 2.3.2 Bolzano-Weierstrass TheoremEvery bounded sequence has a convergent subsequence.

Note: A bounded sequence may have many convergent subsequences (for ex-ample, a sequence consisting of a counting of the rationals has subsequencesconverging to every real number) or rather few (for example a convergentsequence has all its subsequences having the same limit). In fact we define:

Definition 2.3.4 A point a is called an accumulation point of a sequence(an), if there is a subsequence that converges a.

A point a is an accumulation point of (an) if every ε-neighbourhood of acontains infinitely many members of the sequence (not necessarily distinctpoints!). In other words, x is not an accumulation point of (an) only if∃ε > 0 such that |an − x| ≥ ε for all n . Therefore the points that arenot accumulation points of a sequence form an open set in the sense thatevery such point has a neighbourhood consisting only of points that are notaccumulation points. From this it follows that the complementary set ofaccumulation points is closed (for the definition of closed and open sets, seenext chapter). If A denotes the set of all accumulation points of a sequence(an), then glb(A) and lub(A) both belong to A (provided A is bounded).

Definition 2.3.5 limsupn→∞an = lub(A) if the sequence is bounded fromabove. (Otherwise we set limsup = ∞).Similarly we define liminfn→∞an = glb(A) if the sequence is bounded frombelow. (Otherwise we set liminf = −∞).

In simple language, limsup is the largest and liminf is the smallest accumu-lation point of a sequence. Moreover a sequence is convergent iff limsup isequal to liminf, so that there is exactly one (finite) accumulation point.

27

Page 36: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

2.4 Cauchy sequences

One apparent problem with deciding whether a sequence is convergent ornot, using the definition, is that one is supposed to know the limit first. Anelegant way around this problem is to use a fundamental idea first introducedby the French mathematician Augustin Louis Cauchy (1789 to 1857).

Definition 2.4.1 We say that a sequence (an)n∈N is a Cauchy sequence ifand only if the following is true:

∀ ε > 0, ∃N ∈ N such that |an1 − an2 | < ε for all n1, n2 ≥ N .

Informally this means that any two terms (not just for two consecutiveterms!) of a Cauchy sequence can be made arbitrarily close to each otherif we go far enough in the sequence. Note that this definition does not usethe limit of the sequence. Here are some immediate properties of Cauchysequences:

1. Any Cauchy sequence is bounded.Proof: Let (an) be a Cauchy sequence Setting “ε = 1” in the definition wesee that ∃N such that |an1 − an2 | < 1 for all n1, n2 ≥ N . In particularall an ∈ (aN − 1, aN + 1) for all n ≥ N and hence the set an : n ∈ N isbounded.

2. Any convergent sequence is a Cauchy sequence.Proof: Let (an) be a convergent sequence with lim an = a. Given ε > 0, ∃Nso that every an with n ≥ N lies in the 1

3ε -neighbourhood of a. Any twopoints in that neighbourhood are obviously at a distance strictly less thanε apart.

The fundamental property about Cauchy sequences is the following

Theorem 2.4.1 A sequence of real numbers is convergent if and only if itis a Cauchy sequence.

Proof: By the proposition above, we only have to show that every Cauchysequence is convergent, so let (an) be a Cauchy sequence. Since a Cauchy se-quence is bounded it has a convergent subsequence bk = ank

by the Bolzano-Weierstrass theorem. Let b be the limit of the subsequence. Then givenε > 0, ∃nK such that all terms of the subsequence bk with k ≥ K lie inthe 1

3ε -neighbourhood of b. Now since (an) is Cauchy ∃n0 such that allterms an with n ≥ n0 are at a distance less than 1

3ε apart. Now choose

28

Page 37: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

N = max(n0, nK). Then all terms of the sequence an with n ≥ N are at adistance less than ε from b and hence liman = b

Remarks: The fact that Cauchy sequences in R are the same as convergentsequences is called the Cauchy criterion for convergence. The completenessaxiom to prove the last result is crucial. For example, any sequence of ra-tional numbers converging to an irrational number is a Cauchy sequencethat is trying to converge but cannot converge in Q. In fact, Cantor (1845- 1918) used the idea of a Cauchy sequence of rationals to give a construc-tive definition of the Real numbers. Spaces (not just R) where all Cauchysequences converge are called complete. In fact, one can formulate the com-pleteness axiom for R in terms of Cauchy sequences (provided we assumethe Archimedean property). Here are some equivalent formulations of theaxiom:

C1 Every subset which is bounded from above has a least upper bound.

C2 Every bounded sequence has a convergent subsequence.

C3 Every Cauchy sequence is convergent.

The last property is a useful way of generalizing the idea of completenessto more general spaces. For the purpose of the next section we need the fieldof complex numbers, so a quick reminder:

Definition 2.4.2 A complex number is simply a pair of real numbers writ-ten as z = x + i y where the magic “imaginary” number i has the “unreal”property: i2 = −1 .

We add and multiply two complex numbers as follows:

(x1 + i y1) + (x2 + i y2) = (x1 + x2) + i (y1 + y2)

(x1 + i y1).(x2 + i y2) = (x1x2 − y1y2) + i (x1y2 + y1x2)

The complex numbers are denoted by C and they form a field. Themultiplicative inverse of a non-zero complex number z is given by I assumethat you are familiar with simple arithmetic and algebraic properties of thesenumbers. For a complex number z = x + iy, x = Re(z) is called the realpart, y = Im(z) is called the imaginary part. z = x − i y is the conjugateand |z| =

√zz the absolute value (or modulus) of z. The complex numbers

are denoted by C and they form a field. The multiplicative inverse of a

29

Page 38: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

non-zero complex number z is given by z−1 = z|z|2 . I assume that you are

familiar with simple arithmetic and algebraic properties of these numbers.Using the distance |z − w| between two complex numbers z and w, we candefine a Cauchy sequence of complex numbers in exactly the same way asfor real sequences. Using the Pythagorian identity |x + i y|2 = x2 + y2 andthe triangle inequality |w + z| ≤ |w|+ |z| valid in C, one easily observes thata sequence of complex numbers (zk) = (xk + i yk) is convergent iff the tworeal sequences (xk) and (yk) are both convergent. So we arrive at one mainproperty that we need about complex numbers:

Proposition 2.4.1 C is complete, i.e. every Cauchy sequence in C is con-vergent.

Remark: The one property for R that is different from C is that we do nothave an ordering for C. There is no such thing as a positive complex number!On the other hand, there is an algebraic property that is crucial for C (butnot true in R), namely that C is algebraically closed, i.e., every polynomial(with complex coefficients) has a zero in C. For example z2 + 1 = 0 issolvable in C (but not in R) and the two solutions are ± i .

2.5 Series

Although computing with series of numbers (real or complex) is the breadand butter of everyday mathematics and is of great practical and theoreticalimportance, we will be brief in our treatment of series, since this is just ashort one semester course.

Definition 2.5.1 A series is simply a sequence of the form sn =∑n

i=1 ak =a1 + · · ·+ an where the ai’s numbers (real or complex).The series is said to be convergent iff the sequence of partial sums sn isconvergent and we write

∑∞i=1 ai = s if limn→∞ sn = s . s is then called the

sum of the infinite series. By definition, a divergent series is a series thatis not convergent.

Examples:

30

Page 39: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

1. The geometric series a + a r + a r2 + · · · converges to the sum a1−r if

|r| < 1. It is divergent otherwise (assuming a 6= 0). This follows from theidentity:

n∑k=0

rk =1− rn+1

1− r

and the fact that rn+1 → 0 as n →∞ for |r| < 1 .

2.∑∞

k=11kp , where p > 0 is an important series in number theory and

defines the zeta function ζ(p). It converges for p > 1 and diverges for p ≤ 1 .For example ζ(2) = π2

6 .

Proposition 2.5.1∞∑

k=1

1kp

converges for p > 1

Proof:

s2n−1 = 1 +( 1

2p+

13p

)+

( 14p

+ · · ·+ 17p

)+ · · ·+

( 1(2n−1

)p + · · ·+ 1(2n − 1

)p

)≤ 1 + 2

( 12p

)+ 4

( 14p

)+ · · ·+ 2n−1

( 1(2n−1

)p

)= 1 + 2

12p−1

+1

4p−1+ · · ·+ 1(

2n−1)p−1

<1

1− 21−p

Therefore sn is a bounded from above (for p > 1), and since it is amonotonically increasing sequence (being the partial sum of a positive seriesit has to converge.

QED

Proposition 2.5.2∞∑

k=1

1kp

diverges for p ≤ 1

31

Page 40: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proof: We first look at the harmonic series∑ 1

k where p = 1.

s2n = 1 +12

+(1

3+

14

)+ · · ·+

( 12n−1 + 1

+ · · ·+ 12n

)≥ 3

2+ 2

14

+ · · ·+ 2n−1( 1

2n

)= 1 +

n

2

Therefore sn is unbounded and the harmonic series is divergent. For0 < p < 1, we have 1

kp > 1k for all k, so

∑nk=1

1kp ≥

∑nk=1

1k for all n and

hence∑ 1

kp diverges for 0 < p < 1.QED

2.6 Convergence criteria for series

Proposition 2.6.1 (Cauchy criterium).∑

ai converges iff∀ε > 0 ∃n0 ∈ N such that n > m > n0 ⇒ |am + · · ·+ an| < ε

This is just a restatement of the definition of a Cauchy sequence for thepartial sums of a series.

Corollary 2.6.1∑

an converges ⇒ limn→∞ an = 0

Thi converse is not true. A counterexample is the harmonic series.

Proposition 2.6.2 (Basic Comparison Test):If ∃K such that 0 ≤ ak ≤ bk for all k ≥ K, then

∑bk is convergent ⇒∑

ak is convergent.

This follows from the fact that the partial sums of both series form mono-tonically increasing sequences and the partial sums for

∑bk dominate those

of∑

ak .

For ease of language, we will say that a statement is true for all suffi-ciently large k if ∃ K such that the statement is true for all k ≥ K .

Definition 2.6.1 A series∑

ai is said to be absolutely convergent if∑|ai|

is convergent.

Since |am + · · ·+an| < ||am|+ · · ·+ |an||, it is easily seen, by the Cauchycriterium, that absolute convergence implies convergence, but the converseis not true.

32

Page 41: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proposition 2.6.3 If, for all sufficiently large k , |ak| ≤ C rk for somepositive constants C and 0 ≤ r < 1 (C and r are independent of k!), thenthe series

∑ak is absolutely convergent. (ak can be complex numbers).

Proof: Use the basic comparison test to compare the series∑|ak| with the

convergent geometric series C∑

rk.QED

Corollary 2.6.2 (RatioTest):If limk→∞

∣∣∣ak+1

ak

∣∣∣ = q < 1, then∑

ak is absolutely convergent.

Proof: Choose r = 12(1+q). Then for all sufficiently large k ≥ K ,

∣∣∣ak+1

ak

∣∣∣ < r

and hence |ak| ≤ Crk for all k ≥ K with C = |aK |r−K .QED

Corollary 2.6.3 (Root Test):

If limk→∞

(|ak|

) 1k = q < 1, then

∑ak is absolutely convergent.

Proof: Choose r = 12(1 + q). Thew for sufficiently large k ,

(|ak|

) 1k

< r and

so |ak| ≤ rk for all k ≥ K.QED

Definition 2.6.2 An alternating series is a series of real numbers of theform ±

∑∞k=1(−1)k ak , where all the numbers ak ≥ 0.

Proposition 2.6.4 (Leibniz test for alternating series):An alternating series

∑∞k=1(−1)k−1 ak satisfying

(i) ak+1 ≤ ak for all sufficiently large k

(ii) limk→∞ ak = 0

is convergent.

For your pleasure, the simple proof, which can be found in any elemen-tary calculus text book, is left as an exercise at the end of this chapter. Forexample

∑ (−1)k−1

k is convergent (to log(2) although the harmonic series isdivergent. Series which are convergent but not absolutely convergent arecalled conditionally convergent. These series are very sensitive to the orderin which they are summed up. In fact the following rather surprising fact istrue.

33

Page 42: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proposition 2.6.5 Let∑

ak be a conditionally convergent series of realnumbers and let S be any real number. Then there is a rearrangement

∑an

such that∑

an = S .

Proof: Let∑

pl be the series formed by the strictly positive terms of∑

ak

and let∑

qm be the rest. Since∑

ak is convergent but not absolutelyconvergent, both series

∑pl and

∑qm are divergent (to ±∞ respectively).

Let us assume (w.l.o.g.) that S ≥ 0 Now we can always add enough terms ofthe positive series to get a smallest (first) partial sum that is strictly largerthan S. After that we add up (or if you prefer subtract!) just enough termsfrom the other negative series to make the total strictly less than S. Nowwe know that all the terms → 0 (since the original seres is convergent afterall!), so by continuing this process, we arrive at a sequence of partial sums(with terms rearranged as above) converging to S.

QED

We end this section with a product formula that we need in the nextsection for power series.

Theorem 2.6.1 (Cauchy product formula for series).Let

∑∞k=0 an and

∑∞k=0 bn be two absolutely convergent series. Then their

Cauchy-product, defined to be the series∑

cn with cn =∑n

k=0 ak bn−k isabsolutely convergent and

( ∑an

)( ∑bn

)=

∑cn .

Proof: Since the two series are absolutely convergent, we can assume w.l.o.g.that all the terms are non-negative, so that all the partial sums are mono-tonically increasing. Since sn =

∑nk=0 an → S and tn =

∑nk=0 bn → T ,

the product sequence: sn tn → ST . Therefore, by the Cauchy criteriums2nt2n − sntn → 0 as n → ∞ . Now smtm is the sum of all terms aibj withi, j ≤ m. Let um =

∑mk=0 cm. Then um is the sum of all terms aibj with

i + j ≤ m . So sntn ≤ u2n ≤ s2nt2n and hence, un converges to ST .

2.7 Power Series

Definition 2.7.1 A power series is simply a series of the form∑an (x− c)n

where c and the an’s are constants and x is a variable (real or complex).

34

Page 43: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Given a power series, the principal task is to determine the values of thevariable x, for which the series converges and to study the sum of the powerseries as a function of x. Note that a power series always converges at x = cto the value 0 . The first main result states that a power series converges(absolutely) for all points inside a circle with center c in the complex planeand that it diverges outside this circle. (The circle could degenerate to apoint or to the whole plane). We will mainly deal with the case c = 0, sincethe general case is just a simple shift.

Examples

1. The geometric series a + a x + a x2 + · · · converges to the sum a1−x if

|x| < 1. It is divergent otherwise (assuming a 6= 0).

2. The power series:∑∞

k=01k!x

k is one of the most important power series,since it defines the exponential function ex for both real and and complexvalues of x. Since 1

(k+1)!xk+1/ 1

k!xk = x

k+1 , this series converges for all valuesof x (by the ratio test).

3. The power series:∑

k!xk converges only at x = 0 ( again by the ratiotest).

4. The series∑∞

k=1(−1)k+1

k xk is convergent inside the unit circle |x| < 1. Itarises by integrating a geometric series and defines the logarithm as we seelater in the course).

Proposition 2.7.1

(i) If the power series∑

an xn converges for x = x0 then it converges abso-lutely for all x such that |x| < |x0|.

(ii) If the power series∑

an xn diverges for x = x0 then it diverges for allx such that |x| > |x0|.

Proof: We compare with a geometric series. If∑

anxn0 is convergent,

anxn0 → 0 and so in particular, all the terms are bounded: |anxn

0 | ≤ C forsome constant C > 0 . Let b = |x|

|x0| < 1 . Then the geometric series C∑

bn

is convergent and hence by the basic comparison theorem, so is the series∑anxn , since |anxn| = |anxn

0 |bn ≤ Cbn . (ii) follows from a comparisonwith a divergent geometric series since |x|

|x0| > 1 .QED

35

Page 44: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

The “largest” value R for such that the power series converges for all|x| < R and diverges for all |x| > R, is called the radius of convergence.This can be computed by the formula R−1 = limsupn→∞|an|

1n (by the root

test) or by the formula R−1 = limn→∞|an+1||an| (ratio test).

Remark: The ratio and the root tests are related by the following relation:

Proposition 2.7.2 For any sequence of positive real numbers (an) , we have

lim infn→∞

an+1

an≤ lim inf

n→∞n√

an

lim supn→∞

an+1

an≥ lim sup

n→∞n√

an

I will leave the proof of this proposition as an exercise for you (butplease do it!). We will continue with the more important properties of usingpower series to define functions (for example differentiating and integratingpower series) when we deal with uniform convergence in Chapter 7, butI cannot resist introducing the most important “transcendental” function:the exponential function, using a power series, in the next section.

2.8 The Exponential Function

Definition 2.8.1 The exponential function is defined by:

exp(z) = ez =∞∑

n=0

1n!

zn

for any z ∈ C .

Definition 2.8.2 The number e is defined by:

e = exp(1) =∞∑

n=0

1n!

Besides the obvious normalization exp(0) = 1, the exponential functionsatisfies the following extremely important functional identity (In fact, itcan be characterized by this property):

36

Page 45: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Theorem 2.8.1exp(w + z) = exp(w) exp(z)

for all z, w ∈ C

Proof: Multiplying the two series∑∞

n=01n! wn and

∑∞n=0

1n! zn by the Cauchy

product formula we obtain the series∑∞

n=0 cn , with

cn =n∑

k=0

wn−k

(n− k)!

n∑k=0

zk

k!=

1n!

n∑k=0

(n

k

)wn−kzk =

1n!

(w + z)n

by the binomial formula of Chapter 1.QED

Corollary 2.8.1

(i) exp(−z) = (exp(z))−1 for all z ∈ C(ii) exp(x) > 0 for all x ∈ R(iii) exp(n) = en for all n ∈ Z(iv) exp(z) = exp(z) for all z ∈ C

Definition 2.8.3 The (real-valued) trigonometric functions are defined by:cos(y) = Re(eiy) and sin(y) = Im(eiy) for any y ∈ R .

We therefore have Euler’s famous formula

ei θ = cos(θ) + i sin(θ)

and also the power series expressions:

(i) cos(x) = 12(exp(ix) + exp(−ix)) =

∑∞k=0(−1)k 1

(2k)! x2k

(ii) sin(x) = 12i(exp(ix)− exp(−ix)) =

∑∞k=0(−1)k 1

(2k+1)! x2k+1

valid for any x ∈ R .

All the other trigonometric functions (tangent, cotangent, secant andcosecant) can be expressed in terms of cosine and sine.

Trigonometric identities such as cos(α+β) = cos(α) cos(β)−sin(α) sin(β)sin(α+β) = cos(α) sin(β)+sin(α) cos(β) follow from the fundamental func-tional equation for the exponential function. The fundamental identity:(cos θ)2 +(sin θ)2 , which relates the trigonometric functions to the unit cir-cle x2 + y2 = 1, follows from the fact that the conjugate of eix is e−ix andso |eix|2 = 1.

Closely related and very useful functions are the hyperbolic functionsdefined by:

37

Page 46: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

cosh(x) = 12(ex + e−x)

sinh(x) = 12(ex − e−x)

These functions play an important role in geometry and physics.The fundamental identity here is: (cosh t)2− (sinh t)2 = 1 . This relates

the hyperbolic functions to the hyperbola x2− y2 = 1 in the same way thatthe trigonometric functions came from the circle. There are also transcen-dental functions that are related to the ellipse, called elliptic functions!

38

Page 47: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

2.9 Exercises

0. Complete the proof of Proposition 2.2.2 about the arithmetic propertiesof limits.

1. Assume that limk→∞ ak = l. Show that:

(i) limk→∞ |ak| = |l|(ii) If l > 0 then ∃N ∈ N such that n > N ⇒ an > 9

10 l

2. Let ak =√

k + 103−√

k ; bk =√

k +√

k−√

k ; and ck =√

k(1 + 10−3)−√k for k ∈ N . Show that ak > bk > ck for all k < 106 , but ak → 0, bk →

0.5; and ck → +∞ as k →∞ .

3. Define a sequence recursively by the formulas: a1 = a, a2 = b andak = 1

2(ak−1 + ak−2) for all k > 2 . Show that an converges and find thelimit.

4. Compute: √1 +

√1 +

√1 + ...

i.e., the limit of the sequence defined recursively by ak+1 =√

ak + 1 witha1 = 1 .

5. Compute:

1 +1

1 + 11+...

i.e., the limit of the sequence defined recursively by ak+1 = 1 + 1ak

witha1 = 1 .

6. Compute the limits of the following sequences if they converge. Do notuse logarithms or l’Hospital’s rule and stuff like that! Verify your answers,i.e. prove that they are correct.

(i) an = b1n with b > 0

(ii) an = n1n

(iii) an = (bn + cn)1n with b, c > 0

(iv) an = n!nn

39

Page 48: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

7. Cauchy condensation test Suppose that (an) is a monotonically decreasingsequence of positive real numbers such that the series

∑an is convergent.

Let bn = 2n a2n . Prove that the series∑

bn is also convergent.

8. Prove the following form of the Bolzano-Weierstrass theorem:Every bounded infinite set has at least one accumulation point.(An accumulation point of a set X ⊂ R is a point a such that every ε-neighbourhood of a contains at least one point of X distinct from a.)

9. Prove Leibniz’ test for alternating series.

10. For each of the the following power series determine all values of x forwhich the series (i) converge absolutely; (ii) converges conditionally and (iii)diverges.

(i)∑ xn

np with p > 0

(ii)∑ n2

3n (x + 2)n

(iii)∑ (n!)2

(2n)!xn

(iv)∑

2n n!nn xn

11. Prove proposition 2.7.2.

12. Suppose that 0 < r < 1 and that a sequence (an) satisfies thecontraction property: |an+1−an| ≤ r |an−an−1| for all n ≥ 10 . Prove that(an) is a Cauchy sequence and hence converges.

13. Sum the series

1 +13

+15

+19

+115

+125

+127

+ · · · ,

where we sum over all reciprocals of integers whose only prime factors are3 and 5 . (This is a product of two geometric series).

14. Show that cos(2θ) =(cos(θ)

)2−(sin(θ)

)2 and sin(2θ) = 2 cos(θ) sin(θ)from the definitions. Derive similar formulas for the hyperbolic functions.

15. Find a formula for the finite sum:

12

+ cos θ + cos(2θ) + · · ·+ cos(nθ)

.

40

Page 49: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

18. Show that e is irrational.

19. Show that an =(1+ 1

n

)n is a bounded monotonically increasing sequence.What is the limit?

20*. Show thatlim

n→∞

(1 +

x

n

)n= ex

for any x ∈ R .

41

Page 50: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

2.9.1 Short solutions to some of the exercises

2. If k < 106 , then√

k < 103 and 10−3 k <√

k .So:

√k + 10−3 k <

√k +

√k <

√(k + 103) and ak > bk > ck .

ak =√

(k + 103)−√

k =103

√k + 103 +

√k→ 0

bk =√

k +√

k −√

k =

√k√

k +√

k +√

k=

1√1 + 1√

k+ 1

→ 12

ck =√

k(1 + 10−3)−√

k =k 10−3

√k + 10−3 k +

√k

=10−3

√k√

1 + 10−3→∞;

3. Let us assume first that a = 0 and b = 1 . The general case follows bya translation and dilation of this interval. The recursive definition has thegeometric meaning that the sequence is obtained by taking the midpoint ofthe preceding two points (beginning with 0 and 1 ). So you go back andforth, halving your step size everytime!The sequence is therefore:

(0, 1, 1− 12, 1− 1

2+

14, 1− 1

2+

14− 1

8, · · · )

.By the formula for the sum of a geometric series:

ak = 1− 12

+14− 1

8· · · + (−1

2)k−2 =

1− (−12)k−1

1− (−12)

=23(1− (−1

2)k−1)

Thereforelim ak =

23

since lim(−12)k−1 = 0 .

In the general case the limit is

a +23(b− a) =

a + 2b

3

4. We first prove that the given sequence converges by showing that it isincreasing and bounded from above.

42

Page 51: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

To prove an+1 > an for all n , we proceed by induction. For n = 1 : a2 =√2 > 1 = a1 . Assume ak+1 > ak . Then ak+2 = sqrt1 + ak+1 > ak+1 =

sqrt1 + ak . We now show that an ≤ 2 for all n by induction. For n = 1 :a1 = 1 < 2 . Assume ak < 2 . Then ak+1 = sqrt1 + ak < sqrt1 + 2 < 2 .Let a = lim an . Then 0 ≤ a ≤ 2 and satisfies a = sqrt1 + a . Therefore ais equal to the Golden Ratio:

φ =√

5 + 12

5. The limit here is also the Golden Ratio φ .

6.

(i) limn→∞ b1n = 1 .

Proof: Assume first that b > 1 . Then an > 1 and we put an = 1 + hn withhn > 0 . Then, by the Bernoulli inequality: b = an

n = (1 + hn)n ≥ 1 + nhn ,so 0 < hn ≤ b−1

n and hence hn → 0 . If b < 1 , then since 1b < 1 and so

limn→∞(1b )

1n = 1 . If b = 1 , there is nothing to prove!

(ii) We set an = 1 + εn .Now n = (1 + εn)n > 1

2n(n − 1)ε2 for all n > 2 , by the binomial formula.So ε < sqrt 2

n−1 and hence εn → 0 . Therefore limn→∞ n1n = 1 .

(iii) limn→∞(b + c)1n = max(b, c) .

Proof: Assume w.l.o.g. that b > c > 0 . (bn + cn)1n = b(1 + qn)

1n , where

0 < q = cb < 1 . Now (1+qn)

1n < (1+q)

1n for n > 1 and so Set (1+qn)

1n → 0

by (i).

(iv) limn→∞n!nn = 0 since

n!nn

=n

n

n− 1n

. . .2n

1n

<1n

12. By induction: |an+l+1 − an+l| ≤ r |an+l − an+l−1| ≤ · · · r(n + l −1) |a2 − a1|By the triangle inequality: |an+k − an| ≤

∑k−1l=0 |an+l+1 − an+l| ≤ |a2 −

a1|∑k−1

l=0 rn+l−1 ≤ rn

1−r |a2 − a1| for all n > 10 and for all k ! Since rn → 0 ,this implies that an is a Cauchy sequence.

43

Page 52: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

13. By the Cauchy product formula of two absolutely converging series wehave: ( ∞∑

k=0

13k

) ( ∞∑l=0

15l

)=

( ∞∑k,l=0

13k 5l

)which is the given series and therefore its sum is 1

1− 13

11− 1

5

= 158 .

Remark: It follows from this that the number of primes is infinite, since theharmonic series diverges and each integer is a (unique) product of primes!

15.

12

+n∑

k=1

cos(kθ) =12

n∑k=0

(ei kθ + e−i kθ)

=12

n∑k=0

ei kθ +12e−i nθ

n∑k=0

ei (n−k)θ

=12(1 + e−i nθ)

1− ei (n+1)θ

1− ei θ

=12e−i n θ

2(ei n θ

2 + e−i n θ2) ei (n+1) θ

2

(e−i (n+1) θ

2 − ei (n+1) θ2

)ei θ

2 (e−i θ2 − ei θ

2 )

=cos(n θ

2) sin((n + 1) θ2)

sin( θ2)

44

Page 53: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Chapter 3

Topology of the real line R

3.1 Open sets and closed sets

Definition 3.1.1 A subset U ⊂ R is said to be open if ∀a ∈ U, ∃ δ > 0such that (a − δ, a + δ) ⊂ U . A subset A ⊂ R is said to be closed if itscomplement Ac is open.

We decree the empty set to be open (this actually follows logically fromthe definition). Open intervals on the real line are open and closed intervalsare closed. It is trivial to see that any union of open sets is open and henceany intersection of closed sets is also closed. However, in general, only finiteintersections of open sets are open and only finite unions of closed sets areclosed. For example, the union of all the closed intervals [ 1

n , 1 − 1n ] is the

open interval (0, 1) and the intersection of all open intervals (− 1n , 1 + 1

n) isthe closed interval [0, 1] .

Definition 3.1.2 The interior of A ⊂ R is the union of all open sets con-tained in A. It is always open and is denoted by int(A) (sometimes also byA). The closure of A ⊂ R is the intersection of all closed sets that containA. It is obviously closed and is denoted by A (sometimes also by clo(A) .

If U is open then U = U and if A is closed then A = A. The closure of aset A consists of A and all its limit (or accumulation) points where a limitpoint of A is a point p such that every neighbourhood of p contains a pointdistinct from p which is in A. The limit of any sequence (an) contained inA, if it exists, belongs to the closure of A.

45

Page 54: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

3.2 Connected sets

Definition 3.2.1 A subset A ⊂ R is called disconnected if there are twodisjoint open sets U and V with U ∩A 6= ∅, V ∩A 6= ∅ such that A ⊂ U ∪V .A connected set is a set which is not disconnected.

The fundamental fact about connected sets on R is

Theorem 3.2.1 If A is a connected set on the real line and if a, b ∈ Awith a < b, then the whole interval [a, b] ⊂ A. Any interval [a, b] ⊂ R isconnected. In fact, any connected set of R is an interval (not necessarilybounded or closed).

Proof: If ∃x ∈ (a, b) such that x /∈ A, then (−∞, x) and (x,+∞) aretwo disjoint non-empty open sets such that would disconnect A , i.e , A ⊂(−∞, x)∪(x,+∞). Now let [a, b] ⊂ U ∪V with U, V two non-empty disjointopen subsets of R. Then ∃x ∈ U ∩ [a, b], y ∈ V ∩ [a, b] , where we can assumethat x < y. Let z = lub(U ∩ [x, y]) . If z ∈ U , then z < y (since y ∈ V )and so there is an open neighbourhood of z contained in U ∩ [x, y], Thiscontradicts the fact that z is the least upper bound. On the other hand, ifz ∈ V then x < z and so there is an open neighbourhood of z contained inV ∩ [x, y] contradicting the fact that z is the least upper bound.

QED

3.3 Compact sets

Definition 3.3.1 A subset of R is called compact if it is closed and bounded.

Let K be a compact set in R. Then since K is bounded, by the com-pleteness axiom, K has a unique least upper bound M = sup(K) and aunique greatest lower m = inf(K). Suppose M /∈ K. Now Kc is open, sothere exists an open neighbourhood of M : (M − ε,M + ε) ⊂ Kc implyingthat M − 1

2ε is an upper bound for K which contradicts the fact that Mis the least upper bound. Therefore M ∈ K. Similarly the greatest lowerbound m also belongs to K. We have thus proved:

Proposition 3.3.1 For any compact set K, sup(K), inf(K) ∈ K

46

Page 55: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Let (an) be sequence contained in a compact set K. Since Kis bounded,by the Bolzano-Weierstrass theorem, (an) has a convergent subsequencewhose limit point belongs to K = K, since K is closed. We have thusproved:

Proposition 3.3.2 Every sequence of points in a compact set K has a sub-sequence that converges to a limit point in K.

This property is sometimes used as the definition of compact sets inR. There is however a more general and useful (albeit rather abstract)characterization of compact sets which we would like to introduce:

Definition 3.3.2 An open cover of a subset A ⊂ R is a union of open sets⋃Uω|ω ∈ Ω that contains A. A finite subcover of the cover

⋃Uω|ω ∈ Ω

is a finite subset ω1, ω2, ..., ωn ⊂ Ω such that Uω1 ∪ Uω2 ∪ ... ∪ Uωn stillcontains A.

Let us, for the moment call sets with the property that every open coverhas a finite subcover “kompakt” (German for compact!). First of all such aset is necessarily bounded, because Un = (−n, n), n ∈ N is an open cover forany subset of R and unless the set is bounded we will never be able to finda finite subcover. Secondly, a “kompakt” set has to be closed because if p isa limit point of the set which does not belong to the set, then we can formthe open cover Ac

n, n ∈ N, where An is the closed interval [p − 1n , p + 1

n ] .This certainly covers A since it covers everything in R except the point p.In fact we now prove the key result:

Theorem 3.3.1 A subset K ⊂ R is compact if and only if every open coverof K has a finite subcover.

Proof: Let K be a compact set and let⋃Uω|ω ∈ Ω be an open cover of K.

As a preliminary step, we find a countable subcover Uω1∪Uω2∪...∪Uωn ... thatstill covers K. This can be achieved by using the fact that the countable setQ is dense in R and so the countable set of intervals of the form (r− 1

n , r+ 1n),

where r ∈ Q and n ∈ N, form an open cover for all of R. Each point x ∈ K iscontained in such an interval (r− 1

n , r+ 1n) and we choose a definite Uω which

contains it. So by a change of notation we can assume that⋃Uk : k ∈ N

covers K. Assume, on the contrary, that there is no finite n such thatK ⊂ U1 ∪ ... ∪ Un. Then ∀n,∃xn ∈ K such that xn /∈ U1 ∪ ... ∪ Un. Thesequence (xn) has a subsequence that converges to a limit point x ∈ K.

47

Page 56: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Since x is in the open set UN for some N , and since xn /∈ UN for all n > N ,we obtain a contradiction.

QED

3.4 Elementary topology of Rn

As everyone knows Rn denotes the Euclidean vector space of all n -tuplesof real numbers: x = (x1, x2, ..., xn)|xi ∈ R, where n ∈ N.

We denote the distance in Rn by d(x, y) = ||x − y|| , where the norm(= the length) ||v|| of a vector v is defined as usual by the scalar product||v||2 =< v, v >. (For simplicity of notation we will denote vectors by simpleletters. n = 1 is a special case!).

The open ball (or disk) of radius r > 0 with centre a in Rn is the setBr(a) = x ∈ Rn| d(x, a) < r. Sometimes we also use the closed balldefined by Br(a) = x ∈ Rn| d(x, a) ≤ r

Definition 3.4.1 A subset A ⊂ Rn is said to be open if∀a ∈ A, ∃ δ > 0 such that Bδ(a) ⊂ A .A subset A ⊂ Rn is said to be closed if its complement Ac is open.

Any union of open sets is open and hence any intersection of closed setsis also closed, but in general, only finite unions of open sets are open andonly finite intersections of closed sets are closed. This basic property of opensets leads to the following definition of an abstract topological space whereevery other “superficial” geometric property is stripped away until we areleft with the bare bones of the topology of open subsets.

Definition 3.4.2 A topological space is a set X with a distinguished col-lection Uτof subsets called open sets, containing the empty set and the setX itself with the property that any union and any finite intersection of opensets is open.

Definition 3.4.3 A subset A in a topological space is said to be connectedif it is not contained in a disjoint union of two non-empty open subsets

Definition 3.4.4 An open cover of a subset A in a topological space is aunion of open sets

⋃Uω|ω ∈ Ω that contains A.

A finite subcover of the cover⋃Uω|ω ∈ Ω is a finite subset ω1, ω2, ..., ωn ⊂

Ω such that Uω1 ∪ Uω2 ∪ ... ∪ Uωn still contains A.

48

Page 57: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Definition 3.4.5 A subset A in a topological space is said to be compact ifevery open cover of A has a finite subcover.

The key theorem about compact sets in Rn is the following generalizationof the Bolzano-Weierstrass property:

Theorem 3.4.1 A subset of Rn is compact if and only if it is closed andbounded.

3.5 Exercises

1. Determine for each of the following sets wheter they are (i) open, (ii)closed, (iii) connected, (iv) bounded and (v) compact:

(i) (−∞, 1] ∪ (0, 5] ⊂ R

(ii) (−∞, 1] ∩ [0, 5) ⊂ R

(iii) x : |x| ≥ 2 ⊂ R

(iv) z : 0 < |z| ≤ 1 ⊂ C

(v) z : |z| > 2 ⊂ C

2. Give an example of an open cover of the set (−1,+1) which does notadmit any finite subcover.

3. Show that a finite union and an arbitrary intersection of compact setsis again compact

4. Let (Kn)n∈N be a sequence of compact subsets of the real line satisfyingthe property: Kk+1 ⊂ Kk, Such a sequence is called a nested sequence.Prove that the intersection

⋂Kn|n ∈ N is non-empty. (For the sake of

simplicity you can assume that each Kn is a closed interval [an, bn]).

6*. Show that any (non-empty) open subset of R is a countable union ofdisjoint open intervals.

49

Page 58: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

3.5.1 Solutions to some exercise problems

1.

(i) (−∞, 1] ∪ (0, 5] = (−∞, 5] is not open, closed, connected, not boundedand not compact.

(ii) (−∞, 1] ∩ [0, 5) = [0, 1] is not open, closed, connected, bounded andcompact.

(iii) x : |x| ≥ 2 = (−∞,−2]∪ [2,+∞) is not open, closed, not connected,not bounded and not compact.

(iv) z : 0 < |z| ≤ 1 ⊂ C is not open, not closed, connected, bounded andnot compact.

(v) z : |z| > 2 ⊂ C is open, not closed, connected, not bounded and notcompact.

2. (−1+ 1n , 1− 1

n) | n ∈ N form an open cover of (−1,+1) without a finitesubcover. The union of any finite subcollection of this nested sequence ofopen intervals is of the form (−1+ 1

N , 1− 1N ) which does not cover (−1,+1) .

4. Let Kn = [an, bn] be a nested sequence of compact intervals of the realline such that Kk+1 ⊂ Kk , for all k . The sequence (ak) is an increasingsequence which is bounded from above by b1 and hence it converges to alimit a . Similarly (bk) is a decreasing sequence which is bounded from belowby a1 and hence it converges to a limit b . Now a ≤ b since ak < bk for allk and hence the intersection id the non empty closed interval [a, b] .

50

Page 59: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Chapter 4

Continuity

It took mathematicians some time to settle on an appropriate definitionof this key concept in analysis. Intuitively continuity means that thereare no sudden unexpected jumps. A continuous function is supposed tohave a graph with no breaks. Continuity can be defined in several differentways, depending on the degree of abstraction and generality. The “easiest”definition is topological and a continuous function between two topologicalspaces is just a map with the property that the inverse image of any openset is also open. Remember in a topological space every other “superfluous”structure has been stripped away so that the only fundamental concept leftis that of an open set, so everything has to be defined in terms of open sets.

4.1 Continuous functions

We will begin with the usual ε, δ-definition for functions on the real line andshow how it can be reformulated purely in terms of open sets.

Definition 4.1.1 A function f : A → R , where A ⊂ R , is said to becontinuous at a point a ∈ A if and only if the following holds:∀ ε > 0, ∃ δ > 0 such that x ∈ A ∩ (a− δ, a + δ) ⇒ |f(x)− f(a)| < ε

If f is continuous at each point of a subset B of A, then f is said to becontinuous on B.

Definition 4.1.2 We say that a subset B of A is relatively open (or rela-tively closed) in A if it is the intersection of an open (respectively closed)set and A.

51

Page 60: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proposition 4.1.1 A function f : A → R , where A ⊂ R , is continuouson A if and only if for every open subset U ⊂ R the inverse image f−1(U) =a ∈ A | f(a) ∈ U is relatively open in A.

Proof: Assume first that f is continuous and let U ∈ R be open. If f−1(U)is empty it is open. Otherwise, let a ∈ f−1(U) so that f(a) ∈ U . Since Uis open, ∃ ε > 0 such that Bε(f(a)) = (f(a) − ε, f(a) + ε) ⊂ U . Therefore,∃ δ > 0 such that f(A∩Bδ(a)) ⊂ Bε(f(a)) ⊂ U , so that A∩Bδ(a) ⊂ f−1(U)and hence f−1(U) is relatively open in A .Now suppose that the inverse image (under f) of every open set is relativelyopen in A . Now let a ∈ A and ε > 0 be given. Then since Bε(f(a)) isopen, its inverse image: V = f−1(Bε(f(a))), which contains the point a, isrelatively open in A. Therefore ∃ δ > 0 such that Bδ(a) ⊂ V which impliesthat f(A ∩Bδ(a)) ⊂ f(A ∩ V ) ⊂ Bε(f(a)).

QED

Now you can see why the characterization of continuity by open sets isnot only more general, but also by far the more elegant description. (Theε, δ definition has been the main reason why a lot of students dislike analysisand quit studying mathematics!). By using the definition that closed setsare nothing but complements of open sets one can now easily prove thefollowing:

Proposition 4.1.2 A function f : A → R , where A ⊂ R , is continuouson A if and only if for every closed subset B ⊂ R the inverse image f−1(B)is relatively closed.

A simple corollary of the above proposition is the following useful fact:

Proposition 4.1.3 f : A → R , where A ⊂ R , is continuous at a pointa ∈ A if and only if whenever (an) is a sequence in A converging to a, thesequence (f(an)) converges to f(a).

It is also convenient to have a definition of a limit of a function

Definition 4.1.3 Let f : A → R be a function and a ∈ A (closure of A).Then we write limx→a f(x) = l if and only if, whenever (an) is a sequencein A converging to a, the sequence f(an) converges to l. In other words:

∀ ε > 0, ∃ δ > 0 such that x ∈ A ∩ (a− δ, a + δ) ⇒ |f(x)− l| < ε

If f and g are real valued functions, we define the function f + g by(f + g)(x) = f(x) + g(x) for all x in the common domain of definition.

52

Page 61: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Similarly, we may define the difference, product and quotient of functions.(The quotient is not defined at points where the denominator is zero). Thefollowing arithmetic properties for continuous functions then follow easilyfrom the corresponding properties for limits

Proposition 4.1.4 If f and g are continuous, then so are f ± g, f.g andf/g whenever they are defined .

Another basic property is the following:

Proposition 4.1.5 The composite f g of two continuous functions is con-tinuous, wherever the composition is defined.

Proof: This follows from the fact that (f g)−1(U) = g−1(f−1(U)) and theelegant proposition 4.1.1.

Examples

0. Any constant function f(x) = c is continuous everywhere.

1. The identity function defined by f(x) = x is continuous everywhere.

2. Hence by the above propositions, any polynomial function is continuouseverywhere and any rational function (a ratio of two polynomial functions)is continuous at all points where the denominator is not zero.

3. We will see later that the elementary “transcendental ” functions like sin,cos, exp, log, ... are all continuous on their domains of definition.

4. The functions defined by f(r) = 1 for r ∈ Q and f(x) = 0 for x irrationalis defined everywhere but is discontinuous at every point, because Q is densein R.

5. Let f be the function defined by f(x) = 0 if x is irrational and f(mn ) = 1

nfor a rational number m

n (in lowest terms). Then f is discontinuous at everyrational point, but continuous at every irrational point.

4.2 Continuity and Connectedness

The main result of this section is the following important

Theorem 4.2.1 The image of a connected set under a continuous map isconnected.

53

Page 62: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proof: Let A be a connected set and let f be continuous on A. If f(A) weredisconnected, then by definition, there are two disjoint non-empty open setsU and V such that f(A) ⊂ U∪V . Since f is continuous, f−1(U) and f−1(V )are two non-empty disjoint open sets such that A ⊂ f−1(U)∪f−1(V ) whichcontradicts the assumption that A is connected.

QED

Using the fact that intervals on the real line are connected we obtain the“classical” Intermediate Value Theorem of first year calculus.

Theorem 4.2.2 Let f be real-valued function continuous on a (connected)interval I containing the two points a and b. Then for any value y which liesbetween the two real numbers f(a) and f(b) there exists a point x between aand b such that f(x) = y.

Since this is one of the key results, let me give you another “more ele-mentary” proof of the above theorem.

Alternate Proof: By a some simple translation (i.e., a vertical shift of thegraph of the function), we can assume w.l.o.g. that I = [a, b] , f(a) <0 , f(b) > 0 and that y = 0. We then define two monotonic sequences (oneincreasing and the other decreasing) (ak) and (bk) such that f(ak) ≤ 0 andf(bk) ≥ 0 for all k ∈ N recursively as follows:First put: a1 = a, b1 = b . Now suppose ak, bk are defined. Let ck =12(ak + bk) be the mid point of [ak, bk] . If f(ck) ≤ 0 set ak+1 = ck andbk+1 = bk . If f(ck) > 0 set ak+1 = ak and bk+1 = ck .We then have: a = a1 ≤ · · · ≤ ak ≤ ak+1 < bk+1 ≤ bk ≤ · · · ≤ b1 = b.Moreover bk+1 − ak+1 = 1

2(bk − ak) , so that by induction bk+1 − ak+1 =12k (b − a) → 0 as k → ∞ and hence lim ak = lim bk = x , say. Since f iscontinuous, lim f(ak) = f(x) ≤ 0 and lim f(bk) = f(x) ≥ 0 and thereforef(x) = 0 .

QED

Remark: The method used in the above proof can be implemented nu-merically (say on a computer) to find solve an equation of the form f(x) = 0.It is called the bisection method. The method of false position uses a similaridea, but this time instead of taking the midpoint of the interval at each in-teration step, one uses the intersection point of the x−axis with the straightline joining (ak, f(ak)) and (bk, f(bk)), i.e.

ck = (bkf(ak)− akf(bk))/(f(ak)− f(bk))

54

Page 63: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

.

Another simple consequence is the following:

Proposition 4.2.1 If f : [a, b] → R is continuous and injective then f iseither strictly increasing (i.e. x < y ⇒ f(x) < f(y)) or strictly decreasing.

Proof: We assume w.l.o.g. that f(a) < f(b). Let x < y ; x, y ∈ [a, b] . Defineg(t) = f(y(t))− f(x(t)) , where x(t) = a + t(x− a) and y(t) = b− t(b− y)for t ∈ [0, 1] , so that ∀ t ∈ (0, 1) y(t)− x(t) = (1− t)(b− a) + t(y − x) > 0 .g[0, 1] ⇒ R is a difference of composition of continuous functions and henceis continuous. At t = 0 : g(0) = f(b)− f(a) > 0 . If g(1) = f(y)− f(x) < 0 ,then ∃ t∗ ∈ (0, 1) such that g(t∗) = 0, by the Intermediate Value Theoremapplied to g . This contradicts the fact that f is injective since y(t∗) > x(t∗) ,so f(x) < f(y) .

QEDAn important consequence is the following:

Corollary 4.2.1 If f : [a, b] → [c, d] is continous and bijective, then theinverse function g = f−1 : [c, d] → [a, b] is continuous (and monotonic).

Proof: We may assume w.l.o.g. that f is strictly increasing. Let x ∈(a, b) , y = f(x) ∈ (c, d) and let ε > 0 be given. f maps the ε-neighbourhoodof x (in [a, b]) bijectively onto the interval U = (f(x + ε), f(x− ε)) contain-ing y in [c.d]. Pick δ = 1

10min(f(x + ε) − y , y − f(x − ε)

)so that the

δ-neighbourhood of y lies inside U . Then g = f−1 will map this neighbour-hood back into the given ε-neighbourhood U of x = g(y) .

QED

4.3 Continuity and Compactness

The main result of this section is the following extremely important

Theorem 4.3.1 The image of a compact set under a continuous map iscompact

Proof: Let K be a compact set and let f be continuous on K. Let⋃Uω|ω ∈

Ω be an open cover of f(K). Then since f is continuous,⋃f−1(Uω)|ω ∈

Ω is an open cover of the compact set K and so has a finite subcover:K ⊂ f−1(Uω1) ∪ f−1(Uω2) ∪ ... ∪ f−1(UωN ). It is now obvious that f(K) ⊂Uω1 ∪ Uω2 ∪ ... ∪ UωN .

55

Page 64: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

QED

As you can see from the proof given above, the theorem is valid in anytopological space, but for real-valued functions it boils down to the following“Extreme Value Theorem” of first year calculus:

Proposition 4.3.1 Let f be real-valued function continuous on a closed andbounded set K. Then f attains its absolute minimum and maximum valueson K

Proof: Since f(K) is a bounded set of R, both its least upper bound:sup(f(K)) and its greatest lower bound inf(f(K)) exist and since f(K)is closed they are both elements of f(K). QED

Combining the two basic theorems, we can now say that the continuousimage of a closed, connected and bounded interval [a, b] is also a closed,connected and bounded interval [m,M ] where m is the absolute minimumand M is the absolute maximum value of f that are attained on [a, b].

As an application of the above ideas and theorems let us state and provethe following beautiful result.

Theorem 4.3.2 Fixed Point TheoremLet f : [a, b] → [a, b] be a continuous map. Then ∃ p ∈ [a, b] such thatf(p) = p, i.e. p is a fixed point of the map f .

Proof: We assume that f(a) 6= a and f(b) 6= b because otherwise, either aor b is a fixed point. Since f maps [a, b] to [a, b] we then must have f(a) > aand f(b) < b. Let g be defined by g(x) = f(x) − x. g is then a continuousfunction such that g(a) > 0 and g(b) < 0, so by the Intermediate ValueTheorem ∃ p ∈ (a, b) such that g(p) = 0 which means that f(p) = p.

QED

The above is an example of an existence result that mathematicians findextremely useful. There is a generalization of the above theorem for certaindomains in Rn (or even to more general spaces) called the Brouwer FixedPoint Theorem.

Continuing in the same vein, I would also like to mention another veryimportant existence result result that is amazingly simple to state and provebut its appearance is ubiquitous like dandelions.

56

Page 65: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

4.3.1 The Contraction Mapping Principle

Theorem 4.3.3 : Let f : R → R be a continuous map with the followingcontraction property:∃C ∈ (0, 1) such that |f(x)− f(y)| ≤ C|x− y| for all x, y ∈ R.Then, ∃ a unique fixed point p ∈ R such that f(p) = p.

Proof:

Step 0. We first show that f is continuous:∀ε > 0 ∃ δ = ε

C such that |x− y| < δ ⇒ |f(x)− f(y)| ≤ C|x− y| < ε for allx, y .

Step 1. Construction of a sequence converging to the fixed point:Define: x1 = 1 and recursively: xk+1 = f(xk).

Step 2. Proof that the sequence is Cauchy:|xk+2 − xk+1| = |f(xk+1 − f(xk)| ≤ C|xk+1 − xk| and hence by induction:|xk+2 − xk+1| ≤ Ck|x2 − x1| for all k ∈ N. Therefore:

|xk+m+1−xk+1| ≤m∑

i=1

|xk+i+1−xk+i| < |x2−x1|m∑

i=1

Ck+i−1 < |x2−x1|Ck

1− C

Since limk→∞ Ck = 0 , (xk) is a Cauchy sequence and hence converges to alimit lim xk = p .

Step 3. Proof that xk converges to a fixed point:Since f is continuous, p = lim xk+1 = lim f(xk) = f(lim xk) = f(p).

Step 4. Proof that the fixed point is unique:If p and q are two fixed points, then |p− q| = |f(p)− f(q)| ≤ C|p− q| with0 < C < 1 so p = q.

QED

Remark: From the proof we can see that the theorem holds in any com-plete metric space (see Chapter 7 for the definition of a metric space).

4.4 Uniform Continuity

Definition 4.4.1 A function f : A → R , where A ⊂ Rn , is said to beuniformly continuous on A if and only if the following holds:∀ ε > 0, ∃ δ > 0 such that |f(x) − f(y)| < ε for all x, y ∈ A satisfying||x− y|| < δ.

57

Page 66: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

The key thing to note here is that the choice of δ should depend onlyon ε, the function f and the set A; it has to be independent of the points xand y.

Examples:

1 The function f(x) = x2 is not uniformly continuous on A = R, but it ison any bounded interval. It is not uniformly continuous on R , since ∃ ε = 2such that for every δ > 0 we can find two points x = n and y = n+ 1

n , suchthat |x− y| = 1

n < δ , but with |(n + 1n)2 − n2| = 2 + 1

n2 > 2 .

2. The function f(x) = sin(x) is uniformly continuous on all of R. (Theeasiest way to prove this is to use the mean value theorem from the nextchapter)

3. The function f(x) = 1x is not uniformly continuous on the interval (0, 1]

but is uniformly continuous on [1,∞) . It is not uniformly continuous on(0, 1] , since ∃ ε = 1 such that for every δ > 0 we can find two points 1

n and1

n+1 in (0, 1) , satisfying | 1n −1

n+1 | =1

n(n+1) < δ , but with |f( 1n)−f( 1

n+1)| =1 . It is uniformly continuous in (0,∞) , because | 1x −

1y | = |x−y|

|xy| < |x − y|for all x, y ∈ (0,∞) , so for any ε > 0 , we can choose δ = ε uniformlyindependent of x and y .

The main result of this section is the following:

Theorem 4.4.1 A continuous function on a compact set is uniformly con-tinuous.

Proof : Let f be a continuous function defined on a compact set K. If fis not uniformly continuous then there exists an ε > 0 such that for each kwith δ = 1

k , we can find two points xk, yk ∈ K so that ||xk − yk|| < δ = 1k

but with |f(xk) − f(yk)| ≥ ε. By the Bolzano-Weierstraas property thereexists convergent subsequences xkl

, yklconverging to x∗ ∈ K and y∗ ∈ K

respectively. x∗ = y∗ since ||xk − yk|| < 1k for all k. Since f is a continuous

function f(xkl) → f(x∗) and f(ykl

) → f(y∗) = f(x∗) contradicting the factthat there exists a fixed ε > 0 with |f(xk)− f(yk)| ≥ ε for all k.

QED

58

Page 67: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

4.5 Exercises

1. Prove that every polynomial of odd degree with real coefficients has atleast one real root.

2. Let f : Rn → R be a continuous function. Show that the zero-set of f ,i.e., the set x ∈ Rn | f(x) = 0 is closed in Rn .

3. Let f, g : [a, b] → R be two continuous functions such that f(r) = g(r)for all rational r ∈ [a, b] ∩Q. Show that f(x) = g(x) for all x ∈ [a, b] .

4. Let f : R → R be defined by f(x) = 1 if x ∈ Q and f(x) = 0 otherwise.Show that limx→0 f(x) does not exist, but that limx→0 x f(x) = 0 .

5. Let f : K → R be continuous on a compact interval K. Suppose that∀x ∈ K ∃y ∈ K such that |f(y)| ≤ 1

2 |f(x)| . Prove that ∃a ∈ K such thatf(a) = 0 .

6. If f, g : [a, b] → R are continuous real-valued functions, show thatmax(f, g) is also continuous. (max(f, g) is obviously the function definedby: max(f, g)(x) = max(f(x), g(x)) ).

7. Show that the function f : (0,∞) → R ; f(x) =√

x is uniformlycontinuous but that the function g : (0,∞) → R ; g(x) = x2 is not uniformlycontinuous.

8*. Show that there does not exist continuous function f : R → R whichtakes on each value (that it takes on) exactly twice. (in other words thepre-image of a point under f is either empty or consists of exactly twopoints).

9. Is the composition of two uniformly continuous functions uniformly con-tinuous? Justify your answer.

10. Show that if f : R → R is uniformly continuous and if (xn) is a Cauchysequence, then f(xn) is also a Cauchy sequence.

11. Show that a uniformly continuous real valued function defined on abounded (but not necessarily compact!) set B ⊂ Rn is bounded on B .

59

Page 68: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

4.5.1 Short solutions to some of the exercise problems

1. This follows from the Intermediate Value Theorem.

2. 0 is a closed set in R , because its complement is the union of twoopen intervals: (∞, 0)∪(0,∞) . Since the inverse image of a closed set undera continuous map is closed, f−1(0) = x ∈ Rn | f(x) = 0 is closed inRn .

3. The function h(x) = f(x) − g(x) is continuous and h(r) = 0 for allr ∈ Q ∩ [a, b] . Suppose that there exists an irrational number x ∈ [a, b] ,such that h(x) 6= 0 . Let ε = 1

2 |h(x)| > 0 . Then ∃δ > 0 such that |y − x| <δ ⇒ |h(y) − h(x)| < 1

2 |h(x)| ⇒ |h(y)| > 12 |h(x)| > 0 for all y ∈ [a, b] . This

is a contradiction, since every neighbourhood of x contains a rational pointr with h(r) = 0 .

Another way to see it is because the the set of points where f − g = 0 isa closed set (since it is the inverse image of the closed set consisting of thesingle point 0 under the continuous map f − g ). Now the smallest closedset containingg Q is its closure which is all of R .

4. Suppose limx→0 f(x) exists and is equal to some real number a . Thenfor ε = 1

2 (say),∃δ > 0 such that |f(x)− a| < 12 for all x ∈ (−δ,+δ). Now

each such neighbourhood contains both an irrational point s and a rationalpoint r ; but then |f(s)− f(r)| = 1 > 1

2 . Contradiction!

Since f(x) is either zero or one, |x f(x)| ≤ |x| for all x . Therefore: ∀ε > 0 ,we choose δ = ε to have |x − 0| < δ ⇒ |xf(x) − 0| ≤ |x| < ε . Thereforelimx→0 x f(x) = 0 .

6. Since |f | is a continuous function, it attains its absolute minimumvalue at a certain point a ∈ [0, 1] . If |f(a)| > 0 , we obtain a contradictionfrom the assumption that there is another point y , where |f(y)| ≤ 1

2 |f(a)| .Therefore f(a) = 0 .

7.√

x is continuous on [0,∞) and hence it is uniformly continuous on anycompact subset, for example on [0, 1] . Now, if x, y ≥ 1 , then

|√

x−√y| = |x− y||√

x +√

y|≤ 1

2|x− y|

This shows that√

x is uniformly continuous on [1,∞) . (Choose δ = 2ε )!

60

Page 69: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

However the function g(x) = x2 is not uniformly continuous on (0,∞) .Consider the two sequences xn = n and yn = n + 1

n . |xn − yn| = 1n , but

|(xn)2− (yn)2| = 2+ 1n2 ≥ 2 . So there exists “ε” = 2 > 0 such that for each

δ > 0 , we can find points xn and yn in (0,∞) with |xn − yn| = 1n < δ and

|(xn)2 − (yn)2| ≥ 2

11. Let f be uniformly continuous on a bounded set B . Suppose, on thecontrary that f is not bounded. Then there exists a sequence (xn) in B ,such that limn→∞ f(xn) = ∞ (or −∞). Since (xn) is a bounded sequence,by Bolzano-Weierstrass, it has a subsequence: yk = xnk

which is convergentand hence Cauchy. We want to show now that f(yk) is also a Cauchysequence and hence convergent, contradicting the fact that limn→∞ f(xn) =∞ (or −∞).So let ε > 0 be given. Then since f is uniformly continuous, ∃ δ > 0 , suchthat |yk+l − yk| < δ ⇒ |f(yk+l) − f(yk)| < ε for any k, l ∈ N . Since (yk)is a Cauchy sequence, ∃K ∈ N such that k ≥ K ⇒ |yk+l − yk| < δ ⇒|f(yk+l)− f(yk)| < ε for all l .

61

Page 70: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

62

Page 71: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Chapter 5

Differentiation

5.1 Definition and basic properties

Definition 5.1.1 Let f be a real-valued defined in an open neighbourhood Uof a ∈ R. f is said to be differentiable at a if there exists a real number f ′(a)called the derivative of f at a , such that the folowing holds: ∀ ε > 0, ∃ δ > 0s. t. ∀x ∈ U , |x− a| < δ ⇒ |f(x)− f(a)− f ′(a)(x− a)| < ε |x− a| .If f is differentiable at each point of an open set A, then we say that f isdifferentiable in A.

In other words, f is differentiable at a point a if there is a number ccalled the derivative of f at a such that the error function r(x) = f(x) −(f(a) + c(x − a)), that remains after approximating f near a by the linear(affine) function l(x) = f(a) + c(x− a) satisfies:

limx→a, x6=a

r(x)x− a

= 0

.It follows that if f is differentiable at a, then f is continuous at a , since

near a , f differs from a simple (linear) function l(x) , which is definitelycontinuous, by a term r(x) that goes to 0 (even faster than |x − a| !) asx → a . This suggests the notation:

Definition 5.1.2 We say that a function φ(x) is o(|x− a|) (read: little-oh)as x → a if limx→a

φ(x)x−a = 0

We say that φ(x) is O(|x − a|) (read: big-oh) as x → a if the ratio|φ(x)||x−a| = l stays bounded as x → a

63

Page 72: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

The following arithmetic properties for derivatives then follow easily fromthe corresponding properties for limits

Proposition 5.1.1 If f and g are differentiable at a, then so are f +g, f −g, f.g and f/g (provided g(a) 6= 0) and we have the following formulas:

(i) (f ± g) ′(a) = f ′(a)± g ′(a)

(ii) (Product Rule) (f.g) ′(a) = f ′(a).g(a) + f(a).g ′(a)

(ii) (Quotient Rule) (fg ) ′(a) = f ′(a).g(a)−f(a).g ′(a)

(g(a))2

Proof:

(i) This is quite trivial.

(ii)f(x)g(x)− f(a)g(a) = (f(x)− f(a))g(a) + f(x)(g(x)− g(a)) = (c1(x− a) +r1(x))g(a)+f(a)(c2(x−a)+r2(x)) , where c1 = f ′(a), c2 = g ′(a) and r1, r2

are the corresponding remainder terms. It follows thatf(x)g(x) − (f(a)g(a) + (c1g(a) + f(a)c2)(x − a) = r1(x)g(a) + f(a)r2(x) iso(|x − a|) since r1 and r2 are o(|x − a|). Therefore, f.g is differentiable ata with derivative f ′(a).g(a) + f(a).g ′(a).

(iii)The special case where f is constant = 1 follows easily from the calculation:

limh→0

1h

( 1g(x + h)

− 1g(x)

)= lim

h→0

1g(x + h)g(x)

g(x)− g(x + h)h

The general case then follows from the product rule.

QED

Another extremely important formula is the following:

Proposition 5.1.2 The Chain RuleIf g is differentiable at a and f is differentiable at g(a), then the com-

posite f g is differentiable at a with derivative given by:

(f g) ′(a) = f ′(g(a)). g ′(a)

64

Page 73: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proof:Let b = g(a) , y = g(x) , c1 = f ′(a) , and c2 = f ′(b) .f(y) − f(b) = c1(y − b) + r1(y) , where r1(y) is o(|y − b|) and y − b =c2(x− a) + r2(x) with r2(x) in o(|x− a|) .Since f(y)− f(b) = c1(c2(x− a)+ r2(x))+ r1(y) , the chain rule now followsbecause |y − b| is O(|x− a|) .

QED

Examples

0. The constant functions f(x) = c are all differentiable everywhere. (Theirderivatives are dead zero everywhere !)

1. The “identity map”’ f(x) = x is differentiable. (f(x)− f(a) = x− a , sothe derivative is dead constant = 1 everywhere !)

1(bis). By the elementary arithmetic for derivatives (the last two proposi-tions) we see that all polynomials and rational functions (wherever they aredefined) are differentiable and their derivatives can be computed easily andmechanically (by Maple for example!) by the above rules.

2. As you all know, the basic transcendental function exp(x) = ex is alsodifferentiable everywhere. However, this fact does not follow from the ratio-nal arithemetic rules above and has to be proved. The most natural place todo this in Chapter 7, when we discuss uniform convergence of power series,but I will sketch an elementary proof that the derivative of ex at x = a isequal to ea :Proof: By the functional equation ex+y = ex ey , we have:

ea+h − ea

h= ea eh − 1

h

but by the definition of the exponential function: eh − 1 =∑∞

k=1hk

k! =h + r(h) where r(h) = h2

∑∞k=2

hk−2

k! is obviously o(|h|) , by the estimateswe learned in Chapter 2.

65

Page 74: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

5.2 Local properties of the derivative

Suppose f is differentiable at a and that c = f ′(a) > 0. Let l(x) = f(a) +c(x − a) . By the definition of the derivative ∃ a δ-neighbourhood Uof a,where we have |f(x) − l(x)| < 1

10c|x − a| . This implies (for every x ∈ U )f(x) > l(x)− 1

10c(x−a) = f(a)+ 910c(x−a) which is > f(a) if x > a and <

f(a) if x < a . So there exists δ > 0 such that a− δ < x < a ⇒ f(x) < f(a)and a < x < a + δ ⇒ f(x) > f(a) . It follows that f is strictly increasing ina neighbourhood where the derivative is strictly positive. We can apply thesame argument to −f to show

Proposition 5.2.1 A differentiable function is strictly increasing in anopen interval where the derivative is strictly positive (everywhere in thatopen interval) and is strictly decreasing if the derivative is strictly negative.

As you well remember from first year, a relative or local maximum of afunction f is a point a such that f(x) ≤ f(a) for all x in a neighbourhood ofa. Similarly, a relative or local minimum is a point b where f(x) ≥ f(b) forall x in a neighbourhood of b. We say a is a strict relative maximum (resp.minimum) if we have strict inequalities in the above definition (except ofcourse at x = a !). A relative extremum is then the collective term usedfor either a relative maximum or a relative minimum. Since f can neitherbe strictly increasing nor strictly decreasing near a relative extremum, weobtain as a corollary of the above proposition the following familiar fact:

Corollary 5.2.1 The derivative of a differentiable function vanishes at rel-ative extrema.

In fact we can say more if we assume that not only f but also it’sderivative f ′ is differentiable near a relative extremum a

Corollary 5.2.2 Suppose both f and f ′ are differentiable in a neighbour-hood of a and assume f ′(a) = 0 . Then

(i) f ′′(a) > 0 ⇒ a is a strict relative minimum.

(ii) f ′′(a) < 0 ⇒ a is a strict relative maximum.

Proof: Just apply the proposition to the derivative of f .QED

Another related important result is the behaviour of a differentiable func-tion at a point where the derivative does not vanish. We will prove the gen-eral form of this result, called the Inverse Function Theorem in Rn in the

66

Page 75: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

next section, but for now, let me derive a simple formula for the derivativeof the inverse function (not the reciprocal!) of a continuously differentiablefunction at a point where the derivative is non-zero (so it is either positiveor negative).

Definition 5.2.1 f is asaid to be continuously differentiable or C1 in anopen interval if f is differentaible at all points of the interval and the deriva-tive f ′ is continuous in that interval.

Proposition 5.2.2 Suppose that f is continuously differentiable in a neigh-bourhood of a and that f ′(a) 6= 0. Then f is bijective in a neighbourhood of aand the inverse function g = f−1 is differentiable at b = g(a) with derivativegiven by

g ′(b) =1

f ′(a)

Proof: If we can assume that the inverse function is differentiable at b thenthe formula in fact follows from the chain rule. We know that f is strictlyincreasing (let’s say) in a neighbourhood of a so it maps a small intervalU around a bijectively onto a small interval V around b = f(a). To provethat the inverse function is differentiable, let y = f(x) so that x = g(y) forx ∈ U , y ∈ V . Now

g(y)− g(b)y − b

=x− a

f(x)− f(a)=

(f(x)− f(a)x− a

)−1

provided x 6= a so that f(x) 6= f(a) . (This is where we used the factthat f ′(a) 6= 0 ). Now take the limit on both sides and use the fact thatx → a ⇔ y → b since f is continuous.

QED

5.3 Some global properties of the derivative

Proposition 5.3.1 Rolle’s TheoremSuppose that f : [a, b] → R is a continuous function which is differen-

tiable at all points of the open interval (a, b) and assume that f(a) = f(b).Then ∃x ∈ (a, b) where f ′(x) = 0 .

Proof Since [a, b] is compact and f is continuous on a compact interval[a, b], f attains both its absolute maximum and its absolute minimum valueat some point in [a, b]. If this point is either a or b, then the function is

67

Page 76: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

constant on the whole interval because of our assumption f(a) = f(b). Aconstant function has derivative zero everywhere. If f is not constant eitherthe absolute minimum or the absolute minimum occurs at an interior pointin (a, b), but there the function is differentiable and by the corollary fromthe last section the derivative vanishes (i.e. it is zero) at that point.

QEDBy applying Rolle’s Theorem to the function

g(x) = f(x)− f(b)− f(a)b− a

(x− a)

which obviously satisfies g(a) = g(b)(= f(a)) and the other continuity anddifferentiability requirements of Rolle’s theorem, provided f does, we easilyderive the following major result of this section:

Theorem 5.3.1 Mean Value TheoremSuppose that f : [a, b] → R is a continuous function which is differen-

tiable at all points of the open interval (a, b) . Then ∃x ∈ (a, b) where

f ′(x) =f(b)− f(a)

b− a

.

The following (very important) corollaries are immediate consequencesof the Mean Value Theorem: (I remember telling you how they follow fromthe mean value theorem in Math 1A03!)

Corollary 5.3.1 Suppose f is continuous on [a, b] and differentiable in(a, b). If f ′(x) = 0 forall x ∈ (a, b) , then f is constant on [a, b] .

Corollary 5.3.2 Suppose f is continuous on [a, b] and differentiable in(a, b). If f ′(x) > 0 forall x ∈ (a, b) , then f is strictly increasing on [a, b] ,i.e. ∀x1, x2 ∈ [a, b] x1 < x2 ⇒ f(x1) < f(x2) .

Corollary 5.3.3 Suppose f is continuous on [a, b] and differentiable in(a, b). If f ′(x) < 0 forall x ∈ (a, b) , then f is strictly decreasing on [a, b] .

Obviously if we weaken the assumption to f(x) ≥ 0 (resp. f(x) ≤ 0),we have to drop the word “strictly”. We can only claim that the functionis non-decreasing (resp. non-increasing). We also note that in contrast tothe last section, the statements here are global in the sense that it appliesto the whole interval [a, b] . We of course also assume more. The condition

68

Page 77: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

on the derivative is not just at a point! Applying this kind of argument tothe derivative of f we have the following global result about convexity, butfirst the definition:

Definition 5.3.1 A real valued function f (not necessarily differentiable)is said to be convex on [a, b] if the following inequality holds:

f((1− t) x + t y

)≤ (1− t) f(x) + t f(y)

for all t ∈ [0, 1] and x, y ∈ [a, b] .

If z is the point (1 − t) x + t y , then t = z−xy−x and 1 − t = y−z

y−x . Soconvexity implies

(i) f(z)− f(x) ≤ t(f(y)− f(x)

)(ii) f(z)− f(y) ≤ (1− t)

(f(x)− f(y)

)This proves

Proposition 5.3.2 If f is convex on [a, b] , then for any three points x <z < y in the interval [a, b] ,

f(z)− f(x)z − x

≤ f(y)− f(z)y − z

Proof: Just put t = z−xy−x and 1− t = y−z

y−x in (i) and (ii) above.

Now by the mean value theorem: f(z)−f(x)z−x = f ′(p) and f(y)−f(z)

y−z =f ′(q) for some p ∈ (x, z) and q ∈ (z, y) . So if f ′(p) ≤ f ′(q) for all p < q in[a, b] , f would be convex. We have thus proved the following propositionsand its corollary:

Proposition 5.3.3 Suppose f is continuous on [a, b] and differentiable in(a, b). If f ′ is monotonically increasing in (a, b) , then f is convex on [a, b] .

Corollary 5.3.4 Suppose f is continuous on [a, b] and twice differentiablein (a, b). If f ′ ′(ξ) ≥ 0 for all ξ ∈ (a, b) , then f is convex on [a.b] .

69

Page 78: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

5.3.1 The Logarithm

The natural logarithm is the inverse function of the exponential function.As we have seen above, the exponential functions exp(x) = ex has derivativeexp ′(x) = exp(x) , which is always positive since it is a square (ex = e

x2 e

x2 =(

ex2

)2> 0 ). Moreover, since limx→+∞ ex = +∞ (because ex > 1 + x for

all x > 0 by definition) and limx→−∞ ex = 0 (because e−x = (ex)−1 ),exp : R → (0,∞) is a strictly increasing function bijective map. Its inverseis called the natural logarithm.

Definition 5.3.2 The (natural) logarithm log : (0,∞) → R is defined tobe the inverse funsction of the exponential function. It therefore satisfieslog(ex) = x for all x ∈ R and elog(y) = y for all y > 0 .

By what we know about inverse functions in general, it now follows thatlog : (0,∞) → R is a strictly increasing bijective and differentiable mapwith derivative given by:

log ′(y) =1y

5.4 Differentiation in Rn

The derivative of a function of several variables is a linear map, so we willuse the notation L(Rn; Rm) for the vector space of all linear maps from Rn

to Rm.

Definition 5.4.1 Let f be a Rm-valued defined in an open neighbourhoodU of a ∈ Rn. f is said to be differentiable at a with derivative f ′(a) ∈L(Rn; Rm) if and only if the following holds: ∀ ε > 0, ∃ δ > 0 such that∀x ∈ U , ‖x− a‖ < δ ⇒ ‖f(x)− f(a)− f ′(a)(x− a)‖ < ε‖x− a‖

In other words, f is differentiable at a point a if there is a linear mapA = f ′(a) called the derivative of f at a such that the error function r(x) =f(x) − (f(a) + A(x − a)), that remains after approximating f near a bythe linear (affine) function l(x) = f(a) + A(x − a) is in o(‖x − a‖) , i.e., itsatisfies:

limx→a, x6=a

r(x)‖x− a‖

= 0

70

Page 79: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Differentiability obviously implies continuity as in the scalar case n = 1 .If f is differentiable at each point of an open set U , then we say that fis differentiable in U . f is said to be C1 on U , if the map a 7→ f ′(a) iscontinuous in U .

We have the usual arithmetical rules for the derivative as in the case ofn = 1:

Proposition 5.4.1 If f and g are differentiable at a, then so are f +g, f −g,< f, g >, wherever they are defined. (Here <,> is the scalar product).The following formulas hold:

(i) (f ± g) ′(a) = f ′(a)± g ′(a)

(ii) (Product Rule) < f, g > ′(a)(v) =< f ′(a)(v), g(a) > + < f(a), g ′(a)(v) >

Proposition 5.4.2 The Chain RuleIf g is differentiable at a and f is differentiable at g(a), then the com-

posite f g (if defined) is differentiable at a with derivative given by:

(f g) ′(a) = f ′(g(a)) g ′(a)

All the proofs are appropriate vectorial modifications (like using ‖, ‖instead of |, | ) of the proofs I gave you for the scalar case and does notinvolve any fundamentally new ideas, so I will skip them. Instead I will nowstate and prove the Inverse Function Theorem and its Corollary the ImplicitFunction Theorem.

5.4.1 Inverse Function Theorem

Theorem 5.4.1 Let f : U → Rn be a C1 map, where U is an open setin Rn, containing a . If A = f ′(a) = dfa is invertible, then there ex-ists a neighbourhood V of a such that f restricted to V maps V bijec-tively onto W = f(V ). Moreover the inverse function g = f−1 is con-tinuously differentiable on W with derivative given by the Chain Rule (sothat dgf(x) = (dfx)−1 ).

71

Page 80: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proof:

Step 0.By two translations, we may assume that a = 0 and f(a) = 0 . Furthermoreby composing f with a linear map A−1 = (df0)−1 we may also assume thatdf0 = I , where I is the identity matrix. Now let h(x) = f(x) − x . Thendh0 = 0 .

Step 1.Since f and hence h are C1 , ∃ r > 0 , s.t. ‖dhx‖ < 1

2 , for all x ∈ B3r(0) ,where we define ‖A‖ to be the maximum length of all column vectors of A .Applying the Mean Value Theorem to the components of h we find that forevery x1, x2 ∈ B3r(0) , |f(x1)−x1−f(x2)+x2| = |h(x1)−h(x2)| < 1

2 |x1−x2| .So if f(x1) = f(x2) , then x1 = x2 . Therefore f is injective on B3r(0)

Step 2.To prove surjectivity, we pick a point y in Br(0) and define (inductivelyand constructively!) a sequence as follows:

x0 = 0 , x1 = y , xk+1 = y − h(xk) = y − f(xk) + xk

Step 3.As in Step 1, we have: |xk+1 − xk| = |h(xk) − h(xk−1)| < 1

2 |xk − xk−1 ,provided xk , xk+1 ∈ B2r By induction we show that xn ∈ B2r(0) , andhence that |xn+1 − xn| < 1

2 |xn − xn−1| for all n .For n = 0: x0 = 0, x1 = y ∈ Br(0) ⊂ B2r(0) and since ‖dh‖ < 1

2 in B3r(0),|x2 − x1| = |h(x1)− h(x0)| < 1

2 |x1 − x0| .Assume xk ∈ B2r(0) , for all k ≤ n . Then |xk+1 − xk| < 1

2 |xk − xk−1| <· · · < 2−k|x1 − x0| for all k ≤ n and hence by summing up the geometricseries we obtain: |xn+1| = |xn+1−x0| ≤

∑nk=0 |xk+1−xk| < 2|x1−x0| < 2r .

Step 4.We claim now that xk is a Cauchy sequence and hence converges (Thiswas actually proved in Problem # 1 of your Assignment #2 !). By thesame argument as in Step 3, |xn+k − xn| ≤ 2−n|x1 − x0| = |y| , so xn isa Cauchy sequence and it converges to a limit x , which is in B2r , since|xn − x0| < 2|x1 − x0| = 2|y| for all n.

Step 5.We have thus proved that f restricted to the inverse image of Br(0) isa bijective map. The differentiability of the inverse function follows in asimilar fashion as in the case of a single variable and the chain rule thengives the formula for the derivative of the inverse function.

72

Page 81: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Corollary 5.4.1 (Implicit Function Theorem) Suppose f = (f1 , . . . , fm) :Rn × Rm → Rm is continuously differentiable in an open neighbourhood of(a, b) and let f(a, b) = 0 . Let A be the m×m matrix

(∂

∂xn+kfk(a, b)

)k=1,...,m

.If det(A) 6= 0 , then there exists an open neighbourhood U of a and an openneighbourhood V of b such that ∀x ∈ A , there exists a unique y = g(x) ∈ Bsatisfying the implicit equation: f(x, g(x)) = 0 . Moreover g is differentiablein A .

(Sketch of proof): Apply the inverse function theorem to the functionF : Rn × Rm → Rn × Rm , defined by F (x, y) = (x, f(x, y)) . The inverse ofF is of the form F−1(x, y) = (x,G(x, y)) . Then g(x) = G(x, 0) will satisfythe equation f(x, g(x)) = 0 .

73

Page 82: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

5.5 Exercises

1. Show that f : (0,∞) → R defined by f(x) = xn ex ; n > 0 attains itsabsolute maximum value at the point x = n .

2. Let f : R → R satisfy the inequality:

|f(x)− f(y)| ≤ C(x− y)2

for all x, y ∈ R , where C is a fixed positive constant. Prove that f is aconstant function.

3. Suppose f is a differentiable real-valued function defined on R satisfying|f ′(x)| ≤ 10 for all x ∈ R . Show that the function F (x) = x + 1

100f(x) isan injective (one-to-one) map.

4. Let f(n) denote the nth iterate of f, i.e. f(n) = f f . . . f (n-times).Express the derivative of f(n) in terms of f ′ and prove that if m ≤ |f ′| ≤ M ,then mn ≤ |f(n)

′| ≤ Mn .

5. Let p1, p2, . . . , pn be positive numbers satisfying∑n

k=1 pk = 1 and letf : R → R be a convex function. Prove Jensen’s inequality:

f( n∑

k=1

pk xk

)≤

n∑k=1

pk f(xk)

for all real numbers x1, x2, . . . , xn .

6.Let f : R → R be a twice continuously differentiable function with

f(0) = 0, f(1) = 1 , and with f ′(0) = f ′(1) = 0 . Prove that ∃x ∈ [0, 1]with |f ′′(x)| ≥ 4| .In more physical terms; a particle which travels a unit distance in unit timeand starts and ends with zero velocity (like landing on Mars!) has at leastat some time an acceleration ≥ 4 (in absolute value).

7. Let f : [0, 1] → [0, 1] be continuous on [0, 1] and differentiable inthe interior (0, 1) with f(0) = 0 . Assume that |f ′(x)| ≤ 10|f(x)| for allx ∈ (0, 1) . Prove that f(x) = 0 for all x ∈ [0, 1] .

74

Page 83: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

8. Let f : R → R be a strictly increasing and convex function which isthree times differentiable. Assume that: f(0) = 0 . Starting with an initialvalue x1 > 0 , we now define inductively a sequence by the formula (Newton’smethod ):

xn+1 = xn −f(xn

f ′(xn)

Prove that limn→∞ xn = 0 .

9. The Legendre polynomial of order n is defined by:

Pn(x) =1

2n n!dn

dxn

((x2 − 1)n

)Prove that:(i) Pn has exactly n distinct zeros in (−1,+1)(ii) Pn satisfies the differential equation:

(1− x2)Pn′′(x)− 2xPn

′(x) + n(n + 1) = 0

5.5.1 Hints and short solutions

6. First we prove a little lemma:If a twice differentiable function h satisfies:h(0) = h ′(0) = 0 and h ′′(t) < 4 ,for all t ∈ [0, 1] , then h(1

2) < 12 .

Proof of Lemma: Since h ′′(t) < 4 , the function h ′(t) − 4 t is strictly de-creasing and hence h ′(t) − 4 t < h ′(0) − 0 = 0 for 0 < t < 1 . This impliesthat h(t)−2 t2 is also strictly decreasing and hence h(1

2)− 12 < h(0)−0 = 0 .

q.e.d.If |f ′′(t)| < 4 for all t ∈ [0, 1] , then both f(t) and g(t) = f(1) − f(1 − t)satisfy the assumptions of the lemma and hence f(1

2) < 12 and f(1)−f(1

2) <12 contradicting the fact that f(1) = 1 !

75

Page 84: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

76

Page 85: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Chapter 6

Integration

6.1 Definition of the Riemann Integral

Let f be a bounded (not necessarily continuous) real valued function de-fined on a closed (and bounded) interval [a, b] ⊂ R or more generally, on amultidimensional rectangle R =

∏ni=1[ak, bk] ⊂ Rn. We denote the length

(or more generally the n-dimensional volume ) of R by |R|. (|R| is just theproduct of all the side lengths of R and is always positive). A partition Rof R is a finite collection of subintervals (subrectangles) Ri ⊂ Ri=1,··· ,N ,whose interiors are all mutually disjoint and whose union is R. A refine-ment of a partition R is a partition S such that each subrectangle Sj of Sis contained in a subrectangle Ri of R. For each interval (or rectangle) Ri,we set m(f,Ri) = inf(f(Ri)) and M(f,Ri) = sup(f(Ri)). These numbersare finite since f is bounded. Now define:

US(f,R) =∑N

i=1 M(f,Ri)|Ri|

LS(f,R) =∑N

i=1 m(f,Ri)|Ri|UI(f) = infUS(f,R) |R is a partition of R and

LI(f) = supLS(f,R) |R is a partition of R

These are all (finite) real numbers for any bounded function defined ona closed and bounded rectangle. US stands for upper sum, LS for lowersum, UI for upper integral and LI for lower integral. It is obvious from thedefinitions that US ≥ LS for any partition and that LS(f,R) ≤ LS(f,S)and US(f,R) ≥ US(f,S) if S is a refinement of R, since inf(A) ≤ inf(B)and sup(A) ≥ sup(B) if B ⊂ A and volume is an additive function on

77

Page 86: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

rectangles. Given two partitions R and S, we can define their join (orcommon refinement) R∨S to be the partition consisting of all intersectionsRi ∩ Sj . R ∨ S is a refinement of both R and S. By looking at the upperand lower sums of this common refinement we arrive at the following:

Proposition 6.1.1 UI(f) ≥ LI(f) for any bounded function f defined ona bounded rectangle.

Definition 6.1.1We say that f is Riemann-integrable on R if UI(f) = LI(f) and the com-mon value is defined to be the integral of f on R:∫

Rf = UI(f) = LI(f)

The function defined by f(r) = 1 for r ∈ Q and f(x) = 0 is discontinuousat every point and is not Riemann-integrable on any closed interval, becausethe upper sums are always the length of the interval and all the lower sumsare zero. Next year in Math 3A03 you will learn that this function is in-tegrable in the more general Lebesgue sense and that the integral is zero,because although Q is dense in R, it is still a negligible set from the pointof view of Lebesgue measure.

We will show that all continuous functions are integrable on a closedbounded interval.

Proposition 6.1.2 If f is continuous on a closed and bounded rectangleR ⊂ Rn, then f is integrable.

Proof: We know that f is bounded and uniformly continuous on thecompact set R. Let ε > 0 be given. Then ∃δ > 0 such that ||x− y|| < δ ⇒|f(x)− f(y)| < ε

|R| . Now let P be a partition where every subrectangle has

all side lengths less than δ10√

nso that any two points in a subrectangle are at

a distance less than δ apart. This means that the minimum and maximumvalues taken by f on each of these subrectangles are less than ε

|R| apart, sothat the difference between the upper sum and lower sum for this partitionis less than ε which was arbitrary, proving that the upper and the lowerintegrals coincide.

QED

Now let f be a bounded function defined on an arbitrary bounded setA ⊂ Rn. We extend the definition of f to a rectangle R which contains

78

Page 87: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

A by simply setting f(x) = 0 for all points x /∈ A. We then say that f isintegrable on A if this exetended function is integrable on R and define theintegral to be the integral over R of the extended function. The integral, ifit exists, does not depend on the extension. However, there are complicatedbounded sets A, for which even constant functions are not integrable.

6.2 Basic properties of the Integral

The three main arithmetic properties of the integral are the following:

Proposition 6.2.1 If f and g are (Riemann-)integrable functions on abounded set A, then

(Linearity)∫A(c1f + c2g) = c1

∫A f + c2

∫A g

(Monotonicity) If f(x) ≥ 0 for all x ∈ A, then∫A f ≥ 0. In fact if f(x) > 0

for all x ∈ A, then∫A f > 0

Proposition 6.2.2 (Additivity) If A1 and A2 are disjoint sets, on which fis integrable, then

∫A1∪A2

f =∫A1

f +∫A2

f

We will leave the proof of these simple properties as an exercise for you.Monotonicity is an important property and we state some immediate

consequences:

M1 If f(x) ≥ g(x) for all x ∈ A, then∫A f ≥

∫A g.

M2 If m ≤ f(x) ≤ M for all x ∈ A, then m|A| ≤∫A f ≤ M |A|.

M3∣∣∣ ∫

f(x)∣∣∣ ≤ ∫

|f | provided |f | is integrable.

Another property that follows immediately from the above properties isthe following Mean Value Theorem for integrals.

Theorem 6.2.1 If f is continuous on [a, b] then there exists a value c ∈(a, b) such that ∫ b

af(t) dt = f(c) (b− a)

79

Page 88: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proof: Let m = f(p) be the absolute minimum value of f on [a, b] andM = f(q) be its absolute maximum value. Then by monotonicity: m =f(p) ≤ 1

b−a

∫ ba f(t) dt ≤ M = f(q) . Now since f is continuous, by the

Intermediate Value Theorem, there exists a point c (between p and q ) suchthat f(c) = 1

b−a

∫ ba f(t) dt

QED

6.3 Fundamental Theorem of Calculus

We state and prove now the fundamental theorem of calculus, which relatesintegration and differentation for functions of a single variable. The for-mulation of the corresponding theorems in higher dimensions (for example,Stokes’ theorem that you learned last term in in Math 2A03) are a lot moreelaborate although fundamentally the proofs are based on the following onedimensional fundamental theorem of calculus.

Theorem 6.3.1 If f is continuous on [a, b] then the function defined by

F (x) =∫ x

af(t) dt

for x ∈ [a, b] is continuous on [a, b] and differentiable in (a, b) with deriva-tive given by F ′(x) = f(x) .

Proof: Let x ∈ (a, b) . For h 6= 0 sufficiently small we have, by theadditivity property and the mean value theorem for integrals:

F (x + h)− F (x) =∫ x+h

xf = h f(ch)

for some ch , where ch ∈ (x, x + h) , if h > 0 and ch ∈ (x + h, x) if h < 0 .Now since f is continuous f(ch) → f(x) as h → 0 . Therefore

limh→01h

(F (x + h)− F (x)) = f(x)

It is easy to see from the estimate above that not only is F continuous on[a, b], but in fact, it is Lipschitz continuous.

QED

80

Page 89: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

The function F (x) =∫ xa f(t) dt is then what is known as an “antideriva-

tive” in first year calculus, since it satisfies F ′(x) = f(x) . It is also some-times called the indefinite integral. Two antiderivatives differ by a constanton any connected interval. This follows from the mean value theorem forderivatives, since two antiderivatives have the same derivative! Perhaps amore familiar form of the fundamental theorem that you remember fromfirst year calculus is the following:

Corollary 6.3.1 Let F : [a, b] → R be an antiderivative for f : [a, b] →R , in the sense that F is continuous on [a, b] , differentiable in (a, b) withF ′(x) = f(x) . Then ∫ b

af(t) dt = F (b)− F (a)

6.4 Improper Integrals

Improper integrals are integrals for unbounded sets and/or unbounded func-tions. We will restrict ourselves in these notes to the real line.

Let (a, b) be an open interval in R. To deal with unbounded intervals,we will allow a to be −∞ and b to be ∞. Let f : (a, b) → R be a functionnot necessarily bounded or continuous. Then:

Definition 6.4.1 f is said to be integrable on (a, b) if f is locally-integrable,i.e. f is integrable on every closed and bounded subinterval of (a, b) and if∫ b

af = lim

c→a+

(lim

d→b−

∫ d

cf

)exists (as a finite limit).

If the limit exists we say that the improper integral is convergent. Theorder of the limits in the definition actually does not matter.

Here are some familiar examples from first year calculus:

1.∫ 10 x−pdx is convergent iff p < 1.

2.∫∞1 x−pdx is convergent iff p > 1.

81

Page 90: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

3.∫∞0 e−xdx converges to 1.

4.∫∞−∞ e−x2

dx converges to√

π.

In order to test other improper integrals we often use the following:

Proposition 6.4.1 (Comparison Test)Suppose f and g are locally integrable on (a, b). If 0 ≤ f(x) ≤ g(x) for

all x ∈ (a, b), and if g is integrable on (a, b)then f is also integrable on (a, b)and we have

∫ ba f ≤

∫ ba g.

As a corollary it follows that if |f | is integrable it follows that f isintegrable but the converse is not . Functions f with |f | integrable arecalled absolutely integrable. Otherwise (if f is integrable without |f | beingintegrable) we say that f is conditionally integrable.

6.4.1 The Gamma Function

The gamma function is the continuous extension of the familiar factorialfunction n!, which is defined for positive integers only.

Definition 6.4.2

Γ(x) =∫ ∞

0tx−1 e−t dt (x > 0)

The improper integral is easily seen to be convergent at both end points.At t = ∞ , the integral converges, by comparison with the convergent inte-gral

∫∞1 t−2 dt, since limt→∞ t2 tx−1e−t = 0 for all x > 0 .

At t = 0 , the integral converges because tx−1e−t ≤ tx−1 for t > 0 and theintegral

∫ 10 tx−1 is convergent since x > 0 .

Moreover, the Gamma function is always positive and Γ(1) = 1 , since∫∞0 e−t dt = 1 .

Since ddt

(tx e−t

)= x tx−1 e−t − tx e−t , it follows from the fundamental

theorem of calculus that:

x

∫ b

atx−1 e−t dt−

∫ b

atx e−t dt = bx e−b − ax e−a

Letting a → 0+ and b → +∞ to compute the indefinite integral we getlima→0+ ax e−a = 0 limb→+∞ bx e−b = 0 (for x > 0 ). This proves thefollowing fundamental functional equation satisfied by the Gamma function,which makes it the continuous interpolation of the factorial!

82

Page 91: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Proposition 6.4.2 The gamma function Γ(x) is positive and satisfies thefunctional equation:

Γ(x + 1) = xΓ(x)

We therefore have n! = Γ(n + 1) for n ∈ N . Another important fact isthat

Γ(12) =

∫ ∞

0

√t e−t dt = 2

∫ +∞

0e−x2

dx =√

π

which follows from the substitution t = x2 .

6.4.2 Stirling’s Formula

If an and bn are two sequences of real numbers we will use the notationan ∼ bn to mean limn→∞

anbn

= 1.

Theorem 6.4.1

n! ∼√

2π nn n12 e−n

Proof: Let

dn = log(n!) + n− (n +12) log(n)

Then

dn − dn+1 = − log(n + 1)− 1− (n +12) log(n) + (n +

32) log(n + 1)

= (n +12) log(

n + 1n

)− 1

=12x

log(1 + x

1− x)− 1

where x = (2n + 1)−1 > 0 .By a power series expansion for the logarithm we obtain (for sufficiently

small and positive x):

83

Page 92: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

12x

log(1 + x

1− x)− 1 =

1x

(x +x3

3+

x5

5+ · · · )− 1

=x2

3+

x4

5+ · · ·

<x2

3(1 + x2 + x4 + · · · )

=x2

3(1− x2)

=112

(1n− 1

n + 1) since x =

12n + 1

Therefore0 < dn − dn+1 <

112

( 1n− 1

n + 1)

This simultaneously shows two important properties of the sequence dn :

(i) dn is strictly decreasing

(ii) (dn − 112n) is strictly increasing

It follows from (ii) that dn is bounded from below and so by (i) is a conver-gent sequence with limit lim dn = d .

We claim that ed =√

2π .The proof relies on the famous product formula of John Wallis (1616-1703):

2.2.4.4.6.6. . . . (2n)(2n)1.3.3.5.5.7. . . . (2n− 1)(2n + 1)

→ π

2as n →∞

(The derivation of this formula is one of the exercise problems at the end ofthis chapter)Taking the square root we get

2.4.6. . . . 2n

3.5.7. . . . (2n− 1)1√

2n + 1→

√π

2

but the left hand side is

22.42.62. . . . (2n)2

1.2.3.4.5.6. . . . (2n− 1)(2n)1√

2n + 1=

22n (n!)2

(2n)!1√2n

1√1 + (2n)−1

84

Page 93: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Taking logarithms we find

2 log(n!)− log((2n)!) + 2n log(2)− 12

log(n) → log(√

π)

On the other hand, by the definition of dn :

e2dn−d2n = 2 log(n!)− log((2n)!) + 2n log(2)− 12(log(n) + log(2))

Therefore d = lim(2dn − d2n) = log(√

2π)

6.4.3 The Euler-Maclaurin Summation Formula

Let P (t) be a polynomial of degree n and f(x) be a function which is con-tinuously differentiable (n+1)-times for all x in an open interval containinga pointa . Then by differentation we find for 0 ≤ t ≤ 1 :ddt

∑nk=1(−1)kP (n−k)(t)f (k)(a + t(x− a))(x− a)k

= −P (n)f ′(a + t(x− a))(x− a) + (−1)nP (t)f (n+1)(a + t(x− a))(x− a)n+1 .

where for a function F , F (k) denotes the k-th derivative of F .Since P (n)(t) is a constant c = P (n) , we obtain by integrating from t = 0to t = 1 , the following identity due to Darboux:

Proposition 6.4.3 (Darboux Formula)

P (n)(0)(f(x)− f(a)

)+

∑nk=1(−1)k

(P (n−k)(1)f (k)(x)− P (n−k)(0)f (k)(a)

)(x− a)k

= (−1)n(x− a)n+1∫ 10 P (t)f (n+1)(a + t(x− a)) dt

This innocuous looking little formula is quite powerful! For example, ifwe use the polynomials: P (t) = (t − 1)n , we obtain Taylor’s formula thatyou all learn in first year:

Proposition 6.4.4 (Taylor-Maclaurin’s Formula)

f(x)− f(a) =n∑

k=1

f (k)(a)k!

(x− a)k + Rn+1

where the remainder term is given by:

Rn+1 = (−1)n+1(x− a)n+1∫ 10 (1− t)n f (n+1)(a + t(x− a)) dt

85

Page 94: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Functions for wchich the Taylor-Maclaurin formula remains true (in aneighbourhood of a) when n →∞ are called analytic.

To obtain the Euler-Maclaurin summation formula we first introduce avery important sequence of rational numbers named after Bernoulli. TheBernoulli numbers B2j , are defined as coefficients in the power series ex-pansion:

x

ex − 1= 1− x

2+

∞∑j=1

(−1)j−1B2jx2j

(2j)!

They can also be defined through:

12

x cot(x

2) = 1−

∞∑j=1

1(2j)!

B2j x2j

(Maple or some other software will compute them for you: B2 = 16 , B4 =

130 , B6 = 1

42 , B8 = 130 , B10 = 5

66 etc.)

These numbers are intimately related to a family of polynomials alsonamed after Bernoulli. They are defined as the coefficients which you obtainwhen the function x etx−1

ex−1 is expanded in a Taylor Maclaurin power series

xetx − 1ex − 1

=∞∑1

βn(t)xn

n!

The Taylor-Maclaurin summation formula is derived by using the Bernoullipolynomials in the Darboux formula. These polynomials satisfy the impor-tant recursive relation:

βn(t + 1)− βn(t) = ntn−1

This is easily checked from the definition of the polynomials by puttingt + 1 for t and then subtracting.

Multiplying the two powers series for ext−1 and tet−1 , we find that these

polynomials are related to the Bernoulli numbers as follows:

βn(t) = tn − 12n tn−1 +

(n

2

)B2 tn−2 −

(n

4

)B4 tn−4 + · · ·

This implies:β(n−2j−1)(0) = 0 , β

(n−2j)n (0) = (−1)j−1 n!

(2j)! B2j ,

86

Page 95: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

β(n−1)n (0) = −1

2 n! , β(n)n (0) = n! .

Moreover by differentiating the recurrence relation βn(t + 1) = βn(t) , wehave: β

(n−k)n (1) = β

(n−k)n (0) for all k . Putting these all together for β2n in

the Darboux formula we obtain:f(x)− f(a)− 1

2

(f ′(x) + f ′(a)

)(x− a)

+∑n−1

j=1(−1)(j−1)

(2j)! B2j

(f (2j)(x)− f (2j)(a)

)(x− a)2j

= Rn+1 = 1(2n)!(x− a)2n+1

∫ 10 β2n(t)f (2n+1)(a + t(x− a)) dt

Applying this formula to a derivative , i.e if we write f ′ instead of f andassuming that we can neglect the error term Rn+1 , we obtain (by using thefundamental theorem of calculus!):∫ a+h

a f(t) dt = 12h

(f(a) + f(a + h)

)+

∑n−1j=1

(−1)j−1

(2j)! B2jh2j

(f (2j)(a + h)− f (2j)(a)

)where we have replaced x by a + h . Now for the special case of a ∈ N andh = 1 , we can add up all these formulas to obtain the famous:

Theorem 6.4.2 (Euler-Maclaurin Summation Formula)

n∑k=m

f(k)−∫ n

mf(x) dx =

12(f(m)+f(n))−

∞∑j=1

(−1)j

(2j)!B2j

(f (2j−1)(m)−f (2j−1)(n)

)

87

Page 96: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

6.5 Exercises

1. Suppose f : [0, 1] → R is integrable. Show that

(i) exp( ∫ 1

0 f(x) dx)≤

∫ 10 exp(f(x)) dx

(ii)( ∫ 1

0 |f(x)|p dx) 1

p ≤∫ 10 |f(x)| dx for 0 < p < 1

(iii)( ∫ 1

0 |f(x)|p dx) 1

p ≤∫ 10 |f(x)| dx for p > 1

2. Suppose f : [0, 1] → R is continuous. Show that as N →∞

1N

N∑k=1

f(k

N) →

∫ 1

0f

3. By using the sequence of partitions 0 < a < · · · < a qn = b , where

q = n

√ba , show that

limn→∞

n(

n

√b

a− 1

)=

∫ b

a

1x

dx = log(b

a)

4.

(i) Show that ∫ 1

0log(1 + x) dx = 2 log 2− 1

(ii) Deduce that1n

log((2n)!

nn n!

)→ log

(4e

)as n →∞

5. Prove that the following limit exists:

γ = limn→∞

( n∑k=1

1k− log(n)

)The limit, denoted by γ ≈ 0.57721 . . . , is called Euler’s constant is closelyrelated to the Gamma function, for example Γ ′(1) = −γ . It is still unknownwhether γ is irrational)

88

Page 97: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

6. (i) Compute Sn =∫ π

20 sin

n2 (x) dx . (Hint: n Sn = (n− 1) Sn−2 ).

(ii) Deduce Wallis’ formula:

2.2.4.4.6.6. . . . (2n)(2n)1.3.3.5.5.7. . . . (2n− 1)(2n + 1)

→ π

2

7. Using an appropriate change of variables, derive the following alternateexpressions for the Gamma function:

Γ(x) = 2∫ ∞

0t2x−1 e−t2 dt =

∫ 1

0

(log(

1t

)x−1dt

8. Prove that the logarithm of the Gamma function is convex.

9. Find the first four non-vanishing terms of the Taylor-Maclaurin series forthe following functions about the point x = 0 :

(i) f(x) =∫ x0 log(1 + t) dt

(ii) f(x) = exp(sin(x))

(iii) f(x) = log( sin(x)x ) f(0) = 0

(iv) f(x) = x cot(x) f(0) = 1

89

Page 98: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

6.5.1 Hints and short solutions

3(i) ddx

((1 + x) log(1 + x)− x

)= log(1 + x) . Therefore by the fundamental

theorem of calculus:∫ 10 log(1+x) dx = 2 log 2−1 = log

(4e

), since 1 log 1−

0 = 0 .

(ii) Choose the partition 0 < 1n < . . . < n−1

n < 1 and form the Riemannsum for the integral

∫ 10 log(1 + x) dx :

1n

∑nk=1 log(1+ k

n) = 1n log

( ∏nk=1

k+nn

), but

∏nk=1

k+nn = (2n)!

n! nn and the Rie-

mann sum converges to the integral as n → ∞ . Therefore log(

(2n)!nn n!

) 1n →

log(

4e

)as n →∞ .

4. Let γn = 1 + 12 + 1

3 + · · · + 1n − log(n) . Then γn+1 − γn = 1

n+1 −∫ n+1n

dtt > 0 , since 1

t is a strictly decreasing function. This shows that γn

is a monotonically decreasing sequence. Moreover the sequence is boundedfrom below by 0 , because

∑n−1k=1

1k is a upper Riemann sum for

∫ n1

dtt =

log(n) and hence dn > 1n > 0 . Therefore γn converges to a limiting value

γ , because it is bounded from below by 0 .# 5.

For n ≥ 2 , we have:

d

dx

(sinn−1(x) cos(x)

)= (n− 1) sinn−2(x) cos2(x)− sinn(x)

= (n− 1) sinn−2(x)(1− sin2(x)

)− sinn(x)

= (n− 1) sinn−2(x)− n sinn(x)

Therefore by the fundamental theorem of calculus:

(n− 1)∫ π

2

0sinn−2(x) dx = n

∫ π2

0sinn(x)

since sin(0) = cos(π2 ) = 0 . By induction, we obtain:

S2n =(2n− 1).(2n− 3) . . . 3.1

(2n).(2n− 2) . . . 4.2S0 =

(2n− 1).(2n− 3) . . . 3.1(2n).(2n− 2) . . . 4.2

π

2

S2n+1 =(2n).(2n− 2) . . . 4.2

(2n + 1).(2n− 1) . . . 5.3I1 =

(2n).(2n− 2) . . . 4.2(2n + 1).(2n− 1) . . . 5.3

1

90

Page 99: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Since 0 ≤ sin x ≤ 1 , for x ∈ [0, π2 ] we have 0 < Sn ≤ Sn+1 ≤ Sn+2

and because n Sn = (n − 1) Sn−2 . limn→∞S2n+1

S2n= 1 . This proves Wallis’

formula:lim

n→∞

2.2.4.4.6.6 . . . 2n.2n

1.3.3.5.5.7 . . . (2n− 1).(2n + 1)=

π

2

91

Page 100: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

92

Page 101: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Chapter 7

Metric spaces and UniformConvergence

The central idea of convergence is fundamental in a much more generalcontext. In particular, in this chapter we want to discuss convergence offunctions. In order to do this in a systematic way, it is convenient to work inthe framework of general metric spaces. Although topological spaces providethe most general framework for discussions of continuity and convergence, itis necessary for many problems in analysis to work with the more concretespaces called metric spaces, where the notion of a distance is the key idea.

7.1 Definitions and examples

Definition 7.1.1 A metric on a set X is an assignment of a non-negativenumber, called the distance d(x, y) ∈ [0,∞) to every pair of points x, y in Xsatisfying the following axioms:

M1.(Positivity) ∀ x, y ∈ X, d(x, y) ≥ 0 and d(x, y) = 0 iff x = y.

M2.(Symmetry) ∀ x, y ∈ X, d(x, y) = d(y, x)

M3.(Triangle inequality) ∀ x, y, z ∈ X, d(x, y) + d(y, z) ≥ d(x, z).

A metric space is a set X, together with a metric.

Exactly as for Rn , we define an open ball with centre a and radius r > 0in a metric space to be the set Br(a) = x ∈ X | d(x, a) < r and wecall U ⊂ X open if ∀ a ∈ U ∃ r > 0 such that Br(a) ⊂ U . This defines a

93

Page 102: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

topology for X in the sense of Chapter 3 , so metric spaces are special casesof topological spaces and all the general definitions and theorems, using opensets as the fundamental concept in Chapters 3 and 4 about connectedness,compactness and continuity hold for metric spaces (in particular!).

Definition 7.1.2 A sequence (xn) of points in a metric space (X, d) is saidto converge to a limit x ∈ X if and only if the sequence of real numbersd(xn, x) converges to 0 .

Definition 7.1.3 A sequence (xn) of points in a metric space (X, d) is saidto be a Cauchy sequence if ∀ ε > 0,∃N , such that n ≥ N ⇒ d(xn+k, xn) < εfor all k ∈ N .

As was the case for real numbers, it is easy to see that every convergentsequence is a Cauchy sequence, but the converse need not be true in a generalmetric space.

Definition 7.1.4 A metric space is said to be complete if every Cauchysequence is convergent.

Examples

0. The Euclidean vector space Rn with the usual metric d(x, y) = ||x−y|| isthe mother of all metric spaces. (Warning: the four dimensional Minkowskispace-time R1,3 of special relativity where the distance is defined by an innerproduct that is not positive-definite is not a metric space.) Similarly, Cn isusually given the same metric as R2n . All these spaces are complete metricspaces.

1. The vector space Rn can also be equipped with other non-Euclideanmetrics. Important examples are defined by the Lp norms : ||x||p defined by(||x||p

)p =∑n

i=1 |xi|p (for p > 0). The distance is then given by: d(x, y) =||x − y||p. (You will check all the axioms as an exercise at the end of thischapter. The first two properties of the metric are easy to check. Thetriangle inequality is a bit harder.) One can extend this definition to infinitesequences of real numbers provided we restrict to those sequences whose Lp-norm

∑∞i=1 |xi|p is finite. All these spaces are also complete.

94

Page 103: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

2. (The most important metric for the purposes of this section!).Let X = C0[a, b] be the set of all real-valued continuous functions on thecompact interval [a, b]. Since all continuous functions on a compact set arebounded we can define a metric on this space by:

d(f, g) = sup(|f(x)− g(x)| |x ∈ [a, b])

. The corresponding norm ||f || = d(f, 0) is called the supremum norm. Wewill see in the next section that this is a complete metric space.

3. Using integrals, we can also define Lp norms on C0[a, b] as follows:

(||f ||p

)p =∫ b

a|f(x)|p

but I will leave the study of these spaces to your next course in analysis.

Remarks

0. A sequence in Rn (or Cn )converges iff each of its components converges.The situation for infinite-dimensional spaces is more delicate. For exampleeach component of the shift sequence in “R∞ ”: a1 = (1, 0, 0, ...), a2 =(0, 1, 0, ...), a3 = (0, 0, 1, 0, ...), ... converges to 0 , but it would not be wise tosay that the whole sequence converges to 0.

1. For the finite dimensional spaces Rn , convergence with respect to theLp-norma is the same for all p , but this is quite different when we considerinfinite dimensional function spaces. You will learn all of this next year,hopefully!

2. Convergence in the metric space X = C0[a, b] equipped with the metricd(f, g) = sup(|f(x)−g(x)| |x ∈ [a, b]) is our main concern for this sectionand convergence in this space is given a special name:

Definition 7.1.5 A sequence of functions (fn) in C0[a, b] is said to con-verge uniformly to a function f if and only if ∀ ε > 0,∃N , such thatn ≥ N ⇒ |f(x)− g(x)| < ε for all x ∈ [a, b] .

7.2 Uniform Convergence

Definition 7.2.1 A sequence of functions (fn) defined on a set A is saidto converge uniformly to a function f if and only if ∀ ε > 0,∃N , such thatn ≥ N ⇒ |fn(x)− f(x)| < ε for all x ∈ A .

95

Page 104: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

This is in contrast to the following much weaker notion of convergencewe could have used:

Definition 7.2.2 A sequence of functions (fn) defined on a set A is saidto converge pointwise to a function f if for every point x ∈ A and ∀ ε >0,∃N , (depending possibily on x), such that n ≥ N ⇒ |fn(x)− f(x)| < ε .

Remarks

0. The sequence of functions fn(x) = xn converges pointwise but not uni-formly in [0,+1] . The limit function f is obviously discontinuous at the endpoint 0 .

1. The sequence of functions fn(x) = nx1+|nx| converges pointwise but not

uniformly in R .

Proposition 7.2.1 If a sequence of continuous functions fk : [a, b] → Rconverges uniformly to f : [a, b] → R then f is continuous on [a, b] .

Proof:The classic proof of this (which thousands of mathematicians had to

learn in the last 100 years) is actually very simple and is based on the factthat: |f(x)−f(y)| ≤ |f(x)−fk(x)|+ |fk(x)−fk(y)|+ |fk(y)−f(y)| (triangleinequality). All three terms on the right-hand-side are made to go to 0 byusing the assumptions of the theorem, so let’s just do what others have donebefore us !

Let x ∈ [a, b] and ε > 0 be given. Then because we have uniformconvergence we can find a K such that the first term |f(x) − fK(x)| andthe last term: |fK(y)−f(y)| are simultaneously less than (say) 1

10ε . Now thesecond term : |fK(x)−fK(y)| can also be made less than 1

10ε by choosing ysufficiently close to x , (i.e. ∃ δ > 0 s.t. |x−y| < δ ⇒ |fK(x)−fK(y)| < 1

10ε ),since fK is a continuous at x .Adding the three terms together finishes the proof, since 3

10 < 1 !QED

One nice way to paraphrase what we just proved is to say that

limy→x

limn→∞

fk(y) = limn→∞

limy→x

fk(y)

,so uniform continuity implies commutativity of two different limits.

96

Page 105: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Corollary 7.2.1 C0[a, b] is a complete metric space with respect to the met-ric defined by the supremum norm.

The proof given above can be strengthened to show that:

Proposition 7.2.2 If a sequence of uniformly continuous functions fk :[a, b] → R converges uniformly to f : [a, b] → R then f is uniformly contin-uous on [a, b] .

More importantly, we can relate uniform convergence to differentiationand integration:

Proposition 7.2.3 Let fk : A → R be a sequence of continuous functionsconverging uniformly on a compact set A. Then

(i)∫ ba fk →

∫ ba f as k →∞ for any interval [a, b] ⊂ A.

(ii) If each fk is continuously differentiable and if f′k → g uniformly on A,

then f′= g.

Proof:(i) Let ε > 0 be given. Then ∃K such that k ≥ K ⇒supx∈A |fk(x)− f(x)| < ε

b−a ⇒∣∣∣ ∫ b

a (fk − f)∣∣∣ ≤ ∫ b

a |fk − f | < ε

(ii) By (i): f′k → g uniformly on A ⇒

∫ xa f

′k →

∫ xa g for any x ∈ A. By the

fundamental theorem of calculus: fk(x) − f(a) =∫ xa f

′k and since fk(x) →

f(x) we have∫ xa g = f(x) − f(a). Therefore again by the fundamental

theorem of calculus g(x) = f′(x)

QEDRemarks

1. The sequence of functions fn defined on [0, 1] , whose graph consists ofisosceles triangles with base [0, 2

n and of height n all have area 1 , so that∫ 10 fn = 1 , for all n , but the limit function is 0 everywhere except at 0 .

2. The sequence of functions

fn(x) =x

1 + n x2

converges uniformly to a limit function f , but

limn→∞

f ′(0) 6= f ′(0)

97

Page 106: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

7.2.1 Uniform convergence of power series

The main message of this section is that on any compact set which is strictlyinside its radius of convergence a power series converges uniformly and there-fore we can integrate and differentiate power series term by term on anycompact set inside its radius of convergence. The standard way to establishthis is to prove a general, but very useful comparison theorem, called theWeierstrass M-Test for a series of functions (not necessarily a power series)

Proposition 7.2.4 (Weierstrass M-test) Suppose∑

Mn is convergent se-ries of strictly positive numbers. If a sequence of continuous functions (fn)defined on a compact set A satisfies |fn(x)| ≤ Mn for all x ∈ A , then theseries of functions

∑fn(x) is uniformly and absolutely convergent on A .

Proof: The partial sums Sn(x) =∑n

k=1 fk(x) form a Cauchy sequencein the complete metric space C0(A) , since for any n > m :

|Sn(x)− Sm(x)| ≤n∑

k=m

Mk

for all x ∈ A (and Mn is a Cauchy sequence).QED

Corollary 7.2.2 Suppose a power series∑

an xn converges for |x| < R(with R > 0 ). Then it converges uniformly on |x| ≤ ρ for each ρ < R .

Proof: Apply the Weierstrass M-Test: |an xn| ≤ Mn = |anρn| .

7.3 Weierstrass approximation theorem

Theorem 7.3.1 If f is a continuous function on [a, b] , then there exists asequence of polynomials (Pn) which converges uniformly to f .

Proof: After applying a translation and a dilation, we may assumew.l.o.g. that [a, b] = [0, 1] and also that max |f(x)| |x ∈ [0, 1] = 1 .

Define the Bernstein polynomials that approximate f by the formula:

98

Page 107: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

BPn(f, x) =n∑

k=0

f(k

n

)(n

k

)xk (1− x)n−k

For each f , BPn is obviously a polynomial of degree n (in x . They satisfythe “binomial” identity:

BPn(1, x) =n∑

k=0

(n

k

)xk (1− x)n−k = (1− x + x)n = 1

This implies:

BPn(f, x)− f(x) =n∑

k=0

(f(k

n

)− f(x)

) (n

k

)xk (1− x)n−k

Let ε > 0 be given. Since f is uniformly continuous on [a, b] , ∃ δ > 0such that |x− y| < δ ⇒ |f(x)− f(y) < 1

2ε .We now split the sum into two pieces: a sum

∑ ′ where we sum over all kwhich satisfy |x− k

n | < δ and∑ ′′ being the sum of the rest.

By uniform continuity the first sum is less than 12ε

∑(nk

)xk (1−x)n−k < 1

2ε .

To estimate the second sum we borrow some ideas from probabilty theory!We think of it as the probability of a random variable to lie in a certainrange. The random variable X is given by the binomial distribution of nBernoulli trials with “success” probability x . The mean (expected value) ofX is n x and its variance (E((X−E(X))2) is n x(1−x) (see an elementarytext book on probability). Since max |f | ≤ 1 , the second sum is boundedfrom above by

∑ ′′ (nk

)xk (1 − x)n−k , where we sum only over those values

of k which satisfy |x− kn | ≥ δ . We now interpret this sum as the probability

that the random variable differs from its mean by more than n δ (i.e. a“tail estimate”). Using now the well known Tchebychev’s (or Markov’s)inequality:

Prob(|X − µ| ≥ a) ≤ σ2

a2

where µ is the mean and σ is the variance of the random variable X , wecan estimate the second sum to be less than x(1−x)

n2δ2 ≤ 14n2δ2 . which can be

made less than 12ε for n sufficiently large.

QED

99

Page 108: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

7.4 Exercises

# 1 Show that each of the following defines a complete metric on Rn :

(i)

d(x, y) =n∑

k=1

|xk − yk|

(ii)d(x, y) = max

1≤k≤n|xk − yk|

# 2 Let (X, d) be a metric space. Show that

ρ(x, y) =d(x, y)

1 + d(x, y)

is a metric on X . Is every subset of X bounded in the new metric ρ ?

# 3. For each of the following sequence of functions defined on the closedinterval [0, 1] , determine whether the convergence is uniform or just pointwise and calculate the limit functions. Are the limit functions continuous ?

(i)fn(x) =

n x

1 + |n x|

(ii)fn(x) = n x (x− 1)n

(iii)fn(x) =

x

1 + n x2

# 4. Let [[x]] denote the distance from x to the nearest integer (sketch thegraph !). Show that the function:

f(x) =∞∑

k=1

[[3k x]]3n

is continuous for all x ∈ R , but is not differentiable at any point !

100

Page 109: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

# 5. Give an example of a sequence of continuous functions fn defined on[0, 1] that converges pointwise to zero but such that

limn→∞

∫ 1

0fn(x) dx = 1

# 6. Recall that the Fibonacci numbers are defined by: an+1 = an + an−1

with a0 = a1 = 1 .

(i) Show that∞∑

n=0

an xn =−1

x2 + x− 1

for all |x| < 12 .

(ii) Deduce the following formula for the Fibonacci numbers:

an−1 =1√5

((1 +√

52

)n −(1−

√5

2)n

)

# 7*. Prove that: ∫ 1

0x−x dx =

∞∑n=1

n−n

101

Page 110: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

7.4.1 Hints and short solutions to the exercises

2. Except for the triangle inequality, the other axioms are trivially satisfied,since d satisfies them! The triangle inequality follows from the following fact:If a, b, c are positive real numbers then

a ≤ b + c ⇒ a

1 + a≤ b

1 + b+

c

1 + c

which I leave as an easy exercise for you. All sets are bounded in the newmetric, since ρ(x, y) ≤ 1 for all x, y .

3.

(i) fn(x) = n x1+|n x|

This sequence converges pointwise to the function f(x) = +1 for 0 <x ≤ 1 and f(0) = 0 . The convergence is not uniform since the limit functionis not continuous.

(ii)fn(x) = n x (x − 1)n This sequence converges pointwise to the constantfunction f(x) = 0 for every x ∈ [0, 1] . Although the limit function iscontinuous, the convergence is not uniform since the maximum value offn(x) on the interval [0, 1] (occurring at x = 1

n+1 ) is nn+1

(1− 1

n+1

)n whichapproaches e > 0 as n →∞ .

(iii)fn(x) = x1+n x2 This sequence converges pointwise to the constant func-

tion f(x) = 0 for every x ∈ [0, 1] . The convergence is uniform since andthe maximum value of fn(x) on the interval [0, 1] which occurs at x = 1√

n

is 12√

nwhich approaches zero as n → ∞ . The limit function has to be

continuous (as it is obviously!). Note that the derivative at zero of all thefn’s is f ′(0) = 1 and so they do not converge to f ′(0) = 0 !

6. Let f(x) =∑∞

n=0 an xn . Then

(1− x− x2)f(x) =∞∑

n=0

an xn −∞∑

n=0

an xn+1 −∞∑

n=0

an xn+2

=∞∑

n=0

an xn −∞∑

n=1

an−1 xn −∞∑

n=2

an−2 xn

= 1 + x− x +∞∑

n=2

(an − an−1 − an−2

)xn

= 1

102

Page 111: A FRIENDLY INTRODUCTION TO ANALYSIS - …ms.mcmaster.ca/minoo/2ab_book.pdf · A FRIENDLY INTRODUCTION TO ANALYSIS ... Preface Analysis is the rigorous and more advanced study of the

Convergence is guaranteed for all |x| < 1α , where α = lim |an+1

an| = 1+

√5

2is the golden ratio and α < 2 .

(ii) Since the roots of the polynomial x2 + x − 1 are −α = −1−√

52 and

1α = −1+

√5

2 , we have a partial fraction decomposition:

∞∑n=0

an xn =−1

x2 + x− 1=

1√5

( 1x + α

− 1x− 1

α

)Expanding the right hand side into two geometric series:

1α(1 + x

α)=

∞∑n=0

(−1)nα−n−1 xn α

1− α x=

∞∑n=0

αn+1 xn

and comparing coefficients we get the formula for the Fibonacci numbers:√

5 an = αn+1 − (−1)n+1α−n−1 =(

1+√

52

)n+1−

(1−√

52

)n+1.

7.Prove that: ∫ 1

0x−x dx =

∞∑n=1

n−n

x−x = e−x log(x) =∑∞

k=0(−1)k

k! xk(log(x)

)k , so all we have to prove is that

(−1)k

k!

∫ 1

0xk

(log(x)

)kdx = (k + 1)−(k+1)

The improper integral is convergent, because x log x → 0 as x → 0 . More-over, since d

dx

(xk+1

(log(x)

)l)

= (k + 1) xk(log(x))l + l xk(log(x))l−1 ,

∫ 1

0xk

(log(x)

)l = − l

k + 1

∫ 1

0xk

(log(x)

)l−1

Inductively, we see that∫ 1

0xk

(log(x)

)kdx = (−1)k k!

(k + 1)(k+1)

103