ccs math 120: calculus on manifolds - albany. consort · the primary textbook was michael...

37
CCS Math 120: Calculus on Manifolds Simon Rubinstein–Salzedo Spring 2004

Upload: nguyendang

Post on 06-Jun-2018

230 views

Category:

Documents


2 download

TRANSCRIPT

CCS Math 120: Calculus on Manifolds

Simon Rubinstein–Salzedo

Spring 2004

0.1 Introduction

These notes are based on a course on calculus on manifolds I took from ProfessorMartin Scharlemann in the Spring of 2004. The course was designed for first-year CCSmath majors. The primary textbook was Michael Spivak’s Calculus on Manifolds. Arecommended supplementary text was Maxwell Rosenlicht’s Introduction to Analysis.

1

Chapter 1

Basic Analysis and Topology

We define Rn = {(x1, . . . , xn) | xi ∈ R}. For example, (π, e,√

2) ∈ R3.

Rn is a vector space. This means that we have two operations + : Rn×RRn → Rn

and · : R× Rn → Rn satisfying the following properties:

1. x + y = y + x.

2. x + (y + z) = (x + y) + z.

3. There exists 0 ∈ Rn so that for all x, 0 + x = x.

4. For all x ∈ Rn, there exists −x ∈ Rn so that x + (−x) = 0.

5. α(x + y) = αx + αy for all α ∈ R.

6. (α + β)x = αx + βx.

7. α(βx) = (αβ)x.

8. 1x = x.

If x = (x1, . . . , xn) and y = (y1, . . . , yn), we define x + y = (x1 + y1, . . . , xn + yn).

We assume the following properties of R: For α, β ∈ R, |α| = |β| iff α2 = β2, and|α| ≥ |β| iff α2 ≥ β2.

2

There is a norm on Rn: | · | : Rn → R with useful properties. It is defined by|(x1, . . . , xn)| =

√(x1)2 + · · ·+ (xn)2.

Proposition. | · | satisfies, for all α ∈ R and x, y ∈ Rn, the following properties:

1. |x| ≥ 0 and (|x| = 0 iff x = 0).

2. |∑

xiyi| ≤ |x| |y|, with equality iff x and y are linearly dependent.

3. |x + y| ≤ |x|+ |y|.

4. |αx| = |α| |x|.

Proof.

1. |x| = 0 iff |x|2 = 0 iff∑

(xi)2 = 0. If all xi = 0, then∑

(xi)2 = 0. If somexi 6= 0, then (xi)2 > 0, so

∑j 6=i(x

j)2 + (xi)2 ≥ (xi)2 > 0.

2. If x and y are linearly dependent, then |∑

xiyi| = |x| |y|. [Digression: Supposev1, . . . , vm are vectors in a vector space. They are said to be linearly dependentiff there exists α1, . . . , αm, not all zero, so that

∑αivi = 0. Hence αx + βy = 0

for some α, β not both zero. Suppose for example that α 6= 0. Then x+ βαy = 0.

Thus they are linearly independent iff there exists a λ so that x = λy or y = λx.To see this, suppose α 6= 0. Take λ = −β

α. If α = 0, then βy = 0, so y = 0.]

Suppose without loss of generality that y = λx, i.e. yi = λxi for all i. Then(∣∣∣∑ xiyi∣∣∣)2

=(∑

xiyi)2

=

(n∑

i=1

xiλxi

)2

= λ2(∑

(xi)2)2

= λ2(∑

(xi)2)(∑

(xi)2)

=(∑

(xi)2)(∑

(λxi)2)

= |x|2 |y|2,

3

so |∑

xiyi| = |x|2|y|2.

Now suppose that x and y are linearly independent. Then there is no λ forwhich y = λx, so for all λ, (y − λx) 6= 0. Equivalently, for all x, |y − λx| 6= 0,or for all λ, |y − λx| 6= 0, or for all λ, |y − λx|2 6= 0. We have

|y − λx|2 =n∑

i=1

(yi − λxi)2

=n∑

i=1

((yi)2 − 2λxiyi + λ2(xi)2)

=∑

(yi)2 − 2λ∑

yixi + λ2∑

(xi)2

= |y|2 − 2λ∑

yixi + λ2(|x|2)

6= 0

for all λ. The quadratic formula now tells us to to find λ. The solution mustbe complex, so b2 − 4ac < 0, which happens if and only if

4(∑

yixi)2

− 4|x|2|y|2 < 0.

Finally, if (∑

yixi)2

< |x|2|y|2, then |∑

yixi| < |x| |y|.

3. |x + y| ≤ |x|+ |y| iff |x + y|2 ≤ (|x|+ |y|)2.

|x + y|2 =∑

(xi + yi)2

=∑(

(xi)2 + 2xiyi + (yi)2)

= |x|2 + |y|2 + 2∑

xiyi

≤ |x|2 + |y|2 + 2|x| |y|= (|x|+ |y|)2.

4.|αx| =

√(αx1)2 + · · ·+ (αxn)2 =

√α2√

(x1)2 + · · ·+ (xn)2 = |a| |x|. �

4

Definition. For x, y ∈ Rn, 〈x, y〉 =∑n

i=1 xiyi. (This is frequently denoted x · y.)

Properties. For all x, y, z ∈ Rn and α, β ∈ R,

1. 〈x, y〉 = 〈y, x〉 (commutative).

2. 〈x+y, z〉 = 〈x, z〉+〈y, z〉, 〈x, y+z〉 = 〈x, y〉+〈x, z〉, 〈αx, y〉 = 〈x, αy〉 = α〈x, y〉(bilinear).

3. 〈x, x〉 ≥ 0, with equality iff x = 0 (positive definite).

4. |x| =√〈x, x〉.

5. 〈x, y〉 = |x+y|2−|x−y|24

.

Proof of (5).∑(xi + yi)2 −

∑(xi − yi)2 =

∑((xi)2 + (yi)2 − 2xiyi

)−∑(

(xi)2 + (yi)2 − 2xiyi)

= 4∑

xiyi

= 4〈x, y〉,

as desired. �

1.1 Notation and Linear Algebra

Let ei = (0, . . . , 0, 1, 0, . . . , 0), where the 1 is in the ith position. Then {ei} is a naturalbasis for Rn. In fact, (x1, . . . , xn) =

∑ni=1 xiei.

Let T : Rn → Rm be a linear transformation (i.e. for all x, y ∈ Rn and α, β ∈ R,T (αx + βy) = αT (x) + βT (y)). Then for each 1 ≤ i ≤ n, T (ei) ∈ Rm, so T (ei) =∑

ajiej for suitable aji. Consider the matrix

A =

a11 · · · a1n...

. . ....

am1 · · · amn

.

5

Note that

T ((x1, . . . , xn)) = T(∑

xiei

)=∑

xiT (ei)

=∑

j

∑i

xiajiej

=∑

j

∑i

ajixiej

=m∑

j=1

yjej

= (y1, . . . , ym).

The entries yj in Rm of T (x) are given by A ·

x1

...xn

. If S : Rm → Rp is another linear

transformation, then the matrix of S ◦ T is BA, where B is the matrix of S.

1.2 Topology of Rn

Suppose A ⊆ Rn and B ⊆ Rm. Then A × B ⊆ Rn × Rm = {(x, y) | x ∈ A, y ∈B} ⊆ Rn+m. If a, b ∈ R, let [a, b] = {x ∈ R | a ≤ x ≤ b} be the closed interval and(a, b) = {x ∈ R | a < x < b} be the open interval.

Definition. If [a1, b1], . . . , [an, bn] are closed intervals, then [a1, b1] × · · · × [an, bn] isa closed rectangle in Rn. Similarly, (a1, b1)× · · · × (an, bn) is an open rectangle.

Definition. For U ⊆ Rn, U is an open set if for all x ∈ U , there exists an openrectangle A so that x ∈ A ⊆ U .

Examples.

1. An open rectangle A is an open set. (Just use A.)

6

2. ∅ is open.

3. Rn is open: Let x ∈ Rn. Then x ∈ (x1 − 1, x1 + 1)× · · · × (xn − 1, xn + 1).

4. If {Uα} is a collection of open sets, then⋃

(Uα) is open.

Alternate Definition. U ⊆ Rn is open iff there are open rectangles {Aα} so thatU =

⋃Aα. The reverse direction is proven by applying (1) and (4) above. For the

forward direction, suppose U ⊆ Rn is open. For each x, there is an open rectangleAx so that x ∈ Ax ⊆ U . Consider

⋃x∈X Ax. Then

U =⋃x∈U

{x} ⊆⋃x∈U

Ax ⊆ U,

so U =⋃

x∈U Ax.

Definition. A ⊆ Rn is closed if Rn − A is open.

Example. A closed rectangle is a closed set.

Proof. Let [a1, b1] × [an, bn] be any closed rectangle. Suppose x = (x1, . . . , xn) ∈Rn − A, i.e. x 6∈ A. Then for some i, xi ∈ [ai, bi]. Without loss of generality, xi > bi.

Let ε = xi−bi

2. Use (x1 − ε, x1 + ε)× · · · × (xn − ε, xn + ε) ⊆ Rn − A. �

Let A ⊆ Rn be arbitrary and x ∈ Rn. x could have exactly one of these properties:

(i) There is some open rectangle R so that x ∈ R ⊆ A.

(ii) There is some open rectangle R so that x ∈ R ⊆ Rn − A.

1. For every open rectangle R for which x ∈ R, both A∩R 6= ∅ and (Rn−A)∩R 6=∅.

(i) x is in the interior of A. Note: int(A) is open.

7

(ii) x is in the exterior of A. ext(A) = int(Rn − A) is open.

(iii) x is in the boundary of A. bnd(A) = Rn − int(A)− ext(A) is closed.

A collection O of open sets is a cover of A ⊆ Rn if

(i) for each a ∈ A, there is a U ∈ O so that a ∈ U ,

(ii) A ⊆⋃

U∈O U .

Example. O ={(

1n, 1)⊆ R, n ∈ N

}is an open cover of (0, 1).

Problem. Find a surprising open cover of [0, 1].

Answer. You won’t.

If O is a cover of A and O′ ⊆ O so that O′ is also a cover, then O′ is a subcover.

Definition. A is compact if for every open cover O of A, there is a finite subcover.

Example. If A is finite, then A is compact.

Proof. Suppose O is an open cover of {a1, . . . , am | ai ∈ Rn} = A. Then for each ai,there is a Ui so that ai ∈ Ui ∈ O. Then {U1, . . . , Um} ⊆ O is a finite cover. �

Theorem. (Heine-Borel) Any closed interval ⊆ R is compact.

8

1.3 Digression into R Properties

R is complete, i.e. it has the least upper bound property.

Definition. Suppose ∅ 6= S ⊆ R. x ∈ R is an upper bound if for all s ∈ S, s ≤ x.x ∈ R is a least upper bound if

(i) x is an upper bound for S.

(ii) For each upper bound y for S, x ≤ y.

Axiom. For each ∅ 6= S ⊆ R that has an upper bound, there is a least upper bound.

Example. Q doesn’t have this property: Let S ⊆ Q = {x ∈ Q | x2 < 2}. Then1 ∈ S and 2 is an upper bound, but there is no least upper bound.

Proof of Theorem. Let [a, b] be a closed interval and O be any open cover of [a, b].Let S ⊆ R be {x | a ≤ x ≤ b and [a, x] is covered by a finite number of the O}.

1. a ∈ S since [a, a] = {a}. Since O is an open cover, there is some U ∈ O so thata ∈ U , i.e. [a, a] ⊆ U .

2. b is an upper bound for S by definition. Then by the completeness axiom thereis a least upper bound α.

3. We claim that α ∈ S. If α = a, then this follows from (1), so assume α > a.Then α ∈ U ∈ O, and U is open, so there is some ε > 0 so that α ∈ [α−ε, α+ε] ⊆U . We may as well assume (by taking ε small enough) that a < α − ε. Noticethat [a, α − ε] is covered by a finite number of the O. Otherwise no x higherthan α− ε has the property that [a, x] is covered by a finite number of the O’s,so no element of S is greater than or equal to α− ε, so α− ε is a lower upperbound for S. So there are U1, . . . , Un ∈ O so that [a, α− ε] ⊆ U1 ∪ · · · ∪ Un, so[a, α] is covered by U1, . . . , Un, U , so α ∈ S.

4. We now claim that α ∈ S. If α < b, then notice that [a, α + ε) is covered by afinite number of O′s, so α + ε

2∈ S (if α + ε

2≤ b). This contradicts that α is

an upper bound for S, since either α + ε2

or b is in S. Hence b ∈ S, so [a, b] is

9

covered by a finite number of the O’s. �

Goal. Show that if A and B are compact, then A×B is too.

Recall. U ⊆ Rn is open if for all x ∈ U , there is an open rectangle R so thatx ∈ R ⊆ U .

Corollary. Suppose W is open in Rm×Rn and a ∈ Rm. Then W ∩ (a×Rn) is openin Rn, i.e. Wa = {y ∈ Rn | (a, y) ∈ W} is open in Rn.

Proof. Let b ∈ Wa, i.e. (a, b) ∈ W . Since W is open, there are open rectanglesU ⊆ Rm and V ⊆ Rn so that (a, b) ∈ U × V ⊆ W , so {a} × V ⊆ W , so V ⊆ Wa, soWa is open. �

Lemma. Suppose B ⊆ Rn is compact, a ∈ Rm, and O is an open cover of {a} × B.Then there is an open set U ⊆ Rm so that a ∈ U and U × B is covered by a finitenumber of O.

Proof. Given y ∈ B, there is some Wy ∈ O so that (a, y) ∈ Wy. Wy is open, sothere are open rectangles Uy ⊆ Rm and Vy ⊆ Rn so that (a, y) ∈ Uy × Vy ⊆ Wy.Notice how as y varies over B, {Vy | y ∈ B} is an open cover of B, so there is afinite subcover. That is, there exist Uy1 , . . . , Uyk

so that B ⊆ Uy1 ∪ · · · ∪ Uyk. Let

U ⊆ Rm be Uy1 ∩ · · · ∩Uyk, which is open. Notice that a ∈ Uy finally, so a ∈ U . Also,

U × B ⊆ Wy1 ∪ · · · ∪ Wykfor (u, b) ∈ U × B, so b ∈ Vyi

for some i. Since u ∈ U ,u ∈ Uyi

, so (u, b) ∈ Uyi× Vyi

⊆ Wyi. Therefore Wy1 , . . . ,Wyk

is a finite subcover ofU ×B. �

Theorem. If A ⊆ Rm is compact and B ⊆ Rn is compact, then A × B ⊆ Rm × Rn

is compact.

Proof. Let O be an open cover of A×B. Then for every a ∈ A, there is an open set

10

Ua so that a ∈ Ua and Ua × B is covered by a finite number of O’s by the Lemma.Then {Ua} is an open cover of A. Then A is compact, so {Ua1 , . . . , Ua`

} cover A, soA ⊆ Ua1 ∪ · · · ∪ Ua`

. Thus A× B ⊆ (Ua1 × B) ∪ · · · ∪ (Ua`× B) is therefore covered

by finitely many O’s. �

Corollary. Any closed rectangle [a1, b1]× · · · × [an, bn] is compact.

Lemma. Suppose A ⊆ Rm is compact and C ⊆ A and C is a closed set in Rm. ThenC is compact.

Proof. Let O be an open cover of C (C ⊆⋃

U∈O U). Then C is closed iff Rm − C isopen, so O ∪ {Rn − C} is an open cover of A. Compactness implies that there existU1, . . . , Uk ⊆ O so that A ⊆ U1∪· · ·∪Uk∪(Rm−C). Then C ⊆ U1∪· · ·∪Uk∪(Rm−C).But the last term is redundant, so C ⊆ U1 ∪ · · · ∪ Uk. �

Corollary. C ⊆ Rm is compact iff C is closed and bounded.

Proof. For the reverse direction, let C be closed and bounded in Rm.

1. Since C is bounded, there exists a really big closed rectangle R so that C ⊆ R.

2. R is a closed rectangle, so R is compact by the last corollary.

3. R is compact and C is closed, so C is compact by the above Lemma.

The forward direction is a homework problem. �

1.4 Continuous Functions from Rm to Rn

Recall. Let A ⊆ R, and let f : A → R be a function.

1. For a ∈ R, limx→a f(x) = b means that for all ε > 0, there is a δ > 0 so that forall x ∈ A, if |x− a| < δ, then |f(x)− b| < ε.

2. f is continuous at a if limx→a f(x) = f(a).

11

3. f is continuous if it is continuous at every point in its domain.

We can extend the above definition to functions f : Rm → Rn if we let | · | mean thenorm, as follows.

Definition. Let A ⊆ Rm and f : A → Rn. Then f is continuous if for every a ∈ Aand ε > 0, there is a δ > 0 so that whenever x ∈ A satisfies |x − a| < δ, then|f(x)− f(a)| < ε.

If f : Rm → Rn, then for each x ∈ Rm, f(x) = (y1, . . . , yn). Call yi = f i(x). Theneach f i : Rm → R, and f(x) = (f 1(x), . . . , fn(x)).

Given n functions g1, . . . , gn : Rm → R, we can define f = (g1, . . . , gn) : Rm → Rn bysetting its ith coordinate to gi. Thus f i = gi. For each 1 ≤ i ≤ m, let πi : Rm → Rbe πi(x1, . . . , xm) = xi, the ith projection function.

1.5 Facts about Functions

Let f : X → Y be a function. For any subset S ⊆ Y , f−1(S) = {x ∈ X | f(x) ∈ S}.

Note. Unless f is bijective, f−1 does not define a function.

Let P(x) denote the power set of X, the set of subsets of X. Then f−1 : P(Y ) →P(X).

Property. f−1 (⋃

α Sα) =⋃

α f−1(Sα). f−1 (⋂

α Sα) =⋂

α f−1(Sα). The analogousproperty for unions is true for f , but not the analogous property for intersections.

Proof. We show only that f (⋃

α Sα) =⋃

α f(Sα). Let y ∈ f (⋃

α Sα). Theny = f(xα), so y ∈

⋃α f(Sα). If y ∈

⋃α f(Sα), then y = f(Sα) for some α, so

y ∈ f (⋃

α Sα). �

12

Theorem. (The birth of topology) f : Rm → Rn is continuous iff for every open setV ⊆ Rn, f−1(V ) is open in Rm. More generally, f : A → Rn is continuous iff forevery open V ⊆ Rn, there is an open set U ⊆ Rm so that f−1(V ) = U ∩ A.

Proof. Suppose f : A ⊆ Rm to Rn is continuous and V is open in Rn. Let a ∈ f−1(V ).Then f(a) ∈ V , so there is an open rectangle R so that f(a) ∈ R ⊆ V . There is anε > 0 so that whenever |y − f(a)| < ε, y ∈ R implies y ∈ V . Continuity implies thatthere is a δ > 0 so that for all x ∈ A, |x− a| < δ, so |f(x)− f(a)| < ε, so f(x) ∈ V ,so x ∈ f−1(V ), i.e. {x ∈ A | |x− a| < δ} is an open set. For all a ∈ f−1(V ), there isan open Ua so that

1. a ∈ Ua,

2. Ua ∩ A ⊆ f−1(V ).

Finally, let U =⋃

a∈A Ua. This proves the forward direction.

Now suppose that f : A → Rn satisfies the following property: For all open V ⊆ Rn,there is an open set U in Rm so that f−1(V ) = U ∩ A. We will show that fis continuous in the ε and δ sense. Suppose a ∈ A and ε > 0 is given. LetV = {y ∈ Rn | |y − f(a)| < ε}. There is an open set, so there is an open setU ⊆ Rm so that a ∈ f−1(V ) = U ∩A. Since a ∈ U is open, there is an open rectangleR so that a ∈ R ⊆ U . Then there is a δ > 0 so that {x | |x − a| < δ} ⊆ R ⊆ Uimplies that if x ∈ A and |x − a| < δ, then x ∈ U ∩ A = f−1(V ) (so f(x) ∈ V , and|f(x)− f(a)| < ε), so f is continuous on A. �

Corollary. Suppose f : A → Rn is continuous and C ⊆ A is compact. Then f(C) iscompact.

Corollary. If f : R → R is continuous, then f([a, b]) is a closed and bounded set.

Proof. [a, b] is compact, so f([a, b]) is compact and hence closed and bounded. �

13

Theorem. If f : A ⊆ Rm → Rn is continuous and C ⊆ A, then f(C) is compact.

Proof. Suppose O is an open cover of f(C). That is, f(C) ⊆⋃

V ∈O V iff C ⊆f−1

(⋃V ∈O V

)=⋃

V ∈O(f−1(V )). For each V ∈ O, there is a UV open in Rm andUV ∩ A = f−1(V ). Note that(⋃

V ∈O

UV

)∩ A =

⋃V ∈O

(UV ∩ A) =⋃

V ∈O

(f−1(V )) ⊇ C,

so {UV } is an open cover of C. Since C is compact, it is covered by a finite numberof the UVi

’s, i.e. C ⊆ UV1 ∪ · · · ∪ UVk, so

f(C) ⊆ f(UV1) ∪ · · · ∪ f(UVk) = f(f−1(V1)) ∪ · · · ∪ f(f−1(Vk)) ⊆ V1 ∪ · · · ∪ Vk,

i.e. O has a finite subcover. �

Corollary. If f : Rm → Rn is continuous and A ⊆ Rm is closed and bounded, thenf(A) is closed and bounded.

Corollary. If f : R → R is continuous, then f([a, b]) is [c, d].

14

Chapter 2

Differentiation

Recall. f : R → R is differentiable at a ∈ R if

limh→0

f(a + h)− f(a)

h= λ

for some λ ∈ R.

Note. This definition makes no sense for a, h ∈ Rm and f : Rm → Rn.

Alteration. f is differentiable at a if there is a λ ∈ R so that

limh→0∈Rm

|f(a + h)− f(a)− λh||h|

= 0.

Definition. A function f : Rm → Rn is differentiable at a ∈ Rm if there is a lineartransformation T : Rm → Rn so that

limh→0

|f(a + h)− f(a)− T (h)||h|

= 0.

Denote T : Rm → Rn by Df(a) : Rm → Rn. Then Df : Rm → L(Rm → Rn).

Lemma. If T exists, then it is unique.

15

Proof. Suppose also S : Rm → Rn has the property that

limh→0

|f(a + h)− f(a)− S(h)||h|

= 0.

Then

limh→0

|S(h)− T (h)||h|

= limh→0

|S(h)− f(a + h) + f(a) + f(a + h)− f(a)− T (h)||h|

≤ limh→0

(|S(h)− f(a + h) + f(a)|

|h|+|f(a + h)− f(a)− T (h)|

|h|

)= 0.

Now let x 6= 0 ∈ Rm. Then

limt→0

|S(tx)− T (tx)||tx|

= 0,

and since S and T are linear,

limt→0

|t(S(x)− T (x))||tx|

= limt→0

|S(x)− T (x)||x|

= 0,

so S(x) = T (x) for all x, so S = T . �

Definition. The matrix of Df(a) is called the Jacobian matrix; the determinant ofthe Jacobian matrix is sometimes called the “Jacobian.”

Definition. f : X ⊆ Rm → Rn is differentiable if for each a ∈ X there is an openset U ⊆ Rm so that a ∈ U so that there is a differentiable function g : U → Rn andf |U∩X= f |U∩X .

Theorem. (Chain Rule) Suppose f : Rm → Rn, g : Rn → Rp, f is differentiable ata ∈ Rm, and g is differentiable at f(a) = b ∈ Rn. Then (g ◦ f) is differentiable at a,and D(g ◦ f)(a) = Dg(b) ◦Df(a).

Proof. Let λ = Df(a) and µ = Dg(b). We have to prove that

limx→a

|gf(x)− gf(a)− µλ(x− a)||x− a|

= 0.

16

We first need a lemma.

Lemma. Let T : Rm → Rn be a linear transformation. Then there is an M > 0 sothat for all x ∈ Rm, |T (x)| < M |x|.

First Proof. Figure out what M needs to be from the entries of the matrix of T .

Second Proof. Let Sm−1 = {x ∈ Rm | |x| = 1}. Let f : Rm → R be f(x) = |x|.Then Sm−1 = f−1(1). Sm−1 is closed and bounded and hence compact, so T (Sm−1) iscompact and hence bounded, so there is an M so that for all y ∈ T (Sm−1), |y| < M .

Then for any x 6= 0, |T (x)| = |x|∣∣∣T ( x

|x|

)∣∣∣ < |x|M . �

We now return to the proof of the theorem. We have

|gf(x)− gf(a)− µλ(x− a)||x− a|

≤ |gf(x)− gf(a)− µ(f(x)− f(a))||x− a|

+|µ[f(x)− f(a)− λ(x− a)]|

|x− a|,

|µ[f(x)− f(a)− λ(x− a)]||x− a|

≤ M|f(x)− f(a)− λ(x− a)|

|x− a|→ 0.

We know that

limy→b

|g(y)− g(b)− µ(y − b)||y − b|

= 0.

In particular, for all ε > 0, there is a δ > 0 so that |y−b| < δ, so |g(y)−g(b)−µ(y−b)| <ε|y− b|. Since f is continuous at a, there is a δ′ so that |x−a| < δ′, so |f(x)− b| < δ,so

|gf(x)− g(b)− µ(f(x)− b)| < ε|f(x)− b|= ε|f(x)− f(a)− λ(x− a) + λ(x− a)|≤ ε|f(x)− f(a)− λ(x− a)|+ N |x− a|

for some N . Since

limx→a

|f(x)− f(a)− λ(x− a)||x− a|

= 0,

17

there is a δ′′ ≤ δ′ so that |f(x)− f(a)− λ(x− a)| < |x− a| < δ′′, so

|gf(x)− gf(a)− µ(f(x)− f(a))||x− a|

≤ ε(1 + N),

i.e.

limx→a

|gf(x)− gf(a)− µ(f(x)− f(a))||x− a|

= 0. �

1-Dimensional Case. Let f, g : R → R be differentiable at a and f(a), respectively.Then D(g ◦ f)(a) = Dg(f(a)) ◦Df(a), so multiply by g′f(a) · f(a).

Rule. (g ◦ f)′(a) = g′(f(a))f ′(a).

Product Rule. Composition with ρ : R2 → R, given by ρ(x, y) = xy.

Important Examples.

1. f : Rm → Rn constant (i.e. there is a c ∈ Rn so that for all x ∈ Rm, f(x) = c),then Df(a) = 0.

Proof.

limh→0

|f(a + h)− f(a)− 0||h|

= limh→0

|c− c− 0||h|

= limh→0

0

|h|= 0.

2. If f : Rm → Rn is a linear transformation, then Df(a) = f .

Proof.

limh→0

|f(a + h)− f(a)− f(h)||h|

= limh→0

|f(a + h)− f(a + h)||h|

= limh→0

0

|h|= 0.

3. f = (f 1, . . . , fn). Then Df(a) = (Df 1(a), . . . , Dfn(a)). f ′(a) =

(f 1)′(a)...

(fn)′(a)

.

f is differentiable at a iff each f i is differentiable at a.

18

Proof. Suppose f is differentiable at a. Then f i = πif , so

D(f i)(a) = D(πif)(a) = D(πi)(f(a)) ◦Df(a) = πiDf(a),

which is the ith entry of Df(a). We have shown that D(f i)(a) exists. Nowsuppose each Df i(a) exists. Consider

limh→0

|f(a + h)− f(a)− (Df 1(a)(h) + · · ·+ Dfn(a)(h))||h|

≤ limh→0

n∑i=1

|f i(a + h)− f i(a)−Df i(a)(h)||h|

= 0.

4. Let s : Rs → R be given by (x, y) 7→ x + y. Ds(a, b) = s.

Proof. s is a linear transformation, so by (2), Ds(a, b) = s.

5. Let p : R2 → R be given by (x, y) 7→ xy. Dp(a, b)(x, y) = bx + ay.

Proof.

lim(h,k)→0

|p(a + h, b + k)− p(a, b)− bh− ak||(h, k)|

= lim(h,k)→0

|(a + h)(b + k)− ab− bh− hk||(h, k)|

= lim(h,k)→0

|hk||(h, k)|

≤ lim(h,k)→0

|h2 + k2|√h2 + k2

= lim(h,k)→0

√h2 + k2

= 0.

6. Let q : R2 = {x-axis} → R be given by (x, y) 7→ x/y. Then Dq(a, b)(x, y) =bx−ay

b2.

Application. Suppose g, f : Rm → R are differentiable at a. Then

(i) D(f + g)(a) = Df(a) + Dg(a).

(ii) D(fg)(a) = g(a)Df(a) + f(a)Dg(a).

19

(iii) D(f/g)(a) = g(a)Df(a)+f(a)Dg(a)[g(a)]2

if g(a) 6= 0.

Proof.

(i) f + g = s ◦ (f, g), so

D(f+g)(a) = Ds(f(a), g(a))◦(Df(a), Dg(a)) = s◦(Df(a), Dg(a)) = Df(a)+Dg(a).

(ii) fg = p(f, g), so

D(fg)(a) = Dp(f(a), g(a)) ◦ (Df(a), Dg(a)) = g(a)Df(a) + f(a)Dg(a).

(iii) f/g = q(f, g), so

D(f/g)(a) = Dq(f(a), g(a)) ◦ (Df(a), Dg(a)) =g(a)Df(a)− f(a)Dg(a)

[g(a)]2. �

Observation. (iii) shows that f/g is differentiable. If we knew that, its calculationis much easier:

Df = D

(f

gg

)= gD(f/g) +

f

gDg,

so gDf = g2D(f/g) + fDg, so D(f/g) = gDf−fDgg2 .

So far we have seen some examples of derivatives, but we have not yet come up witha method for computing them. Let us now see how it is done. If f = (f 1, . . . , fn),

with each f i : Rm → R, then f ′(a) =

Df 1(a)...

Dfn(a)

, where each Df i(a) is a linear

transformation Rm → R. We have reduced to the case f : Rm → R. Then f ′(a)is a 1 × m matrix. We have (f ◦ g)′(0) = f ′(g(0)) · g′(0). We seek a g : R → Rm

so that g(0) = a ∈ Rm and g′(0) =

0...1...0

, e.g. g(t) = (a1, . . . , aj + t, . . . , am). Then

20

g(0) = (a1, . . . , am) = a and g′(0) =

0...1...0

. Then the jth entry of f ′(a) is

d

dt(f ◦ g)(0) =

d

dt(f(a1, . . . , aj + t, . . . , am))

= limt→0

f(a1, . . . , aj + t, . . . , am)− f(a)

t

=∂f

∂xj

= Djf(a)

∈ R.

Theorem If f is differentiable at a and Dj(f) and Di(f) are differentiable and con-tinuous in an open set containing a, then Di,j(a) = Dj,i(a).

Corollary. If f has a maximum (or minimum) at a, then Djf(a) = 0 for all j.

Suppose f : R3 → R2 is given by f(x, y, z) = (x sin y, y + cos z). Then

f ′(a) =

(D1f

1 D2f1 D3f

1

D1f2 D2f

2 D3f2

),

where D1f1 = sin y, D1f

2 = 0, D2f1 = x cos y, D2f

2 = 1, D3f1 = 0, and D3f

2 =− sin z. Thus

f 1(a1, a2, a3) =

(sin a2 a1 cos a2 0

0 1 − sin a3

).

Now suppose f : R3 → R, with Df(a) = (D1f, D2f, D3f). Suppose f(x, y, z) = xyz.Then

∂f

∂x= yzxyz−1,

∂f

∂y= xyz(ln x)z,

∂f

∂z= xyz(ln x)y.

21

So far we have the following result:

Theorem. If f : Rm → Rn is differentiable at a ∈ Rm, then each Djfi(a) exists, and

f ′(a) = (aij), where aij = Djfi(a).

Converse Theorem. If each Djfi(a) exists in an open set around a and are con-

tinuous at a, then f is differentiable at a.

Digression to one variable. If f : R → R is differentiable on [a, b], then there is a

c ∈ (a, b) so that f ′(c) = f(b)−f(a)b−a

. (Mean value theorem.)

Proof of Converse Theorem. Without loss of generality, take n = 1 so thatf : Rm → R and Djf(x) exists in an open set around a ∈ Rm. The goal is to showthat

limh→0

|f(a + h)− f(a)−Df(a)(h)||h|

= 0.

We have Df(a)(h) =∑

hiDif(a). Since we are given that Dif is continuous near a,we have Dif(bi) → Dif(a). Hence limh→0 |Dif(bi)−Dif(a)| = 0. �

Chain Rule for Engineers. Let g1, . . . , gm : R → R be differentiable at a and f :Rm → R differentiable at (g1(a), . . . , gm(a)). Consider f ◦ G = f(g1(x), . . . , gm(x)) :R → R. Then

d(f ◦G)

dx=

(∂f

∂x1

, . . . ,∂f

∂xm

)dg1

dx...

dgm

dx

=m∑

i=1

∂f

∂xi

dgi

dx.

22

2.1 The Birth of Differential Topology

2.1.1 The Inverse Function Theorem

Suppose f : R → R is a differentiable function.

Question. When does it have an inverse? That is, when does there exist a functionf−1 : R → R so that f ◦ f−1 = 1 = f−1 ◦ f?

(a) Suppose f(x) = ax + b, i.e. y = ax + b. Then x = y−ba

. Thus f has an inverseiff a 6= 0. Then (f−1)′ = 1

a= 1

f ′(x).

(b) Suppose, for all x, f ′(x) 6= 0. Then (without loss of generality), f is monotoni-cally increasing. This is not enough: f may not be onto. We do, however, havef−1 : im(f) → R. For example, take f(x) = ex > 0. Then im(f) = (0,∞). Wehave f−1 : (0,∞) → R, given by f−1(x) = ln(x).

(c) Suppose all we have is f ′(a) 6= 0 (say f ′(a) > 0) on U . Then f |U : U � W hasan inverse f−1 : W → U , called a local inverse.

Example. Let f(x) = sin(x). Then f ′(0) = 1, and indeed for −π2

< x < π2,

f ′(x) = cos(x) > 0. We can define f−1 for f |(−π2, π2 )

called sin−1 : (−1, 1) →(−π

2, π

2

).

What about f : Rn → R?

Conjecture. If f ′(a) 6= 0, then there is a local inverse at a.

Think about a linear transformation T : Rn → Rn, e.g. T : R2 → R2 given by

T =

(0 00 1

). Thus the conjecture is false.

23

Theorem. T : Rn → Rn is invertible iff det(T ) 6= 0.

Inverse Function Theorem. Let f : Rn → Rn be continuously differentiable neara ∈ Rn, and suppose det(f ′(a)) 6= 0. Then there are open sets U and W so thata ∈ U , f(a) ∈ W , and f |U : U → W has a differentiable inverse. (Note thatDf−1(f(x)) = (Df(x))−1.)

Example. Define f : R → (0,∞) by x 7→ ex. Then f−1(y) = ln y. Then(f−1)′(y) = 1

f ′(f−1(y))= 1

y. Thus d ln y

dy= 1

y.

Lemma. Suppose f : A → Rn is a continuously differentiable function on a rect-angle A ⊆ Rn. Suppose that on A, each |Djf

i(x)| < M . Then for any x, y ∈ A,|f(x)− f(y)| < n2M |y − x|.

Proof. Fill this in! �

Let f : Rm × Rn → R be bilinear. Show that it’s continuous.

f satisfies

f(x + h, y + k) = f(x, y + k) + f(h, y + k) = f(x, y) + f(x, k) + f(h, y) + f(h, k),

so

f

(m∑

i=1

aiei,

n∑j=1

bjej

)=∑i,j

aibjf(ei, ej).

Let M = maxi,j |f(ei, ej)|. Then for all (x, y), |f(x, y)| ≤ mn|x| |y|.

Proof. Let x =∑

aiei and y =∑

bjej. Then

|f(x, y)| ≤∑i,j

|ai||bj|M ≤∑i,j

|x||y|M = nm|x||y|M.

24

We must now show that lim(h,k)→0 f(x + h, y + k) = f(x, y). We have

lim(h,k)→0

f(x + h, y + k) = lim(h,k)→0

[f(x, y) + f(h, k) + f(h, y) + f(h, k)]

= f(x, y) + lim(h,k)→0

f(h, k)

≤ nmM lim(h,k)→0

|h||k|

= 0,

so f is continuous. �

Secret Motivation. Let (aij) = f(ei, ej). Then f(x, y) = yt(aij)x.

Lemma. Let f : Rn → Rn be continuously differentiable throughout a rectangle A sothat |Djf

i(x)| ≤ M on A. Then for any x1, x2 ∈ A, |f(x1)− f(x2)| ≤ n2M |x1 − x2|.

Theorem. Suppose f : Rm → Rn is continuously differentiable at det(f ′(a)) 6= 0.Then there are open sets U,W ⊆ Rm, Rn, respectively, so that a ∈ U , f(a) ∈ W ,and f |U : U → W has a differentiable inverse f−1 : W → U . Moreover, (f−1)′(y) =[f ′(f−1(y))]−1.

Proof. Our goal is to show that there is a U on which f is injective. We first show thatit suffices to allow Df(a) to be the identity. To see this, let λ = Df(a) : Rn → Rn.Since det(f ′(a)) 6= 0, λ is invertible by linear algebra. Let λ−1 : Rn → Rn be theinverse. Then

Dg(a) = D(λ−1f)(a) = Dλ−1(f(a))Df(a) = λ−1Df(a) = λ−1λ = id.

Rn

f**

g

22Rnλ−1

**Rn

Since we’re assuming it’s known in this case, there are open sets U 3 a and V 3 g(a)so that g |U : U → V has a differentiable inverse g−1. Then g−1(λ−1f) : U → U isthe identity, so (g−1λ−1)f : U → U is the identity. Then if we let W = λ(V ), wehave λ−1f(U) = V , so f(U) = λ(V ) = W . Then f : U → W has inverse g−1λ−1, asrequired.

25

We now show that there is a neighborhood A of a so that for any x ∈ A − {a},f(x) 6= f(a). We have

limx→a

|f(x)− f(a)− (x− a)||x− a|

= 0.

Let

A =

{x | |f(x)− f(a)− (x− a)|

|x− a|< 1

}.

Then f(x) 6= f(a) for x ∈ A.

We now show that we can find a (possibly smaller) set A so that

(i) A is a rectangle with a ∈ int(A),

(ii) For all x ∈ A, det f ′(x) 6= 0,

(iii) For all x ∈ A, |Djfi(x)− δij| < 1

2n2 .

To see this, we note that det f ′(x) is a continuous function of x since the entriesDjf

i(x) are continuous with respect to x, and the determinant is just a polyno-mial in its entries. Then if f : Rn → R is given by g(x) = | det f ′(x)|, theng−1(0,∞) = {x | det f ′(x) 6= 0} 3 a is open.

We now show that for x1, x2 ∈ A, |f(x1)−f(x2)| ≥ |x1−x2|2

. In particular, f is injectiveon A. To see this, let g(x) = f(x) − x. Then |Djg

i(x)| = |Djfi(x) − δij| < 1

2n2 by(iii), so by the Lemma, for all x1, x2 ∈ A,

|g(x1)− g(x2)| ≤1

2n2n2|x1 − x2| =

1

2|x1 − x2|.

Then we have

1

2|x1 − x2| ≥ |f(x1)− x1 − f(x2)− x2|

≥ |x1 − x2| − |f(x1)− f(x2)|≥ |x1 − x2| − |f(x1)− f(x2)|.

26

We then have

|f(x1)− f(x2)| ≥ |x1 − x2| −1

2|x1 − x2| =

1

2|x1 − x2|.

Denote the boundary of A by ∂A. Then ∂A is compact; let g : ∂A → R be de-fined by g(x) = |f(x) − f(a)| > 0, and this has a nonzero minimum d > 0. LetW = {y ∈ Rn | |y − f(a)| < d/2}.

We now show that W ⊆ f(A). Let y0 ∈ W . Let g : A → R be given byg(x) = |f(x) − y0|. Since A is compact, there is an x0 so that g achieves its min-imum at x0. Note that |f(a) − y0| < |f(x) − y0| for any x ∈ ∂A, so g(a) < g(x)as long as x ∈ ∂A. Thus x0 6∈ ∂A. Thus x0 ∈ int(A). If g has a minimum atx0, so does g2 =

∑ni=1(f

i(x) − yi0)

2, so for all 1 ≤ j ≤ n, Dj(g2) = 0. We have

Dj(g2)(x0) =

∑ni=1 2(f i(x0)− yi

i)Djfi(x). Then f(x0) = y0.

Let U = f−1(W ) ∩ (int A). Then f |U : U → W is bijective. Then f−1 : W → Uexists.

We now show that f−1 is continuous. For all x1, x2 ∈ A, |f(x1) − f(x2)| ≥ |x1−x2|2

.We now let x1 = f−1(y1) and x2 = f−1(y2) for any y1, y2 ∈ W . Then |y1 − y2| ≥|f−1(y1)−f−1(y2)|

2, so |f−1(y1)− f−1(y2)| ≤ 2|y1 − y2|.

Finally, we show that f−1 is continuously differentiable. Let y0 ∈ W ; we will showthat f−1 is differentiable at y0. Let x0 = f−1(y0) and µ = Df(x0). Then

limx→x0

|f(x)− f(x0)− µ(x− x0)||x− x0|

= 0,

so

limx→x0

|µ−1(f(x)− f(x0)− µ(x− x0))||x− x0|

= 0.

Recall that there is an M > 0 so that for all y, |µ−1(y)| ≤ M |y|, so if yg(y)

→ 0, thenµ−1(y)

g(y)→ 0. Thus

limx→0

|µ−1(f(x)− f(x0))− (x− x0)||x− x0|

= 0.

27

Since f−1 is continuous, if y → y0 in W , then f−1(y) → f−1(y0). Now let y = f(x),so that x = f−1(y). Then

limf−1(y)→f−1(y0)

|µ−1(y − y0)− (f−1(y)− f−1(y0))||f−1(y)− f−1(y0)|

,

so

limy→y0

|µ−1(y − y0)− (f−1(y)− f−1(y0))||f−1(y)− f−1(y0)|

= 0.

But that wasn’t the question we were trying to answer. However, we have

limy→y0

|f−1(y)− f−1(y0)− µ−1(y − y0)||y − y0|

= limy→y0

|µ−1(y − y0)− (f−1(y)− f−1(y0))||f−1(y)− f−1(y0)|

f−1(y)− f−1(y0)||y − y0|

.

The first term on the left goes to 0, and the second is at most 2, so the entire expres-sion goes to 0. Hence D(f−1)(y0) = µ−1 = (Df(y0))

−1. We have therefore proven theInverse Function Theorem. �

Addendum. If f is continuously differentiable, then f is C1. If Df is C1, then f isC2. If f is Cn, then we can differentiate n times, and the results are continuous. f isC∞ if f is Cn for all n.

Question. Is f−1 continuously differentiable, i.e. C1? If f is Cn, is f−1 also Cn?(This is, of course, assuming that det f ′(a) 6= 0.)

Answer. Yes. We have shown that f has a local inverse if det f ′(a) 6= 0. We canregard Df : U → GLn(R). For all x ∈ U , Df(x) is invertible. We have shown thatD(f−1)(y) = (Df(f−1(y)))−1.

Wf−1

))

D(f−1)

11UDf..

GLn(R)INV --

GLn(R),

where INV : GLn(R) → GLn(R) is given by A 7→ A−1.

We know:

1. Compositions of continuous functions are continuous.

28

2. The chain rule says that D(g ◦f) = Dg ◦Df , so g ◦f is Cn, i.e. the compositionof Cn functions is Cn.

3. INV : GLn(R) → GLn(R) is C∞. Why are the entries in the matrix A−1 C∞

functions of the entries in A? Cramer’s rules say that an entry is (up to sign)the ratio of determinants. Then D(f−1) is the composition of continuous func-tions f−1, Df , INV , so D(f−1) is continuous.

Corollary. If f is Cn for some n ≥ 1, so is f−1. If f is C∞, then so is f−1.

Proof. Suppose f is Cn for some n ≥ 1. Then f−1 is C1 by the above. Suppose f−1

is Cm for some m < n. Then D(f−1) is the composition of f−1, Df , and INV , soD(f−1) is Cm, so f−1 is Cm+1. �

Question. When is f−1(0) the graph of a function f : Rm → Rn?

Example. Let f : R3 → R be given by (x1, x2, x3) 7→ (x1)2 + (x2)2 + (x3)2− 1. Thenf−1(0) = {(x1, x2, x3) | (x1)2 + (x2)2 + (x3)2 = 1} = S2.

We now note an analogy between global linear algebra and local differential calculus:

1. Let A be a matrix. If det(A) 6= 0, then A is invertible. On the other hand,we have the Inverse Function Theorem: If det f ′(a) 6= 0, then there are opensets U 3 a and W 3 f(a) so that f |U : U → W is invertible, and the inverse iscontinuously differentiable.

2. Suppose a linear transformation T : Rm × Rn → Rn has matrix(B C

)with

det C 6= 0. Then there is an S : Rm → Rn so that (x, y) ∈ ker T iff T (x, y) = 0iff y = S(x) iff (x, y) is in the graph of S. On the other hand, we have theImplicit Function Theorem: Suppose that f : Rm × Rn → Rn is continuouslydifferentiable and f ′(a, b) =

(B C

)with det C 6= 0. Then there are open sets

a ∈ A ⊆ Rm and b ∈ B ⊆ Rn so that for some differentiable g : A → B,f(x, y) = 0 in A × B iff y = g(x). (Locally, f−1(0) looks like the graph of afunction.)

29

To prove (2), just let S(v) = −C−1Bv. For a more abstract proof, let A : Rm ×

Rn → Rm × Rn be A(x, y) = (x, T (x, y)). Then the matrix of A is

(Im 0B C

), so

det A = det C 6= 0. Hence A−1 exists; A−1(x, y) = (x, R(x, y)) for some R. LetS : Rm → Rn be given by S(x) = R(x, 0). Let π : Rm × Rn → Rn be the projection,so πA = T and πA−1 = R. Then

y = π(x, y) = πAA−1(x, y) = (πA)A−1(x, y) = T (x, R(x, y)).

Hence 0 = T (x, R(x, 0)) = T (x, S(x)), so y = S(x), so T (x, y) = 0. Also,

y = π(x, 0) = πA−1A(x, y) = (πA−1)A(x, y) = R(x, T (x, y)).

Thus T (x, y) = 0, so y = R(x, 0) = S(x). Thus y = S(x) iff T (x, y) = 0.

Proof of the Implicit Function Theorem. Given f with f ′(a, b) =(B C

)with

det C 6= 0, let F : Rm × Rn → Rm × Rn be given by F (x, y) = (x, f(x, y)). Then

det F ′(a, b) = det

(Im 0B C

)6= 0. Hence the inverse function theorem applies near

(a, b). There are open sets U 3 (a, b) and W 3 (a, f(a, b)) so that F |U : A×B → Whas a differentiable inverse h : W → U . Restrict to an open set A × B ⊆ U , andlet W = F (A × B). Then F (x, y) = (x, f(x, y)), and h is its inverse: h(x, y) =(h1(x, y), h2(x, y)). Then

(x, y) = hF (x, y) = h(x, f(x, y)) = (h1(x, f(x, y)), h2(x, f(x, y))),

so x = h1(x, f(x, y)), so h(x, y) = (x, h2(x, y)). Hence πF = f and πh = h2. Letg : Rm → Rn be given by g(x) = h2(x, 0). �

Note.y = π(x, y) = πFh(x, y) = fh(x, y) = f(x, h2(x, y)),

so if 0 = f(x, h2(x, 0)) = f(x, g(x)), then y = g(x) implies f(x, y) = 0. Then

y = π(x, y) = πhF (x, y) = h2(x, f(x, y)).

Hence if f(x, y) = 0, then y = h2(x, 0) = g(x), so f(x, y) = 0 implies y = g(x).

Example. Let h : R2 → R be the height of the earth on latitude and longitude(x, y). Then for any height c, consider h − c : (h − c)−1(0) = {(x, y) | h(x, y) = c},

30

which is a curve on a topographical map.

Proposition. Suppose A : Rm → Rp is a linear transformation and is surjec-tive. Then there is an invertible linear transformation M : Rm → Rm so thatAM : Rm → Rp is just the projection π to the last p coordinates: AM(x1, . . . , xm) =(xm−p+1, . . . , xm).

Proof. If A is surjective, then its row rank is p, so its column rank is also p,so m ≥ p. If we rearrange the columns, then the last p are linearly indepen-dent, so we have A =

(B C

), with det C 6= 0. If E is an elementary matrix,

then AE switches two columns of A, and EA switches two rows of A. Let M1 bethe product of elementary matrices doing the column permutation. We then have

AM1 =(B C

)with det C 6= 0, so C−1 exists. Now let M2 =

(I 0

−C−1B C−1

).

Then(B C

)M2 =

(0 Ip

)= π. Then AM1M2 = π. �

Corollary. Suppose f : Rm → Rp is continuously differentiable near a ∈ Rm andf(a) = 0 and Df(a) : Rm → Rp is surjective. Then there is an open set U ⊆ Rm con-taining A and an open set V ⊆ Rm containing 0 and a continuously differentiable maph : V → U with continuously differentiable inverse h−1 : U → V so that fh : V → Rp

is a projection.

Proof. Suppose first that f ′(a) =(B C

), where det C 6= 0. Apply the Implicit

Function Theorem (thinking of Rm = Rm−p×Rp f→ Rp). Then there is a neighborhoodW of 0 in Rm and U × V of a in Rm and

Wh --

U × Vh−1

jj

so that f ◦ h = π.

Now we do the general case. Suppose f ′(a) is surjective. By the Proposition above,there is an invertible matrix M1 so that f ′(a)M1 =

(B C

)with det C 6= 0. Let

g : Rm → Rp be the map fM1. Then let b = M−11 (a) so that M1(b) = a. Then

g′(b) = f ′(M1(b)) ·M1 = f ′(a) ·M1 =(B C

),

31

with det C 6= 0. By the first case, there is an h so that gh = π, and so (fM)h = π,so f(M1h) = π. Since h and M are invertible, so is M1h. �

32

Chapter 3

Manifolds

Definition. Let U,B ⊆ Rm be open sets and h : U → V be a continuously differ-entiable map with continuously differentiable inverse. Then h is called a diffeomor-phism, and U and V are said to be diffeomorphic.

“Diffeomorphic” is an equivalence relation: Clearly U is diffeomorphic to U . If h :U → V is a diffeomorphism, then h−1 : V → U is also a diffeomorphism. Finally ifwe have

Uh ))

Vj

**

h−1

ii Wj−1

ii ,

then jh : U → W is a diffeomorphism with inverse h−1j−1 : W → U .

Definition. Let M ⊆ Rn. We call M a differentiable k-manifold iff for each x ∈ M ,there is a neighborhood U of x in Rn and a neighborhood V on 0 in Rn and a diffeo-morphism h : U → V so that h(U ∩M) = V ∩ Rk.

Recall the following corollary of the Implicit Function Theorem: Suppose f : Rn → Rp

and f(a) = 0 and f ′(a) has rank p (i.e. Df(a) is onto). Then there is an open neigh-borhood U of a in Rn and an open set V ⊆ Rn and a diffeomorphism h : V → U sothat the composition f ◦ h = π : Rn → Rp, i.e. f ◦ h(x1, . . . , xn) = (xn−p+1, . . . , xn).

Corollary. In the above case,

(f ◦ h)−1(0) = π−1(0) = Rn−p × {0} = U ∩ f−1(0) = h((Rn−p × {0}) ∩ V ).

33

Let M = f−1(0). Then U ∩ f−1(0) is a (n− k)-manifold.

Definition. A continuously differentiable function f : Rn → Rp has c ∈ Rp as aregular value if for all x ∈ f−1(c), Df(x) is surjective.

Corollary. If c is a regular value, then f−1(c) is an (n− p)-dimensional manifold.

Example 1 Sn = {(x0, . . . , xn) ∈ Rn+1 |∑n

i=1(xi)2 = 1} is an n-manifold. Let f :

Rn+1 → R be given by (x0, . . . , xn) 7→∑

(xi)2. Then Sn = f−1(1). To show Sn is ann-manifold, it suffices to show that 1 is a regular value for f . Let a ∈ f−1(1) = Sn,i.e.∑

(ai)2 = 1. Then

f ′(a) =(D0f(a) · · · Dnf(a)

)=(2a0 · · · 2an

)6= 0,

so this is of maximal rank 1.

Example 2 Let Mn = {(n×n) real matrices} ' Rn2. Let GLn = {A ∈ Mn | det A 6=

0}. Then if θ : Mn → R is given by A 7→ det(A), then GLn = θ−1(R − {0}) is openin Rn, so it is an n2-manifold. Let SLn = {A ∈ Mn | det A = 1} = θ−1(1). If we canshow that 1 is a regular value of θ, then SLn will be an (n− 1)-manifold. To see that1 is a regular value, we must show that if A ∈ SL)n, then Dθ(A) 6= 0. If A = In,then we can easily calculate D(1,1)θ(In):

limt→0

θ

1 + t 0 · · · 0

0 1 · · · 0...

.... . .

...0 0 · · · 1

− θ(In)

t= lim

t→0

1 + t− 1

t= 1 6= 0.

Hence In is a regular point. Now suppose A ∈ SLn is arbitrary. Consider α : Mn →Mn given by α(B) = AB. Then α−1(B) = A−1B, α(B + C) = α(B) + α(C), andα(cB) = cα(B). Note that

θα(B) = θ(AB) = det(AB) = det(A) det(B) = det(B) = θ(B).

Hence θα = θ. Thus(Dθ ◦ α)Dα = D(θα) = Dθ,

34

soDθ(αA−1)Dα = D(θα)(A−1) = Dθ(A−1),

so Dθ(In)Dα = Dθ(A−1) has maximal rank. We have shown that for all A ∈ SLn,Dθ(A−1) has maximal rank. Since A ∈ SLn iff A−1 ∈ SLn, Dθ(A) also has maximalrank. Hence SLn is an (n2 − 1)-manifold.

Definition. Let On = {A ∈ Mn | AT A = In} denote the set of orthogonal matrices(so called because the columns are orthogonal, as are the rows).

We will show that On is an n(n−1)2

-manifold. Let θ : Mn → Mn be given by A 7→ AT A.Then θ−1(In) = On. Unfortunately, In is not a regular value.

Let Sn = {B ∈ Mn | B = BT}, i.e. for all i, j, bij = bji. Note that im(θ) ⊆ Sn.

Fact. (AB)T = BT AT .

Thus [θ(A)[T = [AT A]T = AT (AT )T = AT A = θ(A), so θ(A) ∈ Sn. We haveSn ' Rn(n+1)/2. We now show that θ : Mn → Sn has In for a regular value, soθ−1(In) = On is an n2 − n(n+1)

2= n(n−1)

2-manifold.

We first calculate Dθ(In) : Rn2 → Rn(n+1)/2, i.e. Dθ(In) : Mn → Sn. We must find aλ so that

limB→0

|θ(I + B)− θ(I)− λ(B)||B|

= 0.

We have θ(I + B) = (I + B)T (I + B) = I + BT + B + BT B. Then θ(I + B)− θ(I) =

BT +B+BT B. Let λ : Mn → Sn be given by B 7→ B+BT . Since limB→0|BBT ||B| = 0, we

have Dθ(In)(B) = B+BT . For any C ∈ Sn, we have Dθ(In)(1/2C) = 12(C+CT ) = C.

Then Dθ(In) : Mn → Sn, as required.

Now let A ∈ On so that AT A = In (or AT = A−1). Let α : Mn → Mn be given byα(B) = AB. Note that

θα(B) = θ(AB) = (AB)T AB = BT AT AB = BT B = θ(B)

35

for all B ∈ Mn, i.e. θα = θ : Mn → Sn. By the chain rule, we have Dθ(αIn ◦Dα(In) = Dθ(In), which is surjective. Furthermore, Dθ(αIn) = Dθ(A). Thus forany A ∈ θ−1(In) = On, Dθ(A) is surjective, i.e. In is a regular value for θ. Hence On

is an n(n−1)2

-manifold.Theorem. (Sard) Almost all y ∈ Rp are regular, i.e. {y | y not regular} has measurezero.

Corollary. Suppose Mm and Nn are manifolds, and f : Mm → Nn is continuouslydifferentiable. Then for almost all y ∈ N , f−1(y) is an (m− n)-manifold.

Corollary. (Brouwer Fixed Point Theorem) Given any continuously differentiablemap f : Bn → Bn (where Bn = {(x1, . . . , xn) ∈ Rn | (x1)2 + · · · + (xn)2 ≤ 1}), thereis a point x so that f(x) = x.

Proof. Suppose for all x, f(x) 6= x. Define g : Bn → Sn−1 as follows: draw theray beginning at f(x) and passing through x. This ray meets Sn−1 at a point. Thispoint is g(x). Then g | Sn−1 is the identity map. By the previous corollary, this isimpossible. �

More generally, if M is a compact m-manifold with boundary, there is no differen-tiable retraction M → ∂M .

Clunky Version. Imagine that there is a differentiable retraction ρ : M → ∂M . LetJM = M − ∂M , and define Jρ : JM → ∂M . For most points y ∈ ∂M , (Jρ)−1(y) isa 1-manifold. But it is easy to write a list of all compact 1-manifolds: one circle, twocircles, three circles, etc. Then (Jρ)−1(y) is a set of circles, which we denote by C.The cardinality of C ∩ ∂M is even. But if x ∈ C ∩ ∂M , then x = Jρ(x) = y. Hencethere is only one point in C ∩ ∂M . But one isn’t even, so we have a contradiction.

36