an introduction to differential geometry contentslerman/518/f11/8-19-11.pdf · 8/19/2011 · an...

AN INTRODUCTION TO DIFFERENTIAL GEOMETRY

EUGENE LERMAN

Contents

1. Introduction: why manifolds? 32. Smooth manifolds 32.1. Digression: smooth maps from open subsets of Rn to Rm 32.2. Definitions and examples of manifolds 42.3. Maps of manifolds 72.4. Partitions of unity 83. Tangent vectors and tangent spaces 103.1. Tangent vectors and tangent spaces 103.2. Digression: vector spaces and their duals 133.3. Differentials 133.4. The tangent bundle 153.5. The cotangent bundle 173.6. Vector fields 184. Submanifolds and the implicit function theorem 214.1. The inverse function theorem and a few of its consequence 214.2. Transversality 254.3. Embeddings, Immersions, and Rank 265. Vector fields and flows 275.1. Definitions, examples, correspondence between vector fields and flows 275.2. The geometry of the Lie bracket 335.3. Map-related vector fields 356. (Multi)linear algebra 366.1. Tensor products 366.2. The Grassmann (exterior) algebra and alternating maps 426.3. Pairings 447. Differential forms and integration 457.1. Motivation 457.2. Pullback of differential forms 477.3. Integration 498. Vector bundles 538.1. Sections 548.2. Frames and local frames 558.3. Vector bundles via transition maps 569. Exterior differentiation, contractions and Lie derivatives of forms 589.1. Exterior differentiation 589.2. Contractions of forms and vector fields 609.3. Lie derivatives of differential forms 629.4. de Rham cohomology 6510. Stokes’s theorem 6811. Connections on vector bundles 7111.1. Connections 71

typeset August 19, 2011.

1

11.2. Parallel Transport 7612. Riemannian geometry 7812.1. Levi-Civita connection 78Fiber metrics 7912.2. Connections induced on submanifolds 8112.3. The second fundamental form of an embedding 8413. Geodesics as critical points of the energy functional 87

2

1. Introduction: why manifolds?

There are many different ways to formulate mathematically the notion of a ‘space’ that occurs in differentbranches of science and engineering. For instance one can talk about the space of configurations of a physicalsystem. This, of course, requires a decision as to the level of details one is trying to model. For example, wecan regard the configuration space of a system consisting of a sun and a planet as R3 × R3. We use threereal numbers to describe the position of the center of mass of the sun and three real numbers to describethe position of the center of mass of the planet. In this model we assume that the sun and the planet aresimply two points in space. We also allow collisions. If we exclude collisions (but still allow the sun and theplanet to come arbitrarily close to each other), the configuration space is then

Q = (x, y) ∈ R3 × R3 | x 6= y.Here is another idealized example: the configuration space of a penny tumbling through the air. Fix a

frame of reference. We will need a triple of real numbers to describe the position of the penny’s center ofgravity and three orthonormal vectors to describe the orientation of the penny. Thus the configuration spacein question is

Q = R3 ×O(3),where O(3) denotes the set of 3 × 3 orthogonal matrices (recall that an n × n matrix is orthogonal if (andonly if) its columns form an orthonormal basis of Rn).1

Exercise 1.1. What is the configuration space of a penny rolling on a plane?

Manifolds constitute a particular way to formalize the notion of a configuration space. These are the spacesthat “locally look like Rn.” The reason we will limit ourselves to manifolds is that they are particularlysuitable for generalizing the ideas of calculus — differentiation and integration. We will see that the twoexamples of configuration spaces given above: Q = (x, y) ∈ R3 × R3 | x 6= y and Q = R3 × O(3) are,indeed, manifolds.

Remark 1.1. There are, of course, many other notions of a “space.” In linear algebra one studies vectorspaces and maps between them. In algebraic geometry one studies spaces of solutions of polynomial equationswhich give rise to the notion of an algebraic variety. In metric topology/geometry one studies metric spaces,spaces with a notion of a distance. In point set topology and in algebraic topology one talks about topologicalspaces. In analysis one may study the space of solutions of a partial differential equation. In geometry andtopology one may be forced to study spaces that have singularities such as orbifolds and stratified spaces.Before we can discuss orbifolds and more complicated spaces we should first come to terms with manifoldswhich are smooth.

2. Smooth manifolds

2.1. Digression: smooth maps from open subsets of Rn to Rm. We start out by recalling the definitionof a differentiable map.

Definition 2.1. Let U ⊂ Rn be an open subset. A map f : U → Rm is differentiable at a point x ∈ U ifthere is a linear map L : Rn → Rm so that

limh→0

1||h||

(f(x+ h)− f(x)− Lh) = 0.

It is not hard to show that if such a map L exists, it is unique. The linear map L is variously called thederivative of f at x, the differential of f at x, ... and is denoted by dfx or by Dfx or by Df(x) or by asimilar notation. Moreover, the matrix corresponding to L with respect to the standard basis of Rn and Rmis the so called Jacobian matrix. That is, if f = (f1, . . . fm) then

Dfx =

∂f1∂x1

(x) . . . ∂f1∂xn

(x)...

...∂fm∂x1

(x) . . . ∂fm∂xn

(x)

1Strictly speaking the configuration space is R3 × SO(3), where SO(3) denotes the set of orthogonal matrices with positive

determinant. Why?

3

Definition 2.2. Let U ⊂ Rn be an open subset. A map f : U → Rm is smooth (or C∞) on the set U if allpartial derivatives of f to all orders exist at all points of U .

Here is a more “sophisticated” version of the the definition above. Suppose f : U → Rm is differentiableat all points of U . Then we have a map g(x) := Dfx : U → Rnm. We can require that g is differentiable as amap from U to Rnm. The derivative of g is a map from U to a bigger vector space RN for an appropriate N .We can require that this map is differentiable and so on...In other words, if all derivatives of f : U → Rnexist and are differentiable we say that f is smooth.

2.2. Definitions and examples of manifolds. A smooth manifold is a generalization of a smooth surfacein R3. A smooth surface in S ⊂ R3 has local parameterizations: for every point p ∈ S there is an open setV ⊂ R3 with p ∈ V and a map x : U → S ∩ V (where U ⊂ R2 is an open set) such that

(1) x is C∞. That is x(u1, u2) = (x1(u1, u2), x2(u1, u2), x3(u1, u2)) and each xi(u1, u2), 1 ≤ i ≤ 3 is aninfinitely differentiable function of u = (u1, u2) ∈ U ;

(2) x is 1-1 (injective) and onto.The map x is a local parameterization of S.

Example 2.3. The two sphereS2 = x ∈ R3 | ||x|| = 1

is a smooth surface: if p = (p1, p2, p3) ∈ S2 and p3 > 0 take V = x ∈ R3 | x3 > 0, U = (u1, u2) | ||u|| < 1and a local parameterization x : U → S2 ∩ V to be x(u1, u2) = (u1, u2,

√1− u2

1 − u22). It’s easy to check

that this x is 1-1, onto and C∞. If p3 < 0 take the local parameterization x(u) = (u1, u2,−√

1− u21 − u2

2).If p3 = 0 then either p1 or p2 is non-zero (or both) and there are formulas for local parameterizations similarto the ones above.

Note that if S is a smooth surface and xα : R2 ⊃ Uα → S and xβ : R2 ⊃ Uβ → S are two localparameterizations with

Wαβ := xα(Uα) ∩ xβ(Uβ) 6= ∅then

x−1β xα : R2 ⊃ x−1

α (Wαβ)→ x−1β (Wαβ) ⊂ R2

is C∞.

This motivates:

Definition 2.4. [of a C∞ manifold, first approximation, not quite right] A C∞ manifold of dimension m isa set M and a family of injective maps xα : Uα →M where Uα ⊂ Rm are open sets, such that

(1)⋃xα(Uα) = M ;

(2) if for some pair of indices α and β, the set Wαβ := xα(Uα)∩ xβ(Uβ) 6= ∅ then x−1α (Wαβ), x−1

β (Wαβ)are open in Rm and

x−1β xα : x−1

α (Wαβ)→ x−1β (Wαβ)

are C∞.

One thing that is wrong with this definition is that there is no topology specified on M . The other is thatinstead of parameterizations one usually works with charts that go the other way. Namely

Definition 2.5 (Chart). Let X be a topological space. An Rn (coordinate) chart on X is a homeomorphismφ : X ⊃ U → U ′ ⊂ Rn.

Notation. We will often write φ : U → Rn or even (U, φ) for a coordinate chart φ : X ⊃ U → U ′ ⊂ Rn. Notethat since φ takes values in Rn, it is an n-tuple of functions φ = (x1, . . . , xn) for some functions xi : U → R,the coordinate functions on U associated to the coordinate chart φ : U → Rn.

Notation. When dealing with charts it will be convenient to to adopt the notation where the standardcoordinate functions on Rn are denote by ri, 1 ≤ i ≤ n. That is, ri assigns to a point a = (a1, . . . , an) ∈ Rnthe number ai. If φ : U → Rn is a chart then

xi = ri φ : U → R4

are the coordinate functions on U ,

Definition 2.6 (Atlas). A C∞ atlas on a topological space X is a collection of charts φα : Uα → U ′α(with all U ′’s being open subsets of one fixed Rn such that

(1) Uα is an open cover of X,2 and(2) If Uα ∩ Uβ 6= ∅, then φβ φ−1

α : φα(Uα ∩ Uβ) → φβ(Uα ∩ Uβ) is C∞ as a map from an open subsetof Rn to Rn. That is, changes of coordinates are smooth.

Example 2.7. The identity map f : R → R, f(x) = x is the standard chart on R. The set (f,R)consisting of one chart is an atlas on R. The map g : R → R, g(x) = x3 is also a chart on R; it defines adifferent atlas on R.

Here is a third atlas on R. For each integer n ∈ Z, φn : (n, n + 2) → R, φn(x) = x is a chart. The set(φn, (n, n+ 2) is an atlas on R.

Definition 2.8. We say that two atlases are equivalent if their union is also an atlas.

The definition above amounts to: an atlas xα : Uα → U ′α is equivalent to an atlas yβ : Vβ → V ′β if forany indices α, β with Uα ∩Vβ 6= ∅ the map xα y−1

β : yβ(Uα ∩Vβ)→ xα(Uα ∩Vβ) is smooth. One can easilyverify that this is indeed an equivalence relation.

Exercise 2.1. Convince yourself that the first and the third atlases in Example 2.7 are equivalent. Showthat the first and the second example of atlases are not equivalent.

Definition 2.9 (Manifold). An n-dimensional (C∞) manifold a topological space M together with anequivalence class of C∞ atlases.

Notation. We will denote the manifold and the underlying topological space by the same letter, with theequivalence class of atlases usually understood.

Example 2.10. Let M = Rn. We cover M by one open set and take the identity map as our chart. Thisis the standard manifold structure on Rn.

Example 2.11. Let M = Cn. Again we cover Cn by one open set U = Cn, and take as our coordinatechart the map φ : Cn → R2n which is given by

φ(z1, . . . , zn) = (Rez1, Imz1, . . .).

Example 2.12. If M is a manifold, and V ⊂ M is an open subset, then V is naturally a manifold. Checkthis!

Example 2.13. The set Mn(R) of n× n matrices with real coefficients is a manifold, since it is Rn2.

The subset GL(n,R) ⊂ Mn(R) of invertible matrices is an open subset: a matrix A is invertible if andonly if its determinant is non-zero and determinant det : Mn(R)→ R is a polynomial map, hence continuous.Hence the subset A ∈ Mn(R) | detA 6= 0 is open. So by the previous example, GL(n,R) is a manifold.

Example 2.14. The two-sphere S2 := x ∈ R3 | ||x||2 = 1 is a manifold. To see this, we give S2 thesubspace topology that it inherits as a subset of R3. Next we define charts. To do this, let

U+i = x = (x1, x2, x3) ∈ S2 : xi > 0

andU−i = x = (x1, x2, x3) ∈ S2 : xi < 0,

i = 1, 2, 3 (6 charts altogether) which gives us an open cover of S2. Define φ±1 (x) = (x2, x3), φ±2 (x) = (x1, x3),and φ±3 (x) = (x1, x2).

We need to verify that changes of coordinates are smooth. Consider, for example, φ+2 (φ+

1 )−1

(u1, u2) =(√

1− u21 − u2

2, u2), which is smooth in its region of definition. The other compositions yield similar results.It follows that S2 is indeed a manifold.

2That is, each Uα ⊂ X is open and ∪αUα = X

5

Example 2.15. Now we consider a slightly more interesting example of a manifold, the real projective spaceRPn−1 which is, by definition, the space of lines through the origin in Rn. To give RPn−1 a topology, wethink of it as the set of equivalence classes of nonzero vectors in Rn. That is,

RPn−1 = (Rn r 0)/ ∼,

where two non-zero vectors v and v′ are equivalent if and only if there is a constant λ 6= 0 such that v = λv′.Note that this is an equivalence relation. We then have a surjective map

π : Rn − 0 → RPn−1, π(v) = [v],

where [v] denotes the equivalence class of v ([v] is the line through v).We put on RPn−1 the quotient topology : U ⊂ RPn−1 is open if and only if π−1(U) is open in Rn − 0.

I leave it to the reader to check that this topology is Hausdorff.Charts here are given as follows: for each 1 ≤ i ≤ n, let

Ui = [x1, ..., xn] ∈ RPn−1 : xi 6= 0

and define

φi : Ui → Rn−1

by

[x1, ..., xn] 7→(x1

xi, · · · , xi−1

xi,xi+1

xi, · · · xn

xi

).

Note that the inverse φ−1i is given by

φ−1i : (x1, · · · , xn−1) 7→ [x1, · · · , xi−1, 1, · · · , xn].

We must check that the change of coordinates maps are smooth. If j < i, then on the interesection Ui∩Uj

φj φ−1i (u1, · · · , un−1) = φj(u1, · · · , ui−1, 1, · · · , un) =

(u1

uj, · · · , ui−1

uj,

1uj, · · · , un

uj

),

which is smooth. Other computations are similar (and are left to the reader).

Exercise 2.2. Define the complex projective space CPn−1 to be the set of complex lines through the originin Cn and prove that it is a manifold.

Exercise 2.3. If M and N are manifolds, show that M ×N is also naturally a manifold.

Exercise 2.4. Let V be a finite-dimensional vector space over R. Then V is a manifold: a choice ofbasis v1, . . . , vn (n = dimV ) of V defines a linear bijection σ : Rn → V , σ(r1, . . . , rn) =

∑rivi. Define a

topology on V by requiring that σ is a homeomorphism (that is, U ⊂ V is open ⇔ σ−1(U) ⊂ Rn is open).Check that this is indeed a Hausdorff second countable topology. Define σ−1 : V → Rn to be a chart andσ−1 : V → Rn to be an atlas (one chart!). Prove that a different choice of basis of V defines the sametopology and an equivalent atlas.

Exercise 2.5. Let M be a manifold. Show that for each point x ∈M there is a coordinate chart φ : U → Rnwith x ∈ U such that φ(x) = 0 and φ(U) is B1(0), the ball of radius 1 centered at 0.

Remark 2.16. In Definition 2.9 we have made no assumption on the topology of our manifolds. It isstandard to assume that the manifolds are Hausdorff. Otherwise all sorts of pathologies turn up. Anotherset of standard assumptions guarantees the existence of partitions of unity (see subsection 2.4 below). Forthis the simplest assumption to make is that the manifold in question is second countable. However, thisassumption is too stringent and paracompactness is much more reasonable. All of this will be discussed lateron.

6

2.3. Maps of manifolds. In the Bourbakist view every area of mathematics has its collection of objectsand its collection of maps between objects (or, more generally, morphisms). While it is enjoyable to makefun of Bourbaki and Bourbakists, there is some merit to this point of view. A map f : M → N between twomanifolds is smooth if it is continuous and is smooth in coordinates. More precisely we have:

Definition 2.17 (smooth map). Let M and N be two smooth manifolds with atlases (Uα, φα) and(Vβ , ψβ), respectively. A continuous map f : M → N is a smooth map (or a morphism of C∞ manifolds)if for all α and β with

f−1(Vβ) ∩ Uα 6= ∅,the composition

ψβ f φ−1α : φα(Uα ∩ f−1(Vβ))→ ψβ(Vβ)

is C∞.

We will write C∞(M,N) to denote the set of all smooth maps from M to N . Note that this definitiondoes not depend on which atlases on M and N we choose [check this].

Also note a special case of this definition is that of a smooth function on a manifold, which is a map fromM to R. To wit

Definition 2.18. A function f : M → R is smooth if f is continuous and if for all coordinate charts(Uα, φα), f φ−1

α : φα(Uα) → R is C∞. It’s consistent with the previous definition: we think of the realline R as a manifold with the standard coordinate chart id : R→ R. We denote the collection of all smoothfunctions on a manifold M by C∞(M) = C∞(M,R).

Exercise 2.6. Let M be a manifold. Check that C∞(M) is a vector space over the reals under the standardaddition of functions and multiplication by scalars. Is it finite dimensional?

Exercise 2.7. Let M be a manifold. Check that a constant function on a manifold M is smooth.

Here are some examples of smooth maps.

Example 2.19. Take M = Rn r 0, and let N = RPn−1. Let π : Rn r 0 → RPn−1 be the projectionπ(v) = [v]. I claim that π is a smooth map. Let’s check it.

The atlas on M is given by one chart — the inclusion φ of M into Rn. The charts on RPn−1 are thesame as last time. Note that π−1(Ui) = v ∈ Rn r 0 : vi 6= 0. To see that π is smooth, we need to checkthat φi π φ−1 : π−1(Ui)→ Rn−1 is C∞. But note that

(φi π φ−1)(v) = φi(π(v)) = φi([v]) =(v1

vi, · · · , vn

vi

).

Example 2.20. Let M = R with the coordinate chart φ(x) = x3. Let N = R with the coordinate chartψ(x) = x. Let f : M → N be the map x 7→ x3. Is f a C∞ map?

(ψ f φ−1)(x) = ψ f(x1/3) = ψ(x) = x,

which is smooth. So f is smooth.Now let us see if the map h : M → N , h(x) = x is smooth. We have ψ h φ−1(x) = x1/3, which is not

differentiable at 0. So h is not smooth.Finally note that f−1 : N →M is smooth:

φ f−1 ψ−1(x) = (x1/3)3 = x.

Example 2.21. Constant functions are smooth maps of manifolds

The appropriate notion of “isomorphism” in differential geometry is the following one:

Definition 2.22 (Diffeomorphism). A C∞ map f : M → N between two smooth manifolds is a diffeomor-phism if f is a homeomorphism and both f and f−1 are C∞ maps.

Example 2.23. The map f : M → N of Example 2.20 is a diffeomorphism.

Exercise 2.8. If M and N are manifolds, prove that M ×N is diffeomorphic to N ×M .7

Exercise 2.9. Show that the composition of smooth maps is smooth.

Exercise 2.10. Let LA : GL(n,R)→ GL(n,R) be left multiplication by A ∈ GL(n,R). Prove that LA is adiffeomorphism. [Recall that GL(n,R) ⊂ Rn2

is the set of all invertible n × n matrices and that it is openin Rn2

.]

2.4. Partitions of unity. In this subsection we define partitions of unity (that is, writing the constantfunction 1 as a sum of bump functions with certain properties) and prove the existence of a partition ofunity subordinate to a cover on a second countable manifold. The existence of such partitions of unity isvery useful. The proof of the existence of the partition of unity is not terribly useful and should be skippedon the first (and second) reading. The reason for this advice is that the proof is technical and the techniqueswill never be used again in this course. We start with a string of definitions.

Definition 2.24 (second countable). A topological space X is second countable if there is a countablecollection of open subsets Ui of X such that any open set in X is the union of some collection of Ui’s. Inother words, the topology of X has a countable basis.

Example 2.25. The real line R with the standard topology is second countable: the collection Ui isconsists of open intervals (a, b) where a and b are rational numbers.

Similarly Rn is second countable: the collection Ui consists of open balls Br(x) of rational radius rcentered at points x with rational coordinates.

Remark 2.26. Any (topological) subspace of a second countable space is second countable [prove it]. Henceany manifold that can be realized as a subspace of some Rn has to be second countable.

The condition of second countability is much more than necessary for the existence of the partition ofunity. One can get away with assuming only paracompactness. Here, for the record, is its definition. It takesa paragraph to state because we have to define a few more things first.

Definition 2.27. Let M be a topological space. A collection Uα of subsets of M is a cover of a subsetW ⊂ M if W ⊂

⋃Uα. It is an open cover if each Uα is open. A refinement Vβ of a cover Uα is a

cover such that for each index β there is an index α = α(β) with Vβ ⊂ Uα.A collection of subsets Uα of M is locally finite if for every point m ∈M there is a neighborhood W of

M with W ∩ Uα 6= 0 for only finitely many α.

Example 2.28. The cover (n, n + 2n∈Z is a locally finite cover of R. The cover [− 1n ,

1n ] is a cover of

(−1, 1) which is not locally finite — there is a problem at 0.

Definition 2.29 (paracompact). A topological space is paracompact if every open cover has a locally finiterefinement.

Example 2.30. Any compact space is paracompact. We will see shortly that second countable Hausdorffmanifolds are paracompact.

Definition 2.31 (support). The support supp f of a continuous function f : X → R is the closure of theset of points where f is non-zero:

supp f = x ∈ X : f(x) 6= 0.

Definition 2.32 (Partition of Unity). Let Uα be an open cover of a manifold M . A partition of unitysubordinate to the cover Uα is a collection of smooth functions ρβ : M → [0, 1] such that :

(1) For each index β there is an index α with supp(ρβ) ⊂ Uα.(2) For each point m ∈ M , there is a neighborhood W of m such that ρβ |W 6= 0 for only finitely many

β. That is, the collection of supports supp ρβ is locally finite.(3)

∑β ρβ = 1.

Remark 2.33. Note that we need condition (2) to make sense of the sum in (3): by (2), for each pointm ∈M the sum

∑ρβ(m) is actually a finite sum. So there are no problems with convergence.

Theorem 2.34. Let M be a second countable Hausdorff manifold. Then every open cover of M has apartition of unity subordinate to it.

8

Proof. (You should not read this proof the first time around)Step 1. We first construct a collection Xk∞k=1 of open subsets of M such that their closures Xk arecompact, Xk ⊂ Xk+1 and M =

⋃∞k=1Xk. Since M is second countable, there is a countable basis of the

topology of M . Out of this collection of open sets choose those that have compact closure and denote themby W1, W2, . . . We claim that that they cover M : M =

⋃Wi. Indeed, a point x ∈ M has a neighborhood

homeomorphic to an open subset of Rn (n = dimM , of course). For any point y in an open set U ⊂ Rnthere is a closed ball Br(y) centered at y with Br(y) ⊂ U . Closed balls in Rn are compact. Hence everypoint x ∈ M has a neighborhood U(x) whose closure U(x) is compact. Now U(x) is a union of a certainnumber of elements of the countable basis of the topology of M . The closure of each of these elements iscompact. Therefore x ∈Wi for some index i. This proves that M =

⋃Wi.

Let X1 = W1. The whole collection Wi∞i=1 covers X1. Since X1 is compact, X1 = Wi1 ∪Wi2 ∪ . . .∪Wip

for some i1 < i2 < · · · < ip. Let X2 = Wi1 ∪Wi2 ∪ . . . ∪Wip . Then X2 is compact ... Continuing in thismanner we get the desired collection Xk∞k=1.

Step 2. We construct three open countable covers Vβ,1, Vβ,2, Vβ,3 with Vβ,1 ⊂ Vβ,2 ⊂ Vβ,3,⋃βVβ,1 = M and Vβ,3 is locally finite and subordinate to Uα, the cover we started out with. Note

that this will prove that any Hausdorff second countable manifold is paracompact, as promised.Fix an index k. For each point z ∈ Xk rXk−1 choose an open set Vz,3 such that Vz,3 ⊂ Uα for some α,

Vz,3 ⊂ Xk+1 and Vz,3 ∩Xk−1 = ∅. Additionally we require that there is a coordinate chart ψz mapping Vz,3homeomorphically onto

B3(0) := x ∈ Rn | ||x|| < 3.Let Vz,i = ψ−1

z Bi(0) for i = 1, 2. The open sets Vz,1 cover the compact set Xk rXk−1 (and are contained inXk+1 r Xk−2). Therefore, for each k, there is a finite collection of Vz,1’s covering Xk r Xk−1. Take all ofthese finite collections. We get a cover Vβ,1 of M . Similarly we get two more covers: Vβ,2 and Vβ,3.Note that by construction they are locally finite and are subordinate to Uα: for each β there is α(β) withVβ,i ⊂ Uα(β).

Step 3. Now we construct a partition of unity. The function

f(t) =e−

1t , if t > 0

0, if t ≤ 0

is smooth on all of R [this fact is not entirely trivial]. Hence

f(t) =e−

11−t , if t < 1

0, if t ≥ 1

is smooth on all of R. Therefore h : Rn → [0,∞) given by

h(x) = f(||x||2/4)

is also smooth. Note that h(x) > 0 for all x ∈ B2(0) and h(x) = 0 for all x 6∈ B2(0). Therefore, for eachindex β,

gβ(x) =h(ψβ(x)) if x ∈ Vβ,30, if x 6∈ Vβ,3,

where ψβ : Vβ,3 → B3(0) is the corresponding coordinate chart, is a smooth function on M . Moreover,gβ(x) > 0 for x ∈ Vβ,1. Since the cover Vβ,3 is locally finite, the sum

G(x) =∑β

gβ(x)

makes sense [converges for each x] and defines a smooth function on M . Since Vβ,1 covers M , G(x) > 0for all x ∈M . Let

ρβ(x) = gβ(x)/G(x).Then 1 ≥ ρβ(x) ≥ 0,

∑ρβ = 1 and supp ρβ ⊂ Vβ,3 ⊂ Uα(β). Thus the collection ρβ is the desired partition

of 1. 9

Corollary 2.34.1. Let M be a second countable Hausdorff manifold and Ui∞i=1 a countable open cover.Then there is a partition of unity ρi with supp ρi ⊂ Ui.

Proof. By Theorem 2.34 there is a partition of unity τβ with supp τβ ⊂ Ui for some i = i(β). Let

I(i) = β | supp τβ ⊂ Ui and supp τβ 6⊂ Uj for j < i.

Defineρi =

∑β∈I(i)

τβ .

The collection ρi is the desired partition of 1.

Proposition 2.35. Suppose that M is a second countable Hausdorff manifold, K ⊂M a closed subset andU ⊂M an open set with K ⊂ U . Then there is a smooth function f : M → [0, 1] such that

(1) f |K ≡ 1 and(2) supp(f) ⊂ U .

Proof. Let U1 = U and U2 = M rK. By Corollary 2.34.1 there exists smooth functions ρ1, ρ2 : M → [0, 1]with supp ρi ⊂ Ui and ρ1 + ρ2 = 1. Since supp ρ2 ⊂M rK, ρ2|K ≡ 0. Hence ρ1|K ≡ 1. Now let f = ρ1.

Corollary 2.35.1. Let M be a (second countable Hausdorff) manifold. For any point x ∈ M and anyneighborhood U of x in M there is a smooth function f : M → R so that

(1) f ≡ 1 on a neighborhood V of x contained in U and(2) supp(f) ⊂ U .

Proof. Exercise. You can use the proposition above. Alternatively prove it directly first in the case whereM = Rn and then use a coordinate chart around x to prove it for arbitrary M . Is the condition that M issecond countable really necessary?

3. Tangent vectors and tangent spaces

3.1. Tangent vectors and tangent spaces. We learn in physics that a vector is an arrow sticking out of apoint in space and that a vector field assigns an arrow to each point in space. When we learn linear algebra,we are told to forget this point of view: all vectors are sticking out of one point — the origin. For thepurposes of differential geometry the physics point of view is correct after all: all our vectors are anchoredat various points in space.

There is another issue we need to deal with. If S ⊂ R3 is a smooth convex surface, one can imagine thatfor every point p ∈ S there is a two-plane TpS touching S at that point, a plane tangent to S at p. (It is notentirely clear that such a plane is unique, but that’s another story.) A vector tangent to S at p would be anarrow anchored at p and lying in TpS. This raises a problem: our manifolds are defined abstractly and notas subsets of some Rn. So what would a tangent plane be in this case? and what vector space would it liein?

The solution is to think of vectors as directional derivatives. A directional derivative of a function on Rndepends on two things: a direction and the point at which the function is being differentiated. For a smoothfunction f ∈ C∞(Rn), we write

Dvf(p) =d

dt|0f(p+ tv)

for the directional derivative of f at a point p ∈ Rn in the direction v ∈ Rn. Observe that(1) the directional derivatives are linear: for any f, g ∈ C∞(Rn) and any λ, µ ∈ R

Dv(λf + µg)(p) = λDvf(p) + µDvg(p);

(2) the directional derivatives have a derivation property:

Dv(fg)(p) = f(p)Dvg(p) +Dvf(p) g(p).

This motivates the following definition:10

Definition 3.1 (Tangent vector). Let M be a manifold and a ∈M a point. A tangent vector to M at a isan R-linear map v : C∞(M)→ R such that

(3.1) v(fg) = f(a)v(g) + g(a)v(f)

for all functions f, g ∈ C∞(M).Linear maps C∞(M) → R satisfying (3.1) are also said to have a derivation property and are called

derivations (into R).

Definition 3.2 (Tangent space). The tangent space TaM to a manifold M at a point a is the collection ofall tangent vectors to M at a.

Exercise 3.1. The tangent space TaM is a vector space over the reals. [That’s why the elements of thetangent space are called “vectors”!] That is, if v, w ∈ TaM and λ, µ ∈ R then the linear map λv + µw :C∞(M)→ R is a derivation.

Note that by our definition every direction derivative at a point p ∈ Rn is a tangent vector at p to Rn Thisbegs a question: are there tangent vectors that are not directional derivatives? The answer is no, tangentvectors to points of Rn are directional derivatives and that’s all there is to it:

Proposition 3.3. Let w ∈ TaRn be a tangent vector. That is, suppose w : C∞(Rn) → R is a linear mapsatisfying (3.1). Then

w(f) = Dvf (a)for some v ∈ Rn. The same result holds with Rn replaced by some open ball Br(a).

To prove the proposition we first “recall” a version of Taylor’s theorem.

Lemma 3.4. Let f be a smooth function on Rn. Fix a point a ∈ Rn Then for any x ∈ Rn

(3.2) f(x) = f(a) +∑

(xi − ai)hi(x)

where hi(x) are smooth functions with

hi(a) =∂f

∂xi(a).

Proof. Suppose first that a = 0. Then, by the fundamental theorem of calculus and chain rule,

f(x)− f(0) =∫ 1

0

d

dtf(tx) dt =

∫ 1

0

(∑

xi∂f

∂xi(tx)) dt =

∑xi

∫ 1

0

∂f

∂xi(tx) dt.

Let hi(x) =∫ 1

0∂f∂xi

(tx) dt. These are the desired functions. If a 6= 0 apply the previous argument tof(x) = f(x− a).

Remark 3.5. If f is a smooth function on an open ball Br(a) then (3.2) still holds at all x ∈ Br(a), exceptnow hi ∈ C∞(Br(a)). The proof is exactly the same.

Before proving the proposition we need one more simple lemma.

Lemma 3.6. Let M be a manifold and w ∈ TaM a tangent vector. Then for any constant function c wehave w(c) = 0.

Proof. Apply the tangent vector w to the constant function 1:

w(1) = w(1 · 1) = 1w(1) + w(1)1 = 2w(1).⇒ w(1) = 0.

Since w is linear, for any constant function c = c · 1w(c) = w(c · 1) = cw(1) = 0.

Proof of Proposition 3.3. By Lemma 3.4, f(x) = f(a) +∑

(xi − ai)hi(x). Hence

w(f) = w(f(a)) +∑

(w(xi − ai)hi(a) + (ai − ai)w(hi)) = 0 +∑

w(xi)hi(a) + 0 =∑

w(xi)∂f

∂xi(a).

Therefore w = Dvf (a), were v = (w(x1), . . . , w(xn)).We leave the ball version of the proof as an exercise.

11

Remark 3.7. The proof above actually shows that the derivations ∂∂xi|a form a basis of TaRn.

For arbitrary manifolds a choice of coordinates near a point also defines a basis of the tangent space at thepoint. To express this precisely it will be convenient to slightly change our notation. To this end, denote thepoints of Rn by r = (r1, . . . , rn). We also think of ri as a function that assign to a point its i-th coordinate.If φ : U → Rn is a coordinate chart on a manifold M , then φ = (r1 φ, . . . , rn φ). We then think ofxi = ri φ as coordinate functions on U .

The coordinates define tangent vectors at points of U : for any a ∈ U and any f ∈ C∞(M) we define ∂∂xi|a

by∂

∂xi|a(f) :=

∂

∂ri|φ(a)(f φ−1).

It is easy to see that these are, indeed, tangent vectors. It should come as no surprise that they form a basisof the tangent space TaM . After all, manifolds locally look like Rn and in Rn the partial derivatives do formbases of tangent spaces. Now let’s prove this. We first observe that tangent vectors are local.

Lemma 3.8. Let M be a manifold and v ∈ TaM a tangent vector. Then for any two functions f, g ∈ C∞(M)with f = g in a neighborhood U of a, we have

v(f) = v(g).

In particular, if h is constant on a neighborhood U of a, then v(h) = 0 (cf. Lemma 3.6).

Proof. As v : C∞(M) → R is R-linear, it is enough to show that v(f − g) = 0. Chose a smooth bumpfunction ρ : M → [0, 1] with supp ρ ⊂ U which is identically 1 on a neighborhood V of a. We then have thatρ · (f − g) = 0 on all of M by construction. Furthermore, because v is linear, v(0) = 0, hence

0 = v(ρ (f − g)) = v(ρ) (f − g)(a) + ρ(a) v(f − g) = v(f − g).

What’s the point of the lemma, aside from its esthetic appeal? If φ = (x1, . . . , xn) : U → Rn is acoordinate chart on a manifold M and v ∈ TaM is a tangent vector at some point a ∈ U , then we cannotapply v to a coordinate function xi. The function xi is only defined on U ; it is not a smooth function onall of M . However, there is a way around this problem. Pick a smooth bump function ρ : M → [0, 1] withsupp ρ ⊂ U which is identically 1 on some neighborhood of a. Then xiρ is a smooth function on M andso v(xiρ) does make sense. Moreover, this number does not depend on the choice of the bump function: ifτ : M → [0, 1] is another choice of a bump function with the same properties, then xiρ = xiτ on some(perhaps smaller) neighborhood of a. Therefore, by the preceding lemma, v(xiρ) = v(xiτ). We thereforedefine

v(xi) := v(xiρ)for some choice of the bump function ρ. Similarly, if h ∈ C∞(U) we define

v(h) := v(hρ)

for some (any) choice of the appropriate bump function ρ.

Lemma 3.9. If φ = (x1, . . . , xn) : U → Rn is a coordinate chart on a manifold M and v ∈ TaM is a tangentvector at some point a ∈ U . Then

(3.3) v =∑i

v(xi)∂

∂xi|a.

Moreover, the vectors ∂∂xi|a form a basis of TaM .

Proof. We evaluate both sides of (3.3) on a function f ∈ C∞(M). It is no loss of generality to assume thatφ(U) is a ball and that φ(a) = 0. By Lemma 3.4,

(f φ−1)(r) = (f φ−1)(0) +∑

rihi(r)

where hi(0) = ∂∂ri

(f φ−1)|0. Thus,

f(x) = f(a) +∑

xi · fi(x),12

wherefi(a) =

∂

∂ri(f φ−1)(0) =

∂

∂xi|a(f),

for all x ∈ U . Hence, for any v ∈ TaM , we have

v(f) = v(f(a) +∑

xifi)

=∑

xi(a)v(fi) +∑

v(xi)fi(a)

=∑

v(xi)fi(a)

=∑

v(xi)∂

∂xi|a(f).

This shows that ∂∂xi|a span TaM . To check linear independence observe that

∂

∂xi|a(xj) = δij ,

where δij denotes the Kronecker delta function: it’s 1 if i = j and zero otherwise.

Remark 3.10. We have seen in the preceding discussion that for any p ∈ Rn the tangent space TpRn isisomorphic to Rn. Explicitly the isomorphism is give by taking a vector v ∈ Rn to the directional derivativeat p in the direction of v:

Rn '→ TpRn v 7→ Dv(·)(p).In particular

R '→ TaR s 7→ sd

dr|a.

3.2. Digression: vector spaces and their duals. Given two (finite dimensional) vector spaces V and Wwe denote the set of all linear maps from V to W by Hom(V,W ). It is a vector space: any linear combinationof two linear maps is again a linear map. Of special interest is the vector space V ∗ := Hom(V,R) of linearmaps from a vector space V to R, the so called dual vector space. If vini=1 is a basis of V , the dual basisis a basis v∗i of V ∗ defined by

v∗i (vj) = δij

for all 1 ≤ i, j ≤ n. This is indeed a basis. If ` ∈ V ∗ is an arbitrary functional, then

` =∑

`(vi)v∗i

because both sides of the formula above agree on the basis vectors vj (I am tacitly using the fact that if twolinear maps µ, ν : V → R agree on basis vectors, then they agree). It follows that dimV ∗ = dimV . Finallyobserve that for any vector u ∈ V ,

u =∑

v∗i (u)vi.

Why is the formula above true? Apply v∗j to both sides.

Exercise 3.2. Show that a choice of basis of vector spaces V and W identifies Hom(V,W ) with a space ofmatrices. Conclude that dim Hom(V,W ) = dimV · dimW.

3.3. Differentials.

Definition 3.11. Let f : M → N be a smooth map of manifolds and a ∈M a point. The differential of fat a is the linear map

dfa : TaM → Tf(a)N

defined by(dfa(v))(h) = v(h f)

for all v ∈ TaM and all h ∈ C∞(N).

Exercise 3.3. Check that the definition above makes sense. That is, given v ∈ TaM , check that the map

C∞(N)→ R, h 7→ v(h f)

is a linear map satisfying (3.1).13

We will check shortly that in the case of a smooth map f : Rn → Rm, dfa = Dfa under the naturalidentification TaRn ' Rn.

We next sort out what the definition of a differential amounts to in the case where f : M → R is asmooth function (in other words the target manifold N = R). By definition 3.11, dfa is a map from TaM

to Tf(a)R ' R. That is, if we compose dfa with the isomorphism Tf(a)R'−→ R (see Remark 3.10, we get a

linear mapdfa : TaM → R

By definition, dfa an element of the dual vector space T ∗aM := Hom(TaM,R). I claim that the linear mapdfa is given by

(3.4) dfa(v) = v(f).

for any tangent vector v ∈ TaM .

Proof. Let r : R → R denote the identity map. We think of it as the standard coordinates on R. Then forevery point x ∈ R the vector d

dr |x is a basis vector of TxR, which gives us an isomorphism

TxR→ R, td

dr|x 7→ t.

The map above has a “coordinate free” description as well. It is:

TxR 3 v 7→ v(r).

Thereforedfa(v) = (dfa(v)) (r) = v(r f) = v(f).

Remark 3.12. It is customary not to distinguish between dfa and dfa. Thus, in the case of f ∈ C∞(M), thedifferential dfa denotes both the linear map dfa : TaM → Tf(a)R and the linear functional dfa : TaM → R.In other words, from now on we drop the notation dfa and write (3.4) as

(3.5) dfa(v) = v(f).

for all f ∈ C∞(M), a ∈M , v ∈ TaM .

Definition 3.13. The vector spaceT ∗aM := Hom(TaM,R)

is called the cotangent space of M at a.

The new concept of the differential allows us to re-interpret the formula (3.3). Recall that a choice ofcoordinates φ = (x1, . . . , xn) : U → Rn on a manifold M gives rise to a basis ∂

∂xi|a of TaM for any point

a ∈ U . We claim that (dxi)a form the dual basis of the cotangent space T ∗aM . Indeed, by (3.5),

(dxj)a

(∂

∂xi|a)

=∂

∂xi|a(xj) = δij .

Since for v ∈ TaM we have v(xi) = (dxi)a(v), (3.3) becomes

(3.6) v =∑

(dxi)a(v)∂

∂xi

∣∣∣∣a

.

Let f = (f1, . . . , fm) : Rn → Rm be a smooth map. We are now in the position to compare dfa : TaRn →Tf(a)Rm with Dfa : Rn → Rm. Let r1, . . . rn denote the standard coordinates on Rn and s1, . . . , sm thestandard coordinates on Rm. Using (3.6) we compute:

(dsi)f(a)(dfa(∂

∂rj

∣∣∣a)) = (dfa(

∂

∂rj

∣∣∣a))(si) =

∂

∂rj

∣∣∣a(si f)

=∂

∂rj

∣∣∣a(fi)

=∂fi∂rj

(a)

14

Thus the matrix of the linear map dfa : TaRn → Tf(a)Rm with respect to the basis ∂∂rj

∣∣a and ∂

∂si

∣∣f(a)

is the Jacobian matrix of Dfa. .

It is worth singling out another special case of the definition of a differential of a map: M = R. In thiscase f : R→ N is a smooth curve. We define the tangent vector to f at t ∈ R to be

f ′(t) := dft

(d

dr|t).

Note that by definition f ′(t) is a tangent vector in Tf(t)N , the tangent space to N at f(t).

Exercise 3.4. Let M be a manifold, p ∈ M a point and v ∈ TpM a tangent vector at the point p. Showthat there is a curve γ : I →M (where I is an open interval containing 0) with γ(0) = p and γ′(0) = v.

We next observe that the chain rule holds for the differentials of smooth maps.

Theorem 3.14 (Chain Rule). If F : X → Y and H : Y → Z are smooth maps of manifolds, then

d(H F )a = dHF (a) dFafor any point a ∈ X.

Proof. Fix a ∈ X, v ∈ TaX, and f ∈ C∞(Z). Then

(d(H F )a(v))(f) = v(f (H F ))= v((f H) F )= (dFa(v))(f H)= (dHF (a)(dFa(v)))(f).

Remark 3.15. Theorem 3.14 and Exercise 3.4 give us a useful way of computing differentials dfa : TaM →Tf(a)N . By the exercise, for any v ∈ TaM we can find a curve γ : I → M with γ(0) = a and γ′(0) = v.Then, by the chain rule,

dfa(v) = dfa(γ′(0)) = dfa(dγ(d

dr|0)) = d(f γ)0(

d

dr|0) = (f γ)′(0).

Exercise 3.5. Prove that if F : M → N is a diffeomorphism then the differential dFa : TaM → TF (a)N isan isomorphism.

Exercise 3.6. Let M and N be manifolds. Prove that for any (a, b) ∈M×N the tangent space T(a,b)(M×N)is isomorphic to TaM × TbN .

Exercise 3.7. Suppose that γ : R→ Rn, γ(t) = (γ1(t), . . . γn(t)) is a smooth curve. Show that

dγ (d

dt) =

∑i

γi′(t)

∂

∂ri,

where γi′(t) are ordinary derivatives.

3.4. The tangent bundle.

Definition 3.16 (provisional). The tangent bundle TM of a manifold M is (as a set)

TM =⊔a∈M

TaM.

Note that there is a natural projection (the tangent bundle projection)

π : TM →M

which sends a tangent vector v ∈ TaM to the corresponding point a of M .15

We want to show that the tangent bundle TM itself is a manifold in a natural way and the projectionmap π : TM → M is smooth. Strictly speaking, we first should specify a topology on TM . However, ourstrategy will be different. We will first find candidates for coordinate charts on the tangent bundle TM .They will be constructed out of coordinate charts on M . We will check that the change of these candidatecoordinates on TM is smooth. We will then use these candidate coordinates to manufacture a topology onTM .

Let φ = (x1, · · · , xn) : U → Rn be a coordinate chart on M . Out of it we construct a chart on TU . Thefirst n functions come for free: we take the functions x1 π, . . . , xn π. Another set of n functions come forfree also: by (3.6), given a vector v ∈ TaU ,

v =∑

(dxi)a(v)∂

∂xi|a.

Hence, abusing the notation a bit, we get maps

dxi : TU → R, TU 3 v 7→ (dxi)a(v), where a = π(v).

Thus we define a candidate coordinate chart

φ := (x1 π, · · · , xn π, dx1, · · · , dxn) : TU → Rn × Rn

byφ(v) = (x1(π(v)), . . . , xn(π(v)), (dx1)π(v)(v), . . . , (dxn)π(v)(v)).

If Uα, φα) is an atlas on M , we get a candidate atlas (TUα, φα) on TM . To see why this could possiblybe an atlas, we need to check that the change of coordinates in this new purported atlas is smooth. To thisend pick two coordinate charts (U, φ = (x1, · · · , xn)) and (V, ψ = (y1, · · · , yn)) on M with U ∩ V 6= ∅. ThenT (U ∩ V ) = TU ∩ TV 6= ∅. Let

φ = (x1, · · · , xn, dx1, · · · , dxn) : TU → Rn × Rn

andψ = (y1, · · · , yn, dy1, · · · , dyn) : TV → Rn × Rn

be the corresponding candidates charts on TM . Now let us compute the change of coordinates ψ φ−1.First, note that

φ−1(r1, · · · , rn, u1, · · · , un) =∑i

ui∂

∂xi

∣∣∣φ−1(r1,··· ,rn)

∈ Tφ−1(r1,··· ,rn)M.

Soψ(∑

ui∂

∂xi

∣∣∣φ−1(r1,··· ,rn)

) = (ψ(φ−1(r1, · · · , rn)), dy1(∑i

ui∂

∂xi), · · · , dyn(

∑i

ui∂

∂xi)).

Butdyj(

∑i

ui∂

∂xi) =

∑i

ui(∂

∂xi(yj)) =

∑i

∂yj∂xi

ui =∑i

∂

∂ri(rj(ψ φ−1))ui.

Thus the change of the candidate coordinates is given by

ψ φ−1(r1, · · · , rn, u1, · · · , un) =(ψ φ−1(r), (∑i

∂y1

∂xi(r)ui, . . . ,

∑i

∂yn∂xi

(r)ui))

=(ψ φ−1(r),(∂yj∂xi

(r)) u1

...un

),

(3.7)

where r = (r1, . . . rn). Clearly ψ φ−1 is smooth wherever it is defined. It remains to define a topology onTM so that the charts φ : TU → φ(U)×Rn are homeomorphisms. We declare a subset O ⊂ TM to be openif for any coordinate chart φ : U → Rn on M , the set φ(O ∩ TU) ⊂ Rn × Rn is open.

Proposition 3.17. The collection of open sets on TM defined above does indeed form a topology. Moreover,if M is Hausdorff and second countable, so is TM .

Proof. An exercise for the reader. 16

We conclude that if M is an n-dimensional Hausdorff second countable manifold then its tangent bundleTM is a 2n-dimensional Hausdorff second countable manifold. Moreover, each coordinate chart (x1, . . . xn) :U → Rn on M gives rise to a coordinate chart (x1 π, . . . xn π, dx1, . . . , dxn) : TU → R2n.

Remark 3.18. The following notation is suggestive: we write (m, v) ∈ TM for v ∈ Tm(M). Strictlyspeaking, it is redundant since m = π(v).

Remark 3.19. It is customary to simply write xi : TU → R for xi π : TU → R.

Exercise 3.8. Prove that the map π : TM →M is smooth and that the differential dπv : Tv(TM)→ Tπ(v)Mis surjective for all tangent vectors v ∈ TM . Hint: do it in (convenient) coordinates.

3.5. The cotangent bundle. As a set, the cotangent bundle T ∗M is the disjoint union of cotangent spaces:

T ∗M =⊔a∈M

T ∗aM.

Note that there is a natural projection (the cotangent bundle projection)

π : T ∗M →M

which sends a cotangent vector (a covector for short) η ∈ T ∗aM to the corresponding point a of M . We makethe cotangent bundle T ∗M into a manifold in more or less the same way we made the tangent bundle into amanifold. That is, we manufacture new coordinate charts on T ∗M out of coordinate charts on M and checkthat the transition maps between the new coordinate charts are smooth.

So let φ = (x1, . . . , xn) : U → Rn be a coordinate chart on M . Then for each point a ∈ U the covectors(dxi)a form a basis of T ∗aM . The partials ∂

∂xi|a form the dual basis. Hence for any η ∈ T ∗aM ,

η =∑

η(∂

∂xi|a) (dxi)a.

Therefore the partials ∂∂xi give us coordinate functions on T ∗U :

∂

∂xi: T ∗U → Rn, T ∗U 3 η 7→ η(

∂

∂xi|a),

where a = π(η). We now define the candidate coordinates

φ : T ∗U → Rn × Rn

by

φ = (x1 π, . . . , xn π,∂

∂x1, . . . ,

∂

∂xn).

Note that

φ−1(r1, . . . , rn, w1, . . . , wn) =n∑i=1

wi(dxi)φ−1(r) ∈ T ∗φ−1(r)M,

where again we have abbreviated (r1, . . . , rn) as r. We now check the transition maps. Let ψ = (y1, . . . , yn) :V → Rn be a coordinate chart on M with V ∩ U 6= ∅. Then

ψ φ−1(r1, . . . , rn, w1, . . . , wn) =ψ(n∑i=1

wi(dxi)φ−1(r))

=((ψ φ−1)(r),∂

∂y1(n∑i=1

wi dxi), . . . ,∂

∂yn(n∑i=1

wi dxi))

=((ψ φ−1)(r),∑i

wi∂xi∂y1

, . . . ,∑i

wi∂xi∂yn

).

We conclude that

(3.8) ψ φ−1(r1, · · · , rn, w1, · · · , wn) = (ψ φ−1(r),(∂xi∂yj

(r)) w1

...wn

),

17

which is smooth. The rest of the argument proceeds as in the case of the tangent bundle.

Remark 3.20. Later on, when we look at the general vector bundles, it will be instructive to compare theformulas for the change of coordinates in the tangent and the cotangent bundles. In particular note that thematrices

(∂yj∂xi

(r))

and(∂xi∂yj

(r))

are inverse transposes of each other.

3.6. Vector fields. A vector field X on a manifold M smoothly assigns to a point a ∈M a tangent vectorX(a) ∈ TaM .3 What does “smoothly” mean? If X is a vector field in Rn then

X(a) =∑

fi(a)∂

∂ri|a

for certain functions fi(a) ∈ R of the point a ∈ Rn. So whatever we mean by “smooth” should amount tothe functions fi being smooth. This suggests one definition of a smooth vector field:

Definition 3.21. A vector field X on a manifold M is smooth if for any coordinate chart φ = (x1, . . . , xn) :U → Rn we have, for any point a ∈ U ,

(3.9) X(a) =∑

fi(a)∂

∂xi|a

for some smooth functions fi : U → R.

There is something a bit unsatisfying about this definition: is it possible that the functions fi in (3.9)are smooth for one choice of coordinates and not smooth for another choice? So we will use it as as startingpoint for a better one. Note that the functions fi in (3.9) are given by:

fi(a) = (dxi)a(X(a)),

for any a ∈ U . Thus Definition 3.21 simply says that the composite (x1, . . . , xn, dx1, . . . , dxn) X : U →Rn × Rn is smooth. But this is the same thing as saying that the map X : M → TM is smooth. Not everymap Z : M → TM is a vector field: we need to make sure that Z(a) ∈ TaM . The condition is equivalent to

π(Z(a)) = a

for all a ∈ M . Here, as before, π : TM → M is the natural projection. This gives us a slightly more“sophisticated” definition of a vector field:

Definition 3.22. A (smooth) vector field X on a manifold M is a smooth map X : M → TM such thatπ X = id.

There is yet another definition of a vector field, which is quite useful from some points of view:

Definition 3.23. A smooth vector field X on a manifold M is a linear map X : C∞(M) → C∞(M) suchthat

(3.10) X(fg) = fX(g) + gX(f) for all f, g ∈ C∞(M).

Proposition 3.24. Definitions 3.22 and 3.23 are equivalent.

Proof. Exercise. Here are a few hints. Given a vector field X : M → TM define a map X from C∞(M) tofunctions on M by

(X(f))(a) = Xa(f)

for all f ∈ C∞(M) and all a ∈M . Check that X(f) is a smooth function and that the map X so defined isa derivation. That is, show that (3.10) holds with X replaced by X.

Conversely, given a map X : C∞(M)→ C∞(M) with the derivation property as above, define X : M →TM by

Xa(f) = (X(f))(a)for all f ∈ C∞(M) and all a ∈ M . Check that Xa is indeed a tangent vector in TaM and that the mapX : M → TM , a 7→ Xa is smooth in a.

3Sometimes this is also written Xa.

18

Remark 3.25. From now on we will not distinguish between the two definitions and will think of vectorfields as either smooth maps M → TM satisfying certain conditions or as R-linear maps C∞(M)→ C∞(M)satisfying the appropriate conditions. We will make no notation distinction between the two ways of lookingat vector fields. Thus X(a) will stand for the value of a vector field at a point a if a is a point. On theother hand, if f is a smooth function, X(f) will stand for a new smooth function, the “derivative” of f withrespect to the vector field X.

Notation. There are several standard ways to denote the space of all smooth vector fields on a given manifoldM . The two most common ones are Γ(TM) [vector fields are sections of the tangent bundle, see below] andX (M).

Remark 3.26. 1. The space of vector fields Γ(TM) is a vector space over R: if X,Y ∈ Γ(TM) are (smooth)vector fields and λ, µ ∈ R are scalars, then their linear combination λX + µY is defined by

(λX + µY )(a) := λX(a) + µY (a)

for any a ∈M . It is again a smooth vector field.2. We can also multiply vector fields on M by smooth functions: if X ∈ Γ(TM) and f ∈ C∞(M) then

fX is defined by(fX)(a) := f(a)X(a)

for all a ∈M .A fancy way of describing 2. is to say that Γ(TM) is a module over the ring of smooth functions C∞(M).

See if you can impress your date.

If X,Y ∈ Γ(TM) are two vector fields on a manifold M then it is not true that the R-linear map

C∞(M)→ C∞(M), f 7→ X(Y (f)).

is a vector field — it does not have the correct derivation property. For example, if M = R and X = Y = ddt ,

then X(Y (f)) = f ′′ and (fg)′′ = (f ′g + fg′)′ = f ′′g + 2f ′g′ + fg′′ 6= f ′′g + fg′′. However,

Lemma 3.27. Let X,Y ∈ Γ(TM) be two smooth vector fields on a manifold M . Then the map

(3.11) [X,Y ] : C∞(M)→ C∞(M), f 7→ X(Y (f))− Y (X(f))

is a vector field.

Proof. Clearly the map [X,Y ] is R-linear. We need to check that it has the correct derivation property. Thisis a mindless computation. Pick two functions f, g ∈ C∞(M). Then[X,Y ](fg) =X(Y (fg))− Y (X(fg))

=X(Y (f)g + fY (g))− Y (X(f)g + fX(g))

=X(Y (f))g + Y (f)X(g) +X(f)Y (g) + fX(Y (g))− Y (X(f))g −X(f)Y (g)− Y (f)X(g)− fY (X(g))

=X(Y (f))g − Y (X(f))g + fX(Y (g))− fY (X(g))

=([X,Y ](f))g + f([X,Y ](g)).

Definition 3.28. The Lie bracket of two vector fields X and Y on a manifold M is the vector field [X,Y ]defined by (3.11).

We now quickly recall the definitions of bilinear and skew-symmetric bilinear maps, the point being thatLie bracket will turn out to be a skew-symmetric bilinear map.

Definition 3.29. Let V , U and W be three vector spaces over the reals. A map

b : V × U →W

is bilinear if it is (R-) linear in each argument: for all u1, u2 ∈ U , c1, c2 ∈ R and all v ∈ V ,

b(v, c1u1 + c2u2) = c1b(v, u1) + c2b(v, u2);

and for all v1, v2 ∈ V , c1, c2 ∈ R and all u ∈ U ,

b(c1v1 + c2v2, u) = c1b(v1, u) + c2b(v2, u).19

Definition 3.30. A bilinear map b : U × U → V is skew-symmetric if

b(u1, u2) = −b(u2, u1)

for all u1, u2 ∈ U .

It is easy to see that the Lie bracket on a manifold M is R-bilinear and skew-symmetric. Note that it isnot C∞(M)-bilinear:

[X,hY ] = X(h)Y + h[X,Y ]

for any X,Y ∈ Γ(TM), h ∈ C∞(M) (prove this).Somewhat surprisingly the Lie bracket has a kind of derivation property:

Lemma 3.31 (Jacobi identity). For any three vector fields X,Y, Z ∈ Γ(TM) on a manifold M

(3.12) [X, [Y, Z]] = [[X,Y ], Z] + [Y, [X,Z]].

Here is how one sees this as a derivation property: for a vector field X ∈ Γ(TM) define

LX : Γ(TM)→ Γ(TM)

byLX(Y ) = [X,Y ].

With this definition (3.12) becomes:

LX([Y,Z]) = [LX(Y ), Z] + [Y, LX(Z)]

Proof of Lemma 3.31. This is another computation that’s easier to do yourself than watch someone elsedoing it. To keep the notation from getting out of hand, we will drop parentheses. Thus XY Zf stands forX(Y (Z(f)))) etc. We pick a function f ∈ C∞(M) and compute:

([[X,Y ], Z] + [Y, [X,Z]])f =[X,Y ]Zf − Z[X,Y ]f + Y [X,Z]f − [X,Z]Y f= XY Zf − Y XZf − ZXY f + ZY Xf + Y XZf − Y ZXf −XZY f + ZXY f

= XY Zf + ZY Xf − Y ZXf −XZY f= X(Y Zf − ZY f) + (ZY − Y Z)Xf = [X, [Y,Z]]f.

This proves the Jacobi identity.

Equation (3.12) is called the Jacobi identity and is often written as

[X, [Y, Z]] + [Y, [Z,X]] + [Z, [X,Y ]] = 0.

(it is equivalent to (3.12) by skew-symmetry of [·, ·].

Definition 3.32. A (real) Lie algebra is a vector space V over R (perhaps infinite dimensional) togetherwith a map [·, ·] : V × V → V , a Lie bracket, such that

(1) [·, ·] is bilinear,(2) [·, ·] is skew-symmetric, and(3) [·, ·] satisfies the Jacobi identity: for all v, u, w ∈ V

[u, [v, w]] = [[u, v], w] + [v, [u,w]].

Example 3.33. We have proved that the space of vector fields Γ(TM) on a manifold M forms a Lie algebra.

Example 3.34. R3 with the cross (vector) product is a Lie algebra.

Remark 3.35. The bracket on a Lie algebra can be thought of as a multiplication. Note that it is notassociative in general because of the Jacobi identity.

The geometric meaning of the Lie brackets of vector fields will be discussed later.20

4. Submanifolds and the implicit function theorem

Given a smooth function F : Rm → Rn and a point c ∈ Rn the level set

F−1(c) := x ∈ Rm | F (x) = cmay or may not be a smooth manifold. For example, take f(x, y) = x2− y2, a smooth function on R2. Thenf−1(0) is the union of two lines: y = ±x. It is not a manifold. However, for c 6= 0, f−1(c) is a union of twosmooth curves, hence a 1 dimensional manifold. The goal of this section is to describe a sufficient conditionfor the level sets F−1(c) to be manifolds. We then generalize this to level sets of smooth maps betweenmanifolds. The key technical result that makes it all possible is the inverse function theorem.

4.1. The inverse function theorem and a few of its consequence.

Theorem 4.1 (Inverse function theorem). Let U,U ′ ⊂ Rn, be open sets and F : U → U ′ a smooth map.Suppose for some point a ∈ U the differential

dFa : Rn → Rn

is invertible. Then there are open neighborhoods U0 of a in U and U ′0 of F (a) in U ′ so that

F : U0 → U ′0

is a diffeomorphism.

We will assume this result without proof. It is not essential that U and U ′ are open subsets of Rn —any finite dimensional vector space will do. It is even true with Rn replaced by a Banach space. We nowdiscuss various consequences of the inverse function theorem. The most famous one is the implicit functiontheorem. But first we prove the manifold version.

Proposition 4.2. Let f : N →M be a smooth map of manifolds with f(p) = q (p ∈ N , q ∈M). Suppose

dfp : TpN → TqM

is an isomorphism (invertible linear map). There there are neighborhoods U of p ∈ N , V of q in M so that

f |U : U → V

is a diffeomorphism (invertible map with a smooth inverse).

Proof. Note first that if φ : U ′ → Rn is a coordinate chart on N then for any z ∈ U ′ the map dφz : TzN →Tφ(z)Rn is an isomorphism ( for instance if φ = (x1, . . . , xn), dφx( ∂

∂xi) = ∂

∂ri).

So let p ∈ U ′ φ→ Rn and q ∈ V ′ ψ→ Rm be two coordinate charts on M and N respectively. Then thediagram

(4.1)

U ′f−−−−→ V ′

φ

y yψφ(U ′) −−−−−−−→

(ψfφ−1)ψ(V ′)

commutes: ψ f = (ψ f φ−1) φ. Hence the diagram of differentials

(4.2)

TpNdfp−−−−→ TqM

dφp

y ydψqTφ(p)φ(U ′) −−−−−−−−−−→

d(ψfφ−1)φ(p)

Tψ(q)ψ(V ′)

commutes as well. By the inverse function theorem, there are neighborhoods U of φ(p) and V of ψ(q) sothat

(ψ f φ−1)|U : U → V

is a diffeomorphism. Consequently,f : φ−1(U)→ ψ−1(V )

21

is a diffeomorphism.

Next we turn to the implicit function theorem, the vector space version.

Theorem 4.3 (Implicit function theorem). Let F : Rn × Rk → Rk be a smooth map, (a, b) ∈ Rn × Rk apoint and c = F (a, b). Suppose that the restriction of the differential

dF(a,b)|0×Rk : 0 × Rk → Rk

is onto. Then there are neighborhoods U of a ∈ Rn, W of (a, b) in Rn × Rk and a smooth map g : U → Rkwith g(a) = b such that the

F−1(c) ∩W = graph g : U → Rk.That is, for (x, y) ∈W

F (x, y) = c ⇔ y = g(x).

In other words the function g is implicitly defined by the equation F (x, g(x)) = c.

Proof. We write suggestively ∂F∂x (a, b) for the restriction dF(a,b)|Rn×0 and ∂F

∂y (a, b) for dF(a,b)|0×Rk . Con-sider the smooth map H : Rn × Rk → Rn × Rk defined by

H(x, y) = (x, F (x, y))

for all (x, y) ∈ Rn × Rk. Then the differential of H at (a, b) is of the form

dH(a,b) =

I 0

∂F∂x (a, b) ∂F

∂y (a, b)

,

where I : Rn → Rn is the identity map. By assumption ∂F∂y (a, b) is invertible. Hence dH(a,b) is invertible.

By the inverse function theorem the function H is invertible on a neighborhood of (a, b). Let G(u, v) =(G1(u, v), G2(u, v)) denote its inverse, which is defined on a neighborhood of H(a, b) = (a, F (a, b)) = (a, c).We may take this neighborhood to be of the form U × V , with U ⊂ Rn and V ⊂ Rk being open. LetW = G(U × V ). Then

(u, v) = H(G(u, v)) = (G1(u, v), F (G1(u, v), G2(u, v))for all (u, v) ∈ U × V . Hence G1(u, v) = u. Therefore

F (u,G2(u, v)) = v

for all (u, v) ∈ U × V . Conversely, if for any (x, y) ∈W we have F (x, y) = v then

(x, y) = G(H(x, y)) = G(x, F (x, y)) = G(x, v) = (G1(x, v), G2(x, v))

and therefore y = G2(x, v).Define the function g : U → Rk by

g(x) = G2(x, c).It is a smooth function and, by the above discussion,

F (x, y) = c ⇔ y = g(x)

for any (x, y) ∈W .

Remark 4.4. Here is a slightly different and ultimately more useful way to look at what we have proved.The argument above shows that there is a diffeomorphism

H : W → U × Vmapping bijectively the set

F = c ∩W := (x, y) ∈W | F (x, y) = conto the set

H(W ) ∩ (Rn × c)

This motivates the following definition.22

Definition 4.5 (Submanifold). Let M be an m-dimensional manifold. A subset N ⊂M is an n-dimensionalembedded submanifold if for every point q ∈ N , there is a coordinate chart φ = (x1, · · · , xm) : U → Rm withq ∈ U such that

φ(U ∩N) = φ(U) ∩ (Rn × 0).That is, for all a ∈ N ∩ U ,

φ(a) = (x1(a), · · · , xn(a), 0, · · · , 0).Such charts are said to be adapted to N.

Example 4.6. The sphere S2 is an embedded submanifold of R3. For example if (x1, x2, x3) ∈ S2 andx3 > 0 then

φ(x1, x2, x3) = (x1, x2, x3 −√

1− x21 − x2

2)

is a chart adapted to S2 (and there are 5 more charts like this).

Thus the implicit function theorem says that, under certain conditions, portions of a level set of a mapF : Rn×Rk → Rk are embedded submanifolds. Naturally the embedded submanifolds are manifolds in theirown right.

Lemma 4.7. If N ⊂M is an n-dimensional embedded submanifold of an m-dimensional manifold M thenit is naturally an n-dimensional manifold in its own right, and the inclusion map ι : N → M , ι(a) = a issmooth.

Proof. We make N into a topological space by giving it the subspace topology. If φ : U → Rm is a charton M adapted to N , then p φ|N : N ∩ U → φ(U) ∩ Rn is a homeomorphism. Here p : Rm → Rn isthe projection p(x1, . . . , xn, . . . , xm) = (x1, . . . , xn). If ψ : V → Rm is another chart adapted to N , thenψ φ−1 : φ(U ∩ V ) → ψ(U ∩ V ) maps φ(U ∩ V ) ∩ (Rn × 0) diffeomorphically to ψ(U ∩ V ) ∩ (Rn × 0).Hence if φα : Uα → Rm is a collection of charts on M adapted to N with M =

⋃Uα then p φα|Uα∩N :

Uα ∩ N → Rn is an atlas on N . Checking that the inclusion map ι is smooth is easy: in coordinates it’sthe inclusion Rn → Rm, (r1, . . . , rn) 7→ (r1, . . . , rn, 0, . . . , 0)

We now generalize the implicit function theorem.

Proposition 4.8. Let F : Rm → Rk be a smooth map and c ∈ F (Rm) ⊂ Rk a point. Suppose that for allpoints q ∈ F−1(c) the differential

dFq : Rm → Rk

is onto. Then the level set F−1(c) is a submanifold of Rm and (if F−1(c) is nonempty)

dimF−1(c) = dim Rm − dim Rk.

Proof. Fix a point q ∈ F−1(c). Let Z = ker dFp. Let X ⊂ Rm be the vector space complement to Z so that

Rm = Z ⊕X ' Z ×X.We can thus think of a point p ∈ Rm as a pair (z, x) ∈ Z ×X. By assumption on dFq and by constructionof X, the restriction

dFq|X : X → Rk

is an isomorphism of vector spaces. We now proceed as in the proof of the implicit function theorem.Consider

H : Z ×X → Z × Rk, H(z, x) = (z, F (z, x)).Write ∂F

∂z for dF |Z and ∂F∂x for dF |X . Then

Then the differential of H is of the form

dH(z,x) =

I 0

∂F∂z

∂F∂x

.

By construction ∂F∂x (q) : X → Rk is a bijection. Hence dHq is a bijection. By the inverse function theorem

there exist neighborhoods W of q in Rm and U × V of H(q) in Z × Rk so that H : W → U × V is a23

diffeomorphism. Moreover, as in the proof of the implicit function theorem H maps bijectively F = c∩Wto (U × V ) ∩ (Z × c). Therefore F−1(c) = F = c is a submanifold of Rm of dimension

dimZ = dim Rm − dim Rk.

Example 4.9. Consider F : Rn → R, F (x) =∑x2i . Then dFx = (2x1, . . . , 2xn). Hence dFx is surjective

for all nonzero x. In particular F−1(1) = x ∈ Rn |∑x2i = 1 is a submanifold of Rn of dimension n − 1.

This is, of course, the standard sphere of radius 1.

Definition 4.10 (Regular value). Suppose f : M → N is a smooth map of manifolds. A point c ∈ N is aregular value of f if for all x ∈ f−1(c) the differential

dfx : TxM → TcN

is surjective.

The previous proposition then simply states that non-empty preimages of a regular values of a mapF : Rm → Rk are submanifolds of Rm.

Remark 4.11. Note that if f−1(c) = ∅, then c is a regular value of f . It seems silly to construct a definitionthis way. The reason for the peculiar phrasing is that it makes easier to state Sard’s theorem.

Theorem 4.12 (Sard’s Theorem). Let f : M → N be a smooth map. Then the set of regular values of f isdense in M (and in fact its compliment has measure 0).

Note that if F : M → N maps everything to one point c then c is not a regular value (the differentialof F is 0 everywhere), but N r c does consist of regular values. So Sard’s theorem does hold for constantmaps, except for the preimage of every regular value of a constant map is empty. It will take us too farafield to prove Sard’s theorem, so we won’t do it. On the other hand Proposition 4.8 nicely generalizes tomanifolds:

Theorem 4.13. If c is a regular value of a smooth map of manifolds f : M → N and if f−1(c) 6= ∅ thenthe level set f−1(c) is an embedded submanifold of M of dimension

dim f−1(c) = dim(M)− dim(N).

Before we proceed with the proof of Theorem 4.13, we make a two observations.

1. Let φα : Uα → Rm be an atlas on a manifold M . Suppose for some index β there is a diffeomorphismσ : φβ(Uβ)→W ⊂ Rm (W is some open set). Then

(i) σ φβ : Uβ → Rm is a chart on M ,(ii) this chart is compatible with the atlas φα : Uα → Rm we started out with.

The implies that

3. If Z is a submanifold of a manifold M and H : M →M ′ is a diffeomorphism, then H(Z) is a submanifoldof M ′.

Proof of Theorem 4.13. It is enough to show that for every point a of f−1(c) there is a neighborhood U ofa such that U ∩ f−1(c) is a submanifold of U of dimension m− n.

Let a ∈ f−1(c) be a point. Let φ : U → Rm be a chart of M with a ∈ U and ψ : V → Rn be a chart onN with c ∈ V . Then

ψ f φ−1 : U ′ → V

is a smooth map. Moreover, by the chain rule,

d(ψ f φ−1)0 = dψc dfa d(φ−1)0.

Since dψc and dφa are isomorphisms and dfa is onto for any a ∈ f−1(c) by assumption, d(ψ f φ−1)φ(a) :Tφ(a)Rm → Tψ(c)Rn is onto for any a ∈ f−1(c)∩U . By Proposition 4.8 (ψf φ−1)−1(ψ(c)) = φ(U ∩f−1(c))is a submanifold of φ(U) of dimension m−n. Therefore U ∩ f−1(c) is a submanifold of U ⊂M of dimensionm− n. Since a is arbitrary, f−1(c) is a submanifold of all of M of the desired dimension.

24

The next statement describes the tangent bundle of a regular level set f−1(c).

Corollary 4.13.1. Suppose that c is a regular value of f : M → N and f−1(c) 6= ∅. Then for all a ∈ f−1(c),

Taf−1(c) = ker(dfa).

Proof. Since dimTaf−1(c) = dim f−1(c) = dimM − dimN = dim ker dfa, it is enough to prove that

Taf−1(a) ⊂ ker dfa. Let v ∈ Taf−1(c) be a vector.

By exercise 3.4 there is a curve γ : I → f−1(c) (where I is an interval containing 0) such that γ(0) = a anddγ( ddt ) = v. Since f γ is a constant map, d(f γ)0 = 0. By the chain rule, d(f γ)0( ddt ) = dfγ(0)(dγ0( ddt )) =dfa(v). Therefore Taf−1(c) ⊂ ker dfa and we are done.

Example 4.14. Let f : Rn → R be given by f(x) =∑x2i . Then, as we have seen before, 1 is a regular

value of f and dfx = (2x1, . . . , 2xn) for all x ∈ Rn. Therefore, for any x ∈ f−1(1) = Sn−1 the tangentspace TxSn−1 is naturally isomorphic to kerv 7→

∑2xivi, which is the (n− 1) dimensional hyperplane in

Rn ' TxRn orthogonal to the vector x.

Exercise 4.1. Show that O(n), the set of all n× n orthogonal matrices, is a submanifold of GL(n,R).Hint: Consider the map f : GL(n,R) → Sym(n,R) given by A 7→ AAT . Show that the identity matrix I isa regular value of f .

4.2. Transversality. We now have enough tools to do a bit of differential topology.

Definition 4.15 (Transversality). A smooth map F : M → N of manifolds is transverse to a submanifoldZ of N if for every z ∈ Z and any m ∈ F−1(z), we have

TzZ + dFm(TmM) = TzN

(not necessarily as a direct sum!).

Notation. We write F t Z if a map F is transverse to a submanifold Z.

Example 4.16.

Let N = R2, M = R3, Z = S2 ⊂M and f : N →M is given by f(x1, x2) = (x1, x2, 0). Then f t S2.

Remark 4.17. A map F : M → N is transverse to submanifold Z consisting of one point c if and only if cis a regular value of F .

Example 4.18. Take M = N = R2. Consider F : M → N given by F (x, y) = (x, x2). Then F is transverseto 0 × R, but it is not transverse to R× 0.

Theorem 4.19. If a smooth map F : M → N of manifolds is transverse to a submanifold Z of N , thenF−1(Z) is a submanifold of M . Moreover,

Ta(F−1(Z)) = (dFa)−1(TF (a)Z),

for all a ∈ F−1(Z), anddim(M)− dim(F−1(Z)) = dim(N)− dim(Z).

Proof. We first consider a special case: assume that N = Rn, Z = Rk × 0 ⊂ Rk × Rn−k = Rn. Letπ : Rk × Rn−k → Rn−k denote the canonical projection map. Then

π−1(0) = Rk × 0 = Z,

hence(π F )−1(0) = F−1(Z).

Additionally, for all a ∈ F−1(Z)

d(π F )a(TaM) = dπF (a)(dFa(TaM)) = dπF (a)(dFa(TaM) + TF (a)Z) = dπF (a)(Rn) = Rn−k,

where for the second equality we used the fact that dπF (a)(TF (a)Z) = 0. Therefore 0 is a regular value ofπ F and consequently (π F )−1(0) = F−1(Z) is a submanifold of M . Moreover,

TaF−1(Z) = Ta(π F )−1(0) = ker d(π F )a = kerdπF (a) dFa = (dFa)−1(ker dπF (a)) = (dFa)−1(TF (a)Z).

25

Finally, since (dπ F )m is surjective,

dimF−1(Z) = dim(ker(dπ F )a) = dimM − dim Rn−k.Therefore

dimM − dimF−1(Z) = dimM − (dimM − dim Rn−k) = dimN − dimZ.

What about the general case? Since Z is an embedded submanifold for all z ∈ Z, there is a coordinate chartψ = (x1, . . . , xn) : N → Rn adapted to Z with z ∈ V . Hence ψ(Z) = ψ(V ) ∩ (Rk × 0). Now apply theprevious argument to ψ F : F−1(V )→ Rn and ψ(V ) ∩ (Rk × 0).

Example 4.20. Consider two surfaces S1 and S2 in R3 such that TxS1 6= TxS2 for every x ∈ S1 ∩S2. ThenTxS1 + TxS2 = R3 for all x ∈ S1 ∩ S2.

Let F : S1 → R3 be the inclusion map. Then dFx(TxS1) = TxS1. Thus, F is transverse to S2. By thetheorem above F−1(S2) = S1 ∩ S2 is a submanifold of S1 of dimension 1. In other words, if two surfaces arenowhere tangent then they intersect in a collection of curves.

4.3. Embeddings, Immersions, and Rank.

Definition 4.21 (Immersion). A smooth map of manifold f : Z → M is an immersion if its differential isinjective at every point of Z.

Immersions need not be injective: consider the map f : S1 → S1, f(eiθ) = e2iθ. It’s a 2-1 map but itsdifferential everywhere is a bijection.

Example 4.22. The inclusion map of an submanifold is a 1-1 immersion.

Definition 4.23 (Submersion). A map f : M → N is called a submersion if its differential at every pointis surjective.

Exercise 4.2. Show that for any manifold M the canonical projection π : TM → M is a submersion —compute in the appropriate coordinates.

Exercise 4.3. Show that if Z ⊂ M is an embedded submanifold, then π−1(Z) ⊂ TM is an embeddedsubmanifold of the tangent bundle TM of M . Here again π : TM → M is the projection. Note thatπ−1(Z) = ∪a∈ZTaM . It is often denoted by TM

∣∣∣Z

.

Definition 4.24 (Embedding). A smooth map of manifold f : Z → M is an embedding if f(Z) ⊂ M is anembedded submanifold and f : Z → f(Z) is a diffeomorphism.

This says, in particular, that every embedding is a 1-1 immersion. The converse is not true.

Example 4.25. Let Z be an interval and consider a map f that sends it to figure 8 as in the picture.Then f : Z → R2 is a 1-1 immersion which is not an embedding: the topology on f(Z) as a subspace ofR2 is coarser than the topology on f(Z) that makes f : Z → f(Z) a homeomorphism. Or, if you preferf−1 : f(Z)→ Z is not continuous if f(Z) is given the subspace topology.

Example 4.26. Consider the map f : R→ S1 × S1 given by

f(t) = (e2πit, e2π√

2t).

The image of f is dense in S1 × S1. Hence f is a 1-1 immersion which is not an embedding.

Definition 4.27 (Rank). The rank of a smooth map f : M → N of manifold at a point a ∈M is the rankof the linear map dfa : TaM → Tf(a)N .

Proposition 4.28. If f : M → N is a smooth and rank(f) = k at some point a ∈ M , then for all a′

sufficiently close to a, (rankfs) ≥ k.

Proof. The rank of f at a is the rank of the matrix ((∂yif∂xj(a))), where (x1, . . . , xm) are coordinates on M

near a and (y1, . . . , yn) are coordinates on N near f(a). By a suitable permutation of coordinates, we mayassume that det (( ∂fi∂xj

(a)))i,j≤k 6= 0. Since the determinant is a continuous mapping, this determinant isalso non-zero for points sufficiently close to a.

26

The following theorem, which is a generalization of the Implicit Function Theorem, applies in particularto immersions, but we state the more general version.

Theorem 4.29 (Rank Theorem). Suppose that a smooth f : M → N has rank k at all points a ∈M . Thenfor any point a ∈ M there are coordinate chart φ : U → Rm on M about a and a chart ψ : V → Rn on Nabout f(a) such that

(ψ f φ−1)(r1, · · · , rm) = (r1, · · · , rk, 0, · · · , 0).

We will not prove this theorem since we don’t have the time

Exercise 4.4. Define f : R3 → R6 by f(x, y, z) = (x2, y2, z2, yz, zx, xy). Is f an immersion? Show that therestriction of f to S2 is an immersion of S2 into R6.

Exercise 4.5. Show that there is no immersion f : S2 → R2.

Exercise 4.6. (a) Let N be a manifold. Prove that the diagonal ∆N = (n, n) ∈ N × N : n ∈ N is anembedded submanifold of N ×N .(b) Let F : M → N and g : L → N be smooth maps such that, for all m ∈ M and l ∈ L with f(m) = g(l)we have

dfm(TmM) + dgl(TlL) = TrN, r = f(m) = g(l).

Show thatZ = (m, l) ∈M × L : f(m) = g(l)

is a submanifold of M × L.

Exercise 4.7. Let f : Rn → Rn be a smooth map such that for every x with ||x|| ≥ 2, we have ||f(x)|| <1/||x||. Show that (a) ||f || attains its maximum value at a point of Rn.(b) f is not an immersion.

Exercise 4.8. Let N be a closed embedded submanifold of M . Show that every vector field X on N canbe extended to a vector field Y on M .

Hint: First extend the vector field in adapted coordinates. Next, use a partition of unity to combine eachof the locally defined extensions into a global vector field.

Exercise 4.9. Consider f(x, y) = y2 + 16x

6 − 12x

2 on R2. For each c ∈ R, determine whether or not f−1(c)is a submanifold of R2. Justify your answer.

5. Vector fields and flows

5.1. Definitions, examples, correspondence between vector fields and flows. We start with a fewwords about notation. In this section I and J will stand for an open connected subset of the reals containingthe origin, such as an open interval (a, b) or half-infinite intervals (−∞, b) and (a,+∞) or the whole of R (ofcourse a < 0 < b).

Recall next that given a curve γ : I →M in a manifold M , the tangent vector γ(t) to the curve at γ(t) is

γ(t) := dγt(d

dt).

As you have proved in the homework if γ(t) = (γ1(t), . . . , γn(t)) is a curve in Rn, then γ(t) is the vector(γ′1(t), . . . , γ′n(t)) where ′ denotes the ordinary derivative. Next, an important definition:

Definition 5.1. A curve γ : I → M is an integral curve of a vector field X on a manifold M through thepoint q if

γ(t) = Xγ(t) for all t ∈ Iγ(0) = q.

In other words the tangent vector to the curve γ at t is the value of the vector field X at γ(t).27

We are now in position to summarize the goals of this subsection. We will see that vector fields are thegeometric version of ordinary differential equations (ODEs) and integral curves are the geometric versionof the solutions of ODEs. Using this connection with ODEs we will show that integral curves of vectorfields exist and that on Hausdorff manifolds integrals curves are unique. We will then assume that all ourmanifolds are Hausdorff. With this assumption we will show all integral curves of a given vector field can beput together to form a flow. Moreover, there is a bijective correspondence between vector fields and flows.We will then use flows to give the Lie bracket a geometric meaning.

We first interpret the problem of existence of integral curves in coordinates. Let φ = (x1, . . . , xm) : U →Rm be a coordinate chart on a manifold M . Suppose γ : I → U is an integral curve of a vector field X.Since X is a smooth vector field, there are smooth functions fi : U → R, 1 ≤ i ≤ m so that

Xa =∑

fi(a)∂

∂xi|a

for all a ∈ U (of course fi = dxi(X)). Similarly,

γ(t) = dγt(d

dt)

=∑

dxi(dγt(d

dt))

∂

∂xi|γ(t)

=∑ d

dt|t(xi γ)

∂

∂xi|γ(t)

=∑

(xi γ)′(t)∂

∂xi|γ(t)

Therefore, the equation γ(t) = Xγ(t) is equivalent to∑(xi γ)′(t)

∂

∂xi|γ(t) =

∑fi(γ(t))

∂

∂xi|γ(t)

for all t ∈ I. Thus γ is an integral curve of X in U if and only if

(xi γ)′(t) = fi(γ(t)), t ∈ I, 1 ≤ i ≤ m.This is a system of ordinary differential equations. Conversely, any solution to the above system defines anintegral curve of the vector field X inside the open set U . We now quote without proof the appropriatedtheorem from the theory of ODEs.

Theorem 5.2. Let V ⊂ Rm be an open set and F = (F1, . . . , Fm) : V → Rm a smooth map. For any pointq0 ∈ V there is an open neighborhood V0 of q0, ε > 0 and a smooth map

Φ : (−ε, ε)× V0 → V

so that for each q ∈ V0 the curve γq(t) := Φ(t, q) is the unique solution of the ODE

γ′i(t) = Fi(γ(t)), t ∈ (−ε, ε), 1 ≤ i ≤ msubject to the initial condition

γq(0) = q.

The proof uses a contraction mapping principle and is similar to the proof of the inverse function theorem.We will have no time for it.

Corollary 5.2.1. Suppose X is a vector field on a manifold M . For every point q0 ∈ M there is aneighborhood U of q0, ε > 0 and a smooth map

Φ : (−ε, ε)× U →M

so that for any q ∈ U ,γq(t) := Φ(t, q)

is the unique integral curve of X through q. In particular, if σ : I → U is another integral curve of X withσ(0) = q then σ(t) = γq(t) for all t ∈ I ∩ (−ε, ε).

28

It is important to note that uniqueness of integral curves does depend on the fact that we keep track ofthe initial conditions.

Lemma 5.3. If γ : I → M is an integral curve of a vector field X passing through p then for any s ∈ R,the curve

σ(t) = γ(t+ s)is also an integral curve of X. However, at time 0 it passes through q = γ(s)4 (The curve σ is defined onI ′ = t ∈ R | t+ s ∈ I.)

Proof. This is an easy application of the chain rule. Here are the gory details. Define the translationτs : I ′ → I by τs(t) = t+ s. Then σ = γ τs. Note that d(τs)t : R→ R = id. Hence

σ(t) = dσt(d

dt) = d(γ τs)t(

d

dt) = (dγs+t d(τs)t) (

d

dt)

= dγs+t(d

dt) = γ(t+ s) = Xγ(t+s) = Xσ(t).

The open set U in Corollary 5.2.1 above lies inside some coordinate chart on M . Therefore the generaluniqueness of integral curves of X doesn’t quite follow from the corollary. Here is an example where theuniqueness fails.

Example 5.4. Consider first the real line R with the constant vector field ddt . The corresponding differential

equation isγ′(t) = 1.

The solutions are curves of the form γ(t) = p+ t.Now consider the non-Hausdorff manifold M obtained by gluing two copies of R along R r 0. More

precisely, let M = R×0, 1. Define an equivalence relation ∼ by (x, 0) ∼ (x, 1) for all x 6= 0. Let M = M/∼.We write [x, 0] and [x, 1] for the equivalence classes of (x, 0) and (x, 1) respectively. Note that by design[0, 0] 6= [0, 1]. These are the “two origins” of the “line” M . For x 6= 0 we have [x, 0] = [x, 1].

Note that M comes with two natural coordinate charts: φ([x, 0]) = x and ψ([x, 1]) = x for all x ∈ R.The change of coordinates φ ψ−1 is defined on all of R r 0 and is the identity map. It follows that theconstant vector field d

dt defines a vector field X on M . Moreover, γ(t) = φ−1(t + 1) and σ(t) = ψ−1(t + 1)are integral curves of X with γ(0) = [1, 0] = [1, 1] = σ(0). Additionally γ(t) = σ(t) except for t = −1.

Why do problems like these not occur on Hausdorff manifolds? The key point is: a manifold M isHausdorff if and only if the diagonal

∆M := (m,m) ∈M ×M | m ∈Mis closed in M ×M [prove it]. Consequently, if γ : I →M and σ : J →M are two curves, then the set

K := t ∈ I ∩ J | γ(t) = σ(t),where the two curves agree, is closed in I ∩ J . Indeed, K is the preimage of ∆M under the map I ∩ J 3 t 7→(γ(t), σ(t)) ∈M ×M .

Now suppose additionally that γ and σ are two integral curves of a vector field X ∈ Γ(TM). Then, byCorollary 5.2.1 and Lemma 5.3, the set of points K is also open in I ∩ J . Since I ∩ J is an interval and Kis open and closed, it follows that the set K has to be all of I ∩ J . This gives us uniqueness: two integralcurves of a given vector field passing through a given point at t = 0 agree for all t in the intersection of theirdomains of definition.

Furthermore it makes sense to take the union of the integral curves γ and σ:

(γ ∪ σ)(t) :=γ(t) t ∈ Iσ(t) t ∈ J

Taking the union of all integral curves of a vector field X passing through a given point p we get a maximalintegral curve γp : Ip → M of X passing through p. It is maximal in the following sense: if γ : I → M is

4γ(s) is not γ(0) unless γ(t) = γ(0) for all t, in which case γ(t) = σ(t).

29

any other integral curve of X passing through p then I ⊂ Ip and γ(t) = γp(t) for all t ∈ I. We have provedthe following lemma.

Lemma 5.5. Let M be a Hausdorff manifold and X ∈ Γ(TM) a vector field. For any two integral curvesγ : I →M and σ : J →M of X

t ∈ I ∩ J | γ(t) = σ(t) = I ∩ J.Consequently, for any point p ∈M there is a unique maximal integral curve γp of X passing through p.

From now on, unless noted otherwise, all manifolds are assumed to be Hausdorff.

Example 5.6. An integral curve of a vector field need not be defined for all time. Here is a simple example.Let M = (−∞, 0) and X = d

dt . Then the maximal integral curve γp of X passing through p ∈ (−∞, 0) isgiven by γp(t) = p+ t, hence is defined only when p+ t < 0, i.e., t < −p.

Corollary 5.2.1 has another important consequence: the maximal integral curve γp of the vector field Xdepends smoothly on the point p. We can therefore put the maximal integral curves together and obtain amap

(5.1) Φ(t, p) = γp(t) for all t ∈ Ip and all p ∈M.

We have to be a bit careful about the set where the map Φ is defined. It is defined on a subset A of R×Mcontaining 0 ×M . Moreover, by Corollary 5.2.1, the subset A is open.

Definition 5.7. We use the notation above: γp denotes the maximal integral curve through the point p ofa vector field X on a manifold M . The map

Φ : R×M ⊃ A→M

defined by (5.1) is called the (local) flow of the vector field X.

The word “local” refers to the fact that the flow Φ need not be defined for all time t but only for t insome neighborhood of 0, the neighborhood that depends on the point p. If the set A in the definition aboveis all of R×M , we say that X has a global flow.

Lemma 5.8. Let X be a vector field on a Hausdorff manifold M and let Φ : R ×M ⊃ A → M denote itslocal flow. Then

Φ(t,Φ(s, p)) = Φ(s+ t, p)

for all p ∈M and all s, t ∈ R for which both sides of the equation make sense.

Proof. Fix p ∈ M and s ∈ R. Let γ(t) = Φ(t,Φ(s, p)) and let σ(t) = Φ(s + t, p). Then γ is the maximalintegral curve of X passing through Φ(s, p). By Lemma 5.3 σ(t) is the maximal integral curve of X passingthrough σ(0) = Φ(s + 0, p). Therefore, since maximal integral curves are unique on Hausdorff manifolds,γ(t) = σ(t) for all t.

This motivates the following definition.

Definition 5.9 (abstract local flow). A local flow on a manifold M is a map Ψ : A→M , where A is opensubset of R×M containing 0 ×M , having the following two properties

(1) Ψ(0, p) = p for all p ∈M(2) Ψ(t,Ψ(s, p)) = Ψ(s+ t, p) whenever both sides make sense.

Example 5.10. Let M = Rn. The map Ψ : R× Rn → Rn, Ψ(t, p) = etp is a flow.

Example 5.11. Let M = R2. The map

Ψ(t, (x, y)) =(

cos t sin t− sin t cos t

)(xy

)is a flow. The example can be described more succinctly in complex coordinates: let M = C and writeΨ(t, z) = eitz.

30

It may be a bit hard to see what the meaning of the two conditions of Definition 5.9 really is. It’s easierto understand what’s going on in the case where Ψ is a global flow, that is, when the domain of the definitionA of Ψ is all of R×M .

Given a global flow Ψ : R×M →M , we have, for each t ∈ R a map

Ψt : M →M, Ψt(q) := Ψ(t, q).

Condition (1) in the definition of local flow then simply says that Ψ0 is the identity map idM on M . Condition(2) becomes

(5.2) Ψt(Ψs(q)) = Ψt+s(q)

for all t, s ∈ R and q ∈M . Hence ΨtΨ−t = idM = Ψ−tΨt. Consequently Ψt : M →M is a diffeomorphismfor each t ∈ R. Moreover, we can interpret (5.2) as saying that we have a homomorphism of groups

R 3 t 7→ Ψt ∈ Diff(M),

where Diff(M) denotes the group of diffeomorphisms of M (it’s a group under composition). This is whyglobal flows are also referred to as 1-parameter groups of diffeomorphisms.

Now let’s return to vector fields. The point of much of the preceding discussion is that for a vector field Xon (Hausdorff) manifold M the collection of integral curves taken together forms a local flow. The converseis true as well.

Lemma 5.12. Let Ψ : R×M ⊃ A→M be a local flow. Then the map X : C∞(M)→ C∞(M) defined by

Xf(p) =d

dt

∣∣∣0f(Ψ(t, p))

for all f ∈ C∞(M) is a vector field. Moreover, Ψ is the local flow of X.

Proof. Since Ψ(t, p) is a smooth function of t and p, f(Ψ(t, p)) is also a smooth function of t and p and itsderivative d

dt |0f(Ψ(t, p)) is a smooth function of p. For any f, g ∈ C∞(M), we have

(fg)(Ψ(t, p)) = f(Ψ(t, p)) g(Ψ(t, p)).

Hence X(fg) = X(f)g + fX(g), i.e., X is a vector field.It remains to check that Ψ is the local flow of X. We need to show that for each p ∈M ,

γ′p(t) = Xγp(t)

where γp(t) := Ψ(t, p). Let f ∈ C∞(M) be a function. Then

(γ′p(t)) f =d

ds|s=tf(γp(s))

=d

ds|s=tf(Ψ(s, p))

=d

ds|s=0f(Ψ(s+ t, p))

=d

ds|s=0f(Ψ(s,Ψ(t, p)))

= (Xf)(Ψ(t, p)) = XΨ(t,p)f = Xγp(t)f.

Definition 5.13. A vector field is complete if its local flow is a global flow. That is, each integral curve isdefined for all t ∈ R.

Here are two examples of vector fields that are not complete.

Example 5.14. The vector field ddt on (−∞, 0) is not complete.

Example 5.15. The vector field x2 ddx on R is not complete: Φ(t, x) = x

1−xt is its local flow [check it]. Theflow is defined for t ∈ (−∞, 1/x) if x > 0 and for t ∈ (1/x,+∞) if x < 0.

It is nice to know when a vector field has a global flow. For this purpose we define:31

Definition 5.16. The support of a vector field X on a manifold M is

supp(X) = p ∈M | Xp 6= 0,

the closure of the set of points where X is non-zero.

Theorem 5.17. A vector field with compact support is complete. In particular any vector field on a compactmanifold defines a global flow.

Recall that we are tacitly assuming throughout that all our manifolds are Hausdorff. Also, recall that anyclosed subset of a compact space is compact. Hence any vector field on a compact manifold has compactsupport. There are more than one way to prove the theorem above. For our proof we will need the followinglemma.

Lemma 5.18. Let X ∈ Γ(TM) be a vector field with the flow Φ : R × M ⊃ A → M . Suppose thatτ ×M ⊂ A, that is, the flow of X is defined for time τ at all points of M .Then

d(Φτ )m(Xm) = XΦτ (m)

for all m ∈M . Here, as before, Φτ (m) := Φ(τ,m).

Proof. Let γm(t) be the maximal integral curve of X through m: γm(t) = Φ(t,m). Then Xm = γ′m(0). Also,

Φτ (γm(t)) = Φ(τ,Φ(t,m)) = Φ(τ + t,m)

= Φ(t,Φ(τ,m)) = γΦτ (m)(t).

Hence

d(Φτ )m(Xm) = d(Φτ )m(γ′m(0)) = d(Φτ )m d(γm)0 (d

dt)

= d(Φτ γm)0 (d

dt) = (Φτ (γm))′(0) = (γΦτ (m))′(0) = XγΦτ (m)(0) = XΦτ (m).

Proof of Theorem 5.17. We want to show that the domain of definition A of the local flow Φ(t, p) of X is allof R×M . If Xm = 0 then the constant curve γm(t) = m is the integral curve of X through m. It is definedfor all t. Therefore on M r suppX the flow is defined for all t: R× (M r suppX) ⊂ A. Also 0 ×M ⊂ Aby definition of the flow. In particular 0 × suppX ⊂ A. Since suppX is compact and A is open, there isε > 0 so that [−ε, ε]× suppX ⊂ A. Hence [−ε, ε]×M ⊂ A. We now define Φ : [0, 2ε]×M →M by

Φ(t, p) = Φ(ε,Φ(t− ε, p)) = Φε(Φ(t− ε, p)).

Here, as before, Φε(q) = Φ(ε, q). We claim that for any p ∈M the curve

γp(t) =

Φ(t, p) t ∈ [−ε, ε]Φ(t, p) t ∈ [0, 2ε]

is an integral curve of X. Indeed, for t ∈ [0, 2ε], by definition of γp and Φ,

γ′p(t) = dΦε(γ′p(t− ε))= dΦε(XΦ(p,t−ε))= XΦε(Φ(p,t−ε)) = XΦ(t,p) = Xγp(t),

where the third equality holds by Lemma 5.18. It follows that the maximal integral curve γp of X throughp is defined for t ∈ [−ε, 2ε]. Hence [−ε, 2ε]×M ⊂ A. Arguing inductively we get [−kε, nε]×M ⊂ A for allpositive integers k and n. Therefore A = R×M and X is complete.

32

5.2. The geometry of the Lie bracket. As before, we continue to assume that all manifolds are Hausdorff.Additionally, in this subsection we will pretend that all flows are global, equivalently, that all vector fieldsare complete. Assuming completeness is not necessary. On the other hand carrying out the argument in fullgenerality obscures the main simple ideas.

Definition 5.19. Let X and Y be two vector fields with Φ denoting the flow of X. The Lie derivative LXYof Y with respect to X is a vector field defined by

(LXY )p := limt→0

1t(d(Φ−t)p(YΦt(p) − Yp) =

d

dt

∣∣∣t=0

d(Φ−t)p(YΦt(p))

for all p ∈M .

Several remarks are in order. It is not entirely clear that the Lie derivative, as defined above, is a smoothvector field. We will prove this shortly. Second, t→ d(Φ−t)p(YΦt(p)) is a curve in a finite dimensional vectorspace TpM . We have seen that for Rn, we can always canonically identify TpRn with Rn. A moment ofreflection should convince you that the same identification works for any finite dimensional vector space.Hence it does make sense to think of the Lie derivative (LXY )p as a vector in the tangent space TpM . Finallynote that if γ : I → TpM is any smooth curve, then

(5.3)(d

dt

∣∣∣t=0

γ(t))f =

d

dt

∣∣∣t=0

(γ(t)f) for any f ∈ C∞(M).

We will need the equation in the proof of Theorem 5.20 below. There are many ways to prove (5.3). Forexample, pick a basis of TpM and compute both sides of (5.3) in coordinates defined by the basis. Whatmakes the proof work is the fact that partials commute. With these preliminaries out of the way we areready to state the main result of the subsection.

Theorem 5.20. Lie derivative is a Lie bracket. That is, for any two vector fields X and Y on a manifoldM

(LXY )p = ([X,Y ])pfor all points p ∈M .

Proof. Denote the flow of Y by Ψ. We evaluate (LXY )p on an arbitrary smooth function f ∈ C∞(M):

(LXY )pf =(d

dt

∣∣∣t=0

d(Φ−t)p(YΦt(p)))f

=d

dt

∣∣∣t=0

(d(Φ−t)p(YΦt(p))f

)=

d

dt

∣∣∣t=0

YΦt(p)(f Φ−t)

=d

dt

∣∣∣t=0

(∂

∂s

∣∣∣s=0

(f Φ−t)(Ψs(Φt(p))))

=∂2

∂t∂s

∣∣∣(0,0)

(f Φ−t Ψs Φt)(p)

=d

ds

∣∣∣s=0

(∂

∂t

∣∣∣t=0

(f Φ−t Ψs Φt)(p))

=d

ds

∣∣∣s=0

(∂

∂t

∣∣∣t=0

(f Φ−t)(Ψs(Φ0(p))) +∂

∂t

∣∣∣t=0

(f Φ−0 Ψs)(Φt(p)))

=d

ds

∣∣∣s=0

(−XΨs(p)f +

∂

∂t

∣∣∣t=0

(f Ψs Φt)(p))

= − d

ds

∣∣∣s=0

(Xf)(Ψs(p)) +d

dt

∣∣∣t=0

∂

∂s

∣∣∣s=0

(f Ψs)(Φt(p))

= −Y (Xf)(p) +d

dt

∣∣∣t=0

(Y f)(Φt(p))

= −Y (Xf)(p) +X(Y f)(p) = ([X,Y ]p)f.33

In particular this proves that the Lie derivative LXY is a vector field. As a corollary to the above proof,we get:

Corollary 5.20.1. Let Φ and Ψ denote the flows of vector fields X and Y respectively. Then for any smoothfunction f ,

(5.4) ([X,Y ]f)(p) =∂2

∂t∂s

∣∣∣(0,0)

(f Φ−t Ψs Φt)(p).

Note that if the flows Φt and Ψs commute, that is,

Φt Ψs = Ψs Φt for all t and s,

then Φ−t Ψs Φt = Ψs Φ−t Φt = Ψs. In particular, it’s independent of t. Hence the right hand side of(5.4) is 0. Therefore [X,Y ] = 0. The converse is true as well.

Lemma 5.21. Let Φ and Ψ denote the flows on a manifold M of vector fields X and Y respectively. Then

[X,Y ] = 0 if and only if Φt Ψs = Ψs Φt.

Proof. We have just proved that if the flows commute the Lie bracket has to vanish. Now suppose [X,Y ] =0. Our proof will use the following observation. Let V and W be two finite dimensional vector spaces,T : V →W a linear map and γ : I → V a smooth curve. Then, since dT = T ,

(T γ)′(t) = T (γ′(t)).

Here, again we identify γ′(t) with a vector in V and similarly for (T γ)′(t). With the preliminaries out ofthe way, we proceed with the actual proof. Since [X,Y ] = 0,

0 =d

dh

∣∣∣h=0

(dΦ−h)(YΦh(p))

for all points p. Hence, with • denoting the appropriate point,d

dt

∣∣∣t=s

d(Φ−t)•(YΦt(p)) =d

dh

∣∣∣h=0

(dΦ−(s+h))•(YΦs+h(p))

=d

dh

∣∣∣h=0

d(Φ−s)•[(dΦ−h)•(YΦh(Φs(p)))]

= d(Φ−s)•[d

dh

∣∣∣h=0

d(Φ−h)•(YΦh(Φs(p)))] = 0.

Here, in the last equality we used the fact that d(Φ−s)• is a linear map between tangent spaces and h 7→d(Φ−h)•(YΦh(Φs(p)) is a curve in the tangent space TΦs(p)M . Hence the curve t 7→ d(Φ−t)•(YΦt(p)) is aconstant curve. In particular,

d(Φ−t)•(YΦt(p)) = d(Φ−0)•(YΦ0(p)) = Yp

for all t. Consequently

(5.5) YΦt(p) = d(Φt)pYp for all t.

We use the equation above to argue that σ(s) = Φt(Ψs(p)) is an integral curve of Y passing through Φt(p):

σ′(s) =d

dτ

∣∣∣τ=s

[Φt(Ψτ (p))]

= (dΦt)(d

dτ

∣∣∣τ=s

Ψτ (p)

= (dΦt)(YΨτ (p)) = Y(ΦtΨs)(p) by (5.5)= Y (σ(s)).

On the other hand s 7→ Ψs(Φt(p)) is also an integral curve of Y passing through Φt(p). Therefore the twocurves are equal:

Φt(Ψs(p)) = Ψs(Φt(p)).

34

We end the section with a somewhat technical subsection. The point of this subsection will not beapparent for some time.

5.3. Map-related vector fields. Recall that given a smooth map between manifolds f : M → N , foreach point p ∈ M we get a map of tangent spaces dfp : TpM → Tf(p)N . Therefore, given a vector fieldX : M → TM we get for each p ∈M a vector dfp(Xp) ∈ Tf(p)N . It is not a vector field on N . If additionallyf is diffeomorphism we can make it into a vector field: define

X ′(q) = dff−1(q)Xf−1(q)

The new vector field X ′ is related to the old vector field X by

df X = X ′ fwhere we think of X, X ′ and df as maps X : M → TM , X ′ : N → TN and df : TM → TN respectively. Inother words, the diagram

TMdf // TN

M

X

OO

f // N

X′

OO

commutes.

Definition 5.22. Let X : M → TM , Y : N → TN be two vector fields and f : M → N a smooth map.The two vector fields X and Y are f -related if

df X = Y f.

Example 5.23. Let f : R2 → R be the projection onto the first factor: f(x, y) = x. Then any vector fieldof the form X = ∂

∂x + g(x, y) ∂∂y , where g : R2 → R is a smooth function is f -related to Y = ddx .

The main fact worth remembering about related vector fields is that Lie brackets go to Lie brackets. Moreprecisely,

Lemma 5.24. Let X1, X2 : M → TM and Y1, Y2 : N → TN be two pairs of vector fields related by a mapf : M → N :

df Xi = Yi f i = 1, 2.Then

df [X1, X2] = [Y1, Y2] f.

Proof. Note that two vector fields X and Y are f -related (f : M → N) if and only if for any smooth functionh ∈ C∞(N),

Y (h)f(p) = Xp(h f)for all p ∈M . Or, more concisely,

(Y h) f = X(h f)We now compute:

[X1, X2](h f) = X1(X2(h f))−X2(X1(h f))

= X1((Y2h) f))−X2((Y1h) f)

= (Y1(Y2h)) f − (Y2(Y1h)) f = ([Y1, Y2]h) f.

Exercise 5.1. Find the flows of the following vector fields on R2:

X = x1∂

∂x1+ x2

∂

∂x2

and

Y = x1∂

∂x2− x2

∂

∂x1.

35

Exercise 5.2. Prove that if a vector field X on a manifold M vanishes at a point p, X(p) = 0, then thereis an open set W containing p such that the flow of X on W exists for all t ∈ [0, 1].

Exercise 5.3. Let M be a manifold. An isotopy on M is a collection of diffeomorphisms ft : M →Mt∈(−ε,ε) such that

(1) f0 is the identity, and(2) the map (−ε, ε)×M →M given by (t,m) 7→ ft(m) is smooth.

A time-dependent vector field Xt is a smooth map (−ε, ε)×M → TM of the form (t,m) 7→ (Xt)m =:Xt(m). An isotopy ft defines a time-dependent vector field Xt by

Xs(fs(m)) =d

dt

∣∣∣t=s

ft(m).

Prove that given a time-dependent vector field Xt, there is an isotopy ft such that the equation aboveholds.Hint: Let X(t,m) = ( ddt , Xt(m)); it is a vector field on R×M . The local flow Φs(t,m) of X is of the formΦs(t,m) = (Φ1

s(t,m),Φ2s(t,m)). Show that Φ1

s(t,m) = s+ t.

Exercise 5.4. Consider a time-dependent vector field Xt(m) = t ddθ on S1. Compute the correspondingisotopy.

Exercise 5.5. Suppose that M and N are manifolds. If X ∈ Γ(TM) is a vector field, show that X :M × N → T (M × N) ' TM × TN given by X(m,n) = (Xm, 0) is a well-defined vector field on M × N .Similarly, given Y ∈ Γ(TN) we get Y ∈ Γ(T (M ×N)). Show that [X, Y ] = 0.

Exercise 5.6. Suppose that X and Y are vector fields on M . Compute an expression for [X,Y ] in localcoordinates.

6. (Multi)linear algebra

The goal of this section is to define tensors, tensor algebra and Grassmann (exterior) algebra. We willuse these constructions to define tensors and differential forms on manifolds. In this section, unless notedotherwise, all vector spaces are over the real number and are finite dimensional. There are two ways to thinkabout tensors:

(1) tensors are multi-linear maps;(2) tensors are elements of a “tensor product” of two or more vector spaces.

The first way is more concrete. The second is more abstract but also more powerful.

6.1. Tensor products. We start by reviewing multi-linear maps.

Definition 6.1. Let V1, . . . , Vn and U be vector spaces. A map

f :

n factors︷︸︸︷V1 × · · · × Vn→ U, (v1, . . . , vn) 7→ f(v1, . . . , vn)

is multi-linear if for each fixed index i and a fixed (n− 1)-tuple of vectors v1, . . . , vi−1, vi+1, . . . , vn the map

Vi → U, w 7→ f(v1, . . . , vi−1, w, vi+1, . . . , vn)

is linear. When the number of factors is n, as above, we will also say that f is n-linear.

For example, if we identify Rn2 'n factors︷︸︸︷

Rn × · · · × Rn by thinking of an n× n matrix as an n-tuple of columnvectors, then the determinant

det :

n factors︷︸︸︷Rn × · · · × Rn→ R, (v1, . . . , vn) 7→ det(v1| . . . |vn)

is an n-linear map. Here is an example of a bilinear map. Any inner product on a vector space V :

V × V 3 (v, w) 7→ v · w ∈ R36

is bilinear. There is no standard notation for the space of n-linear maps from V1 × · · · × Vn to U . We willdenote it by

Mult(V1 × · · · × Vn, U) = Multn(V1 × · · · × Vn, U)(n is to indicate that these are n-linear maps). This space, Mult(V1 × · · · × Vn, U), is a vector space: anylinear combination of two n-linear maps is n-linear. We now take a closer look at the space of bilinear mapsMult2(V ×W,U). This case is complicated enough to understand what happens with multi-linear maps ingeneral, but simple enough not to bog down in notation.

Lemma 6.2. Let vi, wj and uk denote the bases of V , W and U respectively and v∗i , w∗j andu∗k the corresponding dual bases. Then the maps

φkij : V ×W → U, φkij(v, w) = v∗i (v)w∗j (w)ukare bilinear and form a basis of Mult2(V ×W,U). Hence

dim Mult2(V ×W,U) = dimV dimW dimU.

Proof. It is easy to see that φkij are bilinear. Next, for any b ∈ Mult2(V ×W,U), any w ∈W and any v ∈ V ,

b(v, w) = b(∑

v∗i (v)vi,∑

w∗j (w)wj)

=∑i,j

v∗i (v)w∗j (w)b(vi, wj)

=∑i,j,k

v∗i (v)w∗j (w)u∗k(b(vi, wj))uk

=∑i,j,k

u∗k(b(vi, wj))φkij(v, w).

Hence the maps φkij span Mult2(V ×W,U). Also, the collection of numbers u∗k(b(vi, wj)) uniquely determinethe bilinear form b. Hence φkij ’s are linearly independent.

We now turn to the definition of the tensor product V ⊗W [pronounced “V tensor W”] of two vectorspaces V and W . Informally it consists of finite linear combinations of symbols v ⊗ w, where v ∈ V andw ∈W . Additionally, these symbols are subject to the following identities:

(v1 + v2)⊗ w − v1 ⊗ w − v2 ⊗ w = 0v ⊗ (w1 + w2)− v ⊗ w1 − v ⊗ w2 = 0α (v ⊗ w)− (αv)⊗ w = 0α (v ⊗ w)− v ⊗ (αw) = 0,

for all v, v1, v2 ∈ V , w,w1, w2 ∈W and α ∈ R. These identities simply say that the map ⊗ : V ×W → V ⊗W ,(v, w) 7→ v ⊗ w, is a bilinear map. The fact that everything in V ⊗W is a linear combination of symbolsv ⊗ w means that the image of the map ⊗ : V ×W → V ⊗W spans V ⊗W .5 Here is the formal definitionof the tensor product of two vector spaces.

Definition 6.3. A tensor product of two finite dimensional vector spaces V and W is a vector space V ⊗Wtogether with a bilinear map ⊗ : V × W → V ⊗ W , (v, w) 7→ v ⊗ w6 such that for any bilinear mapb : V ×W → U there is a unique linear map b : V ⊗W → U with b(v ⊗w) = b(v, w). That is, the diagram

V ×W b //

⊗

U

V ⊗Wb

;;

commutes. The existence of the map b satisfying the above conditions is called the universal property of thetensor product.

5But the image of ⊗ is not all of V ⊗W . The elements in the image are called decomposable tensors.6The symbol v ⊗ w stands for the value of the map ⊗ on the pair (v, w)

37

This definition is quite abstract. It is not clear that such objects exist and, if they exist, that they areunique. Setting the question of existence and uniqueness of tensor products aside, let’s us sort out therelationship between V ⊗W and bilinear maps Mult(V ×W,U). Recall that Hom(X,Y ) is the space of alllinear maps from a vector space X to a vector space Y and is itself a vector space (see p. 13).

Lemma 6.4. Assume that V ⊗W exists. Then

Hom(V ⊗W,U) '−→ Mult(V ×W,U).

Proof. The isomorphism in question is built into the definition of the tensor product. Given a linear mapA : V ⊗W → U the composition A ⊗ : V ×W → U is bilinear. And conversely, given a bilinear mapb ∈ Mult(V ×W,U) there is a unique linear map b : V ⊗W → U so that (b ⊗)(v, w) = b(v, w) for all(v, w) ∈ V ×W .

In other words the maps Hom(V ⊗W,U) 3 A 7→ A ⊗ ∈ Mult(V ×W,U) and Mult(V ×W,U) 3 b 7→ b ∈Hom(V ⊗W,U) are inverses of each other.

Next we observed that the uniqueness of the tensor product is also built into the definition of the tensorproduct.

Proposition 6.5. If tensor products exist, they are unique up to isomorphism.

Proof. The proof is quite formal and uses nothing but the universal property. Suppose there are two vectorspaces V⊗1W and V⊗2W with corresponding bilinear maps⊗1 : V×W → V⊗1W and⊗2 : V×W → V⊗2Wwhich satisfy the conditions of the Definition 6.3. We will argue that these vector spaces are isomorphic. Bythe universal property there exist a unique linear map ⊗1 : V ⊗2 W → V ⊗1 W so that the diagram

V ×W⊗1 //

⊗2

V ⊗1 W

V ⊗2 W

⊗1

88

commutes. By the same argument, switching the roles of ⊗1 and ⊗2, there is a unique linear map ⊗2 :V ⊗1 W → V ⊗2 W making the diagram

V ×W⊗2 //

⊗1

V ⊗2 W

V ⊗1 W

⊗2

88

commute. Define

T1 = ⊗1 ⊗2 : V ⊗1 W → V ⊗1 W

T2 = ⊗2 ⊗1 : V ⊗2 W → V ⊗2 W.

These are linear maps making the diagrams

V ×W⊗1 //

⊗1

V ⊗1 W

V ⊗1 W

T1

88 and V ×W⊗2 //

⊗2

V ⊗2 W

V ⊗2 W

T2

88

commute. But the identity maps idi : V ⊗iW → V ⊗iW , i = 1, 2, are linear and also make the respectivediagrams commute. By uniqueness Ti = idi. Hence ⊗1 and ⊗2 are inverses of each other and provide thedesired isomorphisms.

Now we construct the tensor product as a quotient of an infinite dimensional vector space by an infinitedimensional subspace thereby proving its existence.

Proposition 6.6. Tensor products exist.38

Proof. Let V and W be two finite dimensional vector spaces. We want to construct a new vector spaceV ⊗W and a bilinear map ⊗ : V ×W → V ⊗W satisfying the conditions of Definition 6.3. We start witha vector space F (V ×W ) made of formal finite linear combinations of ordered pairs (v, w), v ∈ V , w ∈ W .Its basis is the set (v, w) | v ∈ V, w ∈W = V ×W .

If you prefer you can think of F (V ×W ) as the set of functions

f : V ×W → R | f(v, w) 6= 0 for only finitely many pairs (v, w).

This set of functions is an infinite dimensional vector space. Its basis consists of functions that takevalue 1 on a given pair (v0, w0) and 0 on all other pairs. It’s tempting to call this function (v0, w0).The vector space F (V ×W ) is called the free vector space generated by the set V ×W .

Note that we have an inclusion map ι : V × W → F (V × W ), ι(v, w) = (v, w). It is not bilinear since(v1 + v2, w) 6= (v1, w) + (v2, w) in F (V,W ).

Consider the smallest subspace K of F (V,W ) containing the following collection of vectors:

S =

(v1 + v2, w)− (v1, w)− (v2, w)(v, w1 + w2)− (v, w1)− (v, w2)

c(v, w)− (cv, w)c(v, w)− (v, cw),

∣∣∣∣∣∣∣∣ v, v1, v2 ∈ V, w,w1, w2 ∈W and c ∈ R

In other words, consider the subspace K of F (V ×W ) spanned by the set S. Define V ⊗W to be the quotientof F (V ×W ) by K:

V ⊗W := F (V ×W )/K.

Define the map ⊗ : V ×W → V ⊗W to be the composite of the inclusion ι : V ×W → F (V ×W ) and thequotient map F (V ×W )→ F (V ×W )/K. The definition of K is rigged precisely so that this composite isbilinear. We write v⊗w for the value of ⊗ on the pair (v, w). By construction the set v⊗w | (v, w) ∈ V ×Wspans V ⊗W [but it’s much too big to be a basis].

We check that the map ⊗ : V ×W → V ⊗W has the required universal property. Suppose b : V ×W → Uis bilinear. Since V ×W is a basis for F (V ×W ), b defines a unique linear map b : F (V ×W ) → U givenon the basis by b((v, w)) = b(v, w). As b is bilinear, b is 0 on K by the definition of K. Thus we obtain alinear map b : F (V ×W )/K = V ⊗W → U with b(v ⊗ w) = b((v, w)) = b(v, w). Since the vectors of theform v ⊗ w span V ⊗W , b is unique. This verifies the universal property and thereby proves the existenceof the tensor product.

Lemma 6.7. For any vector spaces V and W

dim(V ⊗W ) = dimV · dimW.

Proof.

dimV ⊗W = dim(V ⊗W )∗ = dim Hom(V ⊗W,R)

= dim Mult(V ×W,R) by Lemma 6.4= dimV · dimW · dim R.

We are now in position to quickly prove a number of results about tensor products.

Corollary 6.7.1. If vi and wj are a basis of V and W respectively, then vi⊗wj is a basis of V ⊗W .

Proof. Since the vectors of the form v⊗w, v ∈ V , w ∈W , span V ⊗W , the much smaller set vi⊗wj alsospans V ⊗W 7. But dim(V⊗)W = dimV · dimW is precisely the number of elements in the set vi ⊗ wj.Hence the set vi ⊗ wj is a basis.

Lemma 6.8. V ⊗W is isomorphic to W ⊗ V .

7We are using here the fact that for any (v, w) ∈ V ×W , the tensor v ⊗ w is a linear combination of vi ⊗ wj ’s.39

Proof. Consider the map b : W × V → V ⊗W defined by

b(w, v) = v ⊗ w.Since b is bilinear, there is a unique linear map b : W ⊗ V → V ⊗W with b(w ⊗ v) = v ⊗ w. Since theset v ⊗ w | v ∈ V,w ∈ W generates V ⊗W , the map b is surjective. It is an isomorphism by dimensioncount.

Lemma 6.9. V ∗ ⊗W is isomorphic to Hom(V,W ).

Proof. Consider b : V ∗ ×W → Hom(V,W ) defined by

(b(v∗, w))(v) = v∗(v)w for all v∗ ∈ V ∗, v ∈ V,w ∈W.Since b is bilinear, it induces a linear map b : V ∗ ⊗W → Hom(V,W ) with

(b(v∗ ⊗ w))(v) = v∗(v)w for all v∗ ∈ V ∗, v ∈ V,w ∈W.Observe that linear maps of the form v 7→ v∗(v)w span Hom(V,W ) (The proof of this fact is very similar tothe proof of Lemma 6.2 and is left as an exercise). Hence b is an isomorphism by dimension count.

Exercise 6.1. Show that if vi is a basis of a vector space V , v∗i the dual basis and wj the basis of avector space W , then v∗i (·)wj is a basis of Hom(V,W ).

Lemma 6.10. If A : V → W and B : V ′ → W ′ are two linear maps, then there is a unique linear mapA⊗B : V ⊗ V ′ →W ⊗W ′ such that (A⊗B)(v ⊗ w) = A(v)⊗B(w) for all (v, w) ∈ V ×W .

Proof. Consider b : V ×W → V ′ ⊗W ′ given by

b(v, w) = Av ⊗Bw.The map b is bilinear, whence the universal property gives us a unique linear map b : V ⊗W → V ′ ⊗W ′with

b(v ⊗ w) = Av ⊗Bwfor all (v, w) ∈ V ×W .

Exercise 6.2. Show that if A : V → W is represented by a matrix (aij) with respect to some bases of Vand W and B : V ′ → W ′ is represented by a matrix (bkl) with respect to bases of V ′ and W ′, then A ⊗ Bis represented by the matrix (aijbkl) with respect to the appropriate bases.

Exercise 6.3. Show that there is a natural isomorphism φ : V ∗ ⊗W ∗ '→ Mult(V ×W,R) with

φ(v∗ ⊗ w∗)(v, w) = v∗(v)w∗(w)

for all v∗, w∗, v, w.Show that there is a natural isomorphism ψ : V ∗ ⊗W ∗ → (V ⊗W )∗ with

ψ(v∗ ⊗ w∗) (v ⊗ w) = v∗(v)w∗(w)

for all v∗, w∗, v, w.

Exercise 6.4. Show that the map R× V → V , (a, v) 7→ av gives rise to an isomorphism R⊗ V '→ V whichsends a⊗ v to av for all a ∈ R and v ∈ V .

Exercise 6.5. Show that taking tensor product is associative:

V ⊗ (U ⊗W ) ' (V ⊗ U)⊗Wfor any three vector spaces V,U and W .

From now on we write V ⊗ U ⊗W for V ⊗ (U ⊗W ) since the order of taking tensor products doesn’tmatter. Exercise 6.5 above also allows us to define recursively tensor powers of a vector space V . We define

V ⊗0 := R,V ⊗1 := V and

V ⊗n := V ⊗(n−1) ⊗ V for n > 1.40

It is not hard to generalize the relationship between bilinear maps and tensor products to the relationshipbetween n-linear maps and n-fold tensor products. For example:

Exercise 6.6. Prove that given a n-linear map

f :

n︷︸︸︷V × · · · × V→ U,

then there exists a unique linear map f : V ⊗n → U with

f(v1 ⊗ · · · ⊗ vn) = f(v1, . . . , vn).

for all (v1, . . . , vn) ∈ V × · · · × V .

Moreover, given a ∈ V ⊗n and b ∈ V ⊗m, a⊗ b is in V ⊗n ⊗ V ⊗m ' V ⊗(n+m). This gives us an R-bilinearmap,

V ⊗n × V ⊗m → V ⊗(n+m), (a, b) 7→ a⊗ b.Note that if n = 0 the map above is simply

R× V ⊗m → V ⊗m, (a, t) 7→ at.

(cf. Exercise 6.4).

Definition 6.11. An algebra over R is a vector space A together with a bilinear map A×A→ R, (a, a′) 7→ aa′

(“multiplication”). An algebra A is said to be an algebra with unity if there is an element 1 ∈ A such that1 · a = a for all a ∈ A. An algebra A is associative if the multiplication is associative.

Remark 6.12. Note that in any algebra A, 0a = a0 = 0 for all a ∈ A (this is because multiplication isrequired to be bilinear).

Remark 6.13. If A is an algebra with 1 then there is an injection R→ A, x 7→ x1. We will always identifyR with its image in A.

Example 6.14. A Lie algebra is an algebra. It is not associative and does not have 1 (why not?).

Example 6.15. The space Mn(R) of n× n matrices forms an algebra under matrix multiplication. It is analgebra with unity: the identity matrix I is the unity.

Definition 6.16. An algebra A is graded if

A =∞∑i=0

Ai direct sum

and if for any a ∈ Ai and b ∈ Aj we have a · b ∈ Ai+j . We will refer to the elements of Ak as elements ofdegree k.

Given a vector space V we construct the corresponding tensor algebra T (V ) as follows. As a vector spaceT (V ) is the direct sum:

T (V ) = R⊕ V ⊕ V ⊗2 ⊕ · · · ⊕ V ⊗n ⊕ · · · =∞∑i=0

V ⊗i.

Thus the elements of T (V ) are finite sums ai1 + ai2 + · · · aik , aij ∈ V ⊗ij . We define the multiplication on Tby extending the multiplication

V ⊗n × V ⊗m → V ⊗(n+m) (a, b) 7→ a⊗ b.

bilinearly to all of T (V ). The tensor algebra T (V ) of a vector space V is a graded associative algebra with1. Note that by construction the elements of T (V ) are sums of products of elements of V , that is, T (V ) isgenerated by V .

41

6.2. The Grassmann (exterior) algebra and alternating maps. We have seen that tensor productsare intimately related to multi-linear maps. Exterior (Grassmann) algebras are just as intimately related toalternating multilinear maps. Recall that an n-linear map f : V × · · · × V → U is alternating if it changessign whenever we switch to adjacent entries:

f(v1, . . . , vi, vi+1, . . . , vn) = −f(v1, . . . , vi+1, vi, . . . , vn)

for all (v1, . . . , vn) ∈ V × · · · × V and any index i.

Example 6.17. The determinant

det :

n factors︷︸︸︷Rn × · · · × Rn→ R, (v1, . . . , vn)→ det(v1| . . . |vn)

is an alternating map.

Example 6.18. Consider a vector space V and a, b ∈ V ∗. Define the bilinear map a ∧ b by

(a ∧ b)(v1, v2) := a(v1)b(v2)− a(v2)b(v1), v1, v2 ∈ V.The map a ∧ b (“a wedge b”) is alternating.

Definition 6.19 (Grassmann (exterior) algebra). Let V be a finite dimensional vector space over R. TheGrassmann (exterior) algebra Λ∗(V ) is an algebra over R with unity together with an injective linear mapi : V → Λ(V ) called the structure map which has the following universal property: If A is an algebra overR with unity and j : V → A is a linear map such that j(v) · j(v) = 0 for all v ∈ V , then there is a uniquealgebra map 8 : Λ∗(V )→ A such that the following diagram commutes:

Vj

''OOOOOOOOOOOOOOO

i

Λ∗(V )

// A.

Proposition 6.20. If the exterior algebra Λ∗(V ) exists, it is unique (up to isomorphism).

Proof. This is a formal exercise and is left to the reader.

Proposition 6.21. For every vector space V the exterior algebra Λ∗(V ) exists.

Proof. Let I be the two-sided ideal in the tensor algebra T (V ) generated by the set v ⊗ v : v ∈ V . Notethat R ∩ I = 0 and V ∩ I = 0 for degree reasons. Define

Λ∗(V ) := T (V )/I,

the quotient of the tensor algebra by the ideal I. Then Λ∗(V ) is an algebra — it inherits the multiplicationfrom T (V ). The induced multiplication in Λ∗(V ) is denoted by ∧ (“wedge”). Since the tensor algebra isgraded, so is I, and

I = (I ∩ V ⊗2)⊕ (I ∩ V ⊗3)⊕ · · ·Since V ∩ I = 0, the composite i : V → T (V )→ T (V )/I = Λ∗(V ) is an injection. Note that any element ofΛ∗(V ) is a finite linear combination of products of elements of V .

Now that we have constructed the exterior Λ∗(V ), let us prove the universal property. Suppose that A isan algebra and that we are given a linear map j : V → A with j(v) · j(v) = 0 for all v ∈ V . Consider themap b : V × V → A given by b(v, w) = j(v) · j(w). Since the map b is bilinear, there is a unique linear mapj(2) : V ⊗ V → A with j(2)(v ⊗ w) = j(v) · j(w). Similarly, for all positive integers k, we have k-linear mapsj(k) : V ⊗k → A with

j(k)(v1 ⊗ · · · ⊗ vk) = j(v1) · · · j(vk).In addition, we define j(0)(a) = a · 1A, for all a ∈ R. In this way, we obtain an algebra map : T (V ) → A.By assumption, (v ⊗ v) = 0 for all v ∈ V . Therefore vanishes on the ideal I. This implies that descendsto an algebra map : Λ∗(V ) = T /I → A with (v) = j(v) for all v ∈ V . Since an algebra map is uniquelydetermined on generators, and since V generated Λ∗(V ), the map is unique.

8A map f : A→ B between two algebras is an algebra map if f is linear and preserves multiplication: f(a1a2) = f(a1)f(a2)

42

Remark 6.22. For any v ∈ V , we have v ∧ v = 0 in the exterior algebra Λ∗(V ). Also,

0 = (v1 + v2) ∧ (v1 + v2) = v1 ∧ v1 + v1 ∧ v2 + v2 ∧ v1 + v2 ∧ v2

gives thatv1 ∧ v2 = −v2 ∧ v1;

That is, the wedge product is skew-commutative.

Remark 6.23. Let Λk(V ) = T k(V )/(T k(V ) ∩ I). The vector space Λk(V ) is called the kth exterior powerof V . Then

Λ∗(V ) =∞∑k=0

Λk(V ),

whereΛ0(V ) = R and Λ1(V ) = V.

Also, if α ∈ Λk(V ) and β ∈ Λl(V ), then α ∧ β ∈ Λk+1(V ). Thus, Λ∗(V ) is a graded algebra with 1.

Remark 6.24. We know that if v1, . . . , vn is a basis for V , then vi ⊗ vj is a basis for V ⊗ V . Byinduction, vi1 ⊗ · · · ⊗ vik is a basis for V ⊗k. Thus, vi1 ∧ · · · ∧ vik generates Λk(V ) = V ⊗k/(I ∩ V ⊗k).Since ∧ is skew-commutative, however, we can reduce this generating set to a smaller one:

(6.1) vi1 ∧ · · · ∧ vik | i1 < · · · < ik,This implies that

Λl(V ) = 0 whenever l > dimV.

We will see below that the set (6.1) is a basis of Λk(V ).

We now investigate the connection between the k-th exterior power Λk(V ) of a vector space V andalternating maps.

Proposition 6.25 (Universal property of k-th exterior power of a vector space). Let U and V be vector

spaces. If f :

k︷︸︸︷V × · · · × V→ U is alternating then there is a unique linear map f : Λk(V )→ U with

f(v1 ∧ · · · ∧ vk) = f(v1, . . . , vk).

Proof. By the universal property of V ⊗k, there is a unique linear map f : V ⊗k → U such that f(v1⊗· · ·⊗vk) =f(v1, . . . , vk). Since f is alternating, f

∣∣∣I∩V ⊗k

= 0, where I is the ideal defined in the construction of Λ∗(V ).

This gives us the linear map f : Λk(V ) = V ⊗k/(I ∩ V ⊗k)→ U with the desired property.

Corollary 6.25.1. The space of k-linear alternating maps f : V × · · · × V → U | f is alternating isisomorphic to the space Hom(Λk(V ), U).

Lemma 6.26. Let V be an n-dimensional vector space. Then Λn(V ) is 1-dimensional.

Proof. We may assume that V = Rn. Let e1, . . . , en be the standard basis. Then e1 ∧ · · · ∧ en spans Λn(V ).We need to show that e1 ∧ · · · ∧ en 6= 0. The determinant det : Rn×· · ·×Rn → R is 1 on the identity matrixI = (e1| . . . |en): det(e1| . . . |en) = 1. Hence the induced linear map det : Λn(Rn) → R is 1 on e1 ∧ · · · ∧ en.Therefore e1 ∧ · · · ∧ en 6= 0.

Corollary 6.26.1. If f1, . . . , fn is a basis for a vector space V , then fi1∧· · ·∧fik : 1 ≤ i1 < · · · < ik ≤ nis a basis for its k-th exterior power Λk(V ).

Proof. By Remark 6.24 the above set generates Λk(V ). So we only need to check independence. Suppose

0 =∑

i1<···<ik

ai1,...,ikfi1 ∧ · · · ∧ fik for some ai1,...,ik ∈ R

Pick a sequence j1 < j2 < · · · < jk. Let jk+1 < · · · < jn be the remaining indices. Then

(∑

ai1,...,ikfi1 ∧ · · · ∧ fik) ∧ fjk+1 ∧ · · · ∧ fjn= aj1,...,jkfj1 ∧ · · · fjk ∧ fjk+1 ∧ · · · ∧ fjn ,

43

since ai1,...,ikfi1 ∧ · · · fik ∧ fjk+1 ∧ · · · ∧ fjn = 0 whenever is = jr for some s, r. This gives aj1,...,jk = 0.Also, fj1 ∧ · · · ∧ fjk ∧ fjk+1 ∧ · ∧ fjn = ±f1 ∧ ·fn 6= 0. Hence fj1 ∧ · · · ∧ fjk 6= 0.

Corollary 6.26.2. For any finite dimensional vector space V

dim Λk(V ) =(

dimVk

)=

(dimV )!k!(dimV − k)!

.

Lemma 6.27. Let A : V →W be a linear map. Then there is a unique linear map Λk(A) : Λk(V )→ Λk(W )such that

(Λk(A))(v1 ∧ · · · ∧ vk) = Av1 ∧ · · · ∧Avkfor all v1, . . . , vk ∈ V .

Proof. Consider the map b : V × · · · × V → Λk(W ) given by

b(v1, . . . , vk) = Av1 ∧ · · · ∧Avk.Since ∧ is skew-commutative, b is an alternating map. By Proposition 6.25 there exists a unique linear mapΛk(A) : Λk(V )→ Λk(W ) with the required properties.

Exercise 6.7. Let A : V →W be a linear map as above. Choose bases of V and W and the correspondingbases of Λk(V ) and of Λk(W ). Show that the entries of the matrix representing Λk(A) are polynomial in theentries of the matrix representing A.

6.3. Pairings.

Definition 6.28. Let V and W be two vector spaces. A pairing is a bilinear map 〈·, ·〉 : V ×W → R.

Example 6.29. Let V be a vector space and V ∗ be its dual. The evaluation map

V ∗ × V → R 〈`, v〉 = `(v)

is a pairing.

Definition 6.30. A pairing 〈·, ·〉 : V ×W → R is non-degenerate if

〈v0, w〉 = 0 ∀w ∈W ⇒ v0 = 0〈v, w0〉 = 0 ∀v ∈ V ⇒ w0 = 0.

Example 6.31. The evaluation map

V ∗ × V → R 〈`, v〉 = `(v)

is a non-degenerate pairing. In a sense it is the only nondegenerate pairing:

Proposition 6.32. If b : V ×W → R is a nondegenerate pairing, then V 'W ∗ and W ' V ∗.

Proof. Consider b#1 : V →W ∗ given by

(b#1 (v))(w) = b(v, w).

The map b#1 is linear, and

ker b#1 = v0 ∈ V : b#1 (v0) = 0 = v0 ∈ V : b(v0, w) = 0 ∀w = 0.Thus dimV ≤ dimW ∗ = dimW . By the same argument, we have dimW ≤ dimV ∗ = dimV . ThereforedimV = dimW . Hence b#1 is an isomorphism.

By the same argument, b#2 : W → V ∗ given by w 7→ b(·, w) is an isomorphism as well.

Proposition 6.33. There is a nondegenerate pairing

〈·, ·〉 : Λk(V ∗)× Λk(V )→ Rwith

〈v∗1 ∧ · · · ∧ v∗k, v1 ∧ · · · ∧ vk〉 = det(v∗i (vj)

).

HenceΛk(V ∗) ' (Λk(V ))∗.

44

Proof. Consider b :

k︷︸︸︷V ∗ × · · · × V ∗ ×

k︷︸︸︷V × · · · × V→ R given by

b(l1, . . . , lk, v1, . . . , vk) = det(li(vj)

).

For a fixed (l1, . . . , lk) ∈ V ∗×· · ·×V ∗, b is alternating in the v’s. So there is a map b : (V ∗×·×V ∗)×Λk(V )→R with

(l1, . . . , lk, v1 ∧ · · · ∧ vk) 7→ det(li(vj)

).

Similarly, for a fixed v1 ∧ · · · ∧ vk ∈ Λk(V ), b is alternating in the l’s, which means that there is a mapb : Λk(V ∗)× Λk(V )→ R with the desired property.

To check non-degeneracy evaluate the pairing on the respective bases.

Combining the proposition above with Corollary 6.25.1 we get:

Corollary 6.33.1. The space of k-linear alternating maps f : V × · · · × V → R | f is alternating isisomorphic to the k-th exterior power Λk(V ∗).

Remark 6.34. Explicitly `1 ∧ · · · ∧ `k ∈ Λk(V ∗) defines a k-linear alternating map by

`1 ∧ · · · ∧ `k (v1, · · · , vk) = det(`i(vj))

for all v1, . . . , vk ∈ V . In particular

`1 ∧ `2 (v1, v2) = `1(v1)`2(v2)− `1(v2)`2(v1)

Exercise 6.8. Suppose that V is an n-dimensional vector space. Given a linear map A : V → V , we geta map Λn(A) : Λn(V ) → Λn(W ), and since dim Λn(V ) = 1, the map Λn(A) is multiplication by a scalar.Show that this scalar is detA.

7. Differential forms and integration

7.1. Motivation. Suppose we want to integrate a function f over a manifold M . We start with an easiestcase: the support of f is contained inside a coordinate chart φ : U → Rm. We could then try to define∫

M

f =∫U

f :=∫φ(U)⊂Rm

(f φ−1)(x) dx.

Right away we would then run into a problem when we try to compute this integral with respect to a differentcoordinate chart. Recall the change of variables formula for integrals:

Lemma 7.1. Let F : U → V be diffeomorphism between two open subsets of Rm and f ∈ C∞(V ) anintegrable function. Then y 7→ f(F (y)) |det dFy| is an integrable function on U and

(7.1)∫V=F (U)

f(y) dy =∫U

f(F (x)) |det dFx| dx

Now suppose ψ : U → Rm is another coordinate chart on M with the same domain. By our definition of∫Mf we would want ∫

M

f =∫ψ(U)

(f ψ−1)(y) dy.

But the change of variables formula (7.1) gives us∫ψ(U)

((f ψ−1)(y) dy =∫φ(U)

((f ψ−1) (ψ φ−1)

)(x) |det(d(ψ φ−1)x)| dx

=∫φ(U)

(f φ−1)(x) |det(d(ψ φ−1)x)| dx

Since there is no reason for |det(d(ψ φ−1)x)| to be the constant function 1, the integral of f over M isill-defined.

45

One solution is to integrate something other than functions. If µ is one of those somethings and F is adiffeomorphism, then µ should transform under F by the rule

µ (µ F ) det(dF ).

This will be made more precise shortly. Additionally we will need to confine ourselves to manifolds withatlases φα with the property that the differentials d(φα φ−1

β ) all have positive determinants. Such manifoldsare called orientable. It turns out that what one integrates over manifolds are differential forms and we nowproceed to define them.

Again, let M be a manifold. Recall that we made the disjoint union of its cotangent spaces⊔q∈M T ∗qM

into a manifold, the cotangent bundle T ∗M of M . Moreover, we defined the manifold structure on T ∗M insuch a way that the natural projection

π : T ∗M →M, T ∗qM 3 η 7→ q ∈Mis smooth. Similarly one can make the disjoint union of kth exterior powers of the cotangent spaces of Minto a manifold Λk(T ∗M):

Λk(T ∗M) =⊔q∈M

Λk(T ∗qM).

Moreover, the natural projection

π : Λk(T ∗M)→M, Λk(T ∗qM) 3 ν 7→ q ∈Mcan be arranged to be smooth. We defer the details of this construction for section 8, where we will carry itout for arbitrary vector bundles and not just for the cotangent bundle.

The preimages of points π−1(q) under π : Λk(T ∗M)→M are called fibers of π. By design they are vectorspaces Λk(T ∗qM). Recall that for any vector space V , the 0th exterior power Λ0(V ) is just the real numbersand the 1st exterior power Λ1(V ) is the vector space V itself. It will turn out that Λ0(T ∗M) = M × R andΛ1(T ∗M) = T ∗M .

Definition 7.2. A smooth k-form µ (a.k.a. a differential form of degree k) is a smooth map µ : M →Λk(T ∗M), q 7→ µq so that

µq ∈ Λk(T ∗qM)for all q ∈M .

The last condition can be stated as: π µ : M → M is the identity map. The smoothness condition willbe discussed a few paragraphs down.

Remark 7.3. By definition a 0 form on M is a smooth map µ : M → M × R such that µq = (q, f(q))for all q ∈ M , where f(q) ∈ R depends on q. In other words, 0 forms are nothing but functions. And thesmoothness of 0 forms is the smoothness of functions.

Notation. We denote the space of differential k-forms on a manifold M by Ωk(M). We denote the space ofall differential forms by Ω∗(M). Thus

Ω∗(M) = Ω0(M)⊕ Ω1(M)⊕ · · · ⊕ Ωk(M)⊕ · · ·Let us try to get some feel for differential forms by considering the special case where the manifold M is

an open subset of Rm. We denote the standard coordinates on M by x1, . . . , xm. Then for every q ∈M , thedifferentials (dx1)q, . . . , (dxm)q form a natural basis of the cotangent space T ∗qM ' (Rm)∗. Hence the set

(dxi1)q ∧ . . . ∧ (dxik)q | 1 ≤ i1 < · · · < ik ≤ mis a basis of the kth exterior power Λk(T ∗qM). At this point it is convenient to have a bit more notation atour disposal: if I is an ordered k-tuple i1 < · · · < ik then

(7.2) dxI := dxi1 ∧ . . . ∧ dxik .We write |I| to indicate the size of the tuple I: if I is a k-tuple, then |I| = k. With this notation, a typicalk-form µ on an open subset of Rm has the following expression:

(7.3) µ =∑|I|=k

aI dxI

(=

∑i1<···<ik

ai1···ikdxi1 ∧ · · · ∧ dxik

).

46

It makes sense to describe µ as smooth if aI ’s are smooth functions on M .This tells us what smoothness of a differential k form µ on an arbitrary manifold M should mean. Namely,

in a coordinate chart φ = (x1, . . . , xm) : U → Rm

(7.4) µ|U =∑|I|=k

aI dxI

for some functions aI : U → R. Then the form µ is smooth if and only if the functions aI are all smooth.This begs the obvious question of whether smoothness depends on the choices of coordinate charts. In otherwords, it is not entirely clear whether this definition of smoothness is consistent. The answer is that there isno problem. The issue is closely tied up with the issue of making Λk(T ∗M) into a smooth manifold, whichwe punt for the time being. But see subsection 8.2 below.

Since differential forms take values in vector spaces, two forms of the same degree can be added pointwise.It also makes sense to multiply a k-form by a function. This is completely analogous to vector fields beinga module over the space of smooth functions.

One can also “multiply” differential forms. Namely, if µ ∈ Ωk(M) and ν ∈ Ωl(M) are two differentialforms, then at every point q ∈M , the wedge (exterior) product

µq ∧ νqmakes sense since µq ∈ Λk(T ∗qM) and νq ∈ Λl(T ∗qM). This defines the exterior product on differential forms:

∧ : Ωk(M)× Ωl(M)→ Ωk+l(M), (µ, ν) 7→ µ ∧ ν,with

(µ ∧ ν)q := µq ∧ νq for all q ∈M.

Note that if µ ∈ Ω0(M), that is, if µ is a function, then µ ∧ ν = µ ν. That is, wedging a function with adifferential form is the same as multiplying the differential form by the function.

7.2. Pullback of differential forms. In order to discuss integration of differential forms we need to discusstheir pullback under smooth maps. We start by discussing the underlying linear algebra. Recall that byLemma 6.27 if A : W → V is a linear map, then there exists a unique linear map

Λk(A) : Λk(W )→ Λk(V ), with Λk(A)(w1 ∧ · · · ∧ wk) = Aw1 ∧ · · · ∧Awkfor all w1, . . . , wk ∈ W . In fact, by the universal property of the exterior algebra, we have more than justa collection of linear maps Λk(A) , k = 0, 1, . . .. Namely, the linear map A : W → V defines a linear mapA : W → Λ∗V with Aw ∧ Aw = 0 for all w ∈ W . Hence, by the universal property of Λ∗(W ) there is aunique algebra map

Λ∗(A) : Λ∗(W )→ Λ∗(V ) with Λ∗(A)(w1 ∧ · · · ∧ wk) = Aw1 ∧ · · · ∧Awkfor all w1, . . . , wk ∈ W and for all k > 0. Note that Λ0(A) : Λ0(W ) = R→ R = Λ0(V ) is the identity map.This is the reason why the pull-back of a 0-form, thought of as a function, is composition (see below).

If F : M → N is a smooth map between two manifolds, it defines a pullback map F ∗ : C∞(N)→ C∞(M)on functions by

F ∗f := f F for any f ∈ C∞(N).I claim that F ∗ extends to a map of algebras F ∗ : Ω∗(N) → Ω∗(M). Indeed, given F : M → N we havelinear maps

dFq : TqM → TF (q)N, q ∈M,

and therefore dual mapsdF ∗q : T ∗F (q)N → T ∗qM,

which, in turn, induce maps on the exterior powers

Λk(dF ∗q ) : Λk(T ∗F (q)N)→ Λk(T ∗qM)

withΛk(dF ∗q )(ν1 ∧ · · · ∧ νk) = (dF ∗q ν1) ∧ · · · ∧ (dF ∗q νk)

47

for ν1, . . . , νk ∈ T ∗F (q)N . Therefore if µ : N → Λk(T ∗N) is a k-form we define its pullback F ∗µ ∈ Ωk(M) by

(7.5) (F ∗µ)q := Λk(dF ∗q )(µF (q)) for all q ∈M.

This looks a bit convoluted but has a simple (simpler?) interpretation. Recall that for any vector space Vwe have a canonical isomorphism between the kth exterior Λk(V ∗) of its dual and the space of alternatingk-linear maps f : V × · · · × V → R. The identification in question is given by

(ν1 ∧ · · · ∧ νk)(v1, . . . , vk) := det(νi(vj))

for all ν1, . . . , νk ∈ V ∗ and all v1, . . . , vk ∈ V (cf. Remark 6.34). Hence, if A : W → V is a linear map andA∗ : V ∗ →W ∗ its dual, then(

Λk(A∗)(ν1 ∧ · · · ∧ νk))

(w1, · · · , wk) = (A∗ν1 ∧ · · · ∧A∗νk)(w1, · · · , wk)

= det((A∗νi)(wj))

= det(νi(Awj))

= (ν1 ∧ · · · ∧ νk)(Aw1, . . . , Awk).

Hence for any µ ∈ Λk(V ∗)(Λk(A∗)µ)(w1, . . . , wk) = µ(Aw1, . . . , Awk).

Therefore, the pullback of a differential form µ ∈ Ωk(N) by F : M → N is given by

(7.6) (F ∗µ)q(v1, . . . , vk) := µF (q)(dFqv1, . . . , dFqvk)

for all q ∈ M , v1, . . . , vk ∈ TqM . So why did we define the pullback by (7.5) and not by (7.6)? The reasonis that the first definition tells us that pullback automatically respects exterior multiplication of forms:

(7.7) (F ∗µ) ∧ (F ∗ν) = F ∗(µ ∧ ν)

for any two differential forms µ and ν on N . We will see later on that this is useful.

Remark 7.4. It is easy to see that if F : M → N and G : N → Z are two smooth maps then

(G F )∗µ = F ∗(G∗µ)

for any form µ ∈ Ω∗(Z).

Remark 7.5. If N is a submanifold of a manifold M and µ ∈ Ω∗(M) is a differential form on M , therestriction µ|N of µ to N is, by definition, the pullback of µ to N by the inclusion map ι : N →M .

Before we can get back to our original goal of integrating forms on manifolds, we need to take care of apreliminary observation and some definitions.

Lemma 7.6. Let U, V ⊂ Rm be two open subsets and F : U → V a diffeomorphism. Then for any smoothfunction f ∈ C∞(V )

(F ∗(f dx1 ∧ · · · ∧ dxm))q = f(F (q)) (det dFq) (dx1 ∧ · · · ∧ dxm)q

for all q ∈ U .

Proof. Let e1, . . . , em be the standard basis of Rm. Then for any point q, (dx1)q, . . . , (dxm)q is the dualbasis. Hence

(F ∗(f dx1 ∧ · · · ∧ dxm))q(e1, . . . , em) = (f dx1 ∧ · · · ∧ dxm))F (q)(dFqe1, . . . , dF1em)

= f(F (q)) det(

(dxi)F (q)(dFqej))

= f(F (q)) · det(dFq) · 1

= f(F (q)) det(dFq)(

(dx1 ∧ · · · ∧ dxm)q(e1, . . . , em))

48

Definition 7.7. The support of a k-form µ ∈ Ωk(M) on a manifold M is the closure of the set of pointswhere µ is non-zero:

suppµ = q ∈M | µq 6= 0We denote the space of compactly supported k-forms on M by Ωkc (M).

Definition 7.8. A manifold M is orientable if there is an atlas φα : Uα → Rm so that for any two indicesα and β

(7.8) det(d(φα φ−1β )q) > 0

for all q ∈ φβ(Uα ∩ Uβ).A choice of such an atlas is an orientation of M .Two atlases on M define the same orientation if their union is an atlas satisfying (7.8).

Example 7.9. The identity map id : Rn → Rn defines an orientation of Rn called the standard orientation.The map φ : Rn → Rn, φ(x1, x2, . . . , xn) = (−x1, x2, . . . , xn) defines a different orientation.

Remark 7.10. It is not at all obvious at this point, but a given connected orientable manifold can haveonly two orientations.

Example 7.11. It should not be too hard to see that an n-sphere Sn is orientable. Somewhat harder is thefact that the real projective space RPn is orientable if and only if n is odd. Klein bottle and Mobius stripare not orientable.

7.3. Integration. We now proceed with defining integration of compactly supported m-forms over orientedmanifolds of dimension m. Given an oriented manifold M of dimension m and a compactly supported formµ ∈ Ωmc (M) of top degree we want to define a number

∫Mµ in a reasonable way. For example we want the

integration map ∫M

: Ωkc (M)→ R, µ 7→∫M

µ

to be linear.If µ ∈ Ωmc (Rm) then

µ = f dx1 ∧ · · · ∧ dxmfor some compactly supported function f . We define∫

Rmf(x) dx1 ∧ · · · ∧ dxm :=

∫Rm

f(x) dx

where the right hand side is the Riemann integral of the compactly supported function f over Rm. (Notethat dx2 ∧ dx1 ∧ · · · ∧ xm = −dx1 ∧ dx2 ∧ · · · ∧ xm, hence

∫Rm f(x) dx2 ∧ dx1 ∧ · · · ∧ xm = −

∫Rm f(x) dx

and so on.) The definition naturally extends to arbitrary open subsets of Rm: if U ⊂ Rm is open andµ = f dx1 ∧ · · · ∧ xm ∈ Ωmc (U) then ∫

U

µ :=∫U

f(x) dx.

Clearly the map∫U

: Ωmc (U)→ R is linear. In particular if µ = 0, then∫Uµ = 0 as well.

Next we consider the change of variables formula for the integration of m forms over open subsets of Rm.

Definition 7.12. A diffeomorphism F : U → V between two open subsets of Rm is orientation-preservingif det(dFx) > 0 for all x ∈ U .

Lemma 7.13 (change of variables formula for differential forms). Let F : U → V be an orientation-preserving diffeomorphism between two open subsets of Rm and let ω ∈ Ωmc (V ) be a compactly supportedform of top degree. Then

(7.9)∫U

F ∗ω =∫F (U)

ω.

49

Proof. We know that ω = f dx1 ∧ · · · ∧ dxm for some f ∈ C∞c (V ) and that∫V

ω =∫V

f(x) dx.

On the other hand, by Lemma 7.6,

F ∗ω = (f F ) · det dF · dx1 ∧ · · · ∧ dxm.Hence, ∫

U

F ∗ω =∫U

(f F )(x) det dFx dx

Since det(dFx) = |det(dFx)| for all x ∈ U by assumption, we have∫U

F ∗ω =∫U

(f F )(x) |det dFx| dx

=∫F (U)

f(y) dy by (7.1)

=∫V

ω.

Theorem 7.14. Let M be an oriented m-dimensional manifold. There exists a unique linear map (integra-tion) ∫

M

: Ωmc (M)→ R, µ 7→∫M

µ

such that if φ : U → Rm is a coordinate chart (in an atlas defining the orientation of M) and ω ∈ Ωmc (U) acompactly supported form of top degree, then∫

M

ω =∫φ(U)

(φ−1)∗ω

Proof. We need to check that integration of forms is well-defined and unique (linearity follows from familiarproperties of the Riemann integrals). We do this in two steps.

Step I. We check that if the support of ω is in an open subset U ⊂M and φ : U → Rm, ψ : U → Rm aretwo different charts on M defining the same orientation, then∫

φ(U)

(φ−1)∗ω =∫ψ(U)

(ψ−1)∗ω.

Since φ−1 = ψ−1 (ψ φ−1),

(ψ−1)∗ω = (ψ−1 (ψ φ−1))∗ω= (ψ φ−1)∗((ψ−1)∗ω) by Remark 7.4

By the change of variables formula (7.9),∫φ(U)

(φ−1)∗ω =∫φ(U)

(ψ φ−1)∗((ψ−1)∗ω)

=∫ψ(U)

(ψ−1)∗ω.

Step II. We now deal with the general case. Let ω ∈ Ωmc (M) be an arbitrary compactly supportedform on the manifold M of degree m = dimM . Let φα : Uα → Rm be an atlas on M giving it itsorientation. Since suppω is compact, the there are finitely many sets U1, . . . , Un with U1 ∪ . . . Un ⊃ suppω.Let U0 := M r suppω. Then U0, U1, . . . , Un is a cover of M . Let ρini=0 be a partition of unity subordinateto this cover. Note that ρ0ω ≡ 0. Define the integral of ω over M by

(7.10)∫M

ω :=n∑i=1

∫φi(Ui)

(φ−1i )∗(ρiω).

50

We need to show that our definition of∫Mω does not depend on the choices we made. Accordingly, suppose

that ψβ : Vβ → Rm is another atlas giving M the same orientation, V1, . . . , V` a cover of suppω, V0 =M r suppω and τj`j=0 is a partition of unity subordinate to the cover V0, V1, . . . , V` of M . By step I, forall indices i > 0 and j > 0

(7.11)∫ψj(Ui∩Vj)

(ψ−1j )∗(τjρiω) =

∫φi(Ui∩Vj)

(φ−1i )∗(τjρiω).

Therefore, ∑i

∫φi(Ui)

(φ−1i )∗(ρiω) =

∑i

∫φi(Ui)

(φ−1i )∗(ρi(

∑j

τjω))

=∑i,j

∫φi(Ui∩Vj)

(φ−1i )∗(ρiτjω)

=∑i,j

∫ψj(Ui∩Vj)

(ψ−1j )∗(ρiτjω) by (7.11)

=∑j

∫ψj(Vj)

(ψ−1j )∗(τjω)

Therefore the integral of ω over M is well-defined.

The following lemma is very useful for carrying out integration.

Lemma 7.15. Let M be an oriented manifold of dimension m, ω ∈ Ωmc (M) a compactly supported formand N ⊂M an embedded submanifold of codimension 1 or greater. Then∫

M

ω =∫MrN

ω.

Proof. We may assume that M is an open subset of Rm (why?). In this case the result follows easily fromthe properties of Riemann integrals of functions.

To compute any integrals of forms, we also need to have a good way of computing pull-backs of forms.We have already seen that if f : N → R is a 0-form (i.e., a function) and F : M → N a smooth map ofmanifolds then F ∗f = f F .

Exercise 7.1. Let F : M → N be a smooth map of manifolds and f : N → R a smooth function. Then dfis a 1-form on N and

F ∗df = d(f F ).

Solution: for any point q ∈M and any v ∈ TqM , (F ∗df)q(v) = dfF (q)(dFq(v)) = d(f F )q(v) by the chain rule.

Exercise 7.2. Compute the integral of (the restriction of ) the one form xdy − ydx over the circle S1 =(x, y) ∈ R2 | x2 + y2 (pick any orientation of the circle you want).

Solution: Consider the map F : (0, 2π)→ S1 given by F (t) = (cos t, sin t). Note that the image of F is all of S1 except for one

point. Therefore, by Lemma 7.15,RS1 xdy − ydx =

RF ((0,2π)) xdy − ydx. Also F : (0, 2π) → S1 is an open embedding, hence

the inverse of F is a coordinate chart on S1. Therefore,ZF ((0,2π))

xdy − ydx =

Z(0,2π)

F ∗(xdy − ydx).

Since pull-back respects exterior multiplication,

F ∗(xdy − ydx) = (F ∗x)(F ∗dy)− (F ∗y)(F ∗dx).

But F ∗x = cos t and F ∗y = sin t, while by Exercise 7.1 F ∗dx = d(F ∗x) = d cos t = − sin t dt and, similarly, F ∗dy = d sin t =cos t dt. Therefore Z

S1xdy − ydx = F ∗(xdy − ydx) = cos t d sin t− sin t d cos t = cos2 t dt+ sin2 t dt = dt.

We conclude that ZS1xdy − ydx =

Z(0,2π)

dt = 2π.

51

Exercise 7.3. Compute the pull-back of dx∧dy by the map F : (0,∞)×R→ R2, F (r, θ) = (r cos θ, r sin θ).

Solution: F ∗(dx∧ dy) = F ∗dx∧F ∗dy = d(F ∗x)∧ d(F ∗y) = d(r cos θ)∧ d(r sin θ) = (cos θdr− r sin θdθ)∧ (sin θdr+ r cos θdθ) =

r cos2 θ dr ∧ dθ − r sin2 θ dθ ∧ dr = rdr ∧ dθ.

The definition of orientability of a manifold that we used above is convenient for defining integration. Itis inconvenient for everything else. The following criterion is useful.

Proposition 7.16. An m-dimensional manifold M is orientable if and only if there is a form ν on M ofdegree m so that

νq 6= 0 for all q ∈M.

Remark 7.17. A nowhere vanishing form of top degree on a manifold M as in Proposition 7.16 is called avolume form.

Proof of Proposition 7.16. Suppose M is orientable and φα : Uα → Rm is an atlas giving M an orientation.Let ρα be a partition of unity subordinate to the cover Uα of M . Define an m-form ν on M by

ν =∑

ρα(φ∗α(dx1 ∧ . . . ∧ dxm))

We need to check that ν vanishes nowhere. Fix a point q ∈ M . Then ρα(q) 6= 0 for finitely many α, sayα1, . . . αk. Therefore

((φ−1α1

)∗ν)φα1 (q) =k∑i=1

((φ−1α1

)∗(ραiφ∗αi(dx1 ∧ . . . ∧ dxn))

)φα1 (q)

=k∑i=1

ραi(q)(

(φαi φ−1α1

)∗(dx1 ∧ . . . ∧ dxn))φα1 (q)

=

(k∑i=1

ραi(q) det(d(φαi φ−1

α1)φα1 (q)

))(dx1 ∧ . . . ∧ dxn)φα1 (q) 6= 0

since det(d(φαi φ−1

α1)φα1 (q)

)> 0 and ραi(q) > 0.

Conversely suppose ν ∈ Ωm(M) is a volume form. We want to find an atlas φα : Uα → Rm so that

det d(φα φ−1β )q > 0

for all q and all α, β. Let ψβ : Vβ → Rm be an arbitrary atlas on M . It is no loss of generality to assumethat all the sets Vβ are connected. Then for each index β

(ψ−1β )∗ν = fβ dx1 ∧ · · · ∧ dxm,

with fβ(x) 6= 0 for all x ∈ ψβ(Vβ). Since Vβ is connected fβ is either strictly positive or strictly negative. Iffβ > 0, keep the chart ψβ . Otherwise replace it by T ψβ where T : Rm → Rm is the diffeomorphism givenby

T (x1, x2, · · · , xm) = (−x1, x2, · · · , xm).

Exercise 7.4. Suppose that M and N are orientable manifolds. Prove that their product M × N isorientable.

Exercise 7.5. Show that the tangent bundle TM is always orientable, regardless of whether or not themanifold M is.

Exercise 7.6. Evaluate∫Sω|S where S is the helicoid in R3 parameterized by φ(s, t) = (s cos t, s sin t, t),

0 < s < 1, 0 < t < 4π, and ω = z dx ∧ dy + 3 dz ∧ dx − x dy ∧ dz. Use the orientation of S defined by φ(that is, φ−1 : S → R2 is a coordinate chart on S).

52

8. Vector bundles

Informally a vector bundle is a collection of vector spaces parameterized by points in a manifold. Youhave already seen two example: the tangent bundle and the cotangent bundle. Here is the formal definition.

Definition 8.1. A real vector bundle E of rank k over a manifold M is a manifold E together with asmooth map π : E →M so that

(1) for each x ∈M the fiber Ex := π−1(x) is a real vector space of dimension k and(2) for all x ∈M , there is an open neighborhood U of x ∈M and a diffeomorphism ψ : π−1(U)→ U×Rk

such that pr ψ = π, where pr is the natural projection from U × Rk to U , pr(q, v) = q. That is,the diagram

π−1(U)ψ //

π

%%KKKKKKKKKK U × Rk

pr

U

commutes. Hence ψ maps the fiber Ey to y × Rk for all y ∈ U . Additionally we require that therestrictions

ψ|Ey : Ey → y × Rk

are vector space isomorphisms for all y ∈ U .

Definition 8.2.

• The manifold E is called the total space of the bundle π : E →M .• The manifold M is called the base of the bundle π : E →M .• The maps ψ : π−1(U)→ U × Rk are called local trivializations of the bundle π : E →M .

Example 8.3. The projection π : M × Rk →M , π(m, v) = m is a vector bundle of rank k.

Example 8.4. I claim that the tangent bundle π : TM → M is a vector bundle of rank dimM . Let usconstruct local trivializations. Given a point q ∈ M choose a coordinate chart φ = (x1, . . . , xm) : U → Rmwith q ∈ U . The map

ψ : TM |U ≡ π−1(U)→ U × Rm, ψ(v) = (π(v), (dx1(v), . . . , dxm(v))

is a local trivialization. Note that its inverse ψ−1 : U × Rm → TM |U is given by

ψ−1(p, (a1, . . . , am)) =∑

ai∂

∂xi

∣∣∣p.

Example 8.5. The cotangent bundle T ∗M →M is also a vector bundle over M of rank dimM .

It is useful to be able to say when two vector bundles are “the same.”

Definition 8.6. Let πE : E → M and πF : F → M be two vector bundles over a manifold M . A smoothmap f : E → F is a vector bundle map if f(Ex) ⊂ Fx for all x and if the map f |Ex : Ex → Fx is linear.

A vector bundle map f : E → F is an isomorphism of vector bundles if it is invertible and if f−1 : F → Eis a vector bundle map.

Definition 8.7. A vector bundle E → M of rank k is trivial if there is a vector bundle isomorphismE →M × Rk.

Example 8.8. The vector field X = x1∂∂x2− x2

∂∂x1

on R2 is tangent to the unit circle S1 and is not zeroanywhere. Therefore the map f : S1×R→ TS1, f(q, t) = tXq is an invertible vector bundle map. Convinceyourself that the inverse f−1 : TS1 → S1 × R is smooth.

Exercise 8.1. Show that the tangent bundle of the n-torus Tn :=

n︷︸︸︷S1 × · · · × S1 is trivial.

53

Exercise 8.2. Let E → M , F → M be two vector bundles over a manifold M . Show that if f : E → F isa vector bundle map and f has a set-theoretic inverse f−1, then f−1 is a vector bundle map.

Hints. Consider first the case where E and F are trivial bundles: E = M ×Rk, F = M ×Rl. Prove thatthe map inv : GL(R, k) → GL(R, k) given by A 7→ A−1 is smooth. Use trivializations to reduce everythingto the trivial bundles case.

Exercise 8.3. If π : E → M is a vector bundle and N ⊂ M is an embedded submanifold, show thatπ : E|N := π−1(N)→ N is a vector bundle over N , called the restriction of E to N .

Hints: to show that E|N is a submanifold of E prove that π : E → M is transverse to N . The localtrivializations of E|N are the restrictions of the local trivializations of E.

Example 8.9 (tautological line bundle). (This is a sketch with no actual proofs.) Recall that the complexprojective space CPn is the set of all complex lines in Cn+1. We identify lines CPn with equivalence classesof nonzero vectors [v]. The equivalence relation is given by v ' v′ if and only if v and v′ are collinear:v = λv′ for some 0 6= λ ∈ C. We define the tautological complex line bundle π : L→ CPn as follows. We let

L = (l, v) ∈ CPn × Cn+1 : v ∈ l.and define π : L→ CPn by

π(l, v) = l.

Thus the fiber π−1(l) consists of all vectors v ∈ Cn+1 that lie on the line l, that is, of the complex line litself. Hence the name. I claim that π : L→ CPn is indeed a real vector bundle of rank 2 (2 because C is a2-dimensional vector space over R).

Why is L a manifold? The relationship v ∈ l is really a collection of algebraic equations: if v ∈ [w] then(v1, . . . , vn+1) = λ(w1, . . . , wn+1) for some 0 6= λ ∈ C. Therefore vi/wi = λ = vj/wj for all i and j and hence

vjwi = viwj for all i, j.

From this one can deduce that L is indeed a manifold (in other words I am not really giving you a proof).To construct local trivializations let

Ui = [w] ∈ CPn : wi 6= 0.Define ψi : π−1(Ui)→ Ui × C by

ψi([w], v) = ([w], vi).

8.1. Sections.

Definition 8.10 (Section). A section s : M → E of a vector bundle π : E → M is a C∞ map such thatπ s = idM . That is, a section is a smooth map from M to E such that s(x) ∈ Ex for all x.

Notation. We denote the set of smooth sections of a bundle π : E →M by Γ(E).

Example 8.11. • The space of sections Γ(TM) of the tangent bundle of a manifold M is the spaceof vector fields on a manifold M .

• The space of sections Γ(T ∗M) of the cotangent bundle of a manifold M is the space of 1-forms ona manifold M . That is, Γ(T ∗M) = Ω1(M).

• The space of sections Γ(Λk(T ∗M)) of the kth exterior power of the cotangent bundle (which we havenot constructed yet) is the space of k-forms Ωk(M).

• The space of sections Γ(M × R) of the trivial bundle M × R → R is the space of smooth maps ofthe form m 7→ (m, f(m)), where f : M → R is a smooth function. Thus Γ(M × R) “is” C∞(M).

Lemma 8.12. Let E → M be a vector bundle over a manifold M . The space of sections Γ(E) is a vectorspace over R under pointwise addition and multiplication by scalars. Moreover, if s : M → E is a sectionand f ∈ C∞(M) is a function then we can define a new section fs : M → E by

(fs)q = f(q)sq

for q ∈M . Thus the space of sections Γ(E) is a module over the space of functions C∞(M).54

Proof. The only possible worry is this: suppose f, f ∈ C∞(M) are two smooth functions and s, s ∈ Γ(E)are two smooth sections. Is the section fs + f s smooth? Since every bundle is locally trivial and sincesmoothness is a local condition, we may assume that E is the trivial bundle M × Rk → M . In this caseΓ(E) “is” the space of smooth maps from M to Rk, and the lemma is clearly true for these maps — theydo form a module over C∞(M).

Remark 8.13. The map that assigns to every point q ∈ M the origin 0q in the fiber Eq is smooth. It’scalled the zero section and is often denoted by 0.

Definition 8.14 (local section). A local section of a vector bundle π : E → M is a section of π−1(U) =:E∣∣∣U→ U for some open U ⊂ M . Equivalently, a local section is a smooth map s : U → E such that

π s = idU .

Example 8.15. If (U, x1, . . . , xn) is a coordinate chart on a manifold M , then for each index i, the mapq 7→ ∂

∂xi

∣∣∣q

is a local section of the tangent bundle TM →M . Similarly dxi : U → T ∗M is a local section of

the cotangent bundle.

Exercise 8.4. Let E → M be a vector bundle, x ∈ M a point and v ∈ Ex. Show that there is a globalsection s with sx = v.

8.2. Frames and local frames. We now address the issue raised in section 7: a k-form µ ∈ Ωk(M) on amanifold M is smooth if for any coordinate chart (x1, . . . , xm) : U → Rm on M , we have

µ =∑|I|=k

aI dxI

(=

∑i1<···<ik

ai1···ikdxi1 ∧ · · · ∧ dxik

),

where aI ’s are smooth functions on U . To this end we define frames on a vector bundle.

Definition 8.16. Let E → M be a vector bundle of rank k. A collection s1, . . . , sk ∈ Γ(E) of sections is aframe of E if for each point x ∈M the vectors s1(x), · · · , sk(x) form a basis of the fiber Ex.

Similarly, a collection of local sections s1, . . . , sk : U → E is a local frame of E if for each point x ∈ U thevectors s1(x), · · · , sk(x) form a basis of the fiber Ex.

Example 8.17. A nowhere zero vector field X on a circle S1 is a frame of the tangent bundle TS1 → S1.

Proposition 8.18. A vector bundle E → M of rank k is trivial if and only if it has a frame of k sectionss1, . . . , sk ∈ Γ(E).

Proof. Suppose that E is a trivial vector bundle over a manifold M . Then we have a global trivializationψ : π−1(M)→M × Rk. Define

si(x) = ψ−1x (ei)

where e1, . . . , ek is the canonical basis for Rk. Then the collection s1, . . . , sk satisfies the desired prop-erties.Conversely, suppose that we have smooth sections s1, . . . , sk that form a basis of Ex at every x. Then aglobal trivialization is given by

(q, v1, . . . , vk) 7→∑i

visi(q).

Exercise 8.5. A section s of a vector bundle E →M is smooth if and only if for each point q ∈M there isa neighborhood U of q, a local frame s1, . . . , sk : U → E and smooth functions f1, . . . fk ∈ C∞(U) so that

s = f1s1 + · · ·+ fksk.

Exercise 8.6. Show that the discussion of smoothness of k-forms following (7.4) is correct: if (x1, . . . , xm) :U → Rm is a coordinate chart on a manifold M , then dx1, . . . , dxm is a local frame of T ∗M over U . Hence

dxI | |I| = k55

is a local frame of the kth exterior power Λk(T ∗M) of the cotangent bundle. By Exercise 8.5, a sectionω ∈ Γ(Λk(T ∗M)) is smooth on U if and only if there are smooth functions aI : U → R such that

ω|U =∑

aIdxI .

8.3. Vector bundles via transition maps. The goal of this section is to design a way of tearing vectorbundles apart and then putting them back together in a new way. This will allow us to carry over theoperations of direct sum ⊕, tensor ⊗, exterior power Λk, taking duals and so on from vector spaces to vectorbundles.

Suppose that π : E →M is a vector bundle of rank k and Uα is a cover of M such that E|Uα are trivialand let ψα : π−1(Uα)→ Uα × Rk denote the local trivializations. If Uα ∩ Uβ 6= ∅, we have a map

ψβ ψ−1α : (Uα ∩ Uβ)× Rk → (Uα ∩ Uβ)× Rk

Since the trivializations ψα maps fibers Ey linearly to fibers y×Rk the composition ψβ ψ−1α maps linearly

y × Rk to y × Rk for all y ∈ Uα ∩ Uβ . Hence the map ψβ ψ−1α has to be of the form

ψβ ψ−1α (y, v) = (y, ψβα(y)v).

for some functionψβα : Uα ∩ Uβ → GL(Rk)

Note that ψβα is smooth, because for every basis vector ej of Rk, the map

q 7→ ψβα(q)ej

is smooth. Such maps are called a transition maps for the bundle π : E →M . It is not hard to see that theset of transition maps ψαβ : Uα ∩ Uβ → GL(Rk) for the bundle E → M relative to the cover Uα satisfythe following three conditions called the cocycle conditions:

(1) ψαα = idUα for all α.(2) ψαβ · ψβα = idUα∩Uβ for all pairs of indices α, β (the dot denotes the multiplications in GL(Rk)).(3) ψαβ · ψβγ · ψβγ = idUα∩Uβ∩Uγ for all triples of indices α, β and γ.

Note that (2) implies that ψβα = ψ−1αβ . The transition maps determine the vector bundle E.

Theorem 8.19. Let M be a manifold, Uα an open cover, and ψαβ : Uα ∩ Uβ → GL(Rk) a collectionof smooth maps satisfying the cocycle conditions. Then there is a vector bundle E over M of rank k withtransition maps ψαβ.

Sketch of proof. Consider the disjoint union E of the trivial bundles Uα × Rk:

E =⊔α

(Uα × Rk).

Define a relation on E by

Uα × Rk 3 (q, v) ∼ (q′, v′) ∈ Uβ × Rk if and only if q = q′ and ψβα(v) = v′.

The cocycle conditions guaranty that ∼ is an equivalence relation. Let

E = E/ ∼,and write [q, v] for the equivalence class of (q, v). Define the projection π : E →M by

π([q, v]) = q.

Thenπ−1(Uα) = [q, v] | (q, v) ∈ Uα × Rk.

Define the trivializations ψα : π−1(Uα)→ Uα × Rk by

ψα([q, v]) = (q, v).

It’s not hard to check that the maps ψα are well-defined and that the corresponding transitions maps arethe maps φαβ we started out with. It remains to check that E can be given the structure of a manifold sothat all the maps in sight are smooth. But this is not bad. Let’s examine what we have.

56

We have a topological space E covered by open sets π−1(Uα). For each α we have a homeomorphismψα : π−1(Uα)→ Uα×Rk. This suggests a way to get coordinate charts on our topological space E: composehomeomorphisms ψα with charts on Uα × Rk. This gives us a cover of E by open sets and a collection ofhomeomorphism from these sets to open subsets of Rn, where n = dimM + k. This is an atlas because themaps

ψβ ψ−1α : (Uα ∩ Uβ)× Rk → (Uα ∩ Uβ)× Rk, (q, v) = (q, ψβα(q)v).

are smooth. Note that here we are given that the maps ψβα : Uα ∩ Uβ → GL(Rk) are smooth and we areusing it to conclude that ψβ ψ−1

α : (Uα ∩ Uβ)× Rk → (Uα ∩ Uβ)× Rk are smooth.

Remark 8.20. Naturally a different choice of a cover of M and a different choice of trivializations gives riseto a different collection of transition maps. And we should worry whether two different sets of data (opencover, transitions maps) give rise to the same bundle. But this would take us too far afield.

As a first application of Theorem 8.19 we construct the direct sum E ⊕ F → M of two vector bundlesπE : E → M and πF : F → M over a manifold M . The direct sum E ⊕ F (also known as Whitney sum)should be a vector bundle with with the fiber (E ⊕ F )q = Eq ⊕ Fq for q ∈ M . We define it by way of thetransition maps.

Pick an open cover Uα of M such that E|Uα and F |Uα are trivial. Let ψEαβ : Uα ∩ Uβ → GL(Rk) andψFαβ : Uα ∩Uβ → GL(Rl) be the associated transition maps (thus E is of rank k and F is of rank l). Definethe maps

ψE⊕Fαβ : Uα ∩ Uβ → GL(Rk ⊕ Rl)by

ψE⊕Fαβ (q) =(ψEαβ(a) 0

0 ψFαβ(a)

).

It is not hard to check that the maps ψE⊕Fαβ are smooth and satisfy cocycle conditions. Therefore, by Theo-rem 8.19 there is a vector bundle E ⊕ F →M with transition maps ψE⊕Fαβ . Its fibers are isomorphic to thedirect sum of the corresponding fibers of E and F .

It was worth reflecting on what made the construction above work. It is simply the fact that the map

GL(Rk)×GL(Rl)→ GL(Rk+l), (A,B)→ A⊕B :=(A 00 B

)is smooth (as a map between open subspaces of Rk2 ×Rl2 and R(k+l)2

) and the fact that under this map thecompositions go to compositions:

(A A′)⊕ (B B′) = (A⊕B) (A′ ⊕B′).There are many more examples of maps of this sort. For instance, consider the map that takes a matrix

A ∈ GL(Rk) to its inverse transpose:

( −1)∗ : A 7→ (A−1)∗ ∈ GL((Rk)∗).

The map is smooth (since the entries of the matrix (A−1)∗ are rational functions of the entries of the matrixA), and ((AB)−1)∗ = (A−1)∗(B−1)∗. Now let E →M be a vector bundle of rank k and Uα an open coverof M such that E|Uα are trivial. Let ψβα : Uβ ∩ Uα → GL(Rk) be the corresponding transition maps. Thenthe maps ψ∗βα : Uβ ∩ Uα → GL((Rk)∗) defined by

ψ∗βα(x) =((ψβα(x))−1

)∗are smooth and satisfy the cocycle conditions. By Theorem 8.19 there exists a vector bundle E∗ →M whosetransitions maps are precisely ψ∗βα. The bundle E∗ is called the dual bundle of E. Its fibers E∗q are vectorspaces dual to the fibers Eq of E. We have seen this construction in one special case: the cotangent bundleT ∗M is the dual bundle of the tangent bundle.

The maps (A,B) 7→ A⊕ B and A 7→ (A−1)∗ are what is known as smooth functors. They allowed us todefine direct sum of two bundles and the dual bundle, respectively. Here are a few more examples of thefunctors that will be very useful for us. Let V and W be finite-dimensional vector spaces, A ∈ GL(V ) andB ∈ GL(W ) over M . Then

57

• (A,B) 7→ A⊗B ∈ GL(V ⊗W )• A 7→ Λk(A) ∈ GL(Λk(V )) and• (A,B) 7→ Hom(A,B) ∈ GL(Hom(V,W ), Hom(A,B)T := B T A−1

are smooth functors.9 Indeed, pick bases of V and W . Then the entries of the matrix representing A ⊗ Bare products of entries of matrices representing A and B. The entries of the matrix representing Λk(A) arepolynomial in the entries of the matrix representing A. [If this is confusing, work out the following simpleexample and you’ll see what I mean. Let V = R3, k = 2 and compute Λ2(A)(ei ∧ ej) in terms of the basise1 ∧ e2, e1 ∧ e3, e2 ∧ e3.] Similarly the entries of matrix representing Hom(A,B) are polynomial in theentries of A and B. This allows us, given two vector bundles E →M and F →M to construct the bundles

• E ⊗ F →M• Λk(E)→M and• Hom(E,F )→M .

Exercise 8.7. Check that the bundle E∗ and Hom(E,M × R) are isomorphic.

Exercise 8.8. Let E → M and F → M be two vector bundles. Convince yourself that a section ofHom(E,F ) “is” a vector bundle map from E to F .

Show that E∗ ⊗ F is isomorphic to Hom(E,F ).

Exercise 8.9. Compute transition maps for the tautological real line bundle L→ RPn:

L = (l, v) ∈ RPn × Rn+1 | v ∈ l.

Compute transition maps for L⊗L. (Hint: write down the isomorphism R⊗R→ R.) Compute the transitions

maps for L⊗k :=

k︷︸︸︷L⊗ · · · ⊗ L, k > 1.

Exercise 8.10. Let πE : E →M and πF : F →M be two vector bundles over M .(a) Show that E × F is a vector bundle over M ×M .(b) Explain why G = (e, f) ∈ E × F : πE(e) = πF (f) can be considered a vector bundle over M .(c) Show that, as a vector bundle over M , G is isomorphic to the Whitney sum E ⊕ F .

9. Exterior differentiation, contractions and Lie derivatives of forms

9.1. Exterior differentiation. In this section we first learn how to differentiate differential forms. Wedefine an operator d of exterior differentiation that raises the degree of the form by 1. It is an generalizationof div, grad and curl operators of vector calculus.

Theorem 9.1. For every manifold M , there is a unique R-linear operator

dM : Ω∗(M)→ Ω∗+1(M)

with the following properties :(1) dM raises the degrees by 1: dM (Ωk(M)) ⊂ Ωk+1(M);(2) dMf = df for all f ∈ C∞(M), that is, dM extends the operator d, which takes functions to 1-forms,

to forms of arbitrary degree;(3) the operator dM commutes with restrictions to open sets: for all open sets U ⊂M and all ω ∈ Ω∗(M),

(dMω)|U = dU (ω|U );(4) the operator dM is a super-derivation: dM (ω∧η) = (dMω)∧η+(−1)kω∧ (dMη) for ω ∈ Ωk(M), η ∈

Ωl(M);(5) dM dM = 0.

Remark 9.2. Note that any open set U ⊂M is a manifold, so the theorem asserts that there is an operatordU : Ω∗(U)→ Ω∗(U) with properties (1) – (5), hence property (3) makes sense.

9Note that Λ0(A) = 1 ∈ GL(Λ0(V )) = GL(R).

58

Proof of Theorem 9.1. We prove uniqueness of the operator dM first. We then construct the operator locally,on open sets. By uniqueness, these locally defined operators patch together into a global operator. Thiswould prove existence.

Suppose the operator dM with the desired properties exist. Fix a coordinate chart (x1, . . . , xm) : U → Rmon M . Then for all α ∈ Ωk(M), α|U =

∑|I|=k aIdxI , where aI ∈ C∞(U) (cf. (7.2) and (7.3)). We claim

that

(9.1) (dMα)|U =∑|I|=k

daI ∧ dxI .

This would prove uniqueness since the right hand side is defined independently of dM . We prove (9.1) infour steps. By property (3) of dM ,

(dMα)|U = dU (α|U ).By properties (2) and (5)

dU (dxi) = dU (dUxi) = (dU dU )xi = 0.Hence, by property (4)

dU (dxi1 ∧ dxi2 ∧ · · · ∧ dxik) = dU (dxi1) ∧ (dxi2 ∧ · · · ∧ dxik)− dxi1 ∧ dU (dxi2 ∧ · · · ∧ dxik)

Since dU (dxi) = 0, induction on k then gives:

dU (dxI) = dU (dxi1 ∧ dxi2 ∧ · · · ∧ dxik) = 0.

Hence for any multi-index I,dU (aI dxI) = daI ∧ dxI .

Linearity of dU finishes the proof of (9.1).To prove existence of dM we run equation (9.1) backwards. Given a coordinate chart (x1, . . . , xm) : U →

Rm on M we define an operator dU : Ωk(U)→ Ωk+1(U) by

(9.2) dU (∑|I|=k

aIdxI) =∑|I|=k

daI ∧ dxI

(in particular, if k = 0, then dUa = da). Suppose, for the moment, that dU defined by (9.2) satisfiesproperties (1) – (5). Then by uniqueness, for any two coordinate charts (x1, . . . , xm) : U → Rm and(y1, . . . , ym) : V → Rm and any k-form α ∈ Ωk(M)

(dUα|U )|U∩V = (dV α|V )|U∩V .Consequently dM : Ω∗(M)→ Ω∗+1(M), given by

(dMα)|U = dU (α|U )

for all coordinate charts U , is well-defined. Since dU s have properties (1) – (5), so does dM (check that).It remain to prove that the map dU given by (9.2) has the desired properties. Clearly dU is R-linear and

raises degrees by 1. Property (2) holds by definition.To prove (3) we want to show that for any open subset W ⊂ U and any k-form α =

∑aIdxI ∈ Ωk(U)

(dUα)|W = dW (α|W )

(Note that (x1, . . . , xm)|W : W → Rm is also a coordinate chart). Let j : W → U denote the inclusion. Forany smooth function f ∈ C∞(U), j∗f = f |W . Hence, by Exercise 7.1,

d(f |W ) = (df)|W .Therefore

(dUα)|W =(∑

daIdxI

)|W =

∑daI |W ∧ dxI |W

=∑

d(aI |W ) ∧ dxI |W = dW

(∑aI |W dxI |W

)= dW (α|W ).

To prove (4) it’s enough to show that for any aI dxI ∈ Ωk(U) and any bJ dxJ ∈ Ω∗(U)

(9.3) dU (aI dxI ∧ bJ dxJ) = dU (aI dxI) ∧ bJ dxJ + (−1)kaI dxI ∧ dU (bI dxJ).59

We compute:dU (aI dxI ∧ bJ dxJ) = dU (aIbJ dxI ∧ dxJ)

= d(aIbJ) ∧ dxI ∧ dxJ= (bJdaI + aIdbJ) ∧ dxI ∧ dxJ= daI ∧ dxI ∧ bJ dxJ + (−1)kaIdxI ∧ dbJ ∧ dxJ= dU (aI dxI) ∧ bJ dxJ + (−1)k(aI dxI) ∧ dU (bJ dxJ).

This proves (4). Similarly, if α = aIdxI ∈ Ωk(U) then

dU (dUα)) = dU (daI ∧ dxI)

= dU

(m∑i=1

∂a

∂xidxi ∧ dxI

)

=

∑i,j

∂2a

∂xj∂xidxj ∧ dxi

∧ dxINow, for i = j, dxi ∧ dxj = 0 so we are only summing over indices i and j with i 6= j. Each unordered pairi, j with i 6= j contributes two terms to the sum: ∂2a

∂xj∂xidxj ∧ dxi and ∂2a

∂xi∂xjdxi ∧ dxj . These two terms

cancel since dxj ∧ dxi = −dxj ∧ dxi while the mixed partials commute. Therefore

dU (dUα) = 0.

By linearity this is true for all k forms on the coordinate patch U . This proves property (5) and we aredone.

Notation. From now on we drop the subscript M from dM and simply write d instead.

Example 9.3. The exterior derivative of a form is easy to compute: let α = dz + xdy be a 1-form on R3.Then

dα = d(dz) + d(xdy) = 0 + dx ∧ dy = dx ∧ dy.

9.2. Contractions of forms and vector fields. To relate the exterior derivate operation to the standardcalculus operation of div, grad and curl we need to define contractions of forms with vector fields.

Let u be a vector in a finite dimensional vector space V . Then u defines a linear map

ι(u) : Λk(V ∗)→ Λk−1(V ∗)

by(ι(u)η)(v1, . . . , vk−1) = η(u, v1, . . . , vk−1)

for any η ∈ Λk(V ∗) and any v1, . . . , vk−1 ∈ V . Here, of course, we think η as k-linear alternating maps fromV ×· · ·×V to R. We refer to ι(u)η as the contraction of u with η. Note that if η ∈ Λ1(V ∗) = V ∗, then ι(u)ηis simply the number η(u). If η ∈ Λ0(V ∗) = R, then we define ι(u)η := 0 (and tacitly define Λ−1(V ∗) = 0).

Similarly, if X ∈ Γ(TM) is a vector field on a manifold M and α ∈ Ωk(M) is a k form with k > 0 wedefine the contraction of X with α to be the k − 1 form ι(X)α given, for any point q ∈M , by

(ι(X)α)q = ι(Xq)αq.

Here, on the right we are contracting a vector Xq ∈ TqM with αq ∈ Λ((TqM)∗). In particular, if α is a1-form, ι(X)α = α(X). And again, if α is a 0-form, then ι(X)α = 0 (and the space of (−1)-forms is 0).

Example 9.4. Suppose l1, l2 ∈ V ∗, so that l1 ∧ l2 ∈ Λ2(V ∗). Let u ∈ V be a vector. Then, for any v ∈ V ,

(ι(u)(l1 ∧ l2))(v) = (l1 ∧ l2)(u, v)= l1(u)l2(v)− l1(v)l2(u)= (l1(u)l2 − l2(u)l1) (v).

Henceι(u)(l1 ∧ l2) = l1(u)l2 − l2(u)l1.

60

This example suggests a general way of computing contractions.

Lemma 9.5. If l1, . . . , lk ∈ V ∗, u ∈ V , then

ι(u)(l1 ∧ . . . ∧ lk) =k∑j=1

(−1)j−1(ι(u)lj)(l1 ∧ . . . ∧ lj ∧ . . . ∧ lk),

where lj means that lj is omitted from the expression.

Proof. For any k − 1 vectors v1, . . . , vk−1 ∈ V ,

(ι(u)l1 ∧ . . . ∧ lk)(v1, . . . , vk−1) = det

l1(u) l1(v1) . . . l1(vk−1)...

...lk(u) lk(v1) . . . lk(vk−1)

=

k∑j=1

(−1)j−1lj(u) detAj

=k∑j=1

(−1)j−1lj(u)(l1 ∧ . . . ∧ lj ∧ . . . ∧ lk)(v1, . . . , vk−1),

where Aj is the matrix obtained from the matrix l1(u) l1(v1) . . . l1(vk−1)...

...lk(u) lk(v1) . . . lk(vk−1)

by deleting the first column and jth row.

Corollary 9.5.1. Let V be a vector spaces, u ∈ V a vector and α ∈ Λr(V ∗) and β ∈ Λs(V ∗) be two exteriorforms. Then

ι(u)(α ∧ β) = (ι(u)α) ∧ β + (−1)rα ∧ (ι(u)β).

Proof. It’s enough to consider the case of α = l1∧ . . .∧ lr and β = lr+1∧ . . .∧ lr+s for some l1, . . . , lr+s ∈ V ∗.Then

ι(u)(α ∧ β) = ι(u))(l1 ∧ . . . ∧ lr ∧ lr+1 ∧ . . . ∧ lr+s)

=r+s∑j=1

(−1)j−1(ι(u)lj)(l1 ∧ . . . ∧ lj ∧ . . . ∧ lr+s)

=

r∑j=1

(−1)j−1(ι(u)lj)(l1 ∧ . . . ∧ lj ∧ lr

∧ lr+1 ∧ . . . ∧ lr+s

+ l1 ∧ . . . ∧ lr ∧

r+s∑j=r+1

(−1)j−1(ι(u)lj)(lr+1 ∧ . . . ∧ lj ∧ lr+s)

= (ι(u)α) ∧ β + α ∧

r+s∑j′=1

(−1)r(−1)j′−1(ι(u)lj′+r)(lr+1 ∧ . . . ∧ lj′+r ∧ . . . ∧ lr+s

= (ι(u)α) ∧ β + (−1)rα ∧ (ι(u)β).

Corollary 9.5.2. Let M be a manifold, X ∈ Γ(TM) a vector field and α ∈ Ωr(V ∗) and β ∈ Ωs(V ∗), be twodifferential forms. Then

ι(X)(α ∧ β) = (ι(X)α) ∧ β + (−1)rα ∧ (ι(X)β).61

Example 9.6. Let W = x ∂∂x +y ∂

∂y +z ∂∂z be a vector field on R3 and let ω = dx∧dy∧dz (ω is the standard

volume form on R3). Then

ι(W )ω = ι(W )(dx ∧ dy ∧ dz)= dx(W ) dy ∧ dz − dy(W ) dx ∧ dz + dz(W ) dx ∧ dy= x dx ∧ dy − y dy ∧ dz + z dx ∧ dy

Exercise 9.1. In R3, the standard inner product (·, ·) defines an isomorphism R3 → (R3)∗, v 7→ (v, ·), whichin turn induces an isomorphism of spaces of sections

A : Γ(TR3)→ Ω1(R3), A(X) = (X, ·).

The standard volume form µ = dx1∧dx2∧dx3 defines an isomorphism R3 → Λ2((R3)∗) by v 7→ ι(v)µ, whichalso induces an isomorphism

B : Γ(TR3) 7→ Ω2(R3) B(X) = ι(X)µ.

Finally, the mapC : C∞(R3)→ Ω3(R3) C(f) = fµ

is also an isomorphism. (Check these facts!)Show that the standard vector calculus notions of div, grad, and curl can be defined as(1) grad(f) = A−1(df) for any smooth function f on R3.(2) curl(X) = B−1(d(A(X))) for any vector field X on R3.(3) div(X) = C−1(d(B(X))) for any vector field X on R3.

9.3. Lie derivatives of differential forms. In order to understand divergence of a vector field on amanifold we need to define Lie derivatives of differential forms. This is fairly easy to do, but then thedefinition is hard to compute with. Cartan’s formula makes computation of Lie derivatives of forms easy,but it requires understanding of the interaction between exterior differentiation and pull-backs. Which iswhy we address the pull-backs first.

Lemma 9.7. Exterior differentiation d commutes with pull-backs. That is to say, let F : M → N be asmooth map between two manifold and α ∈ Ω∗(N) a differential form. Then

(9.4) d(F ∗α) = F ∗(dα).

Proof. We know that equation (9.4) holds if α is a zero-form, that is, a function (cf. Exercise 7.1).We now argue that for any coordinate chart (x1, . . . , xn) : U → Rn on N and for any k-form α on N ,

k > 1, we have

(9.5) (F ∗(dα))|F−1(U) = d(F ∗α|F−1(U)

).

Equation 9.5 is enough to prove the lemma. Now, (F ∗dα)|F−1(U) = F ∗(dα|U ) and α|U =∑|I|=k aIdxI for

all multi-indices I of size k and some functions aI ∈ C∞(U). Therefore

dα|U =∑I

daI ∧ dxI ,

and(F ∗(dα))|F−1(U) =

∑I

F ∗daI ∧ F ∗dxI .

Since (9.4) holds for functions,F ∗daI = d(F ∗aI).

Similarly,F ∗dxI = d(F ∗xi1) ∧ . . . ∧ d(F ∗xik)

for all I = (i1, . . . , ik). Therefore

(F ∗(dα))|F−1(U) =∑I

d(F ∗aI) ∧ d(F ∗xi1) ∧ . . . ∧ d(F ∗xik)

62

We now argue that the right hand side of the equation above is d((F ∗α)|F−1(U)

). Properties (4) and (5) of

the exterior derivative d and induction on k shows that for any k functions f1, . . . fk,

d(df1 ∧ . . . ∧ dfk) = 0.

Hence for any functions f0, f1, . . . , fk,

d(f0df1 ∧ . . . ∧ dfk) = df0 ∧ df1 ∧ . . . ∧ dfk.

In particular,

d(F ∗aI) ∧ d(F ∗xi1) ∧ . . . ∧ d(F ∗xik) = d(F ∗aI d(F ∗xi1) ∧ . . . ∧ d(F ∗xik)

)= d (F ∗(aIdxi1 ∧ . . . ∧ dxik)) .

Therefore,

(F ∗(dα))|F−1(U) =∑I

d(F ∗ (aIdxi1 ∧ . . . ∧ dxik)

)= d(F ∗

(∑I

aIdxI

)) = d (F ∗(α|U )) = d(F ∗α|F−1(U))

and we are done.

Definition 9.8. Let X be a vector field on a manifold M and ω ∈ Ωk(M) a k-form. Let φt denote the localflow of X. The Lie derivative LXω of ω with respect to X is defined by

(LXω)q =d

dt

∣∣∣t(φ∗tω)q

for any point q ∈M .

Note that by definition of the flow φt, the Lie derivative of a 0-form f ∈ C∞(M) is

(LXf)(q) =d

dt

∣∣∣t=0

(φ∗t f)(q)

=d

dt

∣∣∣t=0

(f φt)(q) = Xq(f)

As was mentioned above, the goal of this subsection is to prove Cartan’s formula for Lie derivatives.

Theorem 9.9 (Cartan’s Formula). Suppose that X is a vector field on a manifold M and ω ∈ Ω∗(M) adifferential form. Then

LXω = d(ι(X)ω) + ι(X)dω.

We prove the theorem in a sequence of lemmas.

Lemma 9.10. Let X be a vector field on a manifold M . The Lie derivative LX is a derivation on the spaceof forms Ω∗(M) which commutes with the exterior differentiation d. That is to say,

(1) LX : Ω∗(M)→ Ω∗(M) is R-linear.(2) LX(ω ∧ η) = (LXω) ∧ η + ω ∧ (LXη) for all ω, η ∈ Ω∗(M).(3) LX(dω) = d(LXω) for all ω ∈ Ω∗(M)

Proof. The first property of the Lie derivative is easy to see: pull-backs and differentiation are both linear.Let us prove (2). Since pull-back respects exterior multiplication,

d

dt

∣∣∣t=0

(φ∗t (ω ∧ η)) =d

dt

∣∣∣t=0

(φ∗tω) ∧ (φ∗t η).

Since exterior multiplication is bilinear,

d

dt

∣∣∣t=0

((φ∗tω) ∧ (φ∗t η)

)= (

d

dt

∣∣∣t=0

φ∗tω) ∧ (φ∗0η) + (φ∗0ω) ∧ d

dt

∣∣∣t=0

(φ∗t η)

= (LXω) ∧ η + ω ∧ (LXη).63

This proves that the Lie derivative is a derivation. We now prove that it commutes with the exteriormultiplication. For any form ω

LX(dω) =d

dt

∣∣∣t=0

(φ∗t (dω))

=d

dt

∣∣∣t=0

d(φ∗tω)

= d(d

dt

∣∣∣t=0

φ∗tω) (since mixed partials commute)

= d(LXω).

Lemma 9.11. Let X be a vector field on a manifold M . Let

Q = dι(X) + ι(X)d : Ω∗(M)→ Ω∗(M).

The operator Q is also a derivation on the space of forms Ω∗(M) which commutes with the exterior differ-entiation d.

Proof. It’s clear that Q is R-linear. We check that Q commutes with d:

Q d = d ι(X) d+ ι(X) d d

= d ι(X) d (since d d = 0)

= d d ι(X) + d ι(X) d = d Q.

Now we need to check that Q is a derivation. Accordingly, let ω ∈ Ωk(M), η ∈ Ωl(M) be two forms on M .Then

Q(ω ∧ η) = d(ι(X)(ω ∧ η)) + ι(X)(d(ω ∧ η))

= d[(ι(X)ω) ∧ η + (−1)kω ∧ (ι(X)η)] + ι(X)[dω ∧ η + (−1)kω ∧ dη]

= d(ι(X)ω) ∧ η + (−1)k−1(ι(X)ω) ∧ dη + (−1)kdω ∧ ι(X)η + (−1)k(−1)kω ∧ dι(X)η

+ (ι(X)dω) ∧ η + (−1)k+1dω ∧ ι(X)η + (−1)k(ι(X)ω) ∧ dη + (−1)k(−1)kω ∧ ι(X)dη

= Q(ω) ∧ η + ω ∧Q(η)

Proof of Cartan’s formula. If f ∈ Ω0(M) is a function, then ι(X)f = 0 by definition. Hence

Q(f) = (ι(X)d)f = ι(X) df = df(X),

whileLXf = X(f) = df(X).

We conclude that LX and Q agree on functions.To prove Cartan’s formula it is enough to prove

(LXω)|U = (Qω)|U ,

where (x1, . . . , xm) : U → Rm be a coordinate chart. But both LX and Q commute with restrictions, so it’senough to prove that

LX(ω|U ) = Q(ω|U ).

Thus, we may further assume that ω = aIdxI = aIdxi1 ∧ . . . ∧ dxin for some function aI and multi-index I.Both the Lie derivative LX and Q are derivations that commute with d, so

LX(aIdxi1 ∧ . . . ∧ dxin) = (LXaI)d(LXxi1) ∧ . . . ∧ (LXdxin)

= (QaI)d(Qxi1) ∧ . . . ∧ d(Qxin)

= Q(aIdxi1 ∧ . . . ∧ dxin).

64

Exercise 9.2. Let M be an orientable m-dimensional manifold and µ ∈ Ωm(M) a nowhere zero form of topdegree. Show that for any vector field X on M the Lie derivative LXµ satisfies

LXµ = fµ

for some function f ∈ C∞(M), which depends on X. We define the divergence of X with respect to µ to bethis function f and denote it by divµ(X). Thus,

LXµ = divµ(X)µ.

Show that for M = Rm and µ = dx1 ∧ . . . ∧ dxm

divµ(∑i

vi∂

∂xi) =

∑i

∂vi

∂xi.

Exercise 9.3. Consider polar coordinates (r, θ) on R2. The “function” θ is defined up to a constant. Showthat dθ is a well-defined 1-form on R2 − 0 and that

dθ =1

x2 + y2(x dy − y dx).

Exercise 9.4. (1) Consider the two-form ω = x1dx2 ∧ dx3 +x2dx3 ∧ dx1 +x3dx1 ∧ dx2 in R3. Compute dω.(2) Compute

ι(3∑i=1

xi∂

∂xi)dx1 ∧ dx2 ∧ dx3.

(3) Compute LX(dx1 ∧ dx2 ∧ dx3), where X =∑3i=1 xi

∂∂xi

.

Exercise 9.5. Consider k : R2 → R2 given by (u, v) 7→ (u2 + 1, uv). Compute k∗(

(xy − y)dx ∧ dy).

Exercise 9.6. Let X and Y be vector fields and α a 1-form on a manifold M . Prove that1) LX(ι(Y )α) = ι(Y )(LXα) + α(LXY ).2) Using (1), show that dα(X,Y ) = X(α(Y ))− Y (α(X))− α([X,Y ]).

9.4. de Rham cohomology. One of the most interesting applications of Cartan’s formula is the proof ofsmooth homotopy invariance of de Rham cohomology. We start by defining de Rham cohomology.

Definition 9.12. Let M be a manifold. A form α ∈ Ωk(M) is closed if dα = 0. A form β ∈ Ωk(M) is exactif there is a k − 1 form γ with β = dγ.

Note that since d2 = 0, any exact form is closed. The converse need not be true. The difference betweenthe spaces of closed and exact forms is measured by the de Rham cohomology.

Definition 9.13. Let M be a manifold. The kth de Rham cohomology Hk(M) is defined by

Hk(M) : = closed k-forms/exact k-forms

= ker(d : Ωk(M)→ Ωk+1(M)) / Im(d : Ωk−1(M)→ Ωk(M)).

Hk(M) is a vector space over the reals. Thus Hk(M) is the space of equivalence classes [α] of closed k-forms:two closed k-forms α and α′ are equivalent if and only if α− α′ = dγ for some k − 1 form γ.

Remark 9.14. By definition Ω−1(M) = 0 so

H0(M) = f ∈ C∞(M) | df = 0= locally constant functions on M

= Rk,

where k is the number of connected components of M . In particular H0(point) = R.

Definition 9.15. We define the de Rham cohomology H∗(M) to be the direct sum of the de Rham coho-mology groups:

H∗(M) := H0(M)⊕ · · · ⊕Hk(M)⊕ · · ·65

It takes a bit of work to compute the de Rham cohomology of just about anything. Here is an important,but not very exciting, example of a computation directly from the definition.

Example 9.16. Let M be a connected zero dimensional manifold, that is, a single point. Then Ωk(M) = 0for k > 0. Hence Hk(M) = 0 for k > 0. On the other hand H0(M) = R since a point has one connectedcomponent.

Lemma 9.17. The de Rham cohomology H∗(M) has a well-defined the multiplication given by

[α] ∧ [β] := [α ∧ β],

which makes H∗(M) into a ring.

Proof. We need to show that the space of exact forms is an ideal in the algebra of closed forms. That is, ifdα = 0 then dβ ∧ α is exact for any β. But

d(β ∧ α) = dβ ∧ α± β ∧ dα = dβ ∧ α+ 0,

and we are done.

Lemma 9.18. Let F : M → N be a smooth map. Then for each positive integer k the pull-back map

F ∗ : Ω∗(N)→ Ω∗(M)

gives rise to a well-defined ring homomorphism.

F ∗ : H∗(N)→ H∗(M), F ∗[α] := [F ∗α].

Moreover, if idM : M → M is the identity map then id∗M : H∗(M) → H∗(M) is also the identity map.Additionally, for any two maps F : M → N , G : N → Z we have

(G F )∗ = F ∗ G∗.

Proof. If dα = 0, then dF ∗α = F ∗dα = F ∗0 = 0. Therefore F ∗ maps closed forms to closed forms. Forthe same reason, F ∗ maps exact forms to exact forms. Consequently the pullback on forms gives rise to awell-defined pullback of cohomology classes. Since

F ∗(α ∧ β) = (F ∗α) ∧ (F ∗β),

the map on cohomology is a ring homomorphism. The rest of the lemma is left as an exercise.

Definition 9.19 (Homotopy). Two maps f1, f0 : M → N of manifolds are (smoothly) homotopic if thereis a smooth map

F : (a, b)×M → N,

where (a, b) is an open interval containing [0, 1], so that

F (0, x) = f0(x) for all x ∈M and

F (1, x) = f1(x) for all x ∈M.

Example 9.20. The maps f1 : Rn → Rn, f1(x) = x and f0 : Rn → Rn, f0(x) = 0 are smoothly homotopic:let F (t, x) = tx.

Lemma 9.21 (homotopy invariance of de Rham cohomology). If two smooth maps f1, f0 : M → N arehomotopic, then f∗0 , f

∗1 : H∗(N)→ H∗(M) are the same map:

f∗0 [α] = f∗1 [α] for all [α] ∈ H∗(N).

To prove Lemma 9.21, we need the following simple observation.

Lemma 9.22. Let φt denote the flow of a vector field X on a manifold M . For any k-form α on M ,

d

dt

∣∣∣t=τ

φ∗tα = φ∗τ (LXα).

66

Proof. For any map f : M →M ,

d

dt

∣∣∣t=0

f∗φ∗tα = f∗(d

dt

∣∣∣t=0

φ∗tα

)since for any point q ∈M the map Λ(df∗q ) : Λk(T ∗f(q)M → Λk(T ∗qM) is linear. Therefore

d

dt

∣∣∣t=τ

φ∗tα =d

dt

∣∣∣t=0

φ∗τ+tα =d

dt

∣∣∣t=0

φ∗τ (φ∗tα) = φ∗τ

(d

dt

∣∣∣t=0

φ∗tα

)= φ∗τ (LXα).

Proof of Lemma 9.21. Let F : (a, b) ×M → N denote the homotopy between f1 and f0. It is no loss ofgenerality to assume that the interval (a, b) is all of R. ( If (a, b) is not all of R, let ρ : R→ [0, 1] be a smoothfunction with supp ρ ⊂ (a, b) and ρ|[0,1] = 1. Define the map F : R×M → N by

F (t, x) =F (ρ(t)t, x) t ∈ (a, b)F (0, x) t 6∈ (a, b)

The map F is a homotopy between f1 and f0.) Let i0 : M → R×M denote the embedding given by

i0(x) = (0, x)

and let φt : R×M → R×M be given by

φt(s, x) = (s+ t, x).

Then f1 = F φ1 i0 and f0 = F φ0 i0. Therefore, since f∗t = i∗0 φ∗t F ∗, f∗1 and f∗0 are the same mapon cohomology if and only if φ1, φ0 : R×M → R×M induce the same map in cohomology. The collectionof maps φt is the flow of the vector field X = ∂

∂t on R×M . Therefore, for any k-form α ∈ Ωk(R×M),

φ∗1α− φ∗0α =∫ 1

0

d

dt(φ∗tα) dt

=∫ 1

0

φ∗t (LXα) dt =∫ 1

0

φ∗t (dι(X) + ι(X)d)αdt

= d(∫ 1

0

φ∗t (ι(X)α) dt) +∫ 1

0

φ∗t (ι(X) dα) dt

= dκ(α) + κ(dα),

where

κ(α) :=∫ 1

0

φ∗t (ι(X)α) dt.

Therefore, for any α ∈ Ωk(R×M) with dα = 0,

φ∗1α− φ∗0α = d(κ(α)).

Hence[φ∗1α] = [φ∗0α]

and we are done.

Corollary 9.22.1 (Poincare lemma).

Hk(Rn) =

R k = 00 k > 0

Proof. Let ı : 0 → Rn be the inclusion and p : Rn → 0 be the map that sends every point to 0. We wantto show that p∗ : H∗(0) → H∗(Rn) is an isomorphism. It’s enough to show that ı∗0 : H∗(Rn) → H∗(0)and p∗ are the inverses of each other. Define ft : Rn → Rn by

ft(x) = tx.

The F (t, x) = ft(x) is a homotopy between f0 and f1. Hence f∗0 = f∗1 as maps on H∗(Rn). Moreover,

p ı = id0 and ı p = f0.67

Therefore

idH∗(0) = (p ı)∗ = ı∗ p∗

and

idH∗(Rn) = f∗1 = f∗0 = (ı p)∗ = p∗ ı∗.

Therefore p∗ and ı∗ are inverses of each other, and H∗(0) and H∗(Rn) are isomorphic.

10. Stokes’s theorem

There are two slightly different (but equivalent) ways of stating Stokes’s theorem: for manifolds withboundary and for regular domains. Recall that manifolds are locally homeomorphic to open subsets of Rn.Manifolds with boundary are locally homeomorphic to opens subsets of the half-space

Hn := x ∈ Rn | x1 ≤ 0.

Technically it is slightly easier to work with regular domains, which is what we will do. Any regular domainis a manifold with boundary. And conversely, any manifold with boundary is a regular domain in some largermanifold. We will not prove the last two statements.

Definition 10.1. Let M be a manifold of dimension m. A closed subset D ⊂ M is a regular domain(or alternatively, a domain with smooth boundary) if for any point p ∈ D, there is a coordinate chartφ = (x1, . . . , xm) : U → Rm on M such that p ∈ U and

φ(U ∩D) = φ(U) ∩ x ∈ Rn | x1 ≤ 0 = φ(U) ∩Hn.

Such a chart φ is adapted to the domain D.

Example 10.2. The unit disk

D = (x, y) ∈ R2 | x2 + y2 ≤ 1

is a regular domain in R2. For example, if p = (1, 0), we may take U = (x, y) | x ≥ 0 and φ(x, y) =(x−

√1− y2, y).

Recall that the interior int(Y ) of a subset Y of a topological space X is the union of all open subsets ofX which are contained in Y . We define the boundary ∂D of a regular domain D in a manifold M to be thepoints in D that are not in the interior of D. Alternatively, q ∈ ∂D if and only if any open set containing qcontains points of D and points of M rD.

Lemma 10.3. Let D be a regular domain in a manifold M . The boundary ∂D is a submanifold of M ofcodimension 1.

Proof. Let φ : U → Rm be a chart adapted to D. The φ(U ∩ int(D)) ⊂ x ∈ Rm | x1 < 0 andφ(U ∩∂D) ⊂ x ∈ Rm | x1 = 0. If ψ : V → Rm is another coordinate chart adapted to D, then ψ also mapsV ∩ int(D) to an open subset of x ∈ Rm | x1 < 0 and V ∩ ∂D to an open subset of x ∈ Rm | x1 = 0.Therefore

ψ φ−1 : φ(U ∩ V )→ ψ(U ∩ V )

maps smoothly φ(U ∩ V ∩ ∂D) = φ(U ∩ V ) ∩ x1 = 0 to ψ(U ∩ V ) ∩ x1 = 0.It follows that the collection of charts φα : Uα → Rm of M which are adapted to D give rise to an atlas

φα|∂D : ∂D ∩ Uα → x1 = 0 ' Rm−1 on ∂D.

Lemma 10.4. If D is a regular domain in an orientable manifold M then int(D) and ∂D are orientable.

Proof. An open subset of an orientable manifold is orientable. Hence int(D) is orientable. We now addressthe orientability of ∂D. If φ : U → Rm, ψ : V → Rm are two charts adapted to D, then ψ φ−1 maps theφ(U ∩ V )∩ x1 < 0 to ψ(U ∩ V ∩ x1 < 0. Therefore, at the points of φ(U ∩ V )∩ x1 = 0 the differential

68

d(ψ φ−1) maps the vectors that point into x1 < 0 to vectors that point into x1 < 0. In other words,at a point q = (0, x2, . . . , xm−1) the differential has the form

d(ψ φ−1)q =

a 0 . . . . . . . . . 0...... d(ψ φ−1)|0×Rm−1

...

for some smooth function a = a(x2, . . . , xm−1) > 0. Hence if det d(ψφ−1)q > 0 then det d(φ ψ−1)q|0×Rm−1 >0 as well. Therefore, if M is orientable, then so is the boundary ∂D of a regular domain D in M .

We rephrase the lemma above in terms of volume forms (cf. Proposition 7.16).

Proposition 10.5. Let D be a regular domain in a manifold M and µ a non-vanishing top form on M .Then there is a vector field N defined on M near ∂D which points out of D. Moreover,

ν = (ι(N)µ)|∂Dis an orientation on ∂D.

Proof. If D = x ∈ Rm : x1 ≤ 0, take N = ∂∂x1

. In general, cover ∂D by adapted charts φi : Ui → Rm(Since all our manifolds are second countable, by passing to a subcover we may assume that the cover iscountable). On each Ui there is a vector field Ni ∈ Γ(TUi) such that Ni points outward. Pick a partition ofunity ρi subordinate to Ui, and let N =

∑ρiNi. The vector field N defined on ∪Ui is the desired vector

field. Note that in adapted coordinates (x1, . . . , xm) : Ui → Rm it has the form

N = b1∂

∂x1+ · · ·+ bm

∂

∂xm

for some functions b1, . . . , bm with b1 > 0.We now argue that

ν := (ι(N)µ)|∂Dis a nowhere vanishing form on ∂D. We argue in adapted coordinates (x1, . . . , xm). The form µ satisfies

µ = f(x1, . . . , xm)dx1 ∧ . . . ∧ dxmfor some nowhere zero function f . Consequently

ι(N)µ =m∑j=1

(−1)j−1fbj dx1 ∧ . . . ∧ dxj ∧ . . . ∧ dxm

(recall that dxj means that dxj is omitted). Since for j > 1,

dx1 ∧ . . . ∧ dxj ∧ . . . ∧ dxm|x1=0 = 0

we have(ι(N)µ)|∂D = (fb1)|x1=0dx2 ∧ . . . ∧ dxm

with fb1 6= 0.

Definition 10.6. Let M be an oriented manifold and D ⊂ M a regular domain. We will refer to theorientation on the boundary ∂M defined by the orientation of M as in Proposition10 above as the inducedorientation.

Theorem 10.7 (Stokes’s Theorem). Let M be an oriented m-dimensional manifold, D ∈ M a regulardomain, and ω ∈ Ωm−1

c (M) a compactly supported form of degree one less than the dimension of M . Then∫int(D)

dω =∫∂D

ω|∂D.

Here int(D) and ∂D are both given the orientation induced by the one on M .69

Proof. First, consider the case that M = Rm and D = x ∈ Rm | x1 ≤ 0. It doesn’t matter what orientationwe choose on M ; we just have to be consistent in orienting int(D) and ∂D. Choose the orientation on Rmdefined by the standard volume form µ = dx1 ∧ . . .∧ dxm. Let N = ∂

∂x1so that ι(N)µ|∂D = dx2 ∧ . . .∧ dxm.

Let ω ∈ Ωm−1c (Rm) be a compactly supported form. Then

ω =∑j

(−1)j−1fjdx1 ∧ . . . ∧ dxj ∧ . . . ∧ dxm

for some compactly supported functions fj . Note that

ω|∂D =

∑j

(−1)j−1fjdx1 ∧ . . . ∧ dxj ∧ . . . ∧ dxm

∣∣∣∣∣∣x1=0

= f1(0, x2, . . . , xm) dx2 ∧ . . . ∧ dxm.

On the other hand,

dω =∑j

(−1)j−1 ∂f

∂xjdxj ∧ dx1 ∧ . . . ∧ ˆdxj ∧ . . . ∧ dxm

=∑j

∂fj∂xj

dx1 ∧ . . . ∧ dxm.

Now, ∫D

dω =∑j

∫x1≤0

∂f

∂xjdx1 ∧ . . . ∧ dxm =

∑j

∫x1≤0

∂f

∂xjdx1 . . . dxm.

Since the supports of fj ’s are compact, there is an R > 0 such that

supp(fj) ⊂ x ∈ Rn | −R ≤ xj ≤ R

for all j. For j > 1, ∫x1<0

∂f

∂xjdx =

∫x1<0

(∫R

∂f

∂xjdxj

)dx1 . . . dxj . . . dxm

=∫x1<0

(∫ R

−R

∂f

∂xjdxj

)dx1 . . . dxj . . . dxm

= 0,

since ∫ R

−R

∂f

∂xjdxj = f(x1, . . . , xj−1, R, xj+1, . . . , xm)− f(x1, . . . , xj−1,−R, xj+1, . . . , xm) = 0− 0 = 0.

For j = 1, we have∫x1<0

∂f1

∂x1dx =

∫Rm−1

(∫ 0

−∞

∂f1

∂x1dx1

)dx2 . . . dxm =

∫Rm−1

(∫ 0

−R

∂f1

∂x1dx1

)dx2 . . . dxm

=∫

Rm−1(f1(0, x2, . . . , xm)− 0) dx2 . . . dxm

=∫

Rm−1f1(0, x2, . . . , xm dx2 ∧ . . . ∧ dxm =

∫∂D

ω|∂D.

Therefore ∫int(D)

dω =∫∂D

ω|∂D

in the special case of M = Rm, D = x1 < 0.

70

We now consider a slightly more general case: D is a regular domain in an oriented manifold M ofdimension m, φ : U → Rm a chart adapted to D and ω ∈ Ωm−1

c (M) with suppω ⊂ U . Then∫int(D)

dω =∫int(D)∩U

dω =∫φ(int(D))

φ∗(dω) =∫x1<0

d(φ∗ω)

=∫∂x1≤0

φ∗ω|∂x1≤0 (here we used the special case above)

=∫φ(U∩∂D)

φ∗ω|φ(U∩∂D) =∫U∩∂D

ω|U∩∂D =∫∂D

ω|∂D.

Finally we remove the restriction on the support of ω. Let ω ∈ Ωm−1c (M) be an arbitrary compactly

supported form. Cover D ∩ suppω by finitely many charts φα : Uα → Rm adapted to the domain Dand giving D its orientation (we now have to make sure that changes of coordinates between charts preserveorientation). It is no loss of generality to assume that M = ∪αUα (after all, we are going to be only interestedin ω|D and suppω|D ⊂ ∪αUα.) Let ρα be a partition of unity subordinate to the cover. Then

∑ραω = ω

and supp(ραω) ⊂ Uα. By the previous discussion∫int(D)

d(ραω) =∫∂D

ραω

for each index α. Therefore∫int(D)

dω =∫D

∑d(ραω) =

∑∫D

d(ραω) =∑∫

∂D

ραω =∫∂D

∑ραω =

∫∂D

ω.

Exercise 10.1. Let M be an m-dimensional compact oriented manifold, D ⊂ M a domain with smoothboundary, f ∈ C∞(M), and ω ∈ Ωm−1(M). Show that∫

D

f dω =∫∂D

fω −∫D

df ∧ ω.

Exercise 10.2.Let M be an m-dimensional oriented manifold and µ ∈ Ωn(M) a nowhere vanishing form. Recall that forany vector field X on M ,

LXµ = divµ(X)µ,

where divµ(X) is the divergence of µ with respect to X (cf. Exercise 9.2). Show that if D ⊂M is a regulardomain then ∫

D

divµ(X)µ =∫∂D

ι(X)µ

for any vector field X with compact support.

Exercise 10.3. What is the integral of x dy− y dx over ∂D, where D is the unit disk in R2 (and R2 is giventhe standard orientation)?

11. Connections on vector bundles

11.1. Connections. If X is a vector field on an open subset U of Rm, then X is determined by m-tuple(a1, . . . am) of functions:

X =∑i

ai∂

∂xi

Therefore we know how to take directional derivatives of X at a point q ∈ U in the direction of a vectorv ∈ TqU = Rm — we simply differentiate the coefficients:

(DvX)q =∑i

(Dvai)q∂

∂xi|q

71

where Dvai is the directional derivative of the function ai in the direction v. Consequently we know whena vector field does not change along a curve γ:

DγX = 0.

Covariant derivatives generalize the directional derivatives allowing us to differentiate vector fields on arbi-trary manifolds and, more generally, sections of arbitrary vector bundles.

Definition 11.1 (Covariant derivative of sections of a vector bundle). Let π : E → M be a vector bundle.A covariant derivative (also knows as a connection) is an R-bilinear map

∇ : Γ(TM)× Γ(E)→ Γ(E), (X, s) 7→ ∇Xssuch that

(1) ∇fXs = f∇Xs(2) ∇X(fs) = X(f) · s+ f∇Xs.

for all f ∈ C∞(M), all X ∈ Γ(TM), and all s ∈ Γ(E).

Example 11.2. Let U ⊂ Rm be an open set and E = TU → U the tangent bundle. Define a connection Don TU → U by

DX(∑

ai∂

∂xi) =

∑X(ai)

∂

∂xi.

I leave it to the reader to check that this is indeed a connection.

Remark 11.3. Lie derivative (X,Y ) 7→ LXY is not a connection on the tangent bundle (why not?).

Example 11.4. Let π : E →M be a trivial bundle of rank k. Then there exist global sections s1, . . . , skof E such that sj(x) is a basis for Ex for all points x ∈M (si is a frame of E|U ). So for any s ∈ Γ(E),we have s =

∑j fjsj , for some C∞ functions fj . We define a bilinear map ∇ : Γ(TM)× Γ(E)→ Γ(E) by

∇Xs = ∇X(∑j

fjsj) :=∑j

X(fj)sj .

It is easy to check that ∇ is indeed a connection on E:

∇fXs = ∇fX(∑j

fjsj) =∑j

fX(fj)sj = f∑j

fjsj = f∇Xs;

and

∇X(fs) = ∇X(f∑j

fjsj) =∑j

X(ffj)sj = X(f)∑j

fjsj + f∑j

X(fj)sj = X(f)s+ f∇Xs.

Lemma 11.5. Any convex linear combination of two connections on a vector bundle E →M is a connection.More precisely, let ∇1, ∇2 be two connections on E and ρ1, ρ2 ∈ C∞(M) be two functions with ρ1 + ρ2 = 1.Then

Γ(TM)× Γ(E) 3 (X, s) 7→ ∇Xs := ρ1∇1Xs+ ρ2∇2

Xs ∈ Γ(E)is a connection.

Proof. Exercise. Check that the two properties of the connection hold.

As a corollary we get:

Proposition 11.6. Any vector bundle π : E →M has connection.

Proof. Choose a cover Uα on M such that E|Uα is trivial. Let ∇α be a connection on E|Uα , as inExample 11.4. Let ρβ be a partition of unity subordinate to Uα. Then supp ρβ ⊂ Uα for some α = α(β).Define a map ∇ : Γ(TM)× Γ(E)→ Γ(E) by

∇Xs =∑β

ρβ(∇αXU s|Uα).

This is indeed a connection, since a convex linear combination of any finite number of connections is aconnection — see Lemma 11.5 above.

72

Proposition 11.7. Let ∇ be a connection on a vector bundle π : E →M . Then ∇ is local: for any open setU and any vector fields X and Y , and for any sections s and s′ of E such that X|U = Y |U and s|U = s′|U ,we have

(∇Xs)|U = (∇Y s′)|U .

Proof. Since ∇ is bilinear, it is enough to show two things:(a) if X|U = 0, then (∇Xs)|U = 0 for any s ∈ Γ(E); and(b) if s|U = 0, then (∇Xs)|U = 0 for any X ∈ Γ(TM).Fix a point x0 ∈ U . Then there is a smooth function ρ : U → [0, 1] with supp ρ ⊂ U and ρ|V = 1 for someopen neighborhood V of x0. If X|U = 0 then ρX = 0, and hence for any section s of E,

0 = (∇ρXs)(x0) = ρ(x0)(∇Xs)(x0) = (∇Xs)(x0).

Since x0 ∈ U is arbitrary, (a) follows. If s|U = 0 then ρs = 0 on M . This in turn implies that

0 = (∇Xρs)(x0) = (X(ρ)s+ ρ∇Xs)(x0) = 0 + ρ(x0)(∇Xs)(x0) = (∇Xs)(x0).

Remark 11.8. It follows that if ∇ is a connection on a vector bundle E →M then ∇ induces a connection

∇U : Γ(TU)× Γ(EU )→ Γ(E|U )

on the restriction E|U for any open set U ⊂ M . Namely, for any x0 ∈ U let ρ : U → [0, 1] be a bumpfunction as in the proof above. Then for any X ∈ Γ(TU) and any s ∈ Γ(E|U ) we have ρX ∈ Γ(TM) andρs ∈ Γ(E) (with ρX and ρs extended to all of M by 0). We define:

(∇UXs)(x0) = (∇ρXρs)(x0).

By Proposition 11.7, the right hand side does not depend on the choice of the function ρ. We leave it to thereader to check that ∇U is a connection.

Definition 11.9 (Christoffel symbols). Let E →M be a vector bundle with a connection∇. Let (x1, . . . , xn) :U → Rm be a coordinate chart on M small enough so that E|U is trivial. Let sα be a frame of E|U : foreach x ∈ U we require that sα(x) is a basis of the fiber Ex. Then any local section s ∈ Γ(E|U ) can bewritten as a linear combination of sα’s. In particular, for each index i and β

∇U∂∂xi

sβ =∑α

Γαiβsα

for some functions Γβiα ∈ C∞(U). These functions are the Christoffel symbols of the connection ∇ relativeto the coordinates (x1, . . . , xn) and the frame sα.

It follows easily that the Christoffel symbols determine the connection ∇U on the coordinate chart U . Itis customary not to distinguish between ∇ and its restriction ∇U .

Proposition 11.10. Let ∇ be a connection on on a vector bundle π : E → M . For any X ∈ Γ(TM), anys ∈ Γ(E) and any point q the value of the connection (∇Xs)(q) at a point q ∈M depends only on the vectorXq (and not on the value of X near q).

Proof. It’s enough to show that if Xq = 0 then (∇Xs)(q) = 0. Since connections are local we can argue incoordinates. Choose a coordinate chart (x1, . . . , xn) : U → Rm on M with q ∈ U such that E|U is trivial.Pick a local frame sj of E|U . Then, if X =

∑Xi ∂

∂xi, s =

∑fjsj , and Γkij denote the associated Christoffel

symbols,

∇Xs = ∇PXi ∂

∂xi

(∑

fjsj) =∑

Xi∇ ∂∂xi

(∑

fjsj)

=∑

Xi ∂fj∂xi

sj +∑

Xifj∇ ∂∂xi

sj

=∑

Xi(∑ ∂fj

∂xisj +

∑fjΓkijsk).

If Xq = 0 then Xi(q) = 0 for all i. Hence (∇Xs)(q) = 0 and we are done. 73

As a corollary of the proof computation above we get an expression for the connection in terms of theChristoffel symbols.

Corollary 11.10.1. Let ∇ be a connection on on a vector bundle π : E →M and (x1, . . . , xn) : U → Rm acoordinate chart on M with E|U being trivial. Let sj be a frame of E|U . Then

(11.1) ∇PiX

i ∂∂xi

(∑j

fjsj) =∑i,k

Xi(∂fk∂xi

+∑j

fjΓkij)sk.

We note one more corollary that will be useful when we try to define connections induced on submanifolds.

Corollary 11.10.2. Let ∇ be a connection on on a vector bundle π : E → M . For any X ∈ Γ(TM), anys ∈ Γ(E) and any point q the value of the connection (∇Xs)(q) at a point q ∈M depends only on the valuesof s along the integral curve of X through q

Proof. By the previous corollary, for X =∑iX

i ∂∂xi

and s =∑j fjsj

(∇Xs) (q) = (Xfk)(q) sk(q) +∑i,k,j

Xi(q)fj(q)Γkij(q)sk(q).

And (Xfk)(q) depends only on the values of fk along the integral curve of X.

The proof that connections are local has an important generalization to maps of sections of vector bundles.

Definition 11.11. Let E → M and F → M be two vector bundles. We say that a map T : Γ(E) → Γ(F )is tensorial if T is R-linear and for any f ∈ C∞(M)

T (fs) = fT (s)

for all sections s ∈ Γ(E).

Lemma 11.12. Let E →M and F →M be two vector bundles. If T : Γ(E)→ Γ(F ) is tensorial then thereis a vector bundle map φ : E → F so that

[T (s)](x) = φ(s(x))

for all s ∈ Γ(E) and x ∈M . And conversely, any vector bundle map φ : E → F defines a tensorial map onsections Tφ : Γ(E)→ Γ(F ) by Tφ(s) = φ s.

Proof. The proof is in two steps. We first argue that T is local: if s ∈ Γ(E) vanishes on an open set U ⊂Mthen T (s) vanishes on U as well. Pick a point x ∈ U and a smooth function ρ ∈ C∞(M) with supp ρ ⊂ Uand ρ ≡ 1 on a neighborhood V of x (V ⊂ U , of course). Then ρs is identically zero everywhere. Hence

0 = T (ρs)(x) = ρ(x) T (s)(x) = T (s)(x).

Since x ∈ U is arbitrary T (s)|U = 0.Since T is local and E, F are locally trivial, we may assume that E and F are, in fact, trivial. That is

E = M × Rk and F = M × Rl. Moreover the sections of E and F are simply k- and l-tuples of functions.We want to define a vector bundle map φ : E → F . Then φ : M × Rk →M × Rl has to be of the form

φ(x, v) = (x,A(x)v)

where A : M → Hom(Rk,Rl) is smooth, with the property that

T (f1, . . . , fk)(x) = A(x)

f1(x)...

fk(x)

for all x ∈ M . But this is easy: define the jth column of A(x) to be the l-tuple of functions T (ej), whereej is the section of E that assigns to every point the jth basis vector (0, . . . , 0, 1, 0, . . . , 0) (1 in jth slot).Or, if you prefer, ej is the k-tuple of functions with jth function being identically 1 and all the others beingzero.

74

Remark 11.13. Lemma 11.12 above generalizes further: let E1, E2, . . . Ek and F be vector bundles over amanifold M and

T : Γ(E1)× · · · × Γ(Ek)→ Γ(F )a k-linear map which is tensorial in each slot:

T (f1s1, . . . , fksk) = f1 . . . fkT (s1, . . . , sk)

for all si ∈ Γ(Ei) and fj ∈ C∞(M). Then for every x ∈M there is a unique k-linear map

Tx : (E1)x × · · · × (Ek)x → Fx

withTx(s1(x), . . . , sk(x)) = [T (s1, . . . , sk)](x).

Globally this means that there is a vector bundle map

φ : E1 ⊗ · · · ⊗ Ek → F

so thatT (s1, . . . , sk)(x) = φ(s1(x)⊗ . . .⊗ sk(x))

for all x ∈M and all sections si ∈ Γ(Ei).

Remark 11.14. We add one more layer of abstraction to the remark above: there is a bijection betweenvector bundle maps φ : E → F and sections of the bundle Hom(E,F ) ' E∗ ⊗ F . Namely, if φ : E → Fis a vector bundle map, then φ|Ex : Ex → Fx is an element of Hom(Ex, Fx) = Hom(E,F )x for each pointx ∈M . Thus x 7→ φ|Ex is a section of the bundle Hom(E,F )→M .

We summarize the preceding discussion as a proposition.

Proposition 11.15. Let E1, E2, . . . Ek and F be vector bundles over a manifold M . There is a bijectionbetween k-linear tensorial maps

T : Γ(E1)× · · · × Γ(Ek)→ Γ(F )and the sections of the bundle E∗1 ⊗ · · · ⊗ E∗k ⊗ F →M .

Here are a few instances where the above point of view is useful.

Lemma 11.16. Let ∇1 and ∇2 be two connections on a vector bundle E → M . Their difference ∇1 −∇2

“is” a section of the bundle T ∗M ⊗ E∗ ⊗ E ' Hom(TM ⊗ E,E). Conversely, given a connection ∇ onE → M and a section A of the bundle Hom(TM ⊗ E,E) then the map ∇A : Γ(TM)× Γ(E)→ Γ(E) givenby

(∇AXs)(x) := ∇Xs(x) +Ax(Xx ⊗ s(x))is again a connection on E. Here, of course, x ∈M is a point, X a vector field on M and s is a section ofE. Thus a choice of a connection on E →M defines a bijection

space of all connections on E →M ↔ Γ(T ∗M⊗E∗⊗E) = Γ(Hom(TM⊗E,E)) = Γ(T ∗M⊗Hom(E,E)).

Proof. In one direction it’s enough to prove that ∇1 −∇2 is tensorial in both slots. It’s obviously tensorialin the vector field slot. The tensoriality in the second slot is an easy computation.

We also leave it to the reader to check that ∇A as defined above is a connection.

Definition 11.17. A connection on a manifold M is a connection on its tangent bundle TM →M .

Definition 11.18. The torsion T∇ of a connection ∇ on a manifold M is a bilinear map

T∇ : Γ(TM)× Γ(TM)→ Γ(TM), T∇(X,Y ) := ∇XY −∇YX − [X,Y ].

If T∇ = 0, the connection ∇ is called torsion-free.

Lemma 11.19. The torsion of a connection is tensorial, hence corresponds to a section of the bundleT ∗M ⊗ T ∗M ⊗ TM .

Proof. This is yet another computation left to the reader.

75

Definition 11.20. The curvature R of a connection ∇ on a vector bundle E → M is a tri-linear mapΓ(TM)× Γ(TM)× Γ(E)→ Γ(E) defined by

R(X,Y )s = ∇X(∇Y s)−∇Y (∇Xs)−∇[X,Y ]s.

Lemma 11.21. Curvature is tensorial hence correspond to a section of T ∗M ⊗ T ∗M ⊗ Hom(E,E) → M .Moreover, since R(X,Y )s = −R(Y,X)s, it actually corresponds to a section of Λ2(T ∗M)⊗Hom(E,E).

Proof. Once again this is a computation. We check tensoriality in one slot and leave the rest to the reader.For all vector fields X,Y , sections s and functions f ,

R(X,Y )(fs) = ∇X(∇Y (fs))−∇Y (∇X(fs))−∇[X,Y ](fs)= ∇X(Y (f)s+ f∇Y s)−∇Y (X(f)s− f∇Xs)− ([X,Y ]f)s− f∇[X,Y ]s

= X(Y (f))s+ Y (f)∇Xs+X(f)∇Y s+ f∇X(∇Y s)− Y (X(f))s−X(f)∇Y s− Y (f)∇Xs− f∇X(∇Y s)− ([X,Y ]f)s− f∇[X,Y ]s

= fR(X,Y )s

11.2. Parallel Transport. In general there is no consistent way of identifying vectors in tangent spaces atdifferent points of a manifold. More generally there is no consistent way of identifying vectors in fibers ofa vector bundle above different points of a manifold. However we will see that given a connection ∇ on avector bundle π : E →M , for any curve γ : [a, b]→M there is a family of vector space isomorphisms

P t2t1 (γ) = P t2t1 : Eγ(t1) → Eγ(t2),

depending smoothly on t1, t2 ∈ [a, b]. These isomorphisms P t2t1 are called parallel transport along γ. Theconnection can then be recovered from parallel transport. We now proceed with the construction.

Definition 11.22. Let π : E → M be a vector bundle and γ : [a, b] → M a curve. A section σ of E → Malong γ is a smooth map s : [a, b] → E so that π(σ(t)) = γ(t) for all t ∈ [a, b]. We denote the space ofsections of E along the map γ by Γ(γ∗E).

Example 11.23. If s : M → E is a section of E, then s γ is a section along γ.

Example 11.24. The derivative γ := dγt( ddt |t) is a section of the tangent bundle TM →M along γ.

Remark 11.25. If E = TM then a section along a curve γ is also known as a vector field along γ. It’s nottrue that every section σ along γ is of the form σ = s γ for some s ∈ Γ(E): if the curve γ crosses itself thanγ cannot be of the form X γ for any vector field X on M .

Remark 11.26. Here’s another way to consider sections along a curve γ. Suppose f : N →M is a smoothmap of manifolds and that π : E →M is a vector bundle. Define the pullback of the bundle E along f to bethe set

f∗E = (n, e) ∈ N × E | f(n) = π(e).together with the projection π′ : f∗E → N , f∗E 3 (n, e) 7→ n. A transversality argument shows that f∗Eis a submanifold of N × E, so π′ is smooth. It’s not hard to see that f∗E is a vector bundle of the samerank as E. The point of this construction is that a section of a bundle E →M along a curve γ : (a, b)→Mis simply a section of the pullback bundle γ∗E → [a, b].

Strictly speaking the construction above doesn’t apply to maps from closed intervals, since a closed interval is not a manifold.However, a smooth map from a closed interval [a, b] is, by definition, a smooth curve from a slightly larger open interval (a′, b′) ⊃ [a, b]and pulling back E to a bundle over (a′, b′) does make sense.

Definition 11.27. Let π : E → M be a vector bundle and γ : [a, b] → M a smooth curve. A covariantderivative ∇dt along γ is an R-linear map

∇dt

: Γ(γ∗(E))→ Γ(γ∗(E)), σ 7→ ∇dtσ

such that for all function f ∈ C∞([a, b]) and all sections σ ∈ Γ(γ∗(E))

(11.2)∇dt

(fσ) =df

dtσ + f

∇dtσ.

76

Proposition 11.28. Given a connection ∇ on a vector bundle π : E →M and a curve γ : [a, b]→M , thereis a unique covariant derivative ∇dt : Γ(γ∗(E))→ Γ(γ∗(E)) along γ such that

(11.3)∇dt

(s γ)(t) = (∇γ(t)s)(γ(t)).

for all sections s of the bundle E.

Proof. (Uniqueness) Arguing as in Proposition 11.7, it is not hard to show that ∇dt is local: for a section σ

of E along γ the value (∇dtσ)(t) at a point t depends only on the values of σ near t. Therefore, in order toprove uniqueness it is no loss of generality to assume that the image γ([a, b]) of γ is contained in an open setU in M with E|U trivial. Pick a frame sj of E|U . Then for any σ ∈ Γ(γ∗E) there are smooth functionsfj ∈ C∞([a, b]) so that

σ(t) =∑

fj(t)sj(γ(t))

for all t ∈ [a, b]. Then, using (11.2) and (11.3), we get

(11.4)∇dtσ(t) =

∇dt

(∑fj (sj γ)

)(t) =

∑ dfjdt

(t) sj(γ(t)) +∑

fj(∇γ(t)sj)(γ(t)).

Since the right hand side of (11.4) depends only on ∇, ∇dt is unique.

(Existence) Cover γ([a, b]) with sets Uj such that E|Uj is trivial. It’s enough to construct ∇dt on eachΓ(γ∗E|γ−1(Uj)) for by uniqueness the operators on each Γ(γ∗E|γ−1(Uj)) will patch together to a map ∇

dt :

Γ(γ∗E)→ Γ(γ∗E). Pick a frame s(j)k on E|Uj and define ∇dt on γ∗(E|Uj ) by (11.4).

Definition 11.29. We will refer to the covariant derivative ∇dt along γ as in the Proposition 11.28 above asbeing induced by the connection ∇.

Definition 11.30. Let E → M be a vector bundle with a connection ∇, γ : [a, b]→ M a curve. A sectionσ ∈ Γ(γ∗E) is parallel if

∇dtσ = 0,

where ∇dt is the covariant derivative along γ induced by ∇.

To define parallel transport along a curve γ : [a, b] → M , we want, for every vector v ∈ Eγ(a), a sectionσv ∈ Γ(γ∗(E)) such that σv(a) = v and ∇dtσ

v = 0. We also want the map v 7→ σv to be linear. The existenceof such sections and linearity in v is the result of the next two lemmas. The first one is the standard resultfor linear time dependent ODE’s.

Lemma 11.31. Suppose that B = (Bjk(t)) : [c, d] → Rk2is a smooth curve in the space of k × k real

matrices. Then there is a smooth curve R : [c, d] → GL(R, k) such that f(t) := R(t)f0 is a solution of theODE

(11.5)

f ′1(t)...

f ′k(t)

= B(t)

f1(t)...

fk(t)

,

with initial conditions f(c) = f0.

Lemma 11.32. Let E → M be a vector bundle with a connection ∇ and γ : [a, b] → M be smooth curve.For any vector v ∈ Eγ(a) there is a section σv ∈ Γ(γ∗(E)) such that σv(a) = v and ∇dtσ

v = 0. Moreover, themap

Eγ(a) → Γ(γ∗E), v 7→ σv

is a linear isomorphism.

Proof. As before, it is no loss of generality to assume the image of γ is contained in a coordinate chart(x1, . . . , xm) : U → Rm with E|U being trivial. Let sj be a frame of E|U and Γkij the correspondingChristoffel symbols. Suppose σ is a section of E along γ which is parallel and satisfies σ(a) = v. Then there

77

are smooth functions fj ∈ C∞([a, b]) so that σ =∑fj (sj γ). We argue that the fj ’s satisfy a linear ODE

as in Lemma 11.31 for some curve B. By (11.4), since ∇dtσ = 0,∑ dfjdt

(t) sj(γ(t)) = −∑

fj (∇γ(t)sj)(γ(t)).

We also have γ =∑i(ddtγi)

∂∂xi

, where γi := xi γ. Therefore

∇γsj =∑

γi (∇ ∂∂xi

sj) γ =∑i,j,k

γi(Γkijsk) γ =∑k

(∑i

γi (Γkij γ)) (sk γ)

We conclude that σ =∑fj (sj γ) is parallel if and only if

(11.6)dfkdt

(t) = −∑i,j

fj(t)γi(t) Γkij(γ(t)).

That is, f = (f1, . . . , fk) satisfies the ODE (11.5) with

Bjk(t) =∑i

γi(t)(Γkij(γ(t))

By Lemma 11.31 the system of linear equations (11.6) has a solution defined for all time t ∈ [a, b] whichdepends linearly on the initial conditions. Therefore the desired parallel transport exists.

Parallel transport leads to one definition of geodesics.

Definition 11.33. Let ∇ be a connection on the tangent bundle TM → M of a manifold M . A curveγ : [a, b]→M is a geodesic if its velocity field γ(t) is parallel:

(11.7)∇dtγ = 0.

Remark 11.34. It will be useful to know what (11.7) means in coordinates. Let (x1, . . . , xm) : U → Rm bea coordinate chart on our manifold. Define γi = xi γ, γi = d

dtγi and γi = ddt γi. Then γ =

∑γi

∂∂xi

. Hencethe functions fk in (11.6) are γks. Therefore, in this case, (11.6) reads

(11.8) γk = −∑

γiγjΓkij(γ).

We conclude that a curve γ is a geodesic for a connection ∇ if and only if (11.8) holds in every coordinatechart.

Exercise 11.1. Consider the manifold Rn. We have seen that DXY =∑X(Y i) ∂

∂xiis a connection. Suppose

that γ : R→ Rn is a curve. Let Ddt denote the covariant derivative along γ induced by the connection D on

Rn. Show thatD

dtγ = γ (=

d2γ

dt2).

Conclude that the geodesics in Rn with respect to D are straight lines.

12. Riemannian geometry

12.1. Levi-Civita connection. We now specialize the discussion of connections and parallel transport tothe case of manifolds with a choice of an inner product on each tangent space.

Definition 12.1 (Riemannian metric). A Riemannian metric g on a manifold M assigns smoothly to eachpoint x ∈M a positive definite inner product gx on TxM .

A Riemannian manifold is a manifold M together with a choice of a Riemannian metric g. In otherwords, it’s a pair (M, g).

Remark 12.2. An inner product h on a vector space V is a bilinear map h : V × V → R. Hence it is anelement of the tensor product V ∗ ⊗ V ∗. Therefore a Riemannian metric on a manifold M is nothing buta smooth section of the bundle (T ∗M)⊗2 := T ∗M ⊗ T ∗M → M . (Not all sections of T ∗M⊗2 → M areRiemannian metrics. For instance, the zero section is not. But all symmetric and positive definite sectionsof T ∗M⊗2 →M are Riemannian metrics.)

78

Theorem 12.3. Any second countable manifold M has a Riemannian metric.

Proof. Let φi = (x(i)1 , . . . , x

(i)m ) : Ui → Rm be a countable collection of coordinate charts that cover M .

One each chart Ui define a metric g(i) =∑j dx

(i)j ⊗ dx

(i)j Let ρi be a partition of unity subordinate to this

cover. Define a section g of T ∗M ⊗ T ∗M →M by

g =∑i

ρig(i).

Then g is a Riemannian metric.

Fiber metrics. The notion of a Riemannian metric generalizes to arbitrary vector bundles.

Definition 12.4. A fiber metric on the vector bundle E → M assigns smoothly to each point x ∈ M a positive definite

symmetric bilinear form gx : Ex × Ex → R. In particular a fiber product is a section of E∗ ⊗ E∗ →M .

Proposition 12.5. Every vector bundle E →M over a paracompact manifold M has a fiber metric.

Proof. If sα : U → E is a local frame, then

gx(X

aαsα(x),X

bβsβ(x)) =X

aαbβδαβ

is a fiber metric on E|U . Patch these local fiber metrics together using a partition of unity.

The next theorem is the fundamental theorem of Riemannian geometry. It says that for every Riemannianmanifold (M, g) there is a connection ∇ (which depends on the metric g) with two important properties.Such connection is called the Levi-Civita connection.

Theorem 12.6 (existence and uniqueness of the Levi-Civita connection). On every Riemannian manifold(M, g) there is a unique connection ∇ : Γ(TM)× Γ(TM)→ Γ(TM) which is(1) Torsion-free : ∇XY −∇YX = [X,Y ] for all X,Y ∈ Γ(TM)(2) metric (i.e. compatible with g) : X(g(Y, Z)) = g(∇XY,Z) + g(Y,∇XZ) for all X,Y, Z ∈ Γ(TM).

Proof. (Uniqueness) The proof is a trick. Suppose that ∇ exists. Then for any X,Y, Z ∈ Γ(TM),

X g(Y, Z) =g(∇XY,Z) + g(Y,∇XZ)

Y g(Z,X) =g(∇Y Z,X) + g(Z,∇YX)

−Z g(X,Y ) =− g(∇ZX,Y )− g(X,∇ZY )

since the connection is compatible with the metric. Adding up the three equations and using the fact thatthe connection is torsion free, we getX g(Y, Z) + Y g(Z,X)− Z g(X,Y ) =g(∇XY, Z) + g(∇YX,Z) + g(Y,∇XZ −∇ZX) + g(X,∇Y Z −∇ZY )

=g(∇XY,Z) + g(∇XY − [X,Y ], Z) + g(Y, [X,Z]) + g(X, [Y,Z])

=2g(∇XY, Z)− g([X,Y ], Z) + g(Y, [X,Z]) + g(X, [Y,Z])

Thus, we have

(12.1) 2g(∇XY,Z) = X(g(Y,Z)) + Y (g(Z,X))− Z(g(X,Y )) + g([X,Y ], Z)− g(Y, [X,Z])− g(X, [Y,Z]).

Since Z is arbitrary and g is nondegenerate, the formula above uniquely determines ∇XY . This provesuniqueness of a Levi-Civita connection.

It remains to prove existence. The proof is very simple, if one is willing to skip all the details. Definean R-trilinear map

Γ(TM)× Γ(TM)× Γ(TM)→ C∞(M)by sending a triple of vector fields (X,Y, Z) to 1/2 of the right hand side of (12.1). Since g is nondegeneratethis defines an R-bilinear map

Γ(TM)× Γ(TM)→ Γ(TM), (X,Y ) 7→ “∇”XY.

It remains to verify that “∇” so defined is a connection, and that it is metric and torsion-free. These minordetails are traditionally left to the reader. We will provide a different and more detailed proof below after abrief detour.

Equation (12.1) has the following interesting consequence:79

Lemma 12.7. The Christoffel symbols of the Levi-Civita connection depend only on the metric and its firstpartials.

Proof. Given a coordinate chart (x1, . . . , xm) : U → Rm on M , the Christoffel symbols Γkij of the Levi-Civitaconnection ∇ are defined by

∇∂i∂j =∑k

Γkij∂k,

where ∂i = ∂∂xi

. Plugging X = ∂i, Y = ∂j and Z = ∂k into (12.1) we get

2g(∇∂i∂j , ∂k) = ∂i(g(∂j , ∂k)) + ∂j(g(∂j , ∂i))− ∂k(g(∂i, ∂j))

since [∂i, ∂j ] = [∂j , ∂k] = [∂i, ∂k] = 0. Writing gij = g(∂i, ∂j) etc., we obtain

(12.2) 2∑l

Γlijglk = ∂igjk + ∂jgji − ∂kgij .

Since g is a metric, the matrix (gij) is nondegenerate. Let (grs) denote its inverse, so that∑s

grsgsk = δrk.

Multiplying both sides of (12.2) by gsk and summing over k we get∑l

δslΓlij =12

∑k

gsk (∂igjk + ∂jgji − ∂kgij) ,

and simplifying

(12.3) Γsij =12

∑k

gsk (∂igjk + ∂jgji − ∂kgij) .

This proves that the Christoffel symbols depend only on the metric and its first order partials.

Proof of Theorem 12.6 continued. It remains to (re)prove the existence of the Levi-Civita connection. Byuniqueness, it is enough to construct a Levi-Civita connection ∇ in each coordinate chart. For then byuniqueness, these coordinate chart connections patch together into a Levi-Civita connection on the wholemanifold M . We have shown that if the Levi-Civita connection exists then its Christoffel symbols have tobe given by (12.3). Therefore on a chart (x1, . . . , xm) : U → Rm we define a connection ∇ by

∇Xi∂iYj∂j = Xi(∂iYj)∂j +XiYjΓkij∂k

with Christoffel symbols Γkij given by (12.3). In the equation above we finally resorted to the Einsteinsummation convention: we sum on repeated indices and omit the symbol

∑. We now check that ∇ is a

Levi-Civita connection.Since Γkij = Γkji (c.f. (12.3)),

∇∂i∂j −∇∂j∂i = Γkij∂k − Γkji∂k = 0.

Thus, for two vector fields X = Xi∂i and Y = Yj∂j , we have

∇XY −∇YX = ∇Xi∂i(Yjpj)−∇Yjpj (Xi∂i)= Xi(∂iYj)∂j +XiYj∇∂i∂j − Yj(∂jXi)∂i − YjXi∇∂j∂i= Xi(∂iYj)∂j − Yj(∂jXi)∂i= [Xi∂i, Yj∂j ].

Thus, ∇ is torsion-free. Compatibility with g is a somewhat longer computation. First, note that

g(∇∂i∂j , ∂k) + g(∂j ,∇∂i∂k) = g(Γlij∂l, ∂k) + g(∂j ,Γmik∂m)

= Γlijglk + Γmikgjm= ∂igjk,80

where the last equality follows from (12.3). Thus, we have for vector fields X = Xi∂i,Y = Yj∂j and Z = Zk∂k,

(Xj∂j)g(Yi∂i, Zk∂k) = Xj∂j(YiZkgik)= Xj(∂jYi)Zkgik +XjYi(∂jZk)gik +XiYjZk(∂jgik)= g(Xj(∂jYi)∂i, Zk∂k) + g(Yi∂i, Xj(∂jZk)∂k)

+XjYiZk(g(∇∂j∂i, ∂k) + g(∂i,∇∂j∂k))= g((Xj∂j)Yi∂i, Zk∂k) + g(Yi∇Xj∂j∂i, Zk∂k)

+g(Yi∂i, (Xj∂j)Zk∂k) + g(Yi∂i, Zk∇Xj∂j∂k)= g(∇Xj∂j (Yi∂i), Zk∂k) + g(Yi∂i,∇Xj∂j (Zk∂k)).

That is, the connection ∇ is compatible with the metric g. Therefore the connection with Christoffel symbolsdefined by (12.3) is a Levi-Civita connection. This finishes the proof of existence and uniqueness of the Levi-Civita connection.

Example 12.8. Consider the manifold Rn. We have seen that DXY =∑X(Yi) ∂

∂xiis a connection. An

easy computation shows D is the Levi-Civita connection on Rn with respect to the standard inner producton Rn.

We end this section with a brief discussion of the geometric meaning of a connection being metric.

Definition 12.9. Let E →M be a vector bundle with a fiber metric g. A connection ∇ on E is metric if

X(g(s, s′)) = g(∇Xs, s′) + g(s,∇Xs′)

for all vector fields X and sections s, s′ ∈ Γ(E).

Definition 12.10. Let V1, V2 be two vector spaces with inner products g1, g2 respectively. A linear mapA : V1 → V2 is an isometry if

g2(Av,Aw) = g1(v, w)

for all v, w ∈ V1.

Lemma 12.11. If a connection ∇ is metric then the associated parallel transport is an isometry.

Proof. We will only prove the lemma for embedded curves and leave the general case as an exercise. Ifγ : [a, b] → M is an embedded curve, then locally any section σ : [a, b] → E is of the form s γ. Letv, w ∈ Eγ(a) be two vectors and σv, σw : [a, b] → E two parallel sections with σv(a) = a and σw(a) = w.We want to prove that the function t 7→ gγ(t)(σv(t), σw(t)) is constant. For this it’s enough to prove that itsderivative is zero for all t. This condition is local in t, so we may assume, by above remark, that σv = sv γand σw = sw γ for some (local) sections sv, sw of E. Then

gγ(t)(σv(t), σw(t)) = [g(sv, sw)](γ(t)).

Henced

dt

∣∣∣∣t

gγ(t)(σv(t), σw(t)) =γ(g(sv, sw))

= g(∇γsv, sw) + g(sv,∇γsw)

= g(0, sw) + g(sv, 0) = 0.

12.2. Connections induced on submanifolds. Let (M, g) be a Riemannian manifold and N → M anembedded submanifold (think of a surface in R3). We’ll see that the embedding induces a Levi-Civitaconnection on N in two ways that turn out to be equivalent. It will also turn out that for surfaces in R3 thecurvature of the induced connection is intimately related to Gauss curvature.

Suppose f : N →M is a map of manifolds. Then we can use f to pull back a metric g on M to a positivesemi-definite symmetric bilinear form on N :

(f∗g)x(v, w) = gf(x)(dfxv, dfxw)81

for all x ∈ N , v, w ∈ TxN . Moreover, if dfx is injective then (f∗g)x is non-degenerate. Therefore if f : N →Mis an immersion then gN := f∗g is a metric on N . The metric gN defines a Levi-Civita connection ∇N onN .

Suppose now that f : N → M is an embedding. Then there is another way to induce a connection on Nfrom a connection M . First of all, for all point x ∈ N the tangent space TxM splits as an orthogonal directsum with respect to gx:

TxM = TxN ⊕ (TxN)⊥.

Hence there is an orthogonal projectionΠx : TxM → TxN.

Globally ν := tx(TxN)⊥ is a vector bundle, the normal bundle of the embedding of N into M . Henceglobally the first equation says that the restriction TM |N is a direct sum of two bundles:

TM |N = TN ⊕ ν

and the second equation says that we have a bundle map

Π : TM |N → TN.

Here is how one can see that Πx depends smoothly on x: Choose coordinates φ = (x1, . . . , xn, . . . , xm) : U → Rm on M near a pointx ∈ N that are adapted to N . That is, φ(N ∩ U) = φ(U) ∩ xn+1 = 0, . . . , xm = 0. Apply Gram-Schmidt to the basis vectors

∂∂x1

, . . . , ∂∂xn

, . . . , ∂∂xm to obtain an orthonormal frame e1(x), . . . , en(x), . . . , em(x) on TU . Remember that every tangent space

TxM has an inner product gx that depends smoothly on x. The Gram-Schmidt is smooth in the inner product. Define the projectionΠ by

Πx(v) =nXi=1

gx(v, ei(x))ei(x)

Definition 12.12. Let N ⊂M be an embedded submanifold. A vector field X ∈ Γ(TM) is an extension ofa vector field X ∈ Γ(TN) if

Xx = Xx

for all x ∈ N . We will also say that X is tangent to N .

Lemma 12.13. Let N ⊂ M be an embedded submanifold and X ∈ Γ(TN) a vector field. Then for anyx ∈ N there is a neighborhood U ⊂M and an extension X ∈ Γ(TM |U ) of X|N∩U .

Proof. Let (x1, . . . , xn, . . . , xm) : U → Rm be coordinates on M adapted to N . Then X =∑ni=1Xi

∂∂xi

, withXi being smooth functions on U ∩N . Extend Xi to all of U by making them constant in xn+1, . . . , xm. Thisextends X to all of U .

Lemma 12.14. Let N ⊂ M be an embedded submanifold, X,Y ∈ Γ(TN) be two vector fields and X, Y ∈Γ(TM) their extensions. Then their Lie bracket [X, Y ] is tangent to N , hence is an extension of [X,Y ].

Proof. We give two proofs. The first is computational. In coordinates (x1, . . . , xn, . . . , xm) on M adaptedto N , X =

∑mi=1 Xi

∂∂xi

with Xi(x) = 0 for i > n for all x ∈ N . Similarly Y =∑mi=1 Yi

∂∂xi

with Yi(x) = 0for i > n for all x ∈ N . Since

[X, Y ] =∑i,j

Xi∂Yj∂xi

∂

∂xj−∑i,j

Yj∂Xi

∂xj

∂

∂xi

for i > n the coefficient in front of ∂∂xi

vanishes at the points of N .

Here is a geometric proof. If X is tangent to N , its flow φt preserves N (maps it into itself). Hence itsdifferential dφt maps vectors tangent to N to vectors tangent to N . But Y is tangent to N . Hence for anyx ∈ N

(d(φ−t)Y )x ∈ TxNfor all t. Differentiating with respect to t we get

[X, Y ]x ∈ TxN.

82

We now define a connection ∇ on a manifold N induced by its embedding into a Riemannian manifold(M, g) by

∇XY (x) := Πx(∇X Y (x)),

where x ∈ N is a point, X,Y ∈ Γ(TN) are two vector fields, X, Y their (local) extensions to M , Πx : TxM →TxN is the orthogonal projection and ∇ is the Levi-Civita connection on (M, g).

We need to make sure that ∇ is well-defined, that is, that ∇XY (x) does not depend on the choice of thelocal extensions X, Y . By Corollary 11.10.2 ∇X Y (x) depends only on Xx = Xx and the values of Y alongthe integral curve of X through x. Therefore ∇X Y (x) depends only on Xx and the values of Y along theintegral curve of X through x. Hence ∇ is well-defined. Moreover, ∇XY is clearly tensorial in the X slot.To see that it is a connection, let f ∈ C∞(N) be a function and f its (local) extension to M . Then, at thepoints of N ,

∇X(fY ) = Π(∇X(f Y ) = Π((Xf)Y + f∇X Y )

= (Xf)Π(Y ) + fΠ(∇X Y ) = (Xf)Y + f∇XY.

We conclude that the induced connection ∇ is indeed a connection.

Remark 12.15. The projection Π is really necessary in the definition of the induced connection. This isbecause even if vector fields X and Y are tangent to a submanifold N there is no reason for their covariantderivative ∇X Y to be tangent to N . Here is an example:

Let W = Z = x2∂∂x1− x1

∂∂x2

, two vector fields on M = R2. Let D denote the Levi-Civita connection onR2 for the standard metric dx⊗ dx+ dy ⊗ dy. Then

DWZ = (Wx2)∂

∂x1+ (W (−x1))

∂

∂x2= −x1

∂

∂x1− x2

∂

∂x2.

Let N = S1. Then W and Z are tangent to N , hence are extensions of a vector field on N . But DWZ isorthogonal to S1.

Lemma 12.16. Let (M, g) be a Riemannian manifold and i : N → M an embedded submanifold. Thenthe connection ∇ induced on N by the Levi-Civita connection ∇ on M is the Levi-Civita connection for thepullback metric gN := i∗g.

Proof. It is enough to check that

(1) ∇ is torsion-free and that(2) ∇ is metric.

For all X,Y ∈ Γ(TN) and their local extensions X, Y ∈ Γ(TM)

∇XY − ∇YX = Π(∇X Y −∇Y X) = Π([X, Y ]) = Π([X,Y ]) = [X,Y ].

To show that ∇ is metric we need to check that

Z(gN (X,Y )) = gN (∇ZX,Y ) + gN (X, ∇ZY )

for any vector fields X,Y, Z on N . At any point of N ,

Z(gN (X,Y )) = Z(g(X, Y ))

= g(∇ZX, Y ) + g(X,∇Z Y )

= g(∇ZX + (∇ZX − ∇ZX), Y ) + g(X, ∇ZY + (∇Z Y − ∇ZY ))

= g(∇ZX,Y ) + g(X, ∇ZY ).

since ∇Z Y − ∇ZY and ∇ZX − ∇ZX are perpendicular to N .

83

12.3. The second fundamental form of an embedding. As before let N → M be an embedded sub-manifold of a Riemannian manifold (M, g). We want to understand how much N curves in M . We define atensor, the second fundamental form IIx : TxN × TxN → (TxN)⊥ to measure the extrinsic geometry of Nin M . We first define

II : Γ(TN)× Γ(TN)→ Γ(TN⊥)by

II(X,Y ) = ∇X Y − ∇XY,where, as before, ∇ is the Levi-Civita connection on M , ∇ is the induced Levi-Civita connection on N ,X, Y ∈ Γ(TM) are local extensions of the vector fields X,Y ∈ Γ(TN).

Proposition 12.17. The map II defined above is symmetric and tensorial.

Proof. We first argue that II is symmetric.

II(X,Y )− II(Y,X) = (∇X Y − ∇XY )− (∇Y X − ∇YX)

= (∇X Y −∇Y X)− (∇XY − ∇YX)

= [X, Y ]− [X,Y ] = 0.

Next we argue that II is tensorial in the first slot. Let f be a local extension of a function f on N . Then atthe points of N ,

II(fX, Y ) = ∇f X Y − ∇fXY = f∇X Y − f∇XY = f II(X,Y ).

It follows that for all points x ∈ N there is a symmetric bilinear map

IIx : TxN × TxN → (TxN)⊥.

Remark 12.18. In classical terminology the first fundamental form of an embedding is the induced metric.

Next suppose that the embedded submanifold N is a hypersurface, that is, that dimM−dimN = 1. Thenthe normal bundle TN⊥ has 1-dimensional fibers hence, locally, a frame on TN⊥ is defined by one nowherezero vector field. By rescaling, if necessary, we may assume that this vector n field has length 1 everywhere:

gx(nx, nx) = 1

for all points x ∈ N . We furthermore make an extra assumption that unit vector field n normal to N isdefined on all of N . That is, N is orientable inside M . This is true for the sphere embedded in R3 but falsefor the central circle of the Mobius band inside the band. If N ⊂ M has a globally defined unit normal n,we can write

IIx(v, w) = hx(v, w)nxfor a symmetric bilinear map hx : TxN × TxN → R. Unwinding the definitions we see that for any vectorfields X,Y on N

h(X,Y ) = g(∇X Y , n).We will refer to h ∈ Γ(TN∗ ⊗ TN∗) also as the second fundamental form. The second fundamental form hallows us to relate the curvature tensor R of the Levi-Civita connection on M , the Riemann curvature ofM , and the curvature R of the induced connection on N :

Theorem 12.19. Let N →M be an embedded orientable hypersurface of a Riemannian manifold (M, g). Leth ∈ Γ(T ∗N⊗2) be the second fundamental form of the embedding. Then for any vector fields X,Y, Z,W ∈ TN(12.4) g(R(X,Y )Z,W ) = gN (R(X,Y )Z,W )− h(Y,Z)h(X,W ) + h(X,Z)h(Y,W ),

where R is the Riemann curvature tensor of M and R is the induced Riemann curvature tensor of N .

We prove an easy lemma before tackling the computations involved in the proof of the theorem.

Lemma 12.20. Let (M, g), ∇, N , n and h be as above. Then

h(X,W ) = −g(∇Xn,W ).

for any vector fields X,W ∈ Γ(TN) (here we didn’t bother with putting tildes on the extensions).84

Proof. The function g(n,W ) is identically 0 on N . Hence

0 = X(g(n,W )) = g(∇Xn,W ) + g(n,∇XW )

since ∇ is a metric connection.

Proof of Theorem 12.19. Recall that

R(X,Y )Z = ∇X(∇Y Z)−∇Y (∇XZ)−∇[X,Y ]Z.

∇X(∇Y Z) = ∇X(∇Y Z + h(Y,Z)n)

= ∇X(∇Y Z) + h(X, ∇Y Z)n+ (Xh(Y,Z))n+ h(Y,Z)∇Xn.Hence

(12.5) g(∇X(∇Y Z),W ) = g(∇X(∇Y Z),W ) + h(Y,Z)g(∇Xn,W ) = g(∇X(∇Y Z),W )− h(Y,Z)h(X,W ).

Similarly,

(12.6) g(∇Y (∇XZ),W ) = g(∇Y (∇XZ),W )− h(X,Z)h(Y,W ),

while

(12.7) g(∇[X,Y ]Z,W ) = g(∇[X,Y ]Z,W ).

Subtracting (12.6) and (12.7) from (12.5) we get (12.4).

Let us see what the theorem tells us about the curvature of oriented surfaces in R3. If N ⊂ R3 is anoriented embedded manifold, then the unit normal field n assigns to every point in N a unit vector in R3.Hence we can think of n as a map to the unit sphere,

n : N → S2.

This is the Gauss map. Since TxN and TnxS2 are two planes perpendicular to the same vector nx, they are

the same two plane in R3. Therefore we may think of the differential dnx of the Gauss map as a map

dnx : TxN → TxN.

Definition 12.21. The Gauss curvature κ of an oriented surfaceN in R3 is the determinant of the differentialof the Gauss map:

κ(x) = det dnx.

We compute a few examples of Gauss curvature by brute force.

Example 12.22. ConsiderN = (x1, x2, x3) ∈ R3 | x3 = 0,

a plane. The normal vector field n(x) is constant, and so the Gauss curvature κ(x) is 0.

Example 12.23. Now let N be a round cylinder:

N = (x1, x2, x3) ∈ R3 | x22 + x2

3 = R2,Here the unit normal n(x) is constant in the x1 direction. Hence, dnx(e1) = 0, and so the Gauss curvatureis again zero.

Example 12.24. Let N be the standard round sphere of radius R:

N = (x1, x2, x3) : x21 + x2

2 + x23 = R2.

Then the normal vector field n is given by n(x) = 1Rx, hence

dn =1R· id.

Thereforeκ(x) =

1R2

.

Note that the Gauss curvature is constant and positive. Also, the bigger the radius of the sphere the smallerthe Gauss curvature. This makes sense since the sphere gets flatter as its radius increases.

85

In general one computes the Gauss curvature from the first and second fundamental form.Once again wedenote the Levi-Civita connection on R3 by D. Then for any vector v and vector field Y : R3 → R3

DvY = dY (v).

Hence for any two vector fields X,Y on a surface N ,

(12.8) hx(Xx, Yx) = −gx((DXn)(x), Yx) = −gx(dnx(Xx), Yx).

In particular the differential of the Gauss map is completely determined by the induced metric and thesecond fundamental form. We will see shortly that the Gauss curvature depends only on the metric g andits first and second partials. But first we extract Gauss curvature from the above equation.

Lemma 12.25. Let g be a positive definite inner product on a vector space V , h : V × V → R a symmetricbilinear map and S : V → V the linear map uniquely defined by

h(v, w) = g(Sv,w).

Let ei be a basis of V . Thendet(h(ei, ej)) = det(g(ei, ej)) detS.

Proof. The matrix (ski) of S with respect to the basis ei is defined by

Sei =∑k

skiek.

Thereforeh(ei, ej) = g(Sei, ej) = g(

∑k

skiek, ej) =∑k

skig(ek, ej).

Therefore the matrix (h(ei, ej)) is the product of matrices (ski) and (g(ej , ek)) = (g(ek, ej)). Thus

det(h(ei, ej)) = det(g(ej , ek)) det(ski).

Together Lemma 12.25 above and (12.8) tell us how to compute the Gauss curvature: pick a basis e1, e2of the tangent space TxN . Then

κ(x) =det(h(ei, ej))det(g(ei, ej))

.

In particular, if the basis e1, e2 is orthonormal with respect to the induced metric g,

κ(x) = det(h(ei, ej)).

We are now ready to prove Gauss’ theorema egregium (“remarkable theorem”) from 1828!

Theorem 12.26. Let N → R3 be an oriented embedded surface. Let R denote the Riemann curvature onN . Then the Gauss curvature κ is given by

κ(x) = −gNx (Rx(e1, e2)e1, e2)

where e1, e2 is a basis of TxN orthonormal with respect to the induced metric gN .Hence the Gauss metric depends only on the induced metric and its first and second partials and not on

the embedding.

Proof. The Riemann curvature of the standard Levi-Civita connectionD on R3 is 0. Hence, by Theorem 12.19

gNx (Rx(e1, e2)e1, e2) = hx(e2, e1)hx(e1, e2)− hx(e1, e1)hx(e1, e1) = −det(hx(ei, ej)) = −κ(x).

The curvature of a connection depends on the Christoffel symbols and their first partials. The Christoffelsymbols of a Levi-Civita connection are functions of the metric and its first partials.

86

Exercise 12.1. Let f(x, y) be a smooth function on R2 and N its graph in R3:

N = (x, y, f(x, y)) | (x, y) ∈ R2

Show that the Gauss curvature κ is given by

κ =fxxfyy − f2

xy

(1 + f2x + f2

y )2

where fxy = ∂2f∂x∂y and so on.

13. Geodesics as critical points of the energy functional

This section is a brief excursion into the calculus of variations. The basic setup is this. Let M be amanifold. Consider the set of all maps P from a fixed interval [a, b] to M with fixed end points:

P = P([a, b], q1, q2) = γ : [a, b]→M | γ(a) = q1, γ(b) = q2,

where q1, q2 ∈ M are two points. Every path γ ∈ P gives rise to a path γ : [a, b] → TM . Therefore, asmooth function L : TM → R on the tangent bundle of M (a “Lagrangian”) defines a map (“action”)

A : P → R, A(γ) =∫ b

a

L(γ(t)) dt.

For example, if g is a Riemannian metric on a manifold M then

L(x, v) =12gx(v, v) x ∈M, v ∈ TxM

is a Lagrangian and the corresponding action

AL(γ) =∫ b

a

12gγ(t)(γ(t), γ(t)) dt

is the “energy” of the path. The term “energy” comes from the fact that for a particle of mass m moving inR3 the quantity 1

2m(v21 + v2

1 + v23) = 1

2m||v||2 is the kinetic energy.

We want to make sense of a path γ ∈ P being critical for a an action AL : P → R. This is a bit delicatesince we have been careless with the topology on P and since P is infinite dimensional. The cheapest way todo it is by analogy with a finite dimensional case: a point is critical for a function f if and only if for everypath σ(s) through the point, we have d

ds

∣∣∣s=0

f(σ(s)) = 0. Now, a path in the space P through γ0 ∈ P is a

family of curves γs with γs|s=0 = γ0, where s varies in some open interval (−ε, ε). We say that γs dependssmoothly on s if the map

(−ε, ε)× [a, b]→M, (s, t) 7→ γs(t)

is smooth.

Definition 13.1. Let P = P([a, b], q1, q2) be a space of paths in a manifold M and L : TM → R aLagrangian. A path γ0 ∈ P is L-critical if for any family γs of paths through γ0 we have

d

ds

∣∣∣s=0

(AL(γs)) = 0,

where AL is the associated action.

A connection between variational problems and Riemannian geometry is provided by the following theorem.

Theorem 13.2. Let (M, g) be a Riemannian manifold and L(x, v) = 12gx(v, v) the associated Lagrangian.

A path γ is L-critical if and only if γ is a geodesic of the Levi-Civita connection.

We will first prove the theorem above locally, when the image of the path is contained in a coordinatechart. We will then show that any L-critical path is a geodesic. We will not have time to prove the converse.We start by examining what critical paths for an arbitrary Lagrangian look like locally.

87

Theorem 13.3. Let L : Rm×Rm → R, (x, v) 7→ L(x, v) be a Lagrangian. A path γ0(t) = (γ01(t), . . . , γ0

m(t)) :[a, b]→ Rm is L-critical if and only if it satisfies the Euler-Lagrange equations:

(13.1)d

dt

(∂L

∂vi(γ(t), γ(t))

)− ∂L

∂xi(γ(t), γ(t)) = 0, 1 ≤ i ≤ m.

Proof. Let γs(t) = γ(s, t) = (γ1(s, t), . . . , γm(s, t)) be a variation of γ0. Then γ(0, t) = γ0(t) for all t, andγ(s, a) = γ0(a), γ(s, b) = γ0(b) for all s. Hence

h(t) :=∂

∂s

∣∣∣s=0

γ(s, t) : [a, b]→ Rm

has to vanish at t = a and at t = b. It’s important that there are no other restrictions on h: given anarbitrary curve h : [a, b]→ Rm which vanishes at the endpoints,

γ(s, t) := γ0(t) + sh(t)

is a variation of γ0. Note further that γs(t) = ∂∂t

∣∣∣tγ(s, t) and consequently

∂

∂s

∣∣∣s=0

γs(t) =∂2γ

∂s∂t

∣∣∣(0,t)

=d

dt

∣∣∣t(∂

∂s

∣∣∣s=0

γ(s, t) = h(t).

Since γ0 is L-critical,

0 =d

ds

∣∣∣s=0

∫ b

a

L(γ(t, s), γ(t, s)) dt

=∫ b

a

∂

∂s

∣∣∣s=0

L(γ(t, s), γ(t, s)) dt

=∫ b

a

∑i

(∂L

∂xi(γ0, γ0)

∂γi∂s

∣∣∣s=0

+∂L

∂vi(γ0, γ0)

∂γi∂s

∣∣∣s=0

)dt

=∑i

∫ b

a

(∂L

∂xi(γ0, γ0)hi +

∂L

∂vi(γ0, γ0)hi) dt.

Integration by parts gives∫ b

a

∂L

∂vi(γ0, γ0)hi dt =

∂L

∂vi(γ0, γ0)hi

∣∣∣ba−∫ b

a

d

dt

(∂L

∂vi(γ0(t), γ0(t))

)hi(t) dt.

Therefore

0 =∑i

∫ b

a

(∂L

∂xi(γ0, γ0)− d

dt

(∂L

∂vi(γ0(t), γ0(t))

))hi(t) dt.

Since hi(t) are arbitrary, the equation above forces (13.1): see Lemma 13.4 below.Running the computations backwards we see that if γ0 satisfies the Euler-Lagrange equations then γ0 is

L-critical.

Lemma 13.4. If f ∈ C∞([a, b]) is a smooth function and if for any h ∈ C∞([a, b]) with h(a) = h(b) = 0 wehave

∫ baf(t)h(t) dt = 0, then f(t) ≡ 0.

Proof. Exercise.

Proposition 13.5. Let g be a metric on Rm and L(x, v) = 12gx(v, v) the associated Lagrangian. Then γ is

L-critical if and only if it is a geodesic for the Levi-Civita connection defined by the metric g.

Proof. We have2L(x, v) =

∑k,l

gkl(x) vkvl.

Therefore, for each index i,

2∂L

∂xi=∑k,l

∂gkl∂xi

vkvl

88

and2∂L

∂vi=∑k,l

(gilvl + gkivk).

The Euler-Lagrange equations in this case then are∑k,l

∂gkl∂xi

γkγl =d

dt

∑k,l

(gilγl + gkiγk)

.

Differentiating and gathering γs terms on one side, we get:

(13.2)∑s

gisγs = −12

∑k,l

(∂gki∂xl

+∂gil∂xk− ∂gkl∂xi

)γlγk.

Here we used the fact that γis = γsi; this is where the 12 comes from. As before we denote the entries of the

inverse of the matrix (gαβ) by gαβ so that∑β g

αβgβγ = δαγ . Therefore if we multiply both sides of (13.2)by gji and sum on i we get

γj = −12

∑i,k,l

gji(∂gki∂xl

+∂gil∂xk− ∂gkl∂xi

)γlγk = −

∑k,l

Γjklγkγl,

where Γikl are the Christoffel symbols for the Levi-Civita connection (cf. (12.3)). We now see that this is thegeodesic equation. Thus, L-critical curves are geodesics and vice versa.

The result for Lagrangians on Rn, Theorem 13.3, and the corresponding result for geodesics, Propo-sition 13.5, generalize to the manifold setting. To be precise, recall that if (x1, . . . , xn) : U → Rm is acoordinate chart on a manifold M , then it defines an associated coordinate chart (x1, . . . , xm, v1, . . . , vm) :TU → Rm × Rm on the tangent bundle of M . Namely, if q ∈ U is a point and w ∈ TqU = TqM is a vector,then there are unique numbers v1 = v1(w), . . . , vm = vm(w) so that

w =∑i

vi(w)∂

∂xi

∣∣∣q,

since

∂∂xi

∣∣∣q

is a basis of TqM . Of course, vi(w) = (dxi)q(w).

Proposition 13.6. Let M be a manifold and L : TM → R a Lagrangian. If a path γ0 : [a, b] → M liesentirely inside a coordinate chart (x1, . . . , xn) : U → Rm (i.e., γ([a, b]) ⊂ U), then

(γ01(t), . . . , γ0

m(t), γ01(t), . . . , γ0

m(t)) := (x1 γ0(t), . . . , xm γ0(t), v1 γ0(t), . . . , vm γ0(t))

satisfies the Euler-Lagrange equations. Here, as above, (x1, . . . , xm, v1, . . . , vm) : TU → Rm × Rm is thecoordinate chart on the tangent bundle TM associated with the chart (x1, . . . , xn) : U → Rm on the manifoldM .

Proof. The only possible concern is that the image of a variation γs of our curve γ0 lies outside the domainU of our coordinate chart. But we only care about γs for s small, and for small values of the parameter sthe variation γs(t) is close to γ0(t), hence lies in U .

From Propositions 13.5 and 13.6 we deduce:

Corollary 13.6.1. Let M be a manifold with a Lagrangian L. A path γ0 : [a, b] → M lying inside acoordinate chart on M is a geodesic for a Riemannian metric g if and only if γ0 is critical for the energyLagrangian L(x, v) = 1

2gx(v, v).

What about L-critical paths whose images cannot be covered by a single coordinate chart? Suppose γ :[a, b]→M is L-critical and for some time t0 the point γ(t0) lies in a coordinate chart (x1, . . . , xm) : U → Rm.Then γ([a′, b′]) ⊂ U for some subinterval [a′, b′] ⊂ [a, b] containing t0. Any variation of γ|[a′,b′] is a variationof γ. Hence γ|[a′,b′] is also L-critical. Therefore it satisfies Euler-Lagrange equations in the chart U . Inparticular, if γ is critical for the energy Lagrangian, then γ is a geodesic in every coordinate chart, hence ageodesic. This proves one global direction of Theorem 13.2, as promised.

89

The converse is true as well, but this requires a coordinate-free description of L-critical curves which wedon’t have time for.

90

an introduction to differential geometry contentslerman/518/f11/8-19-11.pdf · 8/19/2011 · an...

Documents