stochastic analysis on manifolds - math.bu.edumath.bu.edu/people/prakashb/math/stochmani.pdf ·...

Stochastic Analysis on Manifolds

Prakash Balachandran

Department of Mathematics

Duke University

September 21, 2008

These notes are based on Hsu’s Stochastic Analysis on Manifolds, Kobayahi and Nomizu’s Foundations

of Differential Geometry Volume I, and Lee’s Introduction to Smooth Manifolds and Riemannian Mani-

folds.

1 Chapter 1

1.1 SDE’s on Euclidean Space

1.1.1 A Review of Basic Results

This section reviews the theory of SDE’s in Euclidean space. All theorems are presented without proofs.

See Hsu chapter 1.1 for details.

Throughout this section, we concern ourselves with the SDE:

Xt = X0 +∫ t

0σ(Xs)dZt (1)

where Zt is an Rn valued F∗-semimartingale [henceforth, this always means adapted to the filtration

F∗ = Ftt≥0].

A classical result of the existence and uniqueness of a solution to this SDE is given by

Theorem 1 Suppose that σ is globally Lipschitz and X0 is square integrable. Then, the SDE (1) has a

unique solution X = X0 : t ≥ 0

For a proof, see Hsu, page 9.

1

In general, when σ is locally Lipschitz, we have to allow the possibility of explosion.

Example 1: Consider the SDE:

dXt = X2t dt, X0 = 1.

Clearly, this has the solution Xt = 11−t which explodes at time t = 1.

To take care of this case and transfer the setting to manifolds, we establish some notation and introduce

some definitions.

Definition: Let M be a locally compact metric space, and let M = M ∪ ∂M be the one-point com-

pactification of M .

An M -valued path Xt with exploding time e = e > 0 is a continuous map X : [0,∞) → M such that

Xt ∈M for 0 ≤ t < e and Xt = ∂M for all t ≥ e if e <∞.

The space ofM -valued paths with explosion time is called the path space ofM and is denoted byW (M).

When we consider processes Xt(ω), we can define e(ω) pathwise, but it’s not immediately clear that e

is a stopping time.

For our purposes, recall that an exhaustion of a locally compact metric spaceM is a sequence of relatively

compact open sets UN such that UN ⊆ UN+1 and M =⋃∞N=1 UN .

Theorem 2 1. Let UN be an exhaustion, and τUN be the first exit time from UN . Then, τUN ↑ e as

N ↑ ∞.

2. Suppose that d : M ×M → R+ is a metric on M with the property that every bounded closed

set is compact. Fix a point o ∈ M . Let τR be the first exit time from the ball B(R) = x ∈ M :

d(x, o) ≤ R. Then, τR ↑ e as R ↑ ∞.

Part 1 of theorem 2 guarantees us that the explosion time of a process taking values in a Riemannian

manifold is indeed a stopping time (use the distance function on the manifold to define the exhaustion

UN).

We will use part 2 of theorem 2 in two situations:

1. M is a complete Riemannian manifold and d is the Riemannian distance function

2. M is embedded as a closed submanifold in another complete Riemannian manifold, and d is the

Riemannian metric of the ambient space.

2

Finally, we define what it means to be a semimartingale up to a stopping time, what it means for a process

to solve an SDE up to a stopping time, and give another existence and uniqueness result.

Definition: Let (Ω,F∗, P ) be a filtered probability space and τ be an F∗-stopping time. A continuous

process X defined on the stochastic time interval [0, τ) is called an F∗-semimartingale up to τ if there

exists a sequence of F∗-stopping times τn ↑ τ such that for each n, the stopped processes Xt∧τn are

semimartingales.

Definition: A semimartingaleX up to a stopping time τ is a solution of the SDE (1) if there is a sequence

of stopping times τn ↑ τ such that for each n the stopped process Xτn = Xt∧τn is a semimartingale, and

Xt∧τn = X0 +∫ t∧τn

0σ(Xs)sZs, llt ≥ 0.

Theorem 3 Suppose that we are given:

1. A locally Lipschitz coefficient matrix σ : RN →M(N,n)

2. An Rn valued F∗-semimartingale Z = Zt : t ≥ 0 on a filtered probability space (Ω,F∗, P )

3. An Rn valued F0-measurable random variable X0.

Then, there is a unique W(RN)-valued random variable X which is a solution to (1) up to its explosion

time e.

Theorem 4 Suppose that σ is locally Lipschitz. Let X and Y be two solutions of the SDE (1) up to

stopping times τ and η respectively. Then, Xt = Yt for 0 ≤ t < τ ∧ η. In particular, if X is a solution

up to its explosion time e, then η ≤ e and Xt = Yt for 0 ≤ t < η.

Theorem 5 Suppose that σ and b are locally Lipschitz. Then the weak uniqueness property holds for the

SDE

Xt = X0 +∫ t

0σ(Xs)dWs +

∫ t

0b(Xs)ds;

That is, if X is the solution of

Xt = X0 +∫ t

0σ(Xs

)dWs +

∫ t

0b(Xs

)ds,

where Ws is a Brownian motion defined on another filtered probability space (Ω, F∗, P ) and X0 ∈ F0

has the same law as X0, then X and X have the same law.

Computationally, the following result is useful:

Proposition 1 Suppose that Z is defined on [0,∞). If σ is locally Lipschitz, and there is a constant C

such that

|σ(x)| ≤ C(1 + |x|)

3

then the solution of (1) does not explode.

Finally, we give the existence and uniqueness result of a solution to (1) when Z is a semimartingale

defined up to a stopping time τ :

Theorem 6 Let Z be a semimartingale defined up to a stopping time τ . Then, there is a unique solution

X to the SDE (1) up to the stopping time eX ∧ τ . If Y is another solution up to a stopping time η ≤ τ

then η ≤ eX ∧ τ and Xt = Yt for 0 ≤ t < eX ∧ η.

1.1.2 Stratonovich vs. Ito

In this section, we briefly review the difference between the Ito formalism and the Stratonovich formalism

of the stochastic integral, and proceed to determine how they are equivalent. Throughout, we take as

given the Ito machinery, and proceed to derive the equivalent Stratonovich machinery.

The Ito integral of a left continuous predictable function H w.r.t. a continuous semimartingale Z =

M + A is roughly defined to be a left-endpointed integral of the paths of H over those of Z (of course

care must be taken when considering the stopping times defining the continuous local martingale M ).

The Stratonovich integral is like the Ito integral except that the integral is taken w.r.t. midpoints. More

precisely, the two are related by:∫ t

0Hs(ω) dZs(ω) =

∫ t

0Hs(ω)dZs(ω) +

12〈H,Z〉s

where 〈H,Z〉s denotes the covariance process (otherwise known as the bracket of H and Z).

Now, consider the Stratonovich SDE (Note: we employ Einstein summation notation throughout; that is,

repeated indices are always summed over):

Xt = X0 +∫ t

0Vα(Xs) dZαs (2)

where X0 is an RN -valued random variable, Zs = (Zαs )nα=1 is an Rn-valued semimartingale, and V =

Vαnα=1 are n smooth vector fields on RN .

How do we recast this SDE in terms of the Ito integral? Well, in components:

Xit = Xi

0 +∫ t

0V iα(Xs) dZαs = Xi

0 +∫ t

0V iα(Xs)dZαs +

12⟨V iα(X), Zα

⟩s.

Using Ito’s formula:

V iα(Xt) = V i

α(X0) +∫ t

0

∂V iα

∂xk(Xs)dXk

s +∫ t

0

∂2V iα

∂xk∂xl(Xs)d

⟨Xk, X l

⟩s

4

so that:

Xit = Xi

0+∫ t

0V iα(Xs)dZαs +

12

⟨V iα(X0) +

∫ t

0

∂V iα

∂xk(Xs)dXk

s +∫ t

0

∂2V iα

∂xk∂xl(Xs)d

⟨Xk, X l

⟩s, Zα

⟩s

= Xi0 +

∫ t

0V iα(Xs)dZαs +

12

⟨∫ t

0

∂V iα

∂xk(Xs)dXk

s , Zα

⟩s

= Xi0 +

∫ t

0V iα(Xs)dZαs +

12

⟨∫ t

0

∂V iα

∂xk(Xs)V k

β (Xs) dZβs , Zα⟩s

= Xi0 +

∫ t

0V iα(Xs)dZαs +

12

⟨∫ t

0

∂V iα

∂xk(Xs)V k

β (Xs)dZβs ,∫ t

0dZα

⟩s

= Xi0 +

∫ t

0V iα(Xs)dZαs +

12

∫ t

0

∂V iα

∂xk(Xs)V k

β (Xs)d⟨Zβ, Zα

⟩s

Recalling the definition of the Euclidean connection:(∇VβVα

)i

= Vβ(V iα) = V k

β∂V iα∂xk

, we have finally:

Xit = Xi

0 +∫ t

0V iα(Xs)dZαs +

12

∫ t

0

(∇VβVα

)i(Xs)d

⟨Zα, Zβ

⟩s

(3)

⇔ Xt = X0 +∫ t

0Vα(Xs)dZα +

12

∫ t

0∇VβVαd

⟨Zα, Zβ

⟩s

(4)

Thus, the equations (2) and (4) are equivalent.

Proposition 2 Let X be a solution to the equation (2), and f ∈ C2(Rd). Then:

f(Xt) = f(X0) +∫ t

0Vαf(Xs) dZαs , 0 ≤ t < eX .

Proof of Proposition 2: From the preceding argument, we may assume the equivalence of the SDE’s (2)

and (4), and hence may assume (3).

Ito’s formula implies:

df(Xs) =∂f

∂xi(Xs)dXi

s +12

∂2f

∂xi∂xj(Xs)d

⟨Xi, Xj

⟩s

=∂f

∂xi(Xs)

[V iα(Xs)dZαs +

12(∇VβVα

)i(Xs)d

⟨Zα, Zβ

⟩s

]+

12

∂2f

∂xi∂xj(Xs)V i

α(Xs)Vjβ (Xs)d

⟨Zα, Zβ

⟩s

= V iα(Xs)

∂f

∂xi(Xs)dZαs +

12V jβ (Xs)

∂V iα

∂xj(Xs)

∂f

∂xi(Xs)d

⟨Zα, Zβ

⟩s+

12

∂2f

∂xi∂xj(Xs)V i

α(Xs)Vjβ (Xs)d

⟨Zα, Zβ

⟩s

= V iα(Xs)

∂f

∂xi(Xs)dZαs +

12V jβ (Xs)

(∂

∂xj

(V iα(Xs)

∂f

∂xi(Xs)

)− V i

α(Xs)∂2f

∂xi∂xj(Xs)

)d⟨Zα, Zβ

⟩s

+12

∂2f

∂xi∂xj(Xs)V i

α(Xs)Vjβ (Xs)d

⟨Zα, Zβ

⟩s

5

= V iα(Xs)

∂f

∂xi(Xs)dZαs +

12V jβ (Xs)

∂

∂xj

(V iα(Xs)

∂f

∂xi(Xs)

)d⟨Zα, Zβ

⟩s

where in the fourth line, we have used the product rule on V iα∂f∂xi

. Comparing this final expression with

(4), and backtracking through the proof of it, we find that upon integrating from 0 to t:

f(Xt) = f(X0) +∫ t

0Vαf(Xs) dZαs , 0 ≤ t < eX .

6

1.2 SDE’s on manifolds

Recall that in geometry, we always define objects on a manifold through coordinate charts.

1. LetM be a smooth, connected, manifold of dimension m. We define a function f : M → R to

be smooth at p ∈ M if there exists a coordinate chart (Uα, φα) such that p ∈ Uα and f φ−1α :

Rm → R is smooth at φ(p).

f is smooth on an open set U ⊆M if f is smooth at every point p ∈ U .

Clearly, if this definition holds in any one coordinate chart, it holds in all coordinate charts within

the given differentiable structure of the manifold.

2. In addition to the notation above, let N is an n dimensional, connected, smooth manifold. We say

that a map Φ :M→ N is smooth at p ∈ M is there exists coordinate charts (Uα, φα), (Vβ, ψβ)

ofM and N respectively, such that Φ(p) ∈ Vβ and the map: ψβ f φ−1α : Rm → Rn is smooth

at φα(p).

Again, it’s clear that this definition is invariant of the charts chosen [within respective differentiable

structures].

Thus, it’s clear that we should make the definition:

Definition 1 Let M be a differentiable manifold and (Ω,F∗, P ) be a filtered probability space with

filtration F∗. Let τ be a F∗-stopping time. A continuous,M-valued process X defined on [0, τ) is called

anM-valued semimartingale if f(X) is a real-valued semimartingale on [0, τ) for all f ∈ C∞(M).

It’s easy to see that when M = Rn, this definition gives the usual definition of an Rm valued semi-

martingale.

A stochastic differential equation on a manifoldM is defined by n vector fields V1, . . . , Vn onM, a Rn-

valued semimartingale Z, and anM-valued random variable X0 ∈ F0 serving as an initial condition.

We write the equation symbolically as:

dXt = Vα(Xt) dZαs . (5)

In light of Proposition 2, we make the following definition:

Definition: AnM-valued semimartingale X defined up to a stopping time τ is a solution of the SDE (5)

up to τ if for all f ∈ C∞(M),

f(Xt) = f(X0) +∫ t

0Vαf(Xs) dZαs , 0 ≤ t < τ.

7

The advantage of the Stratonovich formulation is that stochastic differential equations on manifolds in

this formalism transform consistently under diffeomorphisms between manifolds.

Proposition 3 Suppose that Φ :M→N is a diffeomorphism and X is a solution of the SDE (5). Then,

Φ(X) is a solution of the SDE:

dYt = Φ∗Vα dZαs , Y0 = Φ(X0). (6)

Proof: Let Y = Φ(X), and f ∈ C∞(N ). Then, Φ∗f = f Φ ∈ C∞(M). Since X solves (5), we have

that

(f Φ)(Xt) = (f Φ)(X0) +∫ t

0Vα(f Φ)(Xs) dZαs

⇒ f(Yt) = f(Y0) +∫ t

0(Φ∗Vα)f(Ys) dZαs .

Hence, since f was arbitrary, Y = Φ(X) is a solution to the SDE (6).

[Insert remark about Ito integral here]

Now, we want a unique solution to exist for the SDE (5). To see that this is in fact true, we reduce the

equation to an SDE in Euclidean space by embedding the manifold into RN for some N . Then, we will

use the results stated in section 1 about the existence and uniqueness of SDE’s in Euclidean space.

Towards this end, we have:

Theorem 7 (Whitney’s embedding theorem) Suppose thatM is a differentiable manifold. Then, there

exists an embedding i :M→ RN for some N such that the image i(M) is a closed subset of RN .

It is well-known that N = 2 dimM+ 1 will suffice.

Now, fix an embedding ofM into RN and regardM as a closed submanifold of RN . By the smooth

Urysohn lemma:

Proposition 4 Let U ⊆ M be an open subset of a smooth manifold and K ⊆ U a subset that is closed

inM. Then, there is a smooth function f :M→ R such that f |K ≡ 1 and supp(f) ⊂ U .

We identifyM with its embedding i(M) ⊆ RN .

Now, we can ”fatten” the manifold M slightly so that each Vα is smoothly defined on the same open

subset U containingM. Applying proposition 4, we obtain a smooth function f : RN → R such that

f |M ≡ 1 and supp(f) ⊆ U , and hence obtain vector fields Vα = fVα defined on all of RN .

8

From section 1, we know that the equation

Xt = X0 +∫ t

0Vα(Xs) dZαs (7)

on RN has a unique solution X up to its exploding time eX .

We need to know that ifX starts onM then it never leavesM, which makes sense since the vector fields

Vα are tangent toM onM. This is given by:

Proposition 5 Let X be the solution of the extended equation up to its explosion time eX , and let X0 ∈M. Then, Xt ∈M for 0 ≤ t < eX .

See Hsu page 22-23 for the proof.

Finally, we proceed to prove the existence and uniqueness result:

Theorem 8 There is a unique solution of the SDE (5) up to its explosion time.

Proof of theorem 8: By proposition 5, the solution to the extended SDE (7) stays in M up to its

exploding time.

To show that this solves SDE (5), we prove the following

Lemma 1 Suppose thatM is a closed submanifold of RN . Let f1, . . . , fN be the coordinate functions.

Let X be anM-valued continuous process.

1. X is a semimartingale onM iff it is an RN -valued semimartingale, or equivalently, iff f i(X) is a

real-valued semimartingale for each i = 1, . . . , N .

2. X is a solution of the SDE (5) up to a sotpping time σ iff for each i = 1, . . . , N ,

f i(Xt) = f i(X0) +∫ t

0Vαf

i(Xs) dZαs , 0 ≤ t < σ (8)

Proof of Lemma 1:

1) Suppose thatX is anM-valued semimartingale. Each f i is a smooth function onM, so that f i(X) =

Xi is a real-valued semimartingale. Thus, X is a RN -valued semimartingale.

Conversely, suppose that X lives onM and is a RN -valued semimartingale. SinceM ⊆ RN is closed,

we may use Urysohn’s lemma to extend f ∈ C∞(M) to f ∈ C∞(RN). Hence, f(X) = f(X) is a

real-value semimartingale, and so by definition, X is anM-valued semimartingale.

2) If X is a solution to SDE (5), then (8) must hold since f i ∈ C∞(M.

If (8) holds, and f ∈ C∞(M), take an extension f ∈ C∞(RN)

of f . Then,

f(Xt) = f(f1 (Xt) , . . . , fN (Xt)

).

9

Thus, by Ito’s formula:

df(Xt) =∂f

∂xi(f1(Xt), . . . , fN (Xt)) df i(Xt)

=∂f

∂xi(f1(Xt), . . . , fN (Xt)) Vαf i(Xt) dZαt

=∂f

∂xi(X1

t , . . . , XNt )Vαf i(Xt) dZαt

= Vαf(Xt) dZα

where in the last step, we have used the chain rule for differentiating composite functions. 4

Since (7) is a rewriting of (8), lemma 1 implies that X is a solution of SDE (5).

If Y is anoher solution up to a stopping time τ , then as a RN -valued semimartingale, it also solves SDE

(7) up to τ . By the uniqueness statement of theorem 4, Y must coincide with X on [0, τ).

Existence and uniqueness of solutions to SDE (5) will be very important in the stochastic lifting process

in Chapter 2 of Hsu.

10

1.3 Diffusion Processes

This section briefly discusses diffusion processes, which will take a prominent role in chapter 3 of Hsu

when we discuss Brownian motion on manifolds.

Throughout this section L denotes a smooth second order elliptic, but not necessarily nondegenerate,

differential operator on a differentiable manifoldM. If f ∈ C2(M) and ω ∈W (M), we let:

Mf (ω)t = f(ωt)− f(ω0)−∫ t

0Lf(ωs)ds, 0 ≤ t < eω.

Definition:

1. An F∗-adapted stochastic process X : Ω → W (M) defined on a filtered probability space

(Ω,F∗, P ) is called a diffusion process generated by L (or simple an L-diffusion) if X is an

M-valued F∗-semimartingale up to eX and

M t(X)t = f(Xt)− f(X0)−∫ t

0Lf(Xs)ds, 0 ≤ t < eX

is a local F∗-martingale for all f ∈ C∞(M).

2. A probability measure µ on the standard filtered space (W (M),B(W (M))∗) is called a diffusionmeasure generated by L (or simply an L-diffusion measure) if

Mf (ω)t = f(ωt)− f(ω0)−∫ t

0Lf(ωs)ds, 0 ≤ t < eω

is a local B(W (M))∗-martingale for all f ∈ C∞(M).

Remark: While at first it appears that the measure µ doesn’t appear anywhere in this definition,

notice that that the property of being a B(W (M))∗-martingale depends on the measure, since

different measure might be defined on different σ-algebras.

For a given L, the relation between L-diffusion measures and L-diffusion processes is as follows. If

X is an L-diffusion, then its law µX =) X−1 is an L-diffusion measure. Conversely, if µ is an L-

diffusion measure on W (M), then the coordinate process X(ω)t = ωt on (W (M),B(W (M))∗, µ) is

an L-diffusion process.

The main result of use is that given a smooth second order elliptic operator and a probability distribution

µ0 onM, there exists a unique L-diffusion measure whose initial distribution if µ0.

As a preview of chapter 3, we note that this will be used in the following case:

Let M be a Riemannian manifold equipped with the Levi-Civita connection ∇ and Laplace-Beltrami

operator 4M onM (which is a second order elliptic operator). Given a probability measure µ onM,

the above implies that there exists a unique 4M2 -diffusion measure Pµ on the filtered measure space

11

(W (M),B(W (M))∗) (the path space over M). Any 4M2 -diffusion measure on W (M) is called a

Wiener measure on W (M).

If we define anM-valued stochastic process to be a measurable map X : Ω → W (M) where (Ω,F)

is some probability space, Brownian motion on M is defined to be any M-valued stochastic process

X whose law is a Wiener measure on the space space W (M). In particular, it can be shown that any4M

2 -diffusion process is a Brownian motion.

12

2 Chapter 2

2.1 Frame Bundles and Connections

First, recall the definition of a vector bundle:

Definition: LetM be a smoothm-manifold, E a smooth manifold of dimensionm+n, and π : E →M

a smooth map. This will be called an n-plane bundle over M (or a vector bundle over M of fiber

dimension n) if the following properties hold:

1) For each x ∈M , Ex = π−1(x) has the structure of a real, n-dimensional vector space.

2) There is an open cover Wjj∈J of M together with the commutative diagram

π−1(Wj)ψj //

π

Wj × Rn

p1

Wj

IdM// Wj

such that ψj is a diffeomorphism ∀j ∈ J , where p1 is the projection onto the first factor.

3) For each j ∈ J and x ∈ Wj , ψjx = ψj |Ex maps the vector space Ex isomorphically onto the vector

space x × Rn.

Now consider the definition:

Definition: Let M be a manifold, and G be a Lie group. A (differentiable) principal fiber bundle overM with group G consists of a manifold P and an action G on P satisfying the following conditions:

1) G acts freely on P on the right: (u, a) ∈ P × G → ua = Rau ∈ P. (freely just means that ua = u

implies that a = I; i.e. only the identity element fixes any u ∈ P )

2) M is the quotient space of P be the equivalence relation induced by G, M = P/G, and the canonical

projection π : P →M is differentiable.

3) P is locally trivial. That is, every point x ∈M has a neighborhood U such that π−1(U) is isomorphic

with U × G in the sense that there is a diffeomorphism ψ : π−1(U) → U × G such that ψ(u) =

(π(u), φ(u)) where φ is a mapping of π−1(U) into G satisfying φ(ua) = φ(u)a for all u ∈ π−1(U) and

a ∈ G; i.e. the following diagram commutes:

π−1(U)ψ //

π

U ×Gp1

U

IdM// U

13

(1) and (2) imply that the fibers are just the orbits of the action of G on P . (3) implies that the fiber over

a point x ∈ M is identified with x ×G through a local trivialization, and more importantly, says that

the action of G on P maps the orbits onto themselves. It also provides a differentiable structure for P .

Thus, a principal fiber bundle is just a generalization of a vector bundle.

We call a principal fiber bundle over M with group G a G-bundle.

Now, letM be an m-dimensional manifold, and consider the GL(m)-bundle overM.

Definition: A frame at x ∈M is an R-linear isomorphism u : Rm → TxM. We denote the space of all

frames at x ∈M by F(M)x.

Since everyM ∈ GL(m) gives such an isomorphism to the tangent space TxM through its columns, we

find that a GL(m)-bundle overM is actually a frame bundle. That is, if P denotes the GL(m)-bundle,

P = F(M) =⋃x∈M

F(M)x.

Now, the determinant function is a continuous function from M(m) → R being a polynomial in the

entries of a matrix. Since GL(m) = det−1(R−0), we have that GL(m) ⊆M(m) is open, and hence

is an open submanifold of M(m).

Since this space has dimension m2, we find that the local trivializing property of the frame bundle makes

the frame bundle into a manifold of dimension m+m2.

Before we can proceed further, we recall the definition of a connection ∇ on a vector bundle E. Let

Γ(E) denote smooth sections of the vector bundle E; that is, all smooth maps f : M → E such that

π f = id|M.

Definition: A connection∇ is a map

∇ : Γ(E)× Γ(E)→ Γ(E)

such that for all X,Y, Z ∈ Γ(E) and f, g ∈ C∞(M):

1. ∇fX+gY Z = f∇XZ + g∇Y Z

2. ∇X(Y + Z) = ∇XY +∇XZ.

3. ∇X(fY ) = X(f)Y + f∇XY.

In particular, when E = TM , ∇XY is called the covariant differentiation of Y along X .

In local coordinates, we may write X = Xi∂i, Y = Y i∂i. We write

∇∂i∂j = Γkij∂k

14

and refer to Γkij as the Christoffel symbols.

Furthermore, it can be shown that (∇XY )p depends only on the value X(p) and the values of Y along a

curve γ through p tangent to X(p).

Proposition 6 Let∇ be a linear connection onM. For each curve γ : I →M,∇ determined a unique

operator

Dt : Γ(γ(I))→ Γ(γ(I))

satisfying:

1) Dt(aV + bW ) = aDtV + bDtW .

2) Dt(fV ) = fV + fDtV ∀f ∈ C∞(I).

3) If V is extendible, then for any extension V of V :

DtV = ∇γ(t)V .

For any V ∈ Γ(γ(I)), DtV is called the covariant derivative of V along γ.

The unique vector field guaranteed by the operator in a coordinate chart is specified by

DtV (t0) = V j(t0)∂j + V j(t0)∇γ(t0)∂j = (V k(t0) + V j(t0)γi(t0)Γkij(γ(t0)))∂k.

Definiton: A vector field V along γ is said to be parallel if DtV = 0. A curve γ onM is said to be a

geodesic if Dtγ = 0.

Now, letM be a manifold equipped with a connection∇ on the tangent bundle TM.

Definition: A vector X ∈ TuF(M) is called vertical if X is tangent to the fiber F(M))π(u). The space

of all vertical vectors at u is denoted by VuF(M).

The local trivializing property of the frame bundle implies that for any trivializing neighborhood U of

x ∈M:

x ×GL(m) ⊆ U ×GL(m) ∼= π−1(U).

The latter is a submanifold of F(M) since we use U ×GL(m) to give F(M) a differentiable structure.

Thus, the above inclusion is in fact an inclusion of a submanifold, so that VuF(M) ⊆ TuF(M) is a

subspace of dimension m2, and we have the direct sum decomposition:

TuF(M) = VuF(M)⊕HuF(M)

with the space HuF(M) yet to be determined.

15

To determine what HuF(M) is, note that a curve u(t) in F(M) is just a smooth choice of frames at

each point of the curve π(u(t)) onM.

Definition: The curve u(t) above is called horizontal if for each r ∈ Rm, the vector field u(t)e is

parallel along π(u(t)). A tangent vector X ∈ TuF(M) is called horizontal if it is the tangent vector of

a horizontal curve.

It can be shown that the space of horizontal vectors at u is precisely HuF(M), which is isomorphic to

Tπ(u)M:

The smooth maps: p1 : U × GL(m) → U , ψ : π−1(U) → U × GL(m) given by a locally trivializ-

ing neighborhood compose to give the linear surjection (p1 ψ)|u : TuF(M) → Tπ(u)U . Since the

kernal of this map is clearly VuF(M) the first isomorphism theorem and the direct sum decomposition

TuF(M) = VuF(M)⊕HuF(M) give HuF(M) ∼= Tπ(u)M.

Via this isomorphism, for each X ∈ TxM and from u at x, there is a unique horizontal vector X∗, the

horizontal lift of X to u, such that π∗X∗ = X . It’s easily shown that horizontal vectors are contained

in HuF(M), and hence form a subspace of dimension less than or equal to m, so that he above linear

injection of TxM into the space of horizontal vectors is in fact an isomorphism.

Given a curve γ(t) and a frame u0 at γ(0), there is a unique horizontal curve u(t) such that π(u(t)) =

γ(t). One obtains this curve by parallel translating the frame at γ(0) along γ. This is called the horizon-tal lift of γ(t) from u0.

Now, for each e ∈ Rm, ue ∈ Tπ(u)M. Thus, we may take the geodesic γ(t) through π(u) determined

by ue and obtain the horizontal lift u(t) of γ(t) from u. Defining:

He(u) = u(0)

we find that He is a horizontal vector field on F(M). Let e1, . . . , em be the coordinate unit vectors

in Rm. Then, Hi = Hei i = 1, . . . ,m are the fundamental horizontal fields of F(M). They span

HuF(M) at each u ∈ F(M).

The action of GL(m) on F(M) preserves the fundamental horizontal fields as determined by the

Proposition 7 Let e ∈ Rm and g ∈ GL(m). Then,

g∗He(u) = Hge(u), u ∈ F(M)

where g∗ : TuF(M) → TugF(M) is the action of g on the tangent bundle TF(M) induced by the

canonical action of g on F(M)x defined by u 7→ ug.

16

Now, a local chart x = xi on a neighborhood O ⊆M induces a local chart on O = π−1(O) in F(M)

as follows:

Let Xi = ∂∂xi

, 1 ≤ i ≤ m be the moving frame onM determined by the local chart. For a frame u ∈ Owe have that uei = ejiXj for some matrix e = (eij) ∈ GL(m). Then, (x, e) = (xi, eij) ∈ Rm+m2

is

a local chart for O. In terms of this chart, the vertical subspace VuF(M) is spanned by Xkj = ∂∂ekj

1 ≤ j, k ≤ d and th vetor fields

Xi, Xij : 1 ≤ i, j ≤ d

span TuF(M) for every u ∈ O. In terms of this representation, we have:

Proposition 8 In terms of the local coordinate chart on F(M) described above, at u = (x, e) =

(xi, ekj ) ∈ F(M) we have

Hi(u) = ejiXj − ejielmΓkjl(x)Xkm (9)

where

Xi =∂

∂xi, Xkm =

∂

∂ekm.

Definition: let u(t) be a horizontal lift of a differentiable curve γ(t) onM. Since γ(t) ∈ Tγ(t)M, we

have that u−1t ( ˙γ(t)) ∈ Rm.

The anti-development of γ(t) (or of the horizontal curve u(t) is a curve w(t) in Rm determined by

w(t) =∫ t

0u−1s γ(t)ds.

Note thatw in the above definition depends on the choice of the initial frame u0 at x0 but in a simple way:

if v(t) is another horizontal lift of γ(t) and u0 = v0g for some g ∈ GL(m), then the anti-development

of v(t) is gw(t). Since u(t)w(t) = x(t) we have

Hw(t)(u(t)) = ˜u(t)w(t) = ˜x(t) = u(t).

Thus, the anti-development w(t) and the horizontal lift u(t) of a curve γ(t) onM are connected by the

ODE:

u(t) = Hi(u(t))wi(t)

on F(M).

IfM is a Riemannian manifold with a Riemannian metric, then we can restrict ourselves to the O(m)-

bundle consisting of orthonormal frames. In the case we choose the Levi-Civita connection, or any

connection that’s compatible with the metric, proposition 8 still holds except that the coordinates (x, e)

discussed prior to proposition 8 now give local coordinates of the O(m) bundle.

17

2.2 Tensor Fields

We can realize covariant differentiation in the frame bundleFM . At each frame u, the vectorsXi = uei,

i = 1, . . . ,m form a basis at TxM with x = π(u). Let Xi denote the dual frame on TxM∗. Then, an

(r, s)-tensor θ can be expressed uniquely as:

θ = θi1,···irj1,··· ,jsXi1 ⊗ · · · ⊗Xir ⊗Xj1 ⊗ · · · ⊗Xjs .

Definition: The scalarization of θ at u is defined by

θ(u) = θi1,···irj1,··· ,jsei1 ⊗ · · · ⊗ eir ⊗ ej1 ⊗ · · · ⊗ ejs

where ei is the canonical basis for Rm and ei is the corresponding dual basis.

Thus, if θ is an (r, s)-tensor field on M , then its scalarization

θ : F(M)→ R⊗r ⊗ R∗⊗s

is a vector space-valued function on F(M).

Definition: The function θ defined above is O(m)-equivariant, in the sense that θ(ug) = gθ(u) where

the right action of g is the action of O(m) on the tensor space R⊗r ⊗ R∗⊗s.

The following theorem shows covariant differentiation can be realized on the frame bundle:

Theorem 9 Let X ∈ Γ(TM) and θ ∈ Γ(T r,sM). Then, the scalarization of the covariant derivative

∇Xθ is given by

∇Xθ = X∗θ

where X∗ is the horizontal lift of X .

Proof of Theorem: We begin with the case where θ = Y is a vector field.

Let γ(t) be s mooth curve inM such that γ(0) = Xγ(0) and u(t) a horizontal lift of γ(t). Let τt = utu−10

be the parallel transport along the curve (that is to say, given a vector Y ∈ Tγ(0)M , τt(Y ) is the parallel

transport of Y along γ, which is a vector field along γ).

Lemma 2 ∇XY |γ(0) = ddtτ−1t Yγ(t)|t=0.

Proof of Lemma: Let ei(t) = utei where ei is the ith coordinate unit vector of Rm. Then, ei(t) is

parallel along γ(t).

18

Now let Yγ(t) = ai(t)ei(t), which is just the coordinate representation of Y along γ in the basis ei(t).Then:

∇XY |γ(0) = ∇·γ(t)Y |γ(0) = DtYγ(t)|t=0 = ai(t)ei(t)|t=0 = ai(0)ei(0)

since ei(t) are parallel along γ, and Dt has the Leibnitz property.

Now, Yγ(t) = τt(ai(t)ei(0))⇒ τ−1t Yγ(t) = ai(t)ei(0). Upon differentiation:

d

dtτ−1t Yγ(t)|t=0 = ai(0)ei(0).

Thus:

∇XY |γ(0) =d

dtτ−1t Yγ(t)|t=0. 4

By definition of our horizontal lift, X∗ = u(0). Now, as in Lemma 2, Yγ(t) = ai(t)ei(t) implies that

Yu(t) = ai(t)ei. Thus, u(t)Yu(t) = Yγ(t), so that Y (ut) = u−1t Yγ(t).

Thus,

∇XY |u(0) = u−1(0)∇XY |γ(0) = u−10

d

dtτ−1t Yγ(t)|t=0 = u−1(0)ai(0)ei(0) =

d

dtYu(t) = X∗Y |u(0).

In the general case, do the same above for one forms ω. The extension to the tensor case is straightfor-

ward.

19

3 Chapter 3

3.4: The Distance Function

Using exponential coordinates at a point o ∈ M, we can introduce polar coordinates (r, θ) in a neigh-

borhood of o.

3.4.1: A review of exponential coordinates (Taken from Peterson, 5.4)

For a tangent vector v ∈ TpM, let γv denote the unique geodesic with γv(0) = p and γv(0) = v, and let

[0, Lv) be the nonnegative part of the maximal interval containing 0 on which γ is defined. (recall that

this is a result from second order ODE theory:

A curve γ is called a geodesic if DDt γ = 0, where D

Dt is the covariant derivative along γ. In local

coordinates at p, letting γ(t) =(γ1(t), . . . , γm(t)

), with γ(0) = p, γ(t) = v, we have:

0 =D

Dtγ =

D

Dtγj(t)∂j = γj(t)∂j + γj(t)γk(t)Γijk∂i =

(γj(t) + γl(t)γk(t)Γjlk

)∂j

⇒ γj(t) + γl(t)γk(t)Γjlk = 0, ∀j

The ”patching” necessary to obtain the coordinate-free existence is straightforward).

Noticing that γαv(t) = γv(αt) for all α > 0 and t < Lαv, let Op ⊂ TpM be the set of vectors v such

that 1 < Lv, so that γv(t) is defined on [0, 1]. Then the exponential map at p, exp : TpM → M is

defined by

expp(v) = γv(1) = γ v|v|

(|v|) , v ∈ Op.

A basic result from differential geometry is that the exponential map is locally a diffeomorphism, so that

via the above structure, we obtain a coordinate chart onM at p. It’s clear from the above then, how we

obtain polar coordinates (r, θ) at p ∈M via the exponential map: pick an orthonormal frame Eα on the

unit sphere Sm−1 and define E1 = ∂r where ∂r is the unit vector field such that

∇f = ∂r

where f is the Riemannian distance function f(x) = d(p, x). It can be shown that integral curves to

∂r are precisely geodesics. Furthermore, we have the important result that for any x ∈ M such that

expp(v) = x and x is in the domain for which expp is invertible,

x = γe(d(x, p))

for some unit vector e ∈ TpM. To see this, note that since x is within the domain for which expp is

invertible, γv is the unique geodesic such that ||γv|| = d(x, p). Thus:

d(x, p) = d(γv(1), p) =∫ 1

0

√〈γv(s), γv(s)〉ds.

20

Now, since γv is a geodesic:

d

dt〈γv(t), γv(t)〉 =

⟨γv(t),

D

Dtγv(t)

⟩= 0⇒ 〈γv(t), γv(t)〉 = 〈γv(0), γv(0)〉 = |v|2.

Thus, d(x, p) =∫ 10 |v|ds = |v| so that

x = γv(1) = γ v|v|

(|v|) = γe(d(x, p))

where e = v|v| . Thus, the coordinates determined by the exponential map are isometric ones.

Now, let t(e) be the largest t such that the geodesic γe ([0, t]) is distance minimizing from o to γe(t).

Define

Co = t(e)e : e ∈ ToM, |e| = 1.

Definition: The cutlocus of o is the set Co = expo Co.

The set within the cutlocus is the star-shaped domain

Eo = te ∈ ToM : e ∈ ToM, 0 ≤ t < t(e), |e| = 1.

OnM the set within the cutlocus is Eo = expo Eo.

We have the following results concerning the cutlocus [See Peterson chapter 5].

Theorem 3.4.1

1. The map exp : Eo → Eo is a diffeomorphism.

2. If x ∈ Cy then y ∈ Cx.

3. Eo and Co are disjoint andM = Eo ∪ Co.

Definition: io = inft(e) : e ∈ ToM, |e| = 1 is called the injectivity radius at o.

We have that expoB0(io) is the largest geodesic ball on which the exponential map is a diffeomorphism

onto its image.

Now, this is why we need all this geometry.

Last time, we showed that if X is Brownian motion on M starting within Eo, then before it hits the

cutlocus Co,

r(Xt) = r(X0) + βt +12

∫ t

04Mr(Xs)ds, t < TCo

where βt is one dimensional Brownian motion.

21

Thus, if we can control 4Mr, then we can control the radial process r(Xt) within the cutlocus by

comparing it to a one-dimensional diffusion process.

It turns out that the curvature of the manifoldM influences4Mr within the cutlocus. Thus, we need to

review the notions of curvature on a manifoldM.

3.4.2: A review of curvature on a Riemannian manifold Given a Riemannian manifold M, let ∇denote the Levi-Civita connection onM (A connection ”connects” different tangent spaces in the sense

that it enables one to compare vectors in different tangent spaces. The Levi-Civita connection is a special

kind of connection in that parallel transport along a curve doesn’t change lengths or angles (metric

compatible), and doesn’t ”rotate” vectors (torsion-free)).

Definition: The Riemann curvature tensor is defined by

R(X,Y )Z = ∇X∇Y Z −∇Y∇XZ −∇[X,Y ]Z

where X,Y, Z are vector fields on M (while this is a (1,3) tensor, we can define a (0,4) tensor by

R(X,Y, Z,W ) = 〈R(X,Y )Z,W 〉, which we shall also refer to as the RIemann curvature tensor).

Definition: The sectional curvature is the quadratic form

K(X,Y ) = 〈R(X,Y )Y,X〉

whereX,Y are vector fields onM. IfX and Y are orthogonal unit vectors spanning the two dimensional

plane σ in the tangent space, then we define the sectional curvature of the plane σ to be K(σ) =

K(X,Y ).

Definition: Let X,Y ∈ TpM. Then, the Ricci curvature tensor is defined by

Ric(X,Y ) =m∑j=1

〈R(X,Xj)Xj , Y 〉

where Ximi=1 is an orthonormal basis of TpM. The Ricci curvature alongX is defined byRic(X,X).

Definition: The scalar curvature is defined by

S = 2∑i<j

K(Xi, Xj) =m∑i=1

Ric(Xi, Xi).

Definition: We denote the set of sectional curvatures at x by

KM(x) = K(σ) : σ ⊆ TxM is a 2 plane.

Definition: We denote the set of Ricci curvatures at x by

RicM(x) = Ric(X,X) : X ∈ TxM, |X| = 1.

22

Now, let

κ1(r) ≥ supKM(x) : r(x) = r

κ2(r) ≤ infRicM(x) : r(x) = rd− 1

.

and let Gi be solutions to

G′′i (r) + κi(r)Gi(r) = 0, G(0) = 0, G′(0) = 1.

Theorem 3.4.2 (Laplacian comparison theorem) With the notation established above, the following in-

equalities hold for x within the cutlocus:

(d− 1)G′1(r(x))G1(r(x))

≤ 4Mr(x) ≤ (d− 1)G′2(r(x))G2(r(x))

Proof: Let x ∈M be within the cutlocus of o and let γ be the unique geodesic from o to x. Let Xi(x)be an orthonormal frame at x such that X1(x) = Xr(x). Let Xi be the orthonormal frame along γ

obtained by parallel translation of Xi(x). Then, X1 is the tangent vector field of the geodesic, which

will also be denoted by T .

Claim 1 For any orthonormal basis Xi of TxM we have

4Mf =m∑i=1

∇2f(Xi, Xi)

where∇2f is the Hessian defined by

∇2f(X,Y ) = X(Y f)− (∇XY )f.

Proof of Claim 1: By definition

divX =m∑i=1

〈∇XiX,Xi〉 .

So, letting X = gradf and using metric compatibility:

4Mf = div(gradf) =m∑i=1

〈∇Xigradf,Xi〉 =m∑i=1

[∇Xi 〈gradf,Xi〉 − 〈gradf,∇XiXi〉]

=m∑i=1

[Xi(Xif)−∇XiXi(f)] =m∑i=1

∇2f(Xi, Xi) 4.

By Claim 1,

4Mr(γ(t)) =m∑i=1

∇2r(Xi(γ(t)), Xi(γ(t))).

23

Claim 2 ∇2r(Xi(γ(t)), Xi(γ(t))) = d2

ds2r(ηt(s))|s=0 where ηt(s) is the geodesic such that ηt(0) =

γ(t) and ηt(0) = Xi(γ(t)).

Proof of Claim 1:

24

stochastic analysis on manifolds - math.bu.edumath.bu.edu/people/prakashb/math/stochmani.pdf ·...

Documents