chapter 1

DR

AFT

Chapter 1

INTRODUCTION

1.1 Historical perspective

Classical dynamics was developed and advanced without the use of geometry by Lagrange

(1736-1813) and Hamilton (1805-1865). A geometric view of classical dynamics is based on

the work of Riemann1 (1826-1866), who conceived geometry in n-dimensions2. His work

was subsequently used by Darboux (1842-1917) who treated a dynamical system as a point

in n-dimensional space. This was followed by the development of tensor calculus by Ricci

(1853-1925) and Levi-Civita (1873-1941).

Tensor calculus was not received with great enthusiasm until the advent of the general

relativity theory of Einstein (1879-1955), which made heavy use of geometric methods.

The geometric treatment of dynamics has led to great new insights in the works of

Kolmogorov, Moser, Arnold and others. Other branches of physics (e.g., electromagnetism,

thermodynamics, continuum mechanics) have made use of geometric methods with success.

1.2 Euclidean and Riemannian geometry

Euclid (c.325-c.265) founded geometry by stating and admitting five postulates in his famous

work, the Elements.

I. It is possible to draw a straight line from A to B.

II. It is possible to continue a finite straight line in a straight line.

1student of Gauss2Time as a 4th dimension was explored by D’ Alembert (1717-1783) and Lagrange.

1

DR

AFT

2 Introduction

III. It is possible to describe a circle with any center and radius.

IV. All right angles are equal.

V. If a straight line falling on two other straight lines makes the interior angles on the

same side less than two right angles, then the two line intersect, if extended indefinitely

along that side.

insert figure

An alternative (more well-known) version of the fifth postulate is due to Proclus: given

a point and a straight line that does not contain the point, there is only one straight line

through the point that never intersects the original line.

insert figure

Many centuries of frustration and little progress ensued until Gauss started thinking

about the consequences of doing away with the fifth postulate (he never published his

thoughts). Those that did (albeit with limited success) were Lobachevski (1793-1856) and

Bolyai (1775-1856). Both of their works came in the 1820s but did not gain much attention

until 1867.

insert figure

In Euclidean geometry, the notions of

• distance

• angle

• parallelism of lines

• straightness of lines

ME281 Version: April 5, 2010, 14:18

DR

AFT

Euclidean and Riemannian geometry 3

• parallel translation along a curve

make excellent sense, say on a flat plane. Not all of these notions make sense on a curved

surface, e.g.,

insert figure

Why is parallel transport important? Recall the definition of derivative of a vector

function x(t) at t = t0:

v′(t0) = lim∆t→0

v(t0 + ∆t) − v(t0)

∆t(1.1)

In that definition, it is implicit that we can compare the vectors v(t0) and v(t0 + ∆t)

associated with two distinct points t0, t0 + ∆t.

insert figure

This requires that we parallel transport one to the other–which is trivial for Euclidean

spaces, but less trivial for Riemannian spaces.

Where do we encounter non-flat spaces in mechanics?

Some typical examples are:

(a) Planar pendulum The configuration space is a circle, S1.

(b) Planar double pendulum The configuration space is a two-torus, T2.

(c) Take a time-independent, n-dimensional Hamiltonian system associated with the func-

tion H(qi, pi), i = 1, 2, . . . , n, where

pi = −∂H

∂qi, qi =

∂H

∂pi

(1.2)

(qi: generalized coordinates in Rn, pi: generalized momenta in R

n) Clearly

H =n∑

i=1

(∂H

∂qiqi +

∂H

∂pi

pi

)

= 0 (1.3)

therefore H = constant.

(2n−1-dimensional surface embedded in a 2n-dimensional space) in phase space (qi, pi).

Version: April 5, 2010, 14:18 ME281

DR

AFT

4 1.3. CLASSICAL VIEW OF TENSORS

insert figure

insert figure

(d) Shell structures

Need to determine deformation, strain, stress at every point P .

(e) Curved space-time in general relativity. 4-dimensional spacetime, studied for relativity

by Minkowski (1864-1909) special relativity: spacetime flat general relativity: spacetime

curved (due to gravitation)

1.3 Classical view of tensors

Start with the classical view: a tensor is a mathematical object, whose representation with

respect to one coordinate system is related to its representation with respect to any other

coordinate system by means of a definite transformation law.

The independence from coordinate system choice is very important in formulating “in-

trinsically meaningful” mathematical statements of physical laws. To fix our thoughts, take a

point P in n-dimensional Euclidean point space En and parametrize it by a set of coordinates

{(x1, x2, . . . , xn) , xn ∈ R} which “live” in the associated Euclidean vector space En.

insert figure

Also, take any other set of coordinates {(y1, y2, . . . , yn) , yi ∈ R, i = 1, 2, . . . , n} such that

there exists a C1-diffeomorphism (i.e. a one-to-one C1 function with C1 inverse) y : En → En

yi = yi(x1, x2, . . . , xn) (1.4)

where yi is the i-th component of the function y. There should be open subsets of En

containing (x1, x2, . . . , xn) and (y1, y2, . . . , yn). Also, recall that, by the inverse function


DR

AFT

Classical view of tensors 5

theorem, the existence of a C1 inverse of y is guaranteed if

det

(∂yi

∂xj

)

6= 0 (1.5)

in a neighborhood of P . In such a case, xi = xi (y1, y2, . . . , yn), where xi is the i-th component

of x = y−1. Now, take a scalar function F : En → R and notice that its value depends only

on the particular point P in En and not on the coordinates used to identify this point.

Therefore, there is a function f : En → R such that

F (P ) = f(x1, x2, . . . , xn

)

= (f ◦ x)(y1, y2, . . . , yn

)

= g(y1, y2, . . . , yn

)

(1.6)

where f ◦ x is the composition of f and x, defined by

(f ◦ x)(y1, y2, . . . , yn

)= f(x

(y1, y2, . . . , yn

)) = f

(x1, x2, . . . , xn

)(1.7)

Examples:

(a) Fix Q and define F1(P ) = |PQ| = the Euclidean distance between P and Q

(b) ρ(P ): mass density of a particle that at a fixed time occupies point P.

We can identify F by the totality of “components” f , g, etc. relative to different coordi-

nate systems (x1, x2, . . . , xn), (y1, y2, . . . , yn), etc.

f(x1, x2, . . . , xn

)= g

(y1, y2, . . . , yn

)= · · · (1.8)

related by the transformation laws

g = f ◦ x = · · · (1.9)

Classically, F is a 0th-order tensor in En at a point. Likewise, start with

F (P ) = f(x1, x2, . . . , xn

)= g

(y1, y2, . . . , yn

)(1.10)


DR

AFT

6 Introduction

where f is assumed to be in C1 and take the partial derivatives(

∂f

∂x1 ,∂f

∂x2 , . . . ,∂f

∂xn

). Using the

chain rule and summation convention,

∂g

∂y1=

n∑

j=1

∂f

∂xj

∂xj

∂y1=

∂f

∂xj

∂xj

∂y1

∂g

∂y2=

n∑

j=1

∂f

∂xj

∂xj

∂y2=

∂f

∂xj

∂xj

∂y2

...

∂g

∂yn=

n∑

j=1

∂f

∂xj

∂xj

∂yn=

∂f

∂xj

∂xj

∂yn

(1.11)

or, in more compact notation∂g

∂yi=

∂xj

∂yi

∂f

∂xj(1.12)

The above furnishes a transformation law between the component(

∂f

∂x1 ,∂f

∂x2 , · · · , ∂f

∂xn

)and

(∂g

∂y1 ,∂g

∂y2 , · · · , ∂g

∂yn

)

of “vectors” (or 1st-order “covariant” tensors) in En. Again, this tensor

can be identified by the totality of components related by transformation laws (1.12)

Since the choice of function F is arbitrary, the law (1.12) can be more descriptively

expressed as∂

∂yi=

∂xj

∂yi

∂

∂xj(1.13)

In addition, starting from

yi = yi(x1, x2, . . . , xn

)(1.14)

take the total differential to obtain

dyi =∂yi

∂xjdxj (1.15)

The above furnishes another transformation law between the components (dx1, dx2, . . . , dxn)

and (dy1, dy2, . . . , dyn) of a different type of vectors (or 1st order “contravariant” tensors) at

a point in En. Again, this special tensor can be identified by the totality of its components

related by (1.15).

• (1.13): covariant transformation

• (1.15): contravariant transformation


DR

AFT


We can define covariant, contravariant and mixed tensors of order greater than one. For

instance, a covariant tensor of order r is such that its components relative to the coordinate

systems (x1, x2, . . . , xn) and (y1, y2, . . . , yn) are related by

Ti1,i2,...,ir

(y1, y2, . . . , yn

)=

∂xj1

∂yi1

∂xj2

∂yi2· · · ∂xjr

∂yirSj1,j2,...,jr

(x1, x2, . . . , xn

)(1.16)

Such tensors are often denoted as(0

r

)type. Likewise, for a contravariant tensor of order s,

or a(

s

0

)-type tensor

T i1,i2,...,is(y1, y2, . . . , yn

)=

∂yi1

∂xj1

∂yi2

∂xj2· · · ∂yis

∂xjsSj1,j2,...,js

(x1, x2, . . . , xn

)(1.17)

and for a mixed tensor of type(

s

r

)which is covariant to order r and contravariant to order s

Ti1,i2,...,isj1,j2,...,jr

(y1, y2, . . . , yn

)=

∂yi1

∂xk1

∂yi2

∂xk2

· · · ∂yis

∂xks

∂xℓ1

∂yj1

∂xℓ2

∂yj2· · · ∂xℓr

∂ykrS

k1,k2,...,ks

ℓ1,ℓ2,...,ℓr

(x1, x2, . . . , xn

)

(1.18)

Remarks:

• The above equations involve two types of transformation occurring simultaneously:

(a) the covariant/contravariant transformations of the components, and (b) the coor-

dinate system transformation. As noted in Sokolnikoff, whatever the nature of the

covariant/contravariant transformation, it always depends on the coordinate system

transformation.

• Assume that xi = yi, i = 1, 2, . . . , n, i.e. x is the identity transformation. Then

xi = xi(y1, y2, . . . , yn

)= yi

yj = yj(x1, x2, . . . , xn

)= xj

(1.19)

hence∂xi

∂yj=

{

1 : i = j

0 : i 6= j= δi

j (1.20)

and, also,∂yi

∂xj= δi

j (1.21)

Hence

Ti1,i2,...,isj1,j2,...,jr

(y1, y2, . . . , yn

)= δi1

k1δi2k2

. . . δisks

δℓ1j1

δℓ2j2

. . . δℓs

jsS

k1,k2,...,ks

ℓ1,ℓ2,...,ℓr

(x1, x2, . . . , xn

)

= Si1,i2,...,isj1,j2,...,jr

(x1, x2, . . . , xn

)

= Si1,i2,...,isj1,j2,...,jr

(y1, y2, . . . , yn

)

(1.22)


DR

AFT

8 Introduction

i.e., if the coordinate system transformation is the identity, then so is the tensorial

transformation. The above proof makes use of the substitution property of δij, namely

that

δijS

j = δi1S

1 + δi2S

2 + . . . δinSn = Si (1.23)

• Let

yi = yi(x1, x2, . . . , xn

)

zj = zj(y1, y2, . . . , yn

) (1.24)

be component forms of two coordinate transformation maps y and z

insert figure

such that

zi = zi(y1, y2, . . . , yn

)

= zi(y1

(x1, x2, . . . , xn

), y2

(x1, x2, . . . , xn

), . . . , yn

(x1, x2, . . . , xn

))

= zi(x1, x2, . . . , xn

)

(1.25)

Said differently

zi = (zi ◦ y) (x1, x2, . . . , xn)

= zi (x1, x2, . . . , xn)

}

zi = zi ◦ y (1.26)

It is easy, but tedious, to show that, if

Si1,i2,...,isj1,j2,...,jr

(x1, x2, . . . , xn

) (sr)−−→ T

k1,k2,...,ks

ℓ1,ℓ2,...,ℓr

(y1, y2, . . . , yn

) (sr)−−→ U ℓ1,ℓ2,...,ℓs

m1,m2,...,mr

(z1, z2, . . . , zn

)

(1.27)

In effect this means that the composition of two coordinate system transformations induces

the associated composite tensorial transformation. This property is referred to as an iso-

morphism between the coordinate transformation and the tensorial transformation.

If all components Si1,i2,...,isj1,j2,...,jr

(x1, x2, . . . , xn) = 0 then so are all components relative to any

other coordinate system.

dyj =∂yj

∂xidxi =

∂yj

∂xiF i

AdXA =∂yj

∂xiF i

A

∂XA

∂Y BdyB (1.28)


DR

AFT


Example: The deformation gradient F in continuum mechanics, dx = F dX (in compo-

nents dxi = F iadXA). It maps a

(1

0

)-tensor to another

(1

0

)-tensor.

Tensor algebra can be naturally defined for(

s

r

)tensors. In particular, if

T i1i2...isj1j2...jr

(y1, y2, . . . , yn

)= ∂yi1

∂xk1

∂yi2

∂xk2. . . ∂yis

∂xks

∂xℓ1

∂yj1

∂xℓ2

∂yj2. . . ∂xℓr

∂yjrSk1k2...ks

ℓ1ℓ2...ℓr

(x1, x2, . . . , xn

)(1.29)

and

Qi1i2...isj1j2...jr

(y1, y2, . . . , yn

)= ∂yi1

∂xk1

∂yi2

∂xk2. . . ∂yis

∂xks

∂xℓ1

∂yj1

∂xℓ2

∂yj2. . . ∂xℓr

∂yjrP k1k2...ks

ℓ1ℓ2...ℓr

(x1, x2, . . . , xn

)(1.30)

then

T i1i2...isj1j2...jr

(y1, y2, . . . , yn

)= Qi1i2...is

j1j2...jr

(y1, y2, . . . , yn

)(1.31)

satisfies the tensorial transformation rule. The zero tensor of type(

s

r

)is defined as a

(s

r

)-type

tensor whose components are zero relative to any coordinate system.

Likewise, scalar multiplication preserves the tensorial transformation rule. It is also easy,

but tedious, to establish that the outer product of an(

s

r

)and a

(q

p

)tensor yields a

(s+q

r+p

)-type

tensor. Instead of pursuing the full derivation, take a simple example of a(1

0

)tensor

Bi(y1, y2, . . . , yn

)=

∂yi

∂xjAj

(x1, x2, . . . , xn

)(1.32)

outer-multiplied by a(0

1

)tensor

Dk

(y1, y2, . . . , yn

)=

∂xℓ

∂ykCℓ

(x1, x2, . . . , xn

)(1.33)

Dropping the explicit mention of dependence on coordinate system, take

BiDk =∂yi

∂xj

∂xℓ

∂ykAjCℓ (1.34)

which confirms that the result of the multiplication is a(1

1

)-tensor. The multiplication rule

generalizes readily to products of(

s

r

)and

(q

p

)tensors.

The contraction of a(

s

r

)to a

(s−1

r−1

)tensor is accomplished by equating one covariant and

one contravariant index and summing with respect to it. Resorting, for simplicity to an

example, let the(2

1

)tensor

T i1i2j =

∂yj

∂xk1

∂yi2

∂xk2

∂xℓ

∂yjSk1k2

ℓ (1.35)

be contracted to a(0

1

)tensor as

Tji2j =

∂yj

∂xk1

∂yi2

∂xk2

∂xℓ

∂yjSk1k2

ℓ =∂yi2

∂xk2

δℓk1

Sk1k2

ℓ =∂yi2

∂xk2

Sℓk2

ℓ (1.36)


DR

AFT

10 Introduction

or

T i2 =∂yi2

∂xk2

Sk2 (1.37)

Another example: a(1

1

)tensor is contracted as T i

j → T ii to a scalar (i.e. a

(0

0

)tensor).

The inner product between two tensors involves the contraction of each one of them by

equating one (or more) covariant/contravariant indices. By example, if we have

T i1i2j =

∂yi1

∂xk1

∂yi2

∂xk2

∂xℓ

∂yjSk1k2

ℓ

(2

1

)-tensor (1.38)

Qoℓ1ℓ2

=∂yo

∂xm

∂xn2

∂yℓ1

∂xn2

∂yℓ2P m

n1n2

(1

2

)-tensor (1.39)

then

T i1i2j Q

jℓ1ℓ2

=∂yi1

∂xk1

∂yi2

∂xk2

∂xℓ

∂yj

∂yj

∂xm

︸︷︷︸

δℓm

∂xn2

∂yℓ1

∂xn2

∂yℓ2Sk1k2

ℓ P mn1n2

=∂yi1

∂xk1

∂yi2

∂xk2

∂xn2

∂yℓ1

∂xn2

∂yℓ2Sk1k2

m P mn1n2

(1.40)

A tensor is symmetric in a pair of covariant or contravariant indices if the values of the

components remain the same when the indices are interchanged. For instance,

T i1i2j1

= T i2i1j1

(symmetry in the two contravariant indices)

T i1j1j2j3

= T i1j3j2j1

(symmetry in the first and third covariant indices)

Symmetry is obviously preserved under coordinate transformation (take the difference–

which is zero–and transform). Likewise, a tensor is skew-symmetric in a pair of covariant

or contravariant indices, if the value of the components reverses sign when the indices are

interchanged. For example,

T i1i2j1

= −T i2i1j1

(skew-symmetry in the two contravariant indices)

T i1j1j2j3

= −T i1j3j2j1

(skew-symmetry in the first and third covariant indices)

In all of the discussion of tensors up to this point, no explicit use is made of the Euclidean

structure of the space–in fact, the preceding analysis applies only to a point P ∈ En for which

one can locally find a coordinate system that can describe the point by way of a n-tuple of

reals (x1, x2, . . . , xn).

When one considers En, then the notion of distance between two points P and P′, which

are infinitesimally close to each other, can be given precise mathematical meaning in connec-

tion with the rectangular Cartesian coordinate system and Pythagoras’ formula: take any


DR

AFT


coordinate system blanketing P and P′ and assigning to them coordinates (x1, x2, . . . , xn)

and (x1 + dx1, x2 + dx2, . . . , xn + dxn)–also, let z by a C1 diffeomorphism from {xi} to the

right-handed coordinate system {zi}, namely

zi = zi(x1, x2, . . . , xn

)(1.41)

with C1 inverse x, such that

xi = xi(z1, z2, . . . , zn

)(1.42)

insert figure

Pythagoras’ formula implies that the distance ds between P and P′ in En can be expressed

as

ds =√

dzidzi (1.43)

Recalling that

dzi =∂zi

∂xjdxj (1.44)

the same distance can be also expressed as

ds2 =∂zi

∂xj

∂zi

∂xℓdxjdxℓ

= gjℓdxjdxℓ

(1.45)

where

gjℓ

(x1, x2, . . . , xn

)=

∂zi

∂xj

∂zi

∂xℓ(1.46)

are the components of a symmetric covariant tensor of order 2, called the metric tensor.

Why does the set of all gjℓ constitute a tensor? Since, in a Euclidean space, the distance

ds is invariant (i.e. unchanged by the choice of coordinate system), take a third system {yi},such that

zi = zi(y1, y2, . . . , yn

)= z

(x1, x2, . . . , xn

)(1.47)

and

yi = yi(z1, z2, . . . , zn

)(1.48)

Nowds2 = ∂zi

∂xj∂zi

∂xℓ dxjdxℓ

= ∂zi

∂yj∂zi

∂yℓ dyjdyℓ

}

ds is invariant (1.49)


DR

AFT

12 Introduction

Therefore, take

yi = yi(z1, z2, . . . , zn

)

= yi(z1

(x1, x2, . . . , xn

), z2

(x1, x2, . . . , xn

), . . . , zn

(x1, x2, . . . , xn

))

= yi(x1, x2, . . . , xn

)

(1.50)

and

xi = xi(y1, y2, . . . , yn

)(1.51)

Now, one merely needs to establish that

(∂zi

∂yj

∂zi

∂yℓ

)

=∂xk

∂yj

∂xm

∂yℓ

(∂zi

∂xk

∂zi

∂xm

)

(1.52)

i.e., the tensorial transformation relation for(0

2

)tensors. Indeed,

ds2 =∂zi

∂yj

∂zi

∂yℓdyjdyℓ

=∂zi

∂xk

∂zi

∂xmdxkdxm

=∂zi

∂xk

∂zi

∂xm

(∂xk

∂yjdyj

) (∂xm

∂yℓdyℓ

)

(1.53)

hence (∂zi

∂yj

∂zi

∂yℓ− ∂zi

∂xk

∂zi

∂xm

∂zk

∂yj

∂zm

∂yℓ

)

dyjdyℓ = 0 (1.54)

which produces the desired result.

Returning to our initial finding,

ds2 = dzidzi = gjℓdxjdxℓ (1.55)

for the Euclidean space En, where

gjℓ

(x1, x2, . . . , xn

)=

∂zi

∂xj

∂zi

∂xℓ(1.56)

and note that the metric tensor is defined for the coordinate system {xi} always in connection

with the existing RCC system in En. The following question arises: how about if the

“background” En is unavailable? Said differently: how about if we do not know a priori that

the space is Euclidean? Here, the distance between neighboring points can be defined again

by

ds2 = gij

(x1, x2, . . . , xn

)dxidxj (1.57)


DR

AFT

Suggestions for further reading 13

where gij (x1, x2, . . . , xn) is a given symmetric tensor function of (x1, x2, . . . , xn). In this case,

∂zk

∂xj

∂zk

∂xℓ= gij

(x1, x2, . . . , xn

)(1.58)

is a system of 1 + 2 + . . . + n = 1

2n (n + 1) non-linear partial differential equations for n

unknowns zk (x1, x2, . . . , xn). If the system possesses a solution, then the space is Euclidean

(specifically, it is En). Otherwise, the space is Riemannian and gij (taken to be symmetric

and positive-definite) is called a Riemannian metric.

Note that a(0

2

)-type tensor is positive-definite if its components form a positive-definite

n × n matrix.

1.4 Suggestions for further reading

Section 1.1

[1] F. Cajori. History of Mathematics. Third edition, Chelsea, New York, 1980. [This

book contains an excellent account of the history of pure and applied mathematics up

to the beginning of the twentieth century].

Section 1.2

[1] I.S. Sokolnikoff. Tensor Analysis: Theory and Applications. John Wiley, New

York, 1951. [A most useful reference to the classical treatment of tensors].


chapter 1

Documents