a. elementary tensor analysis - principalliu/papers/cm-app.pdf · a. elementary tensor analysis...

A. Elementary Tensor Analysis

This appendix is intended to provide the mathematical preliminaries neededfor a clear and rigorous presentation of the basic principles in continuummechanics. The reader is expected to be familiar with some notions of vec-tor spaces or matrix algebra. In the first part, we shall review some basicnotions of vector spaces and linear transformations. At the same time, ele-mentary properties of tensors as well as tensor notations will be introduced.All mathematical symbols are set in italic. Generally, scalars are representedby letters (a, b, α, · · ·) of normal typeface, vectors by lowercase boldface letters(u,v, · · ·), and tensors by uppercase letters (T, F, · · ·). The identity tensor isrepresented by 1 (italic ”one”).

A.1 Linear Algebra

We shall consider finite-dimensional real vector spaces only. The field of realnumbers is denoted by IR.

Definition. A vector space V is a set equipped with two operations:

1) v + u ∈ V , called addition of v and u in V ,2) αv ∈ V , called scalar multiplication of v ∈ V by α ∈ IR,

which satisfy the following rules: for any v,u,w ∈ V , and any α, β ∈ IR,

1) v + u = u + v.2) v + (u + w) = (v + u) + w.3) There exists a null vector 0 ∈ V , such that v + 0 = v.4) For any v ∈ V , there exist −v ∈ V , such that v + (−v) = 0.5) α(βv) = (αβ)v.6) (α+ β)v = αv + βv.7) α(v + u) = αv + αu.8) 1v = v.

Definition. A set of vectors {v1, · · · ,vn} is said to be a basis of V , if1) it is a linearly independent set, i.e., for any a1, · · · , an ∈ IR,

if a1v1 + · · ·+ anvn = 0 then a1 = · · · = an = 0.2) it spans the space V , i.e., for any u ∈ V , the vector u can be expressed

as a linear combination of {v1, · · · ,vn}.

2 A. Elementary Tensor Analysis

Let {e1, · · · , en} be a basis of V , then any vector u ∈ V can be expressedas

u =n∑

i=1

uiei,

where ui, called the components of u, are uniquely determined relative to thebasis {ei}.

A vector space can have many different bases, but all of them will havethe same number of elements. The number of elements in a basis is called thedimension of the space, in this case, we have dimV = n.

A.1.1 Inner Product

We may think of a vector as a geometric object that has a length and pointsin a certain direction. To incorporate this notion we introduce an additionalstructure, inner product, into the vector space.

Definition. An inner product is a map

g : V × V → IR

with the following properties: For any u, v, w ∈ V , and α ∈ IR,1) g(u + αv,w) = g(u,w) + αg(v,w),2) g(u,v) = g(v,u),3) g(u,u) > 0, if u 6= 0.

An inner product is a positive-definite symmetric bilinear function on V .We call g(u,v) the inner product of u and v. The vector space equipped withan inner product is called an inner product space. Hereafter, all vector spacesconsidered are always inner product spaces.

Notation. g(u,v) = u · v, if g is given and fixed.

Definition. The norm of a vector v ∈ V is defined as

|v| =√

(v · v).

A vector space equipped with such a norm is called a Euclidean vector space.

The notion of angle between two vectors can be defined based on thefollowing Schwarz inequality:

|u · v| ≤ |u| |v| . (A.1)

Definition. For any non-zero u, v ∈ V , the angle between u and v, θ(u,v) ∈[0, π], is defined by

cos θ(u,v) =u · v|u| |v|

.

A.1 Linear Algebra 3

The vectors u and v are said to be orthogonal if θ(u,v) = π/2. Obviously,u and v are orthogonal if and only if u · v = 0.

A vector v is called a unit vector if |v| = 1. The projection of a vector u onthe vector v can be defined as |u| cos θ(u,v), or as (u · e), where e = v/ |v|is the unit vector in the direction of v. The vector (u · e) e is called theprojection vector of u in the direction of v.

Let {ei, i = 1, · · · , n} be a basis of V . Denote the inner product of ei andej by gij ,

gij = ei · ej .

Clearly, gij is symmetric, gij = gji. Let u = uiei, v = vjej be arbitraryvectors in V expressed in terms of the basis {ei}. Then

u · v = (uiei) · (vjej)

= uivj(ei · ej) = uivjgij ,

oru · v = giju

ivj . (A.2)

Here we have used the following summation convention.

Notation. (Summation convention) In the expression of a term, if an indexis repeated once (and only once), a summation over the range of this indexis assumed.

For example,

uiei =n∑

i=1

uiei,

gijuivj =

n∑i=1

n∑j=1

gijuivj .

Note that in these expressions, we purposely write the indices in two differentlevels so that the repeated summation indices are always one superindex andone subindex. The reason for doing so will become clear in the next section.

A.1.2 Dual Bases

Let {e1, · · · , en} be a basis of V . There exists a non-zero vector orthogonalto the plane spanned by the n − 1 vectors {e2, · · · , en}, and if, in addition,the projection of this vector on e1 is prescribed, then this vector is uniquelydetermined. In this manner, for any given basis {e1, · · · , en}, we can constructa set of vectors {e1, · · · , en} such that

ei · ej = δij ,


where δij is called the Kronecker delta defined by

δij =

{0, if i 6= j,1, if i = j.

From this construction, if v = viei is a vector in V , then by taking the innerproduct with ei we have

ei · v = ei · (vjej) = vjδij = vi.

Hence, the i-th component of v relative to the basis {e1, · · · , en} is its innerproduct with the vector ei. Therefore, this set of vectors {ei} associated withthe basis {ei} can be regarded as linear functions that map a vector to itscomponents.1

We can easily show that this new set of vectors is a linearly independentset. Indeed, if for any linear combination aje

j = 0, then it follows that(aje

j) · ei = ajδji = ai = 0 for all i. Furthermore, it also spans the space V ,

for if u = uiei is a vector in V , then for any vector v = viei, from (A.2) andvj = ej · v,

u · v = gijuivj = (giju

iej) · v,

which implies that u can be expressed as u = uiei with

ui = gijuj .

Therefore we have proved that this set of vectors {ei} is also a basis of V .

Definition. Let β = {ei} and β∗ = {ei} be two bases of V related by theproperty

ei · ej = δij .

They are said to be a pair of dual bases for V , or β∗ is the dual basis of β.

The dual bases are uniquely determined from each other. For this reason,we have used the same notation for their elements except the different levelof indices to distinguish them. Clearly, if u is a vector in V , then we canexpress u in terms of components in two different ways relative to the dualbases,

u = uiei = ujej ,

where we have also employed different levels of component indices in orderto be consistent with our summation convention, which sums over repeated1 In general, the space of all linear functions on V is called the dual space of V and

is denoted by V ∗. In this note, for simplicity, we shall not distinguish vectors inV ∗ and V when the space V is equipped with an inner product.


indices in different levels. We call

ui the i-th contravariant component of u,

uj the j-th covariant component of u.

From the definition, it follows that

ui = ei · u, uj = ej · u, (A.3)

and they are related by

ui = gijuj , ui = gijuj ,

where we have denotedgij = ei · ej .

The two operations

gij : uj 7→ ui, gij : uj 7→ ui,

enable us to lower and raise the component index. One can also show that

ej = gijei, ei = gijej .

Therefore, lowering or raising the index for dual bases can be made in thesame manner. It is easy to verify that

gijgjk = δik.

A basis {ei} is called an orthogonal basis if all the elements of the basisare mutually orthogonal, i.e.,

ei · ej = 0 if i 6= j.

If, in addition, |ei| = 1, for all i, it is called an orthonormal basis. Althoughin general, we carefully do our bookkeeping of super- and subindices, thisbecomes unnecessary if β = {ei} is an orthonormal basis. Since then gij = δij ,and

ei = gijej = δije

j = ei.

That is, the basis β is identical to its dual basis β∗. Hence, we do not haveto distinguish contravariant and covariant components. In this case, we canwrite all the indices at the same level. for example,

v = viei.

Of course, according to our summation convention, we still sum over therepeated indices (now in the same level) in this situation.


Exercise A.1.1 Let β′ = {e1 = (1, 0), e2 = (2, 1)} be a basis of IR2,and v = (1,−1) be a vector in IR2.

1) Find the dual basis {e1, e2} of β′.2) Determine the matrix representations [gij ] and [gij ] relative to β′.3) Determine the contravariant and covariant components of v relative

to the bases and make a graphic representation of the results.

A.1.3 Tensor Product

The notion of matrix is related to linear functions on vector spaces. Let Uand V be two vector spaces with inner product. A function T : U → V , iscalled a linear transformation from U to V , if for any u,v ∈ U and α ∈ IR,

T (u + αv) = T (u) + αT (v).

Notation. L(U, V ) = {T : U → V | T is linear}.

If T and S are two linear transformations in L(U, V ), we can definethe addition T + S and the scalar multiplication αT , as transformations inL(U, V ), in the following manner, for all v ∈ U ,

(T + S)(v) = T (v) + S(v),

(αT )(v) = αT (v).

With these operations the set L(U, V ) becomes a vector space.

Definition. For any vectors v ∈ V and u ∈ U , the tensor product of v andu, denoted by v ⊗u, is defined as a linear transformation from U to V suchthat

(v ⊗ u)(w) = (u ·w)v, (A.4)

for any w ∈ U .

The tensor product of two vectors is a linear transformation. We callsuch a linear transformation a simple tensor. Of course, not every lineartransformation can be obtained as a tensor product of two vectors. However,we can show that, indeed, it can always be expressed as a linear combinationof simple tensors.

Proposition. Let {ei}, i = 1, · · · , n and {dα}, α = 1, · · · ,m be bases of Vand U , respectively. Then the set {ei⊗dα}, i = 1, · · · , n, α = 1, · · · ,m, formsa basis of L(U, V ).

Proof. Let {ei} be the dual basis of {ei} and {dα} the dual basis of {dα}.If aiαei ⊗ dα = 0, then

aiα(ei ⊗ dα)(dβ) = aiα(dα · dβ)ei = aiαδβαei = aiβei = 0,


which implies that aiβ = 0, since {ei} is a basis. Therefore, {ei ⊗ dα} is alinearly independent set. Moreover, for any T ∈ L(U, V ), let

ei · T (dα) = T iα.

Then, for any v ∈ V and any u ∈ U ,

v · T (u) = viei · T (uαdα)

= viuαei · T (dα) = T iαviuα.

On the other hand,

v · (ei ⊗ dα)(u) = vjej · (ei ⊗ dα)(uβdβ)

= vjuβ(ej · ei)(dα · dβ) = viuα.

Therefore, we have

v · T (u) = T iαv · (ei ⊗ dα)(u),

for any v and any u, which leads to

T = T iαei ⊗ dα.

That is, {ei ⊗ dα} spans the space L(U, V ). tu

We may call L(U, V ) the tensor product space of V and U and denote itby V ⊗ U . Obviously, from this result, we have

dimV ⊗ U = (dimV )(dimU).

The basis {ei ⊗ dα} is called a product basis of V ⊗ U . Similarly, the sets{ei ⊗ dα}, {ei ⊗ dα}, and {ei ⊗ dα} are also product bases of V ⊗ U .

Notation. V ⊗ V = L(V ) = L(V, V ).

We shall call linear transformations in L(V ) the second-order tensors.Let {ei} and {ej} be dual bases of V , a second-order tensor T then hasdifferent component forms relative to the different product bases.

T = T ijei⊗ ej = T ijei⊗ ej

= T ji ei⊗ ej = Tije

i⊗ ej ,

where the various components are given by

T ij = ei · Tej ,

Tij = ei · Tej ,

T ij = ei · Tej ,

T ji = ei · Tej .

(A.5)


These components are called the associated components of the second-ordertensor T . In classical tensor analysis, they are also called

T ij contravariant tensor of order 2,

Tij covariant tensor of order 2,

T ij , T

ji mixed tensor of order 2.

They are related by

T ij = gkjT

ik = gikTkj , etc., (A.6)

with the operations of raising or lowering the indices discussed in the previoussection.

The matrices [T ij ], [T ij ], [T j

i ], [Tij ] are called the matrix representationsof T relative to the corresponding product bases. Note that the first indexrefers to the row and the second index refers to the column of the matrix. Itis important to distinguish the level as well as the position order of the com-ponent indices. In general, T i

j 6= T ij , therefore it may cause some confusion

to write T ij with i and j at the same position one on top of the other. The

relation (A.6) can be written in terms of matrix multiplication, in which i isthe row-index of the first matrix and j is the column-index of the last matrix,while summing over the column-index of the first matrix with the row-indexof the subsequent one,[

T ij

]=

[T ik

][gkj

]=

[gik

][Tkj

].

Note that if S, T ∈ L(V ), then the composition S ◦ T , defined as S ◦ T (v) =S(T (v)) for all v ∈ V , is also in L(V ). The composition S ◦ T will be moreconveniently denoted by ST . In terms of components and matrix operation,we have

[(ST )ij ] = [Si

kTkj ] = [Si

k][T kj ].

Example A.1.1 The identity transformation, 1v = v for any v in V ,has the components,

1 = δijei⊗ ej = δ j

i ei⊗ ej = gijei⊗ ej = gijei⊗ ej , (A.7)

since from (A.5), we have

1 ij = ei · 1ej = ei · ej = δi

j ,

1ij = ei · 1ej = ei · ej = gij .

Therefore, the Kronecker deltas are the mixed components of the iden-tity tensor, while gij and gij are just its covariant and contravariantcomponents.

Note that the identity transformation is denoted by 1 (italic “one”).That is, 1 ∈ L(V ), while the number, 1 ∈ IR, is a scalar quantity. tu


Example A.1.2 For v = viei and u = uiei in V, their tensor producthas the component form:

v ⊗ u = viujei ⊗ ej .

Let v = (v1, v2) and u = (u1, u2) be two vectors in IR2, then, relative tothe standard basis of IR2, the matrix of v ⊗ u is given by

[(v ⊗ u)] =[v1u1 v1u2

v2u1 v2u2

].

This product is sometimes referred to as the dyadic product of vectors vand u. tu

In general, the tensor products v⊗u and u⊗v belong to two different spaces,namely V ⊗U and U⊗V , respectively. Even in the case V = U , by definition,v ⊗ u and u⊗ v are different, i.e., the tensor product is not symmetric.

Definition. For A ∈ V ⊗ U , the transpose of A, denoted by AT , is definedas a tensor in U ⊗ V such that

v ·Au = u ·AT v, (A.8)

for any v ∈ V and any u ∈ U .

For A ∈ V ⊗ U and B ∈ U ⊗W , it follows immediately that

(AB)T = BTAT .

Example A.1.3 For simple tensors, it follows that

(v ⊗ u)T = u⊗ v,

because for any w1 ∈ U,w2 ∈ V , we have

w1 · (v ⊗ u)T w2 = w2 · (v ⊗ u)w1

= (w2 · v)(u ·w1) = w1 · (u⊗ v)w2.

tu

Example A.1.4 We have

A(u⊗ v) = Au⊗ v, (u⊗ v)A = u⊗AT v.

Indeed, for any vector w ∈ V , we obtain

A(u⊗ v)w = Au(v ·w) = (Au⊗ v)w,

and

(u⊗ v)Aw = u(v ·Aw) = u(AT v ·w) = (u⊗AT v)w.

tu


If A is a second-order tensor in L(V ), then the components of the transposeAT satisfy the following relations:

(AT )ij = Aji,

(AT )ij = A i

j ,

(AT )ij = Aji,

(AT ) ji = Aj

i.(A.9)

We see from these relations that for contravariant or covariant tensors thematrix of AT is simply the transpose of the matrix of A. However, from thesecond group of the relations in (A.9) for mixed tensors, this is not valid ingeneral, since the matrix transpose of [Ai

j ], by changing rows and columns,is [Aj

i], instead of [A ij ].

A tensor S ∈ L(V ) is called symmetric if ST = S, and is called skew-symmetric if ST = −S. In other words, S is symmetric if v ·Su = u ·Sv andS is skew-symmetric if v · Su = −u · Sv, for all u,v ∈ V .

Notation. Sym(V ) = {S ∈ L(V ) | ST = S} andSkw(V ) = {S ∈ L(V ) | ST = −S}.

Note that both Sym(V ) and Skw(V ) are subspaces of L(V ). If S ∈Sym(V ), then its components satisfy

Sij = Sji, Sij = Sji,

Sij = S i

j = gjk gimSk

m.

In terms of matrix representation we have

[Sij ] = [Sij ]T , [Sij ] = [Sij ]T .

Note that although S is symmetric, the matrix [Sij ] is not symmetric, in

general,[Si

j ] 6= [Sij ]

T .

A second-order tensor can also be regarded as a bilinear function in thefollowing manner: For any A ∈ L(V ), define the function on V × V , alsodenoted by A,

A(u,v) = u ·Av,

for any vectors u and v in V . In particular, for simple tensors, we have

(u⊗ v)(u′,v′) = u′ · (u⊗ v)v′ = (u · u′)(v · v′).

By employing the notion of multilinear functions, we can generalize tensorproducts to higher orders. For example, we can define a tensor product ofthree vectors u, v, and w as a trilinear function on V by

(u⊗ v ⊗w)(u′,v′,w′) = (u · u′)(v · v′)(w ·w′)


for any vectors u′, v′ and w′ in V . One can show as before, that if {ei} is abasis of V , then {ei⊗ej ⊗ek} is a product basis for the space of all trilinearfunctions on V. We shall denote this space as V ⊗V ⊗V and call it the spaceof third-order tensors. If S is a third-order tensor, then

S = Sijkei ⊗ ej ⊗ ek = Sijkei ⊗ ej ⊗ ek = etc.

There are several different component forms relative to the different productbases. In a similar manner, the tensor product of higher orders can be defined.We write

k⊗V =

k times︷︸︸︷V ⊗ · · · ⊗ V

for tensors of order k. Clearly, dimk⊗V = (dimV )k.

Exercise A.1.2 Let β′ = {e1 = (1, 0), e2 = (2, 1)} be a basis of IR2,and T ∈ L(IR2) be defined by

T (x1, x2) = (3x1 + x2, x1 + 2x2), ∀ (x1, x2) ∈ IR2. (A.10)

1) Show that T is a symmetric transformation.2) Determine the matrices of the associated components of T relative

to β′:[Tij ], [T ij ], [T j

i ], [T ij ].

Note that the last two matrices are not symmetric.

A.1.4 Transformation Rules for Components

The components of a tensor relative to a basis are uniquely determined andtheir values depend on the basis. Therefore, if we make a change of basis, theymust change accordingly. In this section, we shall establish the transformationrules for components of tensors under a change of basis.

Consider a change of basis from β = {ei} to another basis β = {ei}given by

ek = M jk ej . (A.11)

We call M jk the transformation matrix for the change of basis from β to β.

By the use of (A.3), we have

M jk = ek · ej ,

from which we can also obtain the relation between the dual bases β∗ = {ei}and β∗ = {ei},

ej = M jk ek. (A.12)


The above two transformation relations (A.11) and (A.12) can be schemati-cally represented by

βM−−−−→ β ,

β∗MT

←−−−− β∗.

In other words, if M changes a basis β to another basis β, their correspondingdual bases β∗ and β∗ are changed in the opposite direction through MT .

The components of a vector transform in a similar manner. Indeed, letv be a vector in V , and

v = vi ei = vi ei

= vj ej = vj ej .

One can easily verify that the transformation rules for the components are

vk = M jk vj , vj = M j

k vk, (A.13)

which look exactly like those for the change of basis (A.11) and (A.12). Inmatrix notations, we have

[vk] = [M jk ] [vj ], [vj ] = [M j

k ]T [vk]

or schematically[vi]

M−−−−→ [vi],

[vi] MT

←−−−− [vi].

That is, the covariant components transform in the same direction as thechange of basis by M , while the contravariant components transform in theopposite direction by MT . This is why such components are called covariantand contravariant in classical tensor analysis, in which tensors are definedthrough their transformation properties.

For a second-order tensor A in L(V ),

A = Aij ei ⊗ ej = Aij ei ⊗ ej

= A ji ei ⊗ ej = A j

i ei ⊗ ej .

We have the following transformation rules,

Aij = Amn Mm

i M nj ,

A ji = A n

m M mi

−1

Mnj ,

(A.14)

where the matrix [−1

Mij ] is the inverse matrix of [M j

i ]. In matrix notations,the transformation rules can be written as

[Aij ] = [M mi ] [Amn] [M n

j ]T ,

[A ji ] = [M m

i ] [A nm ] [M j

n ]−1.


Transformation rules for other components and for tensors of higher ordersare similar. The general rule can easily be obtained by composing the transfor-mation rules for covariant and contravariant components, as shown in (A.13)or (A.14).

Exercise A.1.3 Let β = {e1 = (1, 0), e2 = (0, 1)} and β = {e1 =(1, 0), e2 = (2, 1)} be two bases of IR2. Determine the transformationmatrix of the change of basis from β to β and also the transformationmatrix from β∗ to β∗. Let T ∈ L(IR2) be defined in (A.10). Determinethe various components of T relative to the two different bases and verifythe transformation rules.

Exercise A.1.4 For any two bases β = {ei} and β = {ei} of V , thereexists a linear transformation A ∈ L(V ) such that ek = Aek. Show thatthe transformation matrix M for the change of basis from β to β is givenby M j

k = ej ·Aek, that is, [M jk ] = [Aj

k].

A.1.5 Determinant and Trace

In matrix algebra, the definition of the determinant of a square matrix isbased on the notion of permutation. Let (1, · · · , n) be an ordered set of nat-ural numbers. A reordering of the elements in (1, · · · , n) is called a permuta-tion. More precisely, a permutation is a one-to-one mapping σ : {1, · · · , n} →{1, · · · , n} resulting in the ordered set (σ(1), · · · , σ(n)). There are exactly n!permutations of (1, · · · , n). A permutation by exchanging the order of twoadjacent elements is called a transposition. It is known that any permutationcan be obtained by merely subsequent transpositions, and although the num-ber of such transpositions are not unique for a given permutation, the parityof this number is. Hence, a permutation is called even or odd according tothe parity of the number of transpositions in order to restore the permuta-tion back to the natural order and one can define the sign of a permutation,denoted signσ, as +1 if σ is even and −1 if σ is odd.

Let [Mij ] be a square matrix. The first index denotes the row and thesecond the column (it does not matter whether they are superindices orsubindices). The determinant of the matrix can be calculated by

det [Mij ] =∑

σ

(sign σ)Mσ(1) 1 · · ·Mσ(n) n, (A.15)

where the summation is taken over all permutations of (1, · · · , n).On the other hand, since the matrix representation of a linear transfor-

mation depends on the choice of basis, the question arises of whether it ismeaningful to define the determinant of a linear transformation as the de-terminant of its matrix representation. In the following, we shall see that


the notion of the determinant of a linear transformation can be defined in anatural way, independent of the choice of basis and we see how it is relatedto its matrix representations.

Definition. Let V be a vector space of dimension n. A function ω :n︷︸︸︷

V × · · · × V → IR is said to be an alternating n-linear form if it is n-linearand for all v1, · · · ,vn ∈ V ,

ω(vσ(1), · · · ,vσ(n)) = (sign σ)ω(v1, · · · ,vn). (A.16)

ω is called non-trivial if there exist u1, · · · ,un ∈ V , such that ω(u1, · · · ,un) 6=0.

It is obvious that if ω is alternating then

ω(· · · ,u, · · · ,v, · · ·) = 0, if u = v. (A.17)

More generally, if {v1, · · · ,vn} is linearly dependent then ω(v1, · · · ,vn) = 0.In other words, if ω(v1, · · · ,vn) 6= 0 then {v1, · · · ,vn} is a linearly in-dependent set, and since the number of vectors in this set equals dimV ,{v1, · · · ,vn} is also a basis of V .

Theorem. (uniqueness) Let ω and ω′ be two alternating n-linear forms andω be non-trivial. Then there exists uniquely a λ ∈ IR, such that ω′ = λω,i.e., ∀ v1, · · · ,vn ∈ V ,

ω′(v1, · · · ,vn) = λω(v1, · · · ,vn).

Proof. Since ω is non-trivial, there exists a set of vectors, say {e1, · · · , en},such that ω(e1, · · · , en) 6= 0, and hence it is a basis of V . Let λ be the numberdefined by

λ =ω′(e1, · · · , en)ω(e1, · · · , en)

.

Suppose that v1, · · · ,vn ∈ V , and

va = viaei, a = 1, · · · , n.

Then, using (A.16) and (A.17) one can easily obtain

ω(v1, · · · ,vn) = αω(e1, · · · , en),

ω′(v1, · · · ,vn) = αω′(e1, · · · , en),

whereα =

∑σ

(sign σ)vσ(1)1 · · · vσ(n)

n .

Therefore, we have

ω′(v1, · · · ,vn) = λω(v1, · · · ,vn).

Moreover, this relation also shows that λ does not depend on the choice ofbasis. tu


Let T ∈ L(V ) be a linear transformation on V , and ω be a non-trivialalternating n-linear form on V . Define a map Tω : V × · · · × V → IR by

Tω(v1, · · · ,vn) = ω(Tv1, · · · , Tvn). (A.18)

Clearly it is alternating and n-linear, hence by the uniqueness theorem, thereexists a unique λ ∈ IR, such that

Tω = λω.

We can easily see that the scalar λ so defined does not depend on thechoice of ω. For, if ω′ is another non-trivial alternating n-linear form, thenby the uniqueness theorem,

ω′ = µω, µ 6= 0.

Therefore, we haveTω′ = λ′ω′ = λ′µω.

On the other hand, we have

Tω′(v1, · · · ,vn) = ω′(Tv1, · · · , Tvn) = µω(Tv1, · · · , Tvn)

= µTω(v1, · · · ,vn) = µλω(v1, · · · ,vn),

which implies thatTω′ = µλω.

Consequently, λ = λ′. Therefore, λ is uniquely determined by T alone andwe can lay down the following definition.

Definition. T ∈ L(V ), the determinant of T , detT ∈ IR, is defined by thefollowing relation,

(detT )ω(v1, · · · ,vn) = ω(Tv1, · · · , Tvn), (A.19)

for any non-trivial alternating n-linear form ω and for any v1, · · · ,vn ∈ V .

The function det : L(V )→ IR has the following properties:

1) detu⊗ v = 0.2) det(α1 ) = αn. (A.20)3) det(ST ) = (detS)(detT ).4) detST = detS.

The first two properties are almost trivial. Here, let us verify the property(3). By definition,

det(ST )ω(v1, · · · ,vn)

= ω(STv1, · · · , STvn) = ω(S(Tv1), · · · , S(Tvn))

= (detS)ω(Tv1, · · · , Tvn) = (detS)(detT )ω(v1, · · · ,vn).


Since it holds for any ω(v1, · · · ,vn), relation (3) follows.We can calculate the determinant of a linear transformation in term of

its component matrix. Let {ei} be a basis of V , and T = T ijei ⊗ ej . Then

by definition,

(detT )ω(e1, · · · , en)

= ω(Te1, · · · , Ten) = ω(T i11ei1 , · · · , T in

nein)

=∑

σ

(sign σ)T σ(1)1 · · ·T σ(n)

n ω(e1, · · · , en).

Hence, we obtain

detT =∑

σ

(sign σ)T σ(1)1 · · ·T σ(n)

n ,

which assures thatdetT = det [T i

j ],

i.e., detT is equal to the determinant of the component matrix [T ij ] accord-

ing to the definition (A.15). Similarly, one can show that it is also equal todeterminant of [T j

i ]. Therefore, we have

detT = det[T ij ] = det[T j

i ]

= det[ gikTkj ] = det[ gikTkj ].

Note that detT is not equal to det[Tij ] nor to det[T ij ], unless det[ gij ] = 1.Similar to the determinant, another scalar can be associated with a linear

transformation. Let T ∈ L(V ), and ω be a non-trivial alternating n-linearform. Define a map Tω : V × · · · × V → IR by

Tω(v1, · · · ,vn) =n∑

i=1

ω(v1, · · · , Tvi, · · · ,vn).

One can easily check that Tω is alternating and n-linear, hence by the unique-ness theorem, there exists a µ ∈ IR, such that

Tω = µω.

Moreover, µ does not depend on the choice of ω. Therefore, we can make thefollowing definition.

Definition. T ∈ L(V ), the trace of T , trT ∈ IR, is defined by the followingrelation

(trT )ω(v1, · · · ,vn) =n∑

i=1

ω(v1, · · · , Tvi, · · · ,vn), (A.21)

for any non-trivial alternating n-linear form ω and for any v1, · · · ,vn ∈ V .


The function tr : L(V )→ IR has the following properties:

1) tr(αS + T ) = α trS + trT.2) tr 1 = n.3) tr(v ⊗ u) = v · u. (A.22)4) trST = trS.5) tr(ST ) = tr(TS).

The property (1) states that trace is a linear function on L(V ). Here, letus prove the property (3). Suppose that v = viei, then

tr(v ⊗ u)ω(e1, · · · , en) =n∑

i=1

ω(e1, · · · , (v ⊗ u)ei, · · · , en)

=n∑

i=1

(u · ei)ω(e1, · · · ,v, · · · , en) =n∑

i=1

(u · ei)viω(e1, · · · , en),

which implies that

tr(v ⊗ u) =n∑

i=1

u · (viei) = u · v.

Hence (3) is proved.In terms of components, let T = T i

jei ⊗ ej = T ji ei ⊗ ej , then

trT = T ij tr(ei ⊗ ej) = T j

j = T jj = gijT

ij = gijTij .

That is, trT is equal to the sum of diagonal elements of the matrix [T ij ] or

[T ij ], but, in general, is not equal to that of the matrix [Tij ] or [T ij ].

Example A.1.5 Show that det(1 + u⊗ v) = 1 + u · v.

By definition, we have

det(1 + u⊗ v)ω(e1, · · · , en)= ω((1 + u⊗ v)e1, · · · , (1 + u⊗ v)en)

= ω(e1, · · · , en) +n∑

i=1

ω(e1, · · · , (u⊗ v)ei, · · · , en) + · · ·

= ω(e1, · · · , en) + tr(u⊗ v)ω(e1, · · · , en),

where the dots represent terms involved with more than one factor of(u ⊗ v)ei in ω. Since (u ⊗ v)ei = (v · ei)u, which is a vector in thedirection of u for any index i, those terms must all equal zero becauseω is an alternating form. tu


Two non-trivial alternating n-linear forms ω1 and ω2 are said to be equiva-lent if ω1 = λω2 for some λ > 0. Clearly, this is an equivalence relation thatdecomposes the set of non-trivial alternating n-linear forms into two equiva-lent classes. Each of these classes is called an orientation of V . We call oneof them, say ∆, the positive orientation. A basis {ei} of V is called positivelyoriented if for any ω ∈ ∆,

ω(e1, · · · , en) > 0,

and A ∈ L(V ) is said to be orientation preserving if Aω ∈ ∆, for any ω ∈∆. Here, Aω is defined by (A.18). Since Aω = (detA)ω, A preserves theorientation if and only if detA > 0.

Let {ei} and {ei} be two bases such that A(ei) = ei. If detA > 0(or < 0), then {ei} and {ei} are said to have the same (or the opposite)orientation.

Suppose that V is a three-dimensional vector space and let {i1, i2, i3}be a positively oriented orthonormal basis of V , then there exists a uniquee ∈ ∆, called the volume element, such that

e(i1, i2, i3) = 1.

Since e ∈ L(V × V × V, IR), it is a third-order tensor and can be repre-sented as

e = εijk ii ⊗ ij ⊗ ik,

where εijk = e(ii, ij , ik) are the components of e relative to the basis {ik}.Obviously we have

εijk =

1 if (i, j, k) is an even permutation of (1, 2, 3),−1 if (i, j, k) is an odd permutation of (1, 2, 3),0 otherwise,

and we call it the permutation symbol. One can easily check the followingidentities:

εijkεimn = δjmδkn − δjnδkm,

εijkεijn = 2 δkn,

εijkεijk = 6,

(A.23)

where δmn is the Kronecker delta.Let {ek} be a basis and A ∈ L(V ) be a change of basis from {ik} to

{ek}, i.e., A ik = ek. Then the covariant components of the volume elementrelative to {ek} are

eijk = e(ei, ej , ek), e = eijk ei⊗ ej⊗ ek.


From (A.19), it follows that

eijk = (detA)εijk,

and also,gij = ei · ej = A ii ·A ij = (ATA)ij ,

which yields g = (detA)2, where g = det[gij ]. Therefore, we have

eijk =√g εijk, (A.24)

if A preserves the orientation. Similarly, the contravariant components of thevolume element are

eijk = e(ei, ej , ek), e = eijkei⊗ ej⊗ ek,

andeijk = (

√g)−1εijk,

where εijk = εijk. Moreover, the identities (A.23) can be written as

eijkeimn = δjmδ

kn − δj

nδkm,

eijkeijn = 2 δkn,

eijkeijk = 6.

(A.25)

If T ∈ L(V ) and T = T ijei ⊗ ej , then (A.19) leads to the following

formula for the determinant of T ,

elmn(detT ) = eijkTilT

jmT

kn. (A.26)

Multiplying by elmn and using the last identity of (A.25), we obtain anotherformula for the determinant,

detT =16elmneijkT

il T

jmT

kn.

Exercise A.1.5 Consider the tensor defined by (A.10) in the previousexercise. Calculate detT and trT by means of definition and also by theuse of component matrices relative to β′.

A.1.6 Exterior Product and Vector Product

The usual vector product on a three-dimensional vector space can not begeneralized directly to vector spaces in general. However, it can be associatedwith the skew-symmetric tensor product in a trivial manner.


Definition. For any, v,u ∈ V , the exterior product of v and u, denotedv ∧ u, is defined by

v ∧ u = v ⊗ u− u⊗ v.

It is obvious that the operation ∧ : V × V −→ V ⊗ V is bilinear andskew-symmetric, i.e.,

v ∧ u = −u ∧ v.

The exterior product of two vectors v ∧ u is a skew-symmetric tensor.Suppose that {ei⊗ ej}, i, j = 1, · · · , n is a product basis of V ⊗V , then

it is easy to verify that {ei ∧ ej}, 1 ≤ i < j ≤ n is a basis for Skw(V ).Therefore, we have the following proposition.

Proposition. If dimV = n, then dim Skw(V ) = n(n− 1)/2. In particular,if n = 3, then dim Skw(V ) = 3.

Now, suppose that V is an oriented Euclidean three-dimensional vectorspace. Since the space of skew-symmetric tensors is also three-dimensional,we can define a linear map

τ : Skw(V ) −→ V

by the condition: for all u,v,w ∈ V ,

τ(u ∧ v) ·w = e(u,v,w). (A.27)

Here, e is the volume element of V . This linear map, called the duality map,is one-to-one and onto and hence establishes a one-to-one correspondencebetween a skew-symmetric tensor and a vector. It is easy to verify that

τ(ei ∧ ej) = eijkek. (A.28)

For an orthonormal basis {ik} the duality map τ establishes the followingcorrespondence,

i1 ∧ i2 7−→ i3,

i2 ∧ i3 7−→ i1,

i3 ∧ i1 7−→ i2.

For a skew-symmetric tensor W , let w = τ(W ) be the associated vector,which shall be denoted more conveniently by

w = 〈W 〉. (A.29)


In component form, if

W = W ijei ⊗ ej , W ij = −W ji,

orW =

12W ijei ∧ ej .

Then, it follows from (A.28) that

w =12eijkW

ijek.

If the basis is orthonormal, it becomes

wi =12εijkWjk, Wij = εijkwk, (A.30)

or, in matrix form,

[Wij ] =

0 w3 −w2

−w3 0 w1

w2 −w1 0

.Remark. It is worthwhile to point out that the vector associated witha skew-symmetric tensor behaves differently from usual vectors underlinear transformations. To see this, let u,v ∈ V , then for any w ∈ Vand any orthogonal transformation Q (with the property QT = Q−1, seep. 24), it follows from the definition that

〈Qu ∧Qv〉 ·Qw = e(Qu, Qv, Qw)= (detQ) e(u,v,w) = (detQ)〈u ∧ v〉 ·w,

which implies that

〈Qu ∧Qv〉 = (detQ)Q〈u ∧ v〉.

In other words, as the vectors u, v are transformed into Qu, Qv, respec-tively, the vector 〈u ∧ v〉 is transformed into Q〈u ∧ v〉 only to within ascalar constant, or, into a vector that may point in one or the oppositesense of the same axial direction of 〈Qu∧Qv〉. For this reason, a vectorassociated with a skew-symmetric tensor is usually called an axial vector.tu

The usual vector product, in the three-dimensional vector space, can now bedefined from the exterior product in a similar manner.


Definition. For any u,v ∈ V , the vector product of u and v, denoted u×v,is defined by

u× v = 〈u ∧ v〉. (A.31)

Clearly the operation × : V × V −→ V is bilinear and skew-symmetric.In components (A.31) gives

u× v = eijkujvkei.

If the basis is orthonormal, say {ik}, then it becomes

u× v = εijkujvkii,

which is the usual definition of the vector product.The relations (A.27) and (A.31) imply that

e(u,v,w) = (u× v) ·w.

This is usually called the triple product of u, v, and w. For convenience, weshall also use the notation,

[u,v,w] = (u× v) ·w.

With this notation, we can rewrite the definitions (A.19) and (A.21) of thedeterminant and the trace in the following form

detA =[Ae1, Ae2, Ae3]

[e1, e2, e3],

trA =[Ae1, e2, e3] + [e1, Ae2, e3] + [e1, e2, Ae3]

[e1, e2, e3].

(A.32)

One may use the duality map to identify a skew-symmetric tensor withan axial vector, as well as the exterior product with the vector product. Inother words, one may interpret the duality in either way, as far as the contextrequires.

Exercise A.1.6 Verify the following relations, using index notations:1) Wv = −w × v.2) (u× v)×w = (u ·w)v − (v ·w)u.3) |u× v|2 = |u|2|v|2 − |u · v|2.4) |u× v| = |u| |v| sin θ(u,v).

A.1.7 Second-Order Tensors

We shall review some of the important properties of linear transformations,i.e., the second-order tensors, mostly without proofs in this section. Theproofs can be found in most standard books in linear algebra.


First, let us introduce an inner product of two second-order tensors. LetA,B ∈ L(V ), we can define the inner product of A and B by

A ·B = tr(ABT ),

which is obviously a bilinear, symmetric and positive-definite operation. Wehave

1 ·A = trA,

where 1 is the identity tensor, and for any A,B,C ∈ L(V ),

AB · C = B ·ATC = A · CBT .

The norm of a tensor A ∈ L(V ), can then be defined as

|A| =√A ·A =

√trAAT .

Note that if Aij are the components of A relative to an orthonormal basis,then the norm of A is simply

|A| = (A211 +A2

12 + · · ·+A2nn)1/2.

Now, suppose that A ∈ L(V ) is one-to-one (therefore, onto), then thereis a unique A−1 ∈ L(V ), called the inverse of A, such that

AA−1 = A−1A = 1 .

If A−1 exists, A is said to be invertible or non-singular, otherwise, it is saidto be singular. It can be proved that A is invertible if and only if detA 6= 0,and for any non-singular A and B,

(AB)−1 = B−1A−1,

(A−1)T = (AT )−1 = A−T .

Notation. Inv(V ) = {F ∈ L(V ) | F is invertible}.

Recall that a set G is called a group if it has the following properties:

1) If A,B ∈ G then AB ∈ G.2) If A,B,C ∈ G then A(BC) = (AB)C.3) There exists an identity element 1 ∈ G such that 1A = A1 = A, for any

A ∈ G.4) For any A ∈ G, there exists A−1∈ G, such that AA−1 = A−1A = 1 .

It is easy to verify that Inv(V ) forms a group under the operation of com-position. It is usually known as the general linear group of V , denoted byGL(V ).


Definition. Q ∈ L(V ) is called an orthogonal transformation if it preservesthe inner product of V . i.e., for all u,v ∈ V ,

Qu ·Qv = u · v.

Notation. O(V ) = {Q ∈ L(V ) | Q is orthogonal}.

The set O(V ) forms a group and is called the orthogonal group of V .Orthogonal transformations have the following properties:

1) QT = Q−1.2) |detQ| = 1.3) |Qv| = |v|.4) θ(Qv, Qu) = θ(v,u).

The last two relations assert that orthogonal transformations also preservenorms and angles. An orthogonal transformation Q is said to be proper ifdetQ = 1, and improper if detQ = −1.

Notation. O+(V ) = {Q ∈ O(V ) | detQ = 1}.

The set O+(V ) also forms a group, called the proper orthogonal group ofV . It is also called the rotation group since its elements are rotations. Notethat the subset of O(V ) with determinant equal to −1 does not form a groupsince it does not have an identity element.

Notation. U(V ) = {T ∈ L(V ) | |detT | = 1} and

SL(V ) = {T ∈ L(V ) | detT = 1}.

Elements of U(V ) are called unimodular transformations and U(V ) formsa group, called the unimodular group of V . SL(V ) also forms a group, calledthe special linear group of V . Clearly, we have the following relations:

O+(V ) ⊂ SL(V )O(V ) ⊂ U(V ) ⊂ GL(V ).

A.1.8 Some Theorems of Linear Algebra

We shall mention some important theorems of linear algebra relevant to thestudy of mechanics. They are all related to the concept of eigenvalues andeigenvectors.

Definition. Let A ∈ L(V ). A scalar λ ∈ IR is called an eigenvalue of A, ifthere exists a non-zero vector v ∈ V , such that

Av = λv. (A.33)

v is called the eigenvector of A associated with the eigenvalue λ.


It follows from the definition that λ is an eigenvalue if and only if

det(A− λ1 ) = 0. (A.34)

The left-hand side of (A.34) is a polynomial of degree n in λ, where n is thedimension of V . We may write it in the form

(−λ)n + I1(−λ)n−1 + · · ·+ In−1(−λ) + In = 0.

It is called the characteristic equation of A. Its real roots are the eigenvaluesof A. The coefficients I1, · · · , In are scalar functions of A and are called theprincipal invariants of A.

It can be shown that the characteristic equation is also satisfied by thetensor A itself. We have the following

Cayley–Hamilton Theorem. A second-order tensor A ∈ L(V ) satisfiesits own characteristic equation,

(−A)n + I1(−A)n−1 + · · ·+ In−1(−A) + In1 = 0.

Example A.1.6 For dimV = 3 and A ∈ L(V ), we have

det(A− λ1 ) = −λ3 + IAλ2 − IIAλ+ IIIA. (A.35)

The three principal invariants of A, more specifically denoted by IA, IIA,and IIIA can be obtained from the following relations:

IA = trA, IIA = trA−1 detA, IIIA = detA. (A.36)

Of course, the second relation is valid only when A is non-singular.

Proof. From (A.32) we can write

det(A− λ1 )[e1, e2, e3] = [(A− λ1 )e1, (A− λ1 )e2, (A− λ1 )e3]

= − λ3 [e1, e2, e3]

+ λ2([Ae1, e2, e3] + [e1, Ae2, e3] + [e1, e2, Ae3])

− λ ([e1, Ae2, Ae3] + [Ae1, e2, Ae3] + [Ae1, Ae2, e3])

+ [Ae1, Ae2, Ae3].

Comparing this with the right-hand side of (A.35), we obtain (A.36)1,3

by the use of (A.32), as well as the following relation for the secondinvariant IIA,

IIA =[e1, Ae2, Ae3] + [Ae1, e2, Ae3] + [Ae1, Ae2, e3]

[e1, e2, e3].

If A ∈ Inv(V ), then it implies the second relation of (A.36). In particular,if detA = 1, we have IIA = IA−1 . tu


In general, the characteristic equation may not have real roots. However, itis known that if A is symmetric all the roots are real and there exists a basisof V consisting entirely of eigenvectors.

Spectral Theorem. Let S ∈ Sym(V ), then there exists an orthonormalbasis {ei} of V , such that S can be written in the form

S =n∑

i=1

siei ⊗ ei. (A.37)

Such a basis is called a principal basis for S. Relative to this basis, thecomponent matrix of S is a diagonal matrix and the diagonal elements si arethe eigenvalues of S associated with the eigenvectors ei, respectively. Theeigenvalues si, i = 1, · · · , n may or may not be distinct.

Definition. Let λ be an eigenvalue of S ∈ L(V ). We call Vλ = {v ∈ V |Sv =λv} the characteristic space of S associated with λ.

If S is a symmetric tensor and suppose that v ∈ Vλ, u ∈ Vµ, where λ andµ are two distinct eigenvalues of S, then one can easily show that v · u = 0,i.e., they are mutually orthogonal. Moreover, by the spectral theorem anyvector v can be written in the form

v =∑

λ

vλ, vλ ∈ Vλ, (A.38)

where the summation is extended over all characteristic spaces of S.

Commutation Theorem. Let T ∈ L(V ) and S ∈ Sym(V ). Then

ST = TS

if and only if T preserves all characteristic spaces of S, i.e., T maps eachcharacteristic space of S into itself.

Proof. Suppose that S and T commute, and Sv = λv. Then

S(Tv) = T (Sv) = λ(Tv),

so that both v and Tv belong to the characteristic space Vλ.To prove the converse, since S is symmetric, for any v ∈ V , let v =∑

λ vλ be the decomposition relative to the characteristic spaces of S as givenin (A.38). If T leaves each characteristic space Vλ invariant, then Tvλ ∈ Vλ

andS(Tvλ) = λ(Tvλ) = T (λvλ) = T (Svλ).

Therefore, from (A.38), we have

STv =∑

λ

STvλ =∑

λ

TSvλ = TSv,

which shows that ST = TS. tu


There is only one subspace of V that is preserved by any rotation, namelyV itself. Therefore, we have the following

Corollary. A symmetric S ∈ L(V ) commutes with every orthogonal trans-formation if and only if S = λ1 , for some λ ∈ IR.

Definition. S ∈ L(V ) is said to be positive definite (positive semi-definite)if for any v ∈ V and v 6= 0, v · Sv > 0 (≥ 0). Similarly, S is said to benegative definite (negative semi-definite) if v · Sv < 0 (≤ 0).

One can easily see that if S is symmetric, then it is positive definite ifand only if all of its eigenvalues are positive. Consequently, for any symmet-ric positive definite transformation S, there is a unique symmetric positivedefinite transformation T such that T 2 = S and the eigenvalues of T are thepositive square roots of those of S associated with the same eigenvectors.We denote T =

√S and call T the square root of S. In other words, if S is

expressed by (A.37) in terms of the principal basis, then

T =√S =

n∑i=1

√siei ⊗ ei.

Example A.1.7 Let S ∈ L(IR2) be given by S(x, y) = (3x+√

2y,√

2x+2y). Relative to the standard basis of IR2, the matrix of S is

[Sij ] =[

3√

2√2 2

],

which has the eigenvalues s1 = 4 and s2 = 1 and the corresponding prin-cipal basis e1 = (

√2/3,

√1/3) and e2 = (−

√1/3,

√2/3). Therefore, we

haveT =

√S = 2e1 ⊗ e1 + e2 ⊗ e2,

whose matrix, relative to the standard basis, becomes

[Tij ] =23

[2√

2√2 1

]+

13

[1 −

√2

−√

2 2

]=

13

[5√

2√2 4

].

One can easily verify that [Tij ]2 = [Sij ].

Example A.1.8 Let S be a positive definite symmetric tensor in atwo-dimensional space, then

√S =

1b(S + a1 ),

where a =√

detS and b =√

2a+ trS.


Proof. Let A =√S. By the Cayley–Hamilton theorem in the two-

dimensional space, we have the identity

A2 − (trA)A+ (detA)1 = 0.

Since A2 = S, if we let the eigenvalues of A be a1 and a2, then detS =a21a

22 and trS = a2

1 + a22. Therefore

a =√a21a

22 = a1a2 = detA,

b =√

2a1a2 + a21 + a2

2 = a1 + a2 = trA,

which, together with the above identity, prove the result. tu

Polar Decomposition Theorem. For any F ∈ Inv(V ), there exist sym-metric positive definite transformations V and U and an orthogonal trans-formation R such that

F = RU = VR.

Moveover, the transformations U , V and R are uniquely determined in theabove decompositions.

Proof. We can easily verify that FFT and FTF are symmetric positive defi-nite. Indeed, for any v 6= 0, we have

(v · FTFv) = (Fv · Fv) > 0,

since F is non-singular.To prove the theorem, let us define

U =√FTF , R = FU−1, V = RURT . (A.39)

By definition, U is symmetric positive definite and R is orthogonal since

RRT = FU−1(FU−1)T = FU−1U−TFT

= FU−2FT = F (FTF )−1FT = 1 .

Moreover, from the definition (A.39) we also have

V 2 = RURT (RURT ) = (RU)(RU)T = FFT .

Therefore, V is the square root of FFT and hence is itself a symmetricpositive definite transformation. Furthermore, the uniqueness follows fromthe definition of a square root. tu

The polar decomposition theorem, which decomposes a non-singulartransformation into a rotation and a positive definite tensor, is crucial inthe development of continuum mechanics. The following decomposition of


a tensor into its symmetric and skew-symmetric parts is also important inmechanics.

For any T ∈ L(V ), let

A =12

(T + TT ), B =12

(T − TT ),

thenT = A+B, A ∈ Sym(V ), B ∈ Skw(V ).

This is sometimes called the Cartesian decomposition of a tensor. Such adecomposition is also unique.

Exercise A.1.7 Let A ∈ L(V ) be such that (1 + A) is non-singular.Verify that

1) (1 +A)−1 = 1 −A(1 +A)−1.

2) (1 +A)−1 = 1−A+A2−· · ·+(−1)nAn+o(An) if lim|A|→0

o(An)|A|n

= 0.

Exercise A.1.8 Let u,v ∈ V . Show that if 1 + u · v 6= 0 then

(1 + u⊗ v)−1 = 1 − u⊗ v

1 + u · v.

Exercise A.1.9 For dimV = 3, let A ∈ L(V ) and B = 1 + A. Showthat

IB = 3 + IA,

IIB = 3 + 2IA + IIA,

IIIB = 1 + IA + IIA + IIIA,

and if a = detB 6= 0, verify that

(1 +A)−1 =1a

((1 + IA + IIA)1 − (1 + IA)A+A2

).

Exercise A.1.10 Prove the Cayley–Hamilton theorem for the specialcase that A ∈ L(V ) is symmetric, by employing the spectral theorem.

Exercise A.1.11 Let β = {(1, 0, 0), (0, 1, 0), (0, 0, 1)} be the standardbasis of IR3 and the matrix representation of F ∈ L(IR3) relative to βbe given by

F =

√3 1 00 2 00 0 1

.Suppose that F = RU = VR is the polar decomposition of F . Find thematrix representation of U , V , and R relative to the standard basis β.


A.2 Tensor Calculus

In the second part of this appendix, we shall discuss some basic notions ofcalculus on Euclidean spaces: gradients and other differential operators oftensor functions.

A.2.1 Euclidean Point Space

Let E be a set of points and V be a Euclidean vector space of dimension n.

Definition. E is called a Euclidean point space of dimension n, and V iscalled the translation space of E if, for any pair of points x, y ∈ E , there is avector v ∈ V , called the difference vector of x and y, written as

v = y − x, (A.40)

with the following properties:1) ∀ x ∈ E , x− x = 0 ∈ V .2) ∀ x ∈ E , ∀ v ∈ V , there exists a unique point y ∈ E , such that (A.40) is

satisfied. We write y = x+ v.3) ∀ x, y, z ∈ E , (x− y) + (y − z) = (x− z).

Obviously, with (A.40) we can define the distance between x and y in E ,denoted d(x, y), by

d(x, y) = |v| ,

or equivalentlyd(x, y) =

√(x− y) · (x− y),

where the dot denotes the inner product on V .

Notation. Ex = {vx = (x,v) | v = y − x, ∀ y ∈ E}.

Ex denotes the set of all difference vectors at x. It can be made intoa Euclidean vector space in an obvious way, with the addition and scalarmultiplication defined as

vx + ux = (v + u)x,

αvx = (αv)x.

We call Ex the tangent space of E at x.

A.2 Tensor Calculus 31

Clearly Ex is a copy of V , i.e., it is isomorphic to V . In other words, forany x ∈ E , the map ix : V → Ex, called the Euclidean parallelism, takingv to vx trivially establishes a one-to-one correspondence between Ex and V .The composite map

τxy = iy ◦ i−1x : Ex −→ Ey

takingvx = (x,v) 7−→ vy = (y,v)

defines the parallel translation of vectors at x to vectors at y (Fig. A.1).Therefore, although Ex and Ey for x 6= y, are two different tangent spaces,

they can be identified through V in an obvious manner,

Ex ∼= Ey ∼= V, ∀ x, y ∈ E .

In other words, vx = (x,v) ∈ Ex and uy = (y,u) ∈ Ey are regarded as thesame vector if and only if v = u. In this manner, vectors at different tangentspaces can be added or subtracted as if they were in the same vector space.

��

��

��3

��

��

��3a

aa

aa

aa

a

aa

aa

aa

aar

rx

y

v

v

vx ∈ Ex

vy ∈ Ey

Fig. A.1. Parallel translation

A.2.2 Differentiation

Before we define the derivative of tensor functions on Euclidean space ingeneral, let us recall the definition of derivative of a real-valued function ofa real variable. Let f : (a, b) → IR be a function on the interval (a, b) ⊂ IR.The derivative of f at t ∈ (a, b) is defined as

df(t)dt

= limh→0

1h

(f(t+ h)− f(t)

),

if the limit exists.


This definition can easily be extended to tensor-valued functions of areal variable. Let W be a space equipped with a norm or a distance function.As examples, we have

IR : d(x, y) = |x− y| ,E : d(x, y) =

√(x− y) · (x− y),

V : |u| =√

u · u,

L(V ),Sym(V ),Skw(V ) : |A| =√

trAAT .

(A.41)

It makes sense to talk about limit and convergence in the space W when anorm or a distance function is defined.

Let f : (a, b)→ W be a function defined on an interval (a, b) ⊂ IR. Thederivative of f at t ∈ (a, b) is defined as

df(t)dt

= limh→0

1h

(f(t+ h)− f(t)

). (A.42)

The derivative of f at t will also be denoted by f(t). Obviously for anyt ∈ (a, b) we have f(t) ∈W .

Note that if f is defined on a more general space, the expression on theright-hand side of the definition (A.42) may not make sense at all. However,we can rewrite the relation (A.42) in a different form.

For fixed t, let Df(t) : IR→W be the linear transformation defined by

Df(t)[h] = f(t)h.

Then (A.42) is equivalent to

limh→0

1|h||f(t+ h)− f(t)−Df(t)[h]| = 0.

In this form, the definition of the derivative can easily be generalized to otherfunctions.

Tensor Fields

Now we shall consider functions on a Euclidean point space E . Let D be anopen set in E , and f be a tensor-valued function, f : D →W . Such functionsare usually called tensor fields, more specifically,1) W = IR, f is called a scalar field on D,

f : x ∈ D 7−→ f(x) ∈ IR.

2) W = V, f is called a vector field on D,

f : x ∈ D 7−→ f(x) ∈ Ex ∼= V.


3) W = L(V ), f is called a second-order tensor field on D,

f : x ∈ D 7−→ f(x) ∈ Ex⊗ Ex ∼= L(V ).

4) W = E , f is called a point field on D or a deformation of D,

f : x ∈ D 7−→ f(x) ∈ E .

Definition. A function f : D →W is said to be differentiable at x ∈ D ⊂ E ,if there exists a linear transformation Df(x) ∈ L(V,W ) at x, such that forany v ∈ V ,

lim|v|→0

1|v||f(x+ v)− f(x)−Df(x)[v]| = 0. (A.43)

The linear transformation Df(x) is uniquely determined by the aboverelation, and it is called the gradient (or derivative) of f at x, denoted bygradf , or ∇xf , or simply ∇f . By definition, ∇f(x) is a tensor in W ⊗ V ,or is a vector in V if W = IR.

The condition (A.43) is equivalent to

f(x+ v)− f(x) = ∇f(x)[v] + o(v),

where o(v) is a quantity containing terms such that

lim|v|→0

o(v)|v|

= 0.

Moreover, if we substitute tv for v for some fixed v in V , (A.43) is alsoequivalent to

∇f(x)[v] = limt→0

1t

(f(x+ tv)− f(x)

)=

d

dtf(x+ tv)

∣∣∣t=0

.

(A.44)

The right-hand side of the above relation is usually known as the directionalderivative of f relative to the vector v. Note that for fixed x and v, f(x+ tv)is a tensor-valued function of a real variable and its derivative can easily bedetermined from (A.42).

Functions on Tensor Spaces

Let W1 and W2 be two spaces on which a norm or a distance function isdefined, such as the spaces mentioned in (A.41), and let D ⊂W1 be an opensubset. The gradient of tensor functions on D can be defined in a similarmanner.


Definition. A function F : D →W2 is said to be differentiable at X ∈ D ⊂W1, if there exists a linear transformation DF (X) ∈ L(W1,W2) at X, suchthat ∀ Y ∈ D,

lim|Y |→0

1|Y ||F (X + Y )− F (X)−DF (X)[Y ]| = 0.

The linear transformation DF (X) is uniquely determined by the aboverelation, and it is called the gradient of F with respect toX, denoted by ∂XF .We have ∂XF ∈ W2 ⊗W1. The definition is equivalent to the condition: forany Y ∈ D, we have

F (X + Y )− F (X) = ∂XF (X)[Y ] + o(Y ), (A.45)

or∂XF (X)[Y ] =

d

dtF (X + tY )

∣∣∣t=0

. (A.46)

For φ ∈W2⊗W1, and Y ∈W1, the notation φ[Y ] used in the above relationsis self-evident: for φ = K ⊗X,

(K ⊗X)[Y ] = (X · Y )K, ∀K ∈W2, X, Y ∈W1,

and for all v,u ∈ V and A,S ∈ L(V ), we have

v[u] = v · u,A[u] = Au,

A[S] = A · S = trAST ,

(v ⊗ u)[S] = v · Su.

Gradients can easily be computed directly from the definition (A.45) or(A.46). We demonstrate this procedure with some examples.

Example A.2.1 Let φ : L(V )× V → IR be defined by

φ(A,v) = v ·Av.

Then, from (A.45),

φ(A,v + u) = (v + u) ·A(v + u)

= v ·Av + v ·Au + u ·Av + u ·Au

= φ(A,v) + ∂vφ[u] + o(u),

so that∂vφ[u] = v ·Au + u ·Av

= AT v · u +Av · u= (AT +A)v[u].


Therefore, we obtain

∂vφ = (A+AT )v.

Moreover, we have

φ(A+ S,v) = v · (A+ S)v = v ·Av + v · Sv,

which implies∂Aφ[S] = v · Sv = (v ⊗ v)[S],

so that∂Aφ = v ⊗ v.

tu

Example A.2.2 Let u,v ∈ V be constant vectors, and let φ : L(V )→IR be defined by

φ(A) = u ·Av.

From (A.46) we have

∂Aφ[S] =d

dt

(u · (A+ tS)v

)∣∣∣t=0

= u · Sv = (u⊗ v)[S],

for all S ∈ L(V ), and we obtain

∂Aφ = u⊗ v.

Now, suppose that A is a symmetric tensor, hence the function φ isdefined on the subspace Sym(V ) only,

φ : Sym(V )→ IR,

and by definition ∂Aφ ∈ Sym(V ) also. In this case, we have the samerelation,

∂Aφ[S] = (u⊗ v)[S],

but it holds only for all S ∈ Sym(V ). Therefore, we conclude that

∂Aφ =12(u⊗ v + v ⊗ u),

after symmetrization.Similarly, if A is a skew-symmetric tensor, then ∂Aφ ∈ Skw(V ) and

the result must be skew-symmetrized,

∂Aφ =12(u⊗ v − v ⊗ u).

tu


Example A.2.3 We consider trace and determinant functions. Since

tr(A+ S) = trA+ trS = trA+ 1 · S,

so that, trivially, the gradient of the trace is the identity transformation,

∂A(trA) = 1 . (A.47)

For the gradient of the determinant, we have

(∂AdetA)[S] = det(A+ S)− det(A) + o(S).

Let ω be a non-trivial alternating n-linear form, then

ω(v1, · · · ,vn)(∂AdetA)[S]= ω((A+S)v1, · · · , (A+S)vn)−ω(Av1, · · · , Avn)+ o(S).

By the linearity of ω, after removing all the higher-order terms into o(S),the right-hand side becomes

=n∑

i=1

ω(Av1, · · · , Svi, · · · , Avn) + o(S)

=n∑

i=1

ω(Av1, · · · , AA−1Svi, · · · , Avn) + o(S)

= (detA)n∑

i=1

ω(v1, · · · , A−1Svi, · · · ,vn) + o(S)

= (detA)(trA−1S)ω(v1, · · · ,vn) + o(S),

from (A.21). Therefore, we have

(∂AdetA)[S] = (detA)(trSA−1) = (detA)A−T [S],

which implies the following formula,

∂AdetA = (detA)A−T . (A.48)

tu

In differential calculus, we frequently differentiate a composite function bythe chain rule. This rule can be stated for composite tensor functions ingeneral. Let W1, W2, W3 be normed spaces of the type (A.41) and D1 ⊂W1,D2 ⊂W2 be open subsets, and let

φ : D1 →W2, ψ : D2 →W3,

with φ(D1) ⊂ D2. Then we have the following


Chain Rule. Let φ be differentiable at X ∈ D1, and ψ be differentiableat Y = φ(X) ∈ D2. Then the composition f = ψ ◦ φ is differentiable at Xand

Df(X)[Z] = Dψ(φ(X))[Dφ(X)[Z]], (A.49)

for any Z ∈W1 or simply

Df(X) = Dψ(Y ) ◦Dφ(X).

Example A.2.4 If φ is a scalar-valued function of a vector variable,g(x) is a vector field on E , and h(v) is a vector-valued function of avector variable, then

∇h(g(x)) = ∂vh∣∣∣v=g(x)

(∇g(x)),

∇φ(g(x)) =(∇g(x)

)T∂vφ

∣∣∣v=g(x)

.

Let us verify the last one in the above formulae. For any u ∈ V , from(A.49),

∇φ(g(x))[u] = ∂vφ∣∣∣v=g(x)

[∇g(x)[u]] = ∂vφ∣∣∣v=g(x)

·(∇g(x)

)u

=(∇g(x)

)T∂vφ

∣∣∣v=g(x)

· u =(∇g(x)

)T∂vφ

∣∣∣v=g(x)

[u],

where in the third step we have used the definition of transpose (A.8).Note that ∇h, ∇g, and ∂vh are all second-order tensors, while ∂vφ is avector quantity. tu

Another important result in differentiation is the product rule. For tensorfunctions, in general, there are many different products available, for example,the product of a scalar and a vector, the inner product, the tensor product,the action of a tensor on a vector. These products have one property incommon, namely, bilinearity. Therefore, in order to establish a product rulevalid for all cases of interest, we consider the bilinear operation

π : W1 ×W2 −→W3,

which assigns to each φ ∈ W1, ψ ∈ W2, the product π(φ, ψ) ∈ W3. If φ, ψare two functions,

φ : D →W1, ψ : D →W2,

where D is an open subset of some normed space W , then the product f =π(φ, ψ) is the function defined by

f : D −→W3

f(X) = π(φ(X), ψ(X)), ∀X ∈ D.

We then have the following


Product Rule. Suppose that φ and ψ are differentiable at X ∈ D ⊂ W ,then their product f = π(φ, ψ) is differentiable at X and

Df(X)[V ] = π(Dφ(X)[V ], ψ(X)) + π(φ(X), Dψ(X)[V ]), (A.50)

for all V ∈W .

In other words, the derivative of the product π(φ, ψ) is the derivative ofπ holding ψ fixed plus the derivative of π holding φ fixed.

Example A.2.5 Let f be a scalar-valued, and h, q be vector-valuedfunctions on D ⊂W . For W = IR, we have

(fh)˙= fh + f h,

(q · h)˙= q · h + q · h.(A.51)

For W = E , we have

∇(fh) = h⊗∇f + f∇h,

∇(q · h) = (∇q)T h + (∇h)T q.(A.52)

For W = V , we have

∂v(fh) = h⊗ ∂vf + f ∂vh,

∂v(q · h) = (∂vq)T h + (∂vh)T q.(A.53)

Unlike the simple formulae in (A.51), the relations in (A.52) and (A.53)do not look like the familiar product rules, because they have to beconsistent with our notation conventions.

Let us demonstrate the first relation of (A.52). By the product rule(A.50), for any w ∈ V , we have

∇(fh)[w] = (∇f [w])h + f(∇h[w]) = (∇f ·w)h + f(∇h)w

= (h⊗∇f)w + f(∇h)w =(h⊗∇f + f(∇h)

)[w],

where, in the third step, we have used the definition (A.4). tu

If f : D ⊂ U →W is differentiable and its derivative Df is continuous in D,we say that f is of class C1. The derivative is again a function, Df : D →W ⊗ U , for which we can talk about the differentiability and continuity. Wesay that f is of class C2, if Df is of class C1, and so forth. Frequently, wesay a function is smooth to mean that it is of class Ck for some k ≥ 1. Wemention the following

Inverse Function Theorem. Let D ⊂ W be an open set and f : D →W be a one-to-one function of class Ck(k ≥ 1). Assume that the lineartransformation Df(X) : W → W is invertible at each X ∈ D, then f−1

exists and is of class Ck.


Example A.2.6 Let D ⊂ E and φ : D → IR be of class C2. Then thesecond gradient of φ is a symmetric tensor, that is, ∇(∇φ) ∈ Sym(V ).

Indeed, from the definition, we have

∇φ(x+ u)−∇φ(x) = ∇(∇φ)[u] + o(u).

Taking the inner product with v, we obtain

∇φ(x+ u)[v]−∇φ(x)[v] = v · ∇(∇φ)u + o(u),

which implies that

v · ∇(∇φ)u =(φ(x+ u + v)− φ(x+ u)

)−

(φ(x+ v)− φ(x)

)+ o(u) + o(v).

Since the right-hand side of the last relation is symmetric in u and v, itfollows that

v · ∇(∇φ)u = u · ∇(∇φ)v,

which proves that the second gradient of φ is symmetric. tu

Exercise A.2.1 Show that if Q : IR → O(V ) is differentiable, thenQQT is skew-symmetric.

Exercise A.2.2 Let h(v, A) = (v · Av)A2v be a vector function of avector v and a second-order tensor A. Compute ∂vh and (∂Ah)[S] forany S ∈ L(V ).

Exercise A.2.3 If A ∈ L(V ) is invertible, show that1) (∂AA

−1)[S] = −A−1SA−1, for any S ∈ L(V ),2) ∂A tr(A−1) = −(A−2)T .

Exercise A.2.4 Let A be a second-order tensor. Show that1) For any positive integer k,

∂A trAk = k(Ak−1)T .

2) For principal invariants IA, IIA, IIIA,

∂AIA = 1 ,

∂AIIA = (IA1 −A)T ,

∂AIIIA = (IIA1 − IAA+A2)T .

(A.54)

Hint: Calculate ∂A det(A+ λ1 ) = ∂A(λ3 + IAλ2 + IIAλ+ IIIA).


A.2.3 Coordinate System

Tensor functions can be expressed in terms of components relative to smoothfields of bases in the Euclidean point space E associated with a coordinatesystem.

Definition. Let D ⊂ E be an open set. A coordinate system on D is a smoothone-to-one mapping

ψ : D −→ U,

where U is an open set in IRn, such that ψ−1 is also smooth.

Let x ∈ D,ψ : x 7−→ (x1, · · · , xn) = ψ(x).

(x1, · · · , xn) is called the (curvilinear) coordinate of x, and the functions

χi : D −→ IR

χi(x) = xi, i = 1, · · · , n,(A.55)

are called the i-th coordinate function of ψ. For convenience, we call (xi) acoordinate system on D.

-

6

��

(x1, · · · , xn)

��

BBBXXXhhhhh ��

��

QQQ

XXXXXX

PPPq��

��*r

-rt

xi

t

λi(t)

ei(x)

λi

x

D

IRn

Fig. A.2. Coordinate curve

Let χ = ψ−1, thenx = χ(x1, · · · , xn). (A.56)

For x1, · · · , xn fixed, the mapping (Fig. A.2)

λi : IR −→ Dλi(t) = χ(x1, · · · , xi + t, · · · , xn),

(A.57)


is a curve in D passing through x at t = 0, called the i-th coordinate curveat x. We denote the tangent of this curve at x by ei(x).

ei(x) = λi(t)∣∣∣t=0

=∂χ

∂xi

∣∣∣(x1,···,xn)

. (A.58)

Proposition. The set {ei(x), i = 1, · · · , n} forms a basis for the tangentspace Ex.

Proof. For any vector v ∈ Ex, we can define a curve through x by

λ(t) = x+ tv.

Letλ(t) = χ(λ1(t), · · · , λn(t)),

where λi(t) are the coordinates of λ(t) given by

λi(t) = χi(x+ tv). (A.59)

Then, from (A.58), the tangent vector becomes

v = λ(t)∣∣∣t=0

=∂χ

∂xi

∣∣∣x

dλi

dt

∣∣∣t=0

=dλi

dt

∣∣∣t=0

ei(x),

In other words, {ei(x)} spans the space Ex. tu

The set {ei(x)} is a basis of Ex for each x. This field of bases is calledthe natural basis of the coordinate system (xi) for V , the translation space ofE . The corresponding dual basis of this natural basis is denoted by {ei(x)}.

Combining (A.55) and (A.56), we have

xi = χi(χ(x1, · · · , xn)),

which implies

∂xi

∂xj= δi

j = (∇χi) · ∂χ∂xj

= (∇χi) · ej(x),

by the use of (A.58). Therefore, the two natural bases of the coordinatesystem (xi) are given by the following relations:

ei(x) =∂χ

∂xi

∣∣∣x, ei(x) = ∇χi(x). (A.60)

The inner products,

gij(x) = ei(x) · ej(x), gij(x) = ei(x) · ej(x),

are called the metric tensors of the coordinate system.


Now let us consider a change of coordinate systems. Let (xi) and (xi)be two coordinate systems on D, and {ei(x)}, {ei(x)} be the correspondingnatural bases. Suppose that the coordinate transformations are given by

xi = xi(x1, · · · , xn),

xk = xk(x1, · · · , xn).

Then, by taking the gradients, one immediately obtains the change of thecorresponding natural bases given by

ei(x) =∂xi

∂xkek(x), ei(x) =

∂xk

∂xiek(x). (A.61)

Comparing the change of bases considered in Sect. A.1.4, [∂xi/∂xk] plays the

role of the transformation matrix [M ik ] in (A.12), and hence, the transfor-

mation rules (A.14) for the components of an arbitrary tensor in the changeof coordinate system becomes

Aij = Ak

l

∂xi

∂xk

∂xl

∂xj. (A.62)

For other components of tensors, in general, the transformation rules aresimilar.

Example A.2.7 Let us consider a deformation κ : D → E ,

κ(x) = x.

Let (xi) be a coordinate system on D, and (xα) be a coordinate systemon κ(D),

x = χ(x1, · · · , xn), x = χ(x1, · · · , xn).

The deformation κ is usually expressed explicitly in the form,

xα = κα(x1, · · · , xn), α = 1, · · · , n. (A.63)

By the use of chain rule, we obtain, from (A.55) and (A.60),

∇κ(x)[v] =∂χ

∂xα

∣∣∣x

[∂κα

∂xi

∣∣∣x∇χi(x)[v]

]=∂κα

∂xi

∣∣∣xeα(x)

[ei(x)[v]

]for any vector v ∈ Ex. There, it follows that

∇κ(x) =∂κα

∂xi

∣∣∣xeα(κ(x))⊗ ei(x).

This is the component form of the deformation gradient ∇κ(x) in termsof two different coordinate systems (xi) and (xα). With respect to thesetwo natural bases at two different points, namely, x and κ(x), the com-ponents of the deformation gradient are just the partial derivatives of thedeformation function (A.63), which can most easily be calculated. Othercomponent forms of ∇κ can be obtained through the metric tensors andby the change of bases relative to the coordinate systems. tu


A.2.4 Covariant Derivatives

We shall now consider the component form of the gradient of a tensor field,in general, relative to the natural basis of a coordinate system. Let (xi) be acoordinate system on D ⊂ E , and {ei(x)}, {ei(x)} be its natural bases.

To begin with, let us consider a scalar field, f : D → IR,

f(x) = f(χ(x1, · · · , xn)) = f(x1, · · · , xn).

Using the chain rule, we obtain, by the use of (A.60),

∇f(x)[v] =∂f

∂xi

∣∣∣(x1,···,xn)

[∇χi(x)[v]

]=

∂f

∂xi

∣∣∣(x1,···,xn)

ei(x)[v]

for any vector v ∈ Ex. Therefore,

∇f(x) =∂f

∂xi

∣∣∣(x1,···,xn)

ei(x).

Usually, we shall write f(χ(x1, · · · , xn)) as f(x1, · · · , xn) for simplicity.Therefore, the component form of the gradient of f(x) can be written as

∇f(x) =∂f

∂xi

∣∣∣xei(x). (A.64)

In other words, for the gradient of a scalar field f , its covariant componentrelative to the natural basis, (∇f)i, is just the partial derivative relative tothe coordinate xi.

Now, let us consider the gradients of natural bases themselves. For eachi fixed, {ei} and {ei} can be regarded as vector fields on D,

ei : x ∈ D 7−→ ei(x) ∈ Ex.

Let us denote the gradients of natural bases by

Γi(x) = ∇ei(x) ∈ Ex⊗ Ex,Γ i(x) = ∇ei(x) ∈ Ex ⊗ Ex.

(A.65)

We writeΓi = Γ j

i kej⊗ ek, Γ i = Γ ijkej⊗ ek. (A.66)

The components Γ ji k and Γ i

jk are called the Christoffel symbols. Note thatΓ j

i k and Γ ijk are not the associated components of a third-order tensor.

By taking the gradient of (ei(x) · ej(x)), one can obtain the relation,

Γ ij k = −Γ i

jk. (A.67)


Moreover, since Γ i = ∇(∇χi(x)) by (A.60)1 and the second gradient is asymmetric tensor, we have the following symmetry conditions,

Γ ijk = Γ i

kj , Γ ij k = Γ i

k j . (A.68)

Since both Christoffel symbols are related in such a simple manner, usuallyonly one is in use, namely, Γ i

j k, and it is called the Christoffel symbol of thesecond kind in classical tensor analysis.

Now let us calculate the gradient of a vector field in terms of the coor-dinate system. Suppose that v(x) is a vector field and

v(x) = vi(x)ei(x) = vi(x)ei(x).

Then from (A.52)1, (A.64), (A.65), and (A.66), we have

∇v = ∇(viei)

= ei⊗∇vi + vi∇ei

= ei⊗∂vi

∂xkek + viΓ j

i kej⊗ ek

=( ∂vj

∂xk+ viΓ j

i k

)ej⊗ ek.

Hence, the gradient of v(x) has the component form,

∇v = vj,kej⊗ ek,

where

vj,k =

∂vj

∂xk+ vi Γ j

i k. (A.69)

Similarly, we also have∇v = vj,kej⊗ ek,

wherevj,k =

∂vj

∂xk− vi Γ

ij k. (A.70)

Here, the relation (A.67) has been used.vj

,k and vj,k are the mixed and the covariant components of ∇v. Thecomma stands for the operation called the covariant derivative, since it in-creases the order of the covariant components by one.

More generally, suppose that A is a second-order tensor field, then ∇Ais a third-order tensor field that has the following component form,

∇A = Aij,k ei ⊗ ej ⊗ ek,


where

Aij,k =

∂Aij

∂xk+Al

j Γi

l k −Ail Γ

lj k. (A.71)

Covariant derivatives of other components can easily be written using thesame recipes for covariant and contravariant components, respectively.

We have seen in (A.7) that the components of the metric tensor, gij(x)and gij(x), are also the components of the identity tensor, therefore theircovariant derivatives must vanish,

gij,k = 0, gij,k = 0. (A.72)

Consequently, from (A.24), the covariant derivatives of the volume tensoralso vanish,

eijk,l = 0, eijk,l = 0.

In other words, the components of the metric tensor and the volume tensorbehave like constant tensors in covariant derivation, although they are, ingeneral, functions of x.

From (A.72)1, we can derive a formula for the determination of theChristoffel symbols in terms of the metric tensor. From (A.71) we have

∂gij

∂xk= gljΓ

li k + gilΓ

lj k.

Rotating the indices (i, j, k) of this relation, then adding two of the threeresulting equations and subtracting the remaining one, we get

2 gljΓl

i k =(∂gjk

∂xi+∂gij

∂xk− ∂gik

∂xj

).

Hence, we have the following formula:

Γ ji k =

12gjl

(∂gli

∂xk+∂glk

∂xi− ∂gik

∂xl

). (A.73)

The Christoffel symbols are not components of a third-order tensor. For twocoordinate systems (xi) and (xi), they have the following transformationrules:

Γ ji k = Γ s

r t

∂xr

∂xi

∂xj

∂xs

∂xt

∂xk+

∂2xr

∂xi∂xk

∂xj

∂xr.

A.2.5 Other Differential Operators

Divergence and curl of a vector field can be defined in the usual way andtheir definitions can be adopted also for tensor fields.

Definition. The divergence of a vector field u is a scalar field defined by

div u = tr(∇u). (A.74)

In component form,div u = ui

,i.


Definition. The curl (or rotation) of u is a vector field defined by

curlu = 〈∇uT −∇u〉.

In component form,curlu = eijkuk,jei.

Here, the duality map defined in (A.29) is employed and, according to(A.30), curl u is the axial vector of the skew-symmetric part of the gradientof (−2u). One can easily verify the following condition:

v · curlu = div(u× v),

for any constant vector field v. This condition can be used as the definitionfor the curl operator. In a similar manner, we can define the divergence of asecond-order tensor in terms of the divergence of a vector.

Definition. The divergence of a second-order tensor field S is a vector fielddefined by the condition: for any constant vector field v,

v · divS = div(ST v). (A.75)

In component form, we have

divS = Sij,jei.

Definition. The Laplacian of a scalar (or vector) field φ, denoted by ∇2φ,is a scalar (or vector) field defined by

∇2φ = div(∇φ).

In component form, if φ is a scalar field,

∇2φ = gjk(φ,j),k = gjkφ,jk.

If φ = h is a vector field,

∇2h = gjkhi,jkei.

In the above expressions, the comma denotes the covariant derivative.

Example A.2.8 Let f , S, and u, v be scalar, tensor, and vector fields,respectively. Then we can show the following relations:

div(fu) = u · ∇f + f div u,

div(Su) = u · divST + tr(S∇u),

div(u× v) = v · curlu− u · curlv,

∇2(u · v) = ∇2u · v + 2∇u · ∇v + u · ∇2v.

(A.76)


Let us verify the first relation.

div(fu) = tr(∇(fu))

= tr(u⊗∇f + f(∇u)

)= tr(u⊗∇f) + f tr(∇u),

which gives (A.76)1. In this calculation, we have used the definition(A.74), the relation (A.52)1, and the linearity of the trace operator.

Verification of the other relations in (A.76) may not be so straight-forward in direct notation. And more annoyingly, these relations, as wellas the relations (A.52) and (A.53), are not easy to memorize. Neverthe-less, if we express all of these relations in index notation, they all becometrivially simple. Indeed, (A.76) may be written out directly as:

(fui),i = f,iui + fui

,i,

(Sijuj),i = Sij,iuj + Sijuj,i,

(gileljkujvk),i = gileljku

j,iv

k + gileljkujvk

,i,

gjk(uivi),jk = gjkui,jkvi + 2 gjkui

,kvi,j + gjkuivi,jk,

which are merely the usual product rules of differentiating scalar func-tions and the symmetry of second gradient. The only difference here isthat the comma denotes the covariant derivative instead of the usualpartial derivative. tu

Remark. From the observation made in the above example, the use of in-dex notation is often encouraged, especially when complicated calculationsare involved. In arbitrary curvilinear coordinate systems, contravariant andcovariant indices must be carefully distinguished and the pair of repeatedindices, for which the summation convention is applied, must always appearin different levels. An index can be raised or lowered to its proper level withthe metric tensor gij or gij . Moreover, since the gradients of the metric tensorand the volume tensor vanish, in covariant differentiation, the metric tensorgij as well as the components of the volume element eijk can be treated asconstants. Furthermore, if the Cartesian coordinate system is used, there isno difference between contravariant and covariant components and hence allthe indices can be written at the same level, and more conveniently, the co-variant derivative becomes the partial derivative, and gij = δij , eijk = εijk

are constants.It is important to note that given an expression in index notation, one

can always turn it into an expression in direct notation or vice versa. There-fore, in handling calculations, the choice of using direct notation or indexnotation, or even using Cartesian index notation is totally up to one’s tasteand convenience.


We shall also mention some important theorems of integral calculus oftenused in mechanics.

Divergence Theorem. Let R be a bounded regular region2 in E , and letφ : R → IR, h : R → V , S : R → L(V ) be smooth fields. Then∫

∂Rφn da =

∫R∇φdv,

∫∂R

v · n da =∫R

div v dv,

∫∂R

Sn da =∫R

divS dv,

(A.77)

where n is the outward unit normal field on ∂R.

Proof. The relations (A.77)1,2 are well-known classical results. To show(A.77)3, let v be an arbitrary constant vector. Then

v ·∫

∂RSn da =

∫∂R

v · Sn da =∫

∂RST v · n da

=∫R

div(ST v) dv =∫R

v · divS dv

= v ·∫R

divS dv,

where we have used (A.77)2 and the definition (A.75). tuProposition. Let φ : D → W be a continuous function on an open set Din E . If ∫

Nφdv = 0,

for any N ⊂ D, then φ is identically zero in D, i.e.,

φ(x) = 0, ∀x ∈ D.

Proof. Suppose that φ(x◦) 6= 0 for some x◦ ∈ D, then since φ is continuous,there exists a small neighborhood N ⊂ D containing x◦, such that φ(x) 6= 0,∀x ∈ N . Therefore, by the mean value theorem of integral calculus,∫

Nφdv = Kφ(x) 6= 0,

2 A regular region, roughly speaking, is a closed region with a piecewise smoothboundary.


for some x ∈ N , where K denotes the volume of N . This contradicts thehypothesis. tu

This proposition and the divergence theorem enable us to deduce localfield equations from the integral balance laws.

Exercise A.2.5 Let f , u,v, and S be smooth scalar, vector, and second-order tensor fields. Verify the following identities:1) div(Su) = u · divST + tr(S∇u),2) div(fS) = S∇f + f divS,3) div(u⊗ v) = (∇u)v + u div v,4) div(∇uT ) = ∇(div u).

Exercise A.2.6 Let f and v be smooth scalar and vector fields, re-spectively. Show that1) curl∇f = 0,2) div curl v = 0,3) If div v = 0 and curlv = 0, then ∇2v = 0.

Exercise A.2.7 Let v and S be smooth vector and tensor field, on abound regular region R, respectively. Show that

1)∫

∂Rv ⊗ n da =

∫R∇v dv,

2)∫

∂Rv ⊗ Sn da =

∫R

(v ⊗ divS + (∇v)ST

)dv.

A.2.6 Physical Components

Let (xi) be a coordinate system on E and {ei(x)} and {ei(x)} be its naturalbases. The system (xi) is called an orthogonal coordinate system if the metrictensor

gij(x) = 0, for i 6= j, ∀x ∈ E .

For an orthogonal coordinate system, we can define a field of orthonormalbasis, denoted by {e〈i〉(x)}, by normalizing the natural basis,

e〈i〉 =ei

|ei|(no sum).

In this expression, the summation notation is not invoked as indicated ex-plicitly. Since

|ei| =√

ei · ei =√gii (no sum),


therefore,

e〈i〉 =ei√gii

=ei√gii

=√gii ei =

√gii ei (no sum).

Here, we have noted that normalization of the two dual natural bases of anorthogonal coordinate system gives rise to the same orthonormal basis.

The components of a tensor field relative to the orthonormal basis{e〈i〉(x)} are called the physical components in the coordinate system (xi).For a vector field v,

v = viei = viei = v〈i〉e〈i〉.

The physical components v〈i〉 are given by

v〈i〉 =√gii v

i =vi√gii. (no sum) (A.78)

For a second-order tensor field T ,

T = T ijei⊗ ej = Tijei⊗ ej = T i

jei⊗ ej

= T〈ij〉e〈i〉⊗ e〈j〉.

The physical components T〈ij〉 are given by

T〈ij〉 =√gii√gjj T

ij =Tij√gii√gjj

=√gii√gjj

T ij (no sum). (A.79)

In particular, we have g〈ij〉 = δij .The advantage of using physical components is obvious in practical ap-

plications. Since the norms of the basis vectors of the natural basis, in general,vary from point to point in E , hence it is inconvenient for the measurementof physical quantities relative to this basis.

A.2.7 Orthogonal Coordinate Systems

We now consider three orthogonal coordinate systems most commonly used:the Cartesian, the cylindrical, and the spherical coordinate systems and de-rive their basic characteristics.

a) Cartesian Coordinate System

Fix a point o in E . Let {i1, i2, i3} be an orthonormal basis of V . For anyx ∈ E , then x− o ∈ V . We write

x− o = xi ii.


Clearly, this defines a coordinate system

x 7−→ (x1, x2, x3),

with {i1, i2, i3} as its natural basis, which is, of course, independent of x ∈ E .We call such a system a Cartesian coordinate system.

For a Cartesian coordinate system, we have

gij(x) = δij , ∀ x ∈ E ,

and hence, from (A.73)Γ i

j k(x) = 0.

It is also a custom to write the basis {i1, i2, i3} as {ex, ey, ez} and thecoordinate (x1, x2, x3) as (x, y, z) for a Cartesian coordinate system.

b) Cylindrical Coordinate System

The cylindrical coordinate system (r, θ, z) is defined as

x = χ(r, θ, z),

by the following coordinate transformation (see Fig. A.3 (a)),

x1 = r cos θ, r > 0

x2 = r sin θ, 0 < θ < 2π

x3 = z,

(A.80)

where x = (x1, x2, x3) is the Cartesian coordinate system.The natural bases are denoted by {er, eθ, ez} and {er, eθ, ez}. From

(A.80) and (A.60)2, we can determine the basis in terms of the Cartesiancomponents.

er =∂χ

∂r= cos θ i1 + sin θ i2,

eθ =∂χ

∂θ= −r sin θ i1 + r cos θ i2,

ez =∂χ

∂z= i3.

Therefore, we obtain the matrix representations of the metric tensor in thecylindrical coordinate system,

[gij ] =

1r2

1

, [gij ] =

1r−2

1

,


and the Christoffel symbols from (A.73),

Γ θr θ = Γ θ

θ r =1r,

Γ rθ θ = −r,

others = 0.

Moreover, we have

er = er, eθ = r2 eθ, ez = ez,

ande〈r〉 = cos θ i1 + sin θ i2,

e〈θ〉 = − sin θ i1 + cos θ i2,

e〈z〉 = i3.

-

6

��

��+

ZZ

ZZ

ZZ

ZZq(r, θ, z)

z

θ

r

x1

x2

x3

(a) Cylindrical

-

6

��

��+

ZZ

ZZ

��

��

�q(r, θ, φ)

θ

φ

r

x1

x2

x3

(b) Spherical

Fig. A.3. Coordinate systems

c) Spherical Coordinate System

The spherical coordinate system (r, θ, φ) is defined as

x = χ(r, θ, φ),

by the following coordinate transformation (see Fig. A.3 (b)),

x1 = r sin θ cosφ,

x2 = r sin θ sinφ,

x3 = r cos θ,

r > 0

0 < θ < π

0 < φ < 2π

where x = (x1, x2, x3) is the Cartesian coordinate system.


The natural bases are denoted by {er, eθ, eφ} and {er, eθ, eφ}. We have

er = sin θ cosφ i1 + sin θ sinφ i2 + cos θ i3,

eθ = r cos θ cosφ i1 + r cos θ sinφ i2 − r sin θ i3,

eφ = −r sin θ sinφ i1 + r sin θ cosφ i2,

ander = er, eθ = r2 eθ, eφ = r2 sin2 θ eφ.

The matrix representations of the metric tensor have the forms

[gij ] =

1r2

r2 sin2 θ

, [gij ] =

1r−2

(r sin θ)−2

,and the Christoffel symbols are

Γ θr θ = Γ θ

θ r = Γ φr φ = Γ φ

φ r =1r,

Γ rθ θ = −r,

Γ rφ φ = −r sin2 θ,

Γ φθ φ = Γ φ

φ θ = cot θ,

Γ θφ φ = − sin θ cos θ,

others = 0.

Moreover, the orthonormal basis for the physical components are

e〈r〉 = sin θ cosφ i1 + sin θ sinφ i2 + cos θ i3,

e〈θ〉 = cos θ cosφ i1 + cos θ sinφ i2 − sin θ i3,

e〈φ〉 = − sinφ i1 + cosφ i2.

Remark. More frequently, we would like to express quantities in these co-ordinate systems in terms of their physical components. A simple way to dothis is to derive the expressions first in terms of contravariant or covariantcomponents and then convert them into physical components using relationslike (A.78) and (A.79).

Example A.2.9 Let us calculate the Laplacian of a scalar field Φ inthe spherical coordinate system. We have

Φ,j =∂Φ

∂xj,

Φ,jk =∂2Φ

∂xj∂xk− ∂Φ

∂xiΓ i

j k,


from which we obtain the following covariant components:

Φ,rr =∂2Φ

∂r2,

Φ,θθ =∂2Φ

∂θ2− ∂Φ

∂rΓ r

θ θ =∂2Φ

∂θ2+ r

∂Φ

∂r,

Φ,φφ =∂2Φ

∂φ2− ∂Φ

∂rΓ r

φ φ −∂Φ

∂θΓ θ

φ φ =∂2Φ

∂φ2+ r sin2 θ

∂Φ

∂r+ sin θ cos θ

∂Φ

∂θ.

We have Φ,rr = Φ,〈rr〉, Φ,θθ = r2Φ,〈θθ〉, Φ,φφ = r2 sin2 θ Φ,〈φφ〉 in termsof physical components. That is,

Φ,〈rr〉 =∂2Φ

∂r2,

Φ,〈θθ〉 =1r2∂2Φ

∂θ2+

1r

∂Φ

∂r,

Φ,〈φφ〉 =1

r2 sin2 θ

∂2Φ

∂φ2+

1r

∂Φ

∂r+

cot θr2

∂Φ

∂θ.

Therefore, the Laplacian ∇2Φ, which is the sum Φ,〈rr〉 + Φ,〈θθ〉 + Φ,〈φφ〉in physical components, becomes

∇2Φ =∂2Φ

∂r2+

2r

∂Φ

∂r+

1r2∂2Φ

∂θ2+

1r2 sin2 θ

∂2Φ

∂φ2+

cot θr2

∂Φ

∂θ.

tu

Example A.2.10 We give the physical components of the divergenceof a symmetric tensor field T in the following coordinate systems:

a) Cartesian coordinate system (x, y, z):

(div T )〈x〉 =∂T〈xx〉

∂x+∂T〈xy〉

∂y+∂T〈xz〉

∂z,

(div T )〈y〉 =∂T〈xy〉

∂x+∂T〈yy〉

∂y+∂T〈yz〉

∂z, (A.81)

(div T )〈z〉 =∂T〈xz〉

∂x+∂T〈yz〉

∂y+∂T〈zz〉

∂z.

b) Cylindrical coordinate system (r, θ, z):

(div T )〈r〉 =∂T〈rr〉

∂r+

1r

∂T〈rθ〉

∂θ+∂T〈rz〉

∂z+T〈rr〉 − T〈θθ〉

r,

(div T )〈θ〉 =∂T〈rθ〉

∂r+

1r

∂T〈θθ〉

∂θ+∂T〈θz〉

∂z+

2rT〈rθ〉, (A.82)

(div T )〈z〉 =∂T〈rz〉

∂r+

1r

∂T〈θz〉

∂θ+∂T〈zz〉

∂z+

1rT〈rz〉.


c) Spherical coordinate system (r, θ, φ):

(div T )〈r〉 =∂T〈rr〉

∂r+

1r

∂T〈rθ〉

∂θ+

1r sin θ

∂T〈rφ〉

∂φ

+1r

(2T〈rr〉 − T〈θθ〉 − T〈φφ〉 + cot θ T〈rθ〉

),

(div T )〈θ〉 =∂T〈rθ〉

∂r+

1r

∂T〈θθ〉

∂θ+

1r sin θ

∂T〈θφ〉

∂φ(A.83)

+1r

(3T〈rθ〉 + cot θ (T〈θθ〉 − T〈φφ〉)

),

(div T )〈φ〉 =∂T〈rφ〉

∂r+

1r

∂T〈θφ〉

∂θ+

1r sin θ

∂T〈φφ〉

∂φ

+1r

(3T〈rφ〉 + 2 cot θ T〈θφ〉

).

tu

Exercise A.2.8 Let u be a vector field. Show that1) in the cylindrical coordinate system,

div u =∂u〈r〉

∂r+

1r

∂u〈θ〉

∂θ+∂u〈z〉

∂z+

1ru〈r〉;

2) in the spherical coordinate system,

div u =∂u〈r〉

∂r+

1r

∂u〈θ〉

∂θ+

1r sin θ

∂u〈φ〉

∂φ+

2ru〈r〉 +

cot θr

u〈θ〉.

Exercise A.2.9 Let u be a vector field and E = 12 (∇u+∇uT ). Express

E in cylindrical and spherical coordinate systems,1) relative to the natural basis,2) in terms of physical components.

Exercise A.2.10 Let T be a symmetric tensor field. Compute div T ,in cylindrical and spherical coordinate systems,1) relative to the natural basis,2) in terms of physical components. (Verify (A.82) and (A.83)).

Exercise A.2.11 Let Φ : IR → E be a curve. Suppose that {ei(x)}is the natural basis and φi(t) is the coordinate of Φ(t) in the coordinatesystem (xi). Show that

1) Φ(t) = φi(t) ei

(Φ(t)

),

2) Φ(t) =(φi(t) + φj(t)φk(t)Γ i

j k

(Φ(t)

))ei

(Φ(t)

).

a. elementary tensor analysis - principalliu/papers/cm-app.pdf · a. elementary tensor analysis...

Documents