physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 prelude ﻢﻴﺷﻮﺧ...

44
A Companion to Elementary Linear Algebra for Physics Students Gol Mohammad Nafisi University of Tehran Fall 2018 gmnafisi@ut.ac.ir

Upload: others

Post on 19-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

A Companion to

Elementary Linear Algebra

for Physics Students

Gol Mohammad Nafisi∗

University of Tehran

Fall 2018

[email protected]

Page 2: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Contents

1 Prelude ii

2 Preliminaries 1

2.1 Operations on Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2.2 On Algebraic Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3 Vector Spaces 4

3.1 Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.2 Linear Maps and Dual Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4 Inner Product Space 15

4.1 Inner Product, Norm and Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.2 Orthonormal Basis and Gram-Schmidt Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.3 Orthogonal Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5 Ode to Hilbert Space 28

5.1 Hilbert Space and Operator Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6 Cadenza: Algebra Over a Field 36

6.1 On Homomorphism and Isomorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

7 Coda 41

i

Page 3: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

1 Prelude

خوشيم ديدار وعده ی ي به روز و شب ما اند خوش تو جمال تماشای به گر ران دي

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

This outline note is intended to serve as a supplementary material to Linear Algebra part of the undergraduate

“Mathematical Physics” course in the Physics department curriculum. I prepared it in order to be used in weekly

recitation sessions and it is NOT going to be a replacement for the class lectures or the course reference textbooks.

By this goal, I am not going to treat Linear Algebra rigorously (as it is done in Math departments). Rather, my

approach will be practical, meaning that it is more of a “survival kit for Physics undergrads”. As “One does not

simply walk into Mordor”, one does not also simply learn Mathematics just by memorizing stuff and watching

others do. Therefore we will try many examples and exercises. Problems which will be marked as exercise are

going to be solved during the recitations so I highly encourage you to ponder them and try to arrive at a solution

on your own or by discussing them with your peers. I do not claim on originality of the material whatsoever.

To quote wonderful David Tong, “My primary contribution has been to borrow, steal and assimilate the best

discussions and explanations I could find from the vast literature on the subject”. I will list couple of my favorite

resources at the end of this section just in case you might like to check them out. Last but not least, all comments

are most welcome.

• Some Useful Resources:

. Linear Algebra Done Right, Sheldon Axler, Springer

. Mathematical Physics: A Modern Introduction to Its Foundations, Sadri Hassani, 2nd ed., Springer

. Essential Linear Algebra, Joel G. Broida

. Mathematics of Classical and Quantum Physics, F. W. Byron and R. W. Fuller, Dover

. Principles of Quantum Mechanics (Chapter 1), R. Shankar, Springer

ii

Page 4: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

2 Preliminaries

“The beginner should not be discouraged if he finds he does not have the prerequisites for reading the prerequisites.”

- Paul Halmos

In this note we are going to dive into the realm of Linear Algebra, but you might ask “What is Algebra in the

first place?”. Literally, Algebra means ’bone-setting’ or ’the reunion of broken parts’! If this is not helpful, you

can consider Algebra as a branch of Mathematics in which people study objects (like Sets), structures that these

objects build (like Categories, Groups, Rings, Modules,...) and the relations between them. Familiar Set-like

objects that you have encountered during your educations are the number systems: Natural (N), Integer (Z),

Rational (Q), Real (R) and Complex (C) numbers for which we have the following:

N ⊂ Z ⊂ Q ⊂ R ⊂ C

Now if we want to build algebraic structures from Sets, first we need to define operations on them.

2.1 Operations on Sets

Since the foundation of Mathematics (up until 1945) is based on Sets, we begin by recalling some of its concepts

that we need. Fortunately we all learned a thing or two about Sets at least during our high school education so I

will spare you the conceptual details.

Definition 1 (Map). Let X and Y be sets. A map f is a rule which assigns y ∈ Y for each x ∈ X and denoted

by

f : X → Y

Definition 2 (n-ary Cartesian Product). Let X1, ..., Xn be n sets. Then n-ary Cartesian product of these n

sets is the set of n-tuples defined as:

X1 × · · · ×Xn = {(x1, ..., xn) | ∀i ∈ {1, ..., n} : xi ∈ Xi}

1

Page 5: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Definition 3 (n-ary Cartesian Power). Let X be a set. Then n-ary Cartesian power of X is the set of n-tuples

defined as:

Xn := X ×X × · · · ×X︸ ︷︷ ︸n

= {(x1, ..., xn) | ∀i ∈ {1, ..., n} : xi ∈ X}

A familiar example is the 2-dimensional plane R2 = R× R which is the set of all points (x, y) where x, y ∈ R .

Now an n-ary operation on X takes n element of X and combines them into a single element of X. Examples

of such operations are Unary (1-ary), Binary (2-ary), Ternary (3-ary) , ... . We will restrict ourselves to Binary

operations throughout this note.

Definition 4 (Binary Operation). Let X be a non-empty set. A map

⋆ : X ×X → X

is called a Binary operation on X. It takes x1, x2 ∈ X and combines them into single element x1 ⋆ x2 ∈ X.

Now we say that ⋆ is

• associative if ∀x1, x2, x3 ∈ X : x1 ⋆ (x2 ⋆ x3) = (x1 ⋆ x2) ⋆ x3

• commutative if ∀x1, x2 ∈ X : x1 ⋆ x2 = x2 ⋆ x1

Familiar examples of Binary operations are Addition (+) and Multiplication (·) which both can be defined in any

algebraic structure.

2.2 On Algebraic Structures

One can classify Algebraic structures based on the number of sets and operations that are involved. Examples of

one set structures are:

• Group-like structures which involve one Binary operation (e.g. Monoid, Group, ...)

• Ring-like structures which involve two Binary operations (e.g. Ring, Field, ...)

Examples of two set structures are:

• Module-like structures which involve at least two Binary operations (e.g. Module, Vector Space, ...)

2

Page 6: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

• Algebra-like structures which are defined over a ring and a module and involve two operations on the ring,

two on the module and one involving both the ring and the module, making up to five Binary operations

(e.g. Lie algebra, Inner Product Space, ...)

Note that we can construct other classes of structures by using these Algebraic structures together with structures

which are not Algebraic themselves (like Topological). One important example of such classes is called “Normed

Vector Space” which is the Mathematical foundation of Quantum Mechanics and we will soon meet her majesty.

By the way, you do not need to worry about these fancy terms here. Let me tell you a story:

Suppose that a good friend of yours suggests a piece of music. Let’s say it is “Joe Satriani’s Jumpin’ In”. You

hear it and your mind blows. You want to play it! But you can’t do it just by hearing it repeatedly (which was

your method of playing so far) since there are sounds that you don’t know how to produce. So you grab its music

sheet, but you don’t understand a thing! What does that piece of music have to do with these alien symbols? Later

you find that every pitch of that track can be represented by those strange symbols and the relations between them

will produce different parts of solos and riffs. You even might dig deeper and find out that in scales and octaves

one can find traces of group theory or Riemann zeta function. Now in my naive view, I believe that is the case

with Physics, us and Mathematics. If one wants to understand the Nature, one could at least do the courtesy of

seeing its alphabets, looking for its grammar and hearing its music. The term “universe” that we love to use so

much, comes from the Latin “universum” meaning “one verse, or one song” after all.

3

Page 7: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

3 Vector Spaces

“Young man, in mathematics you don’t understand things. You just get used to them.”

- John von Neumann

3.1 Vector Space

In order to define a vector space, we first need to define a Field. Since as I mentioned earlier, a vector space is a

Module-like structure which means it consists of two sets. You might note that when we use the term vector, we

do not mean it as a Geometrical arrow-like object per se, but rather we mean it as an abstract object. Later we

will see that we can represent this abstract object by some interesting and familiar class of objects called Matrices.

Definition 5 (Field). A structure (F,+, ·) is a field F if F is a non-empty set with at least two elements, with

‘‘ + ” (addition) and ‘‘ · ” (multiplication) as two Binary operations defined on F such that ∀a, b, c ∈ F we have

the following axioms:

A.1. a+ b = b+ a (commutativity of addition)

A.2. a+ (b+ c) = (a+ b) + c (associativity of addition)

A.3. ∃0 ∈ F s.t. a+ 0 = a (existence of an additive identity)

A.4. ∃ − a ∈ F s.t. a+ (−a) = 0 (existence of additive inverse)

A.5. a · b = b · a (commutativity of multiplication)

A.6. a · (b · c) = (a · b) · c (associativity of multiplication)

A.7. (a+ b) · c = a · c+ b · c (distributivity)

A.8. ∃1 ∈ F s.t. 1 = 0 and x · 1 = x (existence of multiplicative identity)

A.9. if a = 0 then ∃ a−1 ∈ F (i.e. 1/a ) s.t. a · a−1 = 1 (existence of multiplicative inverse)

A exercise 1. One can show that Q, R and C are fields by checking that they satisfy the field axioms. Now state

an argument about Z being a field or not 1.

1The symbol A amounts to nothing but fun! It’s just a label that I like to use.

4

Page 8: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Definition 6 (Vector Space). Let V be a non-empty set (whose elements we call vectors) and F be a field

(whose elements we call scalars) and define two binary operations as:

• + : V × V → V s.t. ∀u,v ∈ V : u+ v ∈ V (vector addition)

• · : F × V → V s.t. ∀u ∈ V, a ∈ F : a · u ∈ V (scalar multiplication)

Now (V,F ,+, ·) is a vector space V over F , if ∀u,v,w ∈ V and ∀ a, b ∈ F the following eight axioms hold:

B.1. u+ v = v + u (commutativity of addition)

B.2. u+ (v +w) = (u+ v) +w (associativity of addition)

B.3. ∃0 ∈ V s.t. u+ 0 = u (identity element of addition)

B.4. ∃ − u ∈ V s.t. u+ (−u) = 0 (inverse elements of addition)

B.5. a · (u+ v) = a · u+ a · v (distributivity of scalar multiplication with respect to vector addition)

B.6. (a+ b) · u = a · u+ b · u (distributivity of scalar multiplication with respect to field addition)

B.7. ∃1 ∈ F s.t. 1 · u = u (identity element of scalar multiplication)

B.8. a · (b · u) = (a · b) · u (associativity of scalar multiplication with field multiplication)

Remark 1. From now on we will drop “·” from scalar multiplication a · v and write av instead. Also the bold

letters and numbers denote vectors while the plain ones will denote scalars. For example 0 is scalar while 0 is the

additive identity vector.

Example 1. Show that ∀a ∈ F and ∀ |v⟩ ∈ V we have:

i. 0v = 0 ii. a0 = 0 iii. (−1)v = −v

Solution 1.

i. We know that

0 + 1 = 1 ⇒ (0 + 1)v + (−v) = 0

Now using axiom (B.6) of def.6 we have:

0v = 0

5

Page 9: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

ii. We have:

a0 = a(0+ 0) = a0+ a0

Now adding additive inverse of a0 which is −a0 to both sides and using axiom (B.4) we get:

0 = a0

iii. We know that:

1 + (−1) = 0 ⇒ (1 + (−1))v = 0v ⇒ v + (−1)v = 0

Therefore from axiom (B.4), (−1)v must be the additive inverse of v i.e. (−1)v = −v

Remark 2. Throughout this note we restrict F to be either R or C and the vector spaces defined on them are

called real or complex vector spaces respectively.

Some examples of vector spaces include:

• ∀xi ∈ R and ∀i ∈ {1, ..., n}, the set of n-component vectors denoted by

x1...

xn

form a real vector space.

• ∀xij ∈ C and ∀i ∈ {1, ...,m} and ∀j ∈ {1, ..., n}, the set of m× n matrices

x11 · · · x1n... . . . ...

xm1 · · · xmn

form a complex vector space.

• Let S be a non-empty set and F be a field such that FS denotes the set of functions from S to F . Now if:

i. ∀f, g ∈ FS , x ∈ S we have (f + g)(x) = f(x) + g(x) ∈ FS

ii. ∀λ ∈ F we have (λf)(x) = λf(x) ∈ FS

6

Page 10: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Then FS is a vector space over F . For example, R[0,1] which is the set of real-valued functions on the interval

[0, 1] is a vector space over R.

• The set Pn[F ] of all polynomials p(x) of the form:

∀x, p(x), ai ∈ F : p(x) = a0 + a1x+ · · ·+ anxn

form a vector space (where n is a positive integer and is called degree of polynomial), in which the vector

addition and scalar multiplication are defined respectively as:

n∑i=0

aixi+

n∑i=0

bixi =

n∑i=0

(ai + bi)xi

cn∑

i=0

aixi =

n∑i=0

(cai)xi

A exercise 2. If V = C and F = R, then C is a vector space on R. Now let V = R and F = C. Does V form a

vector space on F ?

Definition 7 (Subspace). Let V be a vector space on F . A subset W of V is a subspace of V if U forms a vector

space U on F .

Example 2. Let M be the set of all M2×2 real matrices which forms a real vector space. Consider the subset

N = {A ∈ M| det(A) = 0}. Is N a subspace of M ?

Solution 2. Consider the matrices A =

1 0

0 0

, B =

0 0

0 1

∈ N . From def.6, we see that if N is a vector space

then it must be closed under vector addition i.e. A+ B should be in N. But we see that A+ B =

1 0

0 1

/∈ N

since det(A+B) = 1. Hence from def.7 N is not a vector space ergo it is not a subspace of M

7

Page 11: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Definition 8 (Linear Combination). Let V be a vector space over F and let S = {v1, ...,vn} be a list which

contains n vectors in V. Given any set of scalars {a1, ..., an} ∈ F , the vector

n∑i=1

aivi = a1v1 + · · ·+ anvn (1)

is called a linear combination of n vectors and the set of all such linear combinations of elements in S is called

the subspace spanned by S and denoted by span(v1, ...,vn).

Remark 3. A vector space V is said to be finite dimensional if it is spanned by some list of vectors in V. Otherwise

it is infinite dimensional.

Definition 9 (Linear Independence). A list of vectors {v1, ...,vn} in V are linearly independent if the equation

a1v1 + · · ·+ anvn = 0 (2)

only has a trivial solution a1 = a2 = · · · = an = 0. Otherwise they are linearly dependent. If a list of vectors are

linearly independent, it means that we can not write any member of the list in terms of the others.

Example 3. Check the linear dependency of these vectors:

i. x1 = (1, 0, 0) , x2 = (0, 1, 0) , x3 = (0, 0, 1) are three vectors in R3.

ii. y1 = (1, 0, 0) , y2 = (0, 1, 2) , y3 = (1, 3, 6) are three vectors in R3.

Solution 3.

i. Since a1x1 + a2x2 + a3x3 = (a1, a2, a3), then (a1, a2, a3) = 0 holds only for a1 = a2 = a3 = 0. Hence three

vectors are linearly independent.

ii. Since y3 = y1+3y2 then it implies that we can write one vector in terms of the others hence three vectors are

linearly dependent.

Now if the vector space be the space of r-times continuously differentiable functions on an open interval I denoted

by Cr(I), then there is a simple way to check the linear dependency of a list of its elements. First we need to

define a Wronskian and use the theorem that follows.

8

Page 12: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Definition 10 (Wronskian). If f1, ..., fn be n real- or complex-valued (n − 1)-times differentiable functions on

an interval I, then the Wronskian W (f1, ..., fn) is a function on I and is defined as a determinant:

W (f1, ..., fn)(x) :=

∣∣∣∣∣∣∣∣∣∣∣∣∣

f1(x) f2(x) · · · fn(x)

f ′1(x) f ′2(x) · · · f ′n(x)

...... . . . ...

f(n−1)1 (x) f

(n−1)2 (x) · · · f (n−1)

n (x)

∣∣∣∣∣∣∣∣∣∣∣∣∣(3)

where x ∈ I

Theorem 1. Let f1, ..., fn be differentiable functions on the interval I. If for some x0 ∈ I the Wronskian

W (f1, ..., fn)(x) = 0 then f1, ..., fn are linearly independent. If f1, ..., fn are linearly dependent then

∀x ∈ I : W (f1, ..., fn)(x) = 0 (4)

Example 4. Let {ex, e2x, e3x} be a list of functions in the vector space of C∞(−1, 1). Check their linear dependency.

Solution 4. By computing the Wronskian we have:

W (x) =

∣∣∣∣∣∣∣∣∣∣ex e2x e3x

ex 2e2x 3e3x

ex 4e2x 9e3x

∣∣∣∣∣∣∣∣∣∣= exe2xe3x

∣∣∣∣∣∣∣∣∣∣1 1 1

1 2 3

1 4 9

∣∣∣∣∣∣∣∣∣∣= 2e6x

Now since W (x) = 0 on (−1, 1) then by theorem (1), these vectors are linearly independent.

Another important definition is the basis of a vector space.

Definition 11 (basis). A basis of vector space V over F is a list of vectors {v1, ...,vn} which is linearly independent

and spans V and every v ∈ V can be written uniquely in the form:

v =

n∑i=1

aivi , ai ∈ F

The dimension of the V denoted by dim(V) is the number of its basis vectors.

9

Page 13: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

One important example of a vector space and its basis is the real vector space of 2× 2 Hermitian matrices which

appears in Quantum Mechanics. I just mention that the basis of this space is the list {1, σ1, σ2, σ3} in which 1 is

the identity matrix and σi are Pauli matrices defined as:

σ1 =

0 1

1 0

, σ2 =

0 −i

i 0

, σ3 =

1 0

0 −1

so this vector space is 4-dimensional. Now we turn ourselves to the concept of linear map and dual space.

3.2 Linear Maps and Dual Space

Definition 12 (Linear Map). A linear map (also called linear transformation) from vector spaces V to W is a

map T : V → W that satisfies

T (v1 + v2) = Tv1 + Tv2 (additivity)

T (λv) = λ(Tv) (homogeneity)

The set of all linear maps from V to W denoted by L(V,W) is a vector space if for S, T ∈ L(V,W), λ ∈ F we

have:

(S + T )v = Sv + Tv

(λT )v = λ(Tv)

A exercise 3. Let C(R) be the vector space of real functions. Define the map T as T (f(x)) = (f(x))2 for

f ∈ C(R). Determine whether T is a linear map or not.

Example 5. Let V = C∞(R) be the vector space of all C∞ real-valued functions (smooth function, differentiable

for all degrees of differentiation). Let L(V) be the vector space of all linear transformations from V to V. Prove that:

i. Differentiation which is a map D defined as D(f(x)) =d

dxf(x) and integration which is a map S defined as

S(f(x)) =

∫f(x) dx, for all f ∈ L(V) are linear maps.

ii. Let T1, T2, T3 ∈ L(V) be defined as:

T1(f(x)) =d

dxf(x) , T2(f(x)) =

d2

dx2f(x) , T3(f(x)) =

∫ x

0f(t) dt

Then determine whether the list {T1, T2, T3} is linearly independent or not.

10

Page 14: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Solution 5.

i. For f, g ∈ C∞(R) and λ, µ ∈ R we have:

D(λf(x) + µg(x)) =d

dx(λf(x) + µg(x)) = λ

d

dxf(x) + µ

d

dxg(x) = λD(f(x)) + µD(g(x))

Same argument goes for integration as well:

S(λf(x) + µg(x)) =

∫ (λf(x) + µg(x)

)dx

= λ

∫f(x) dx+ µ

∫g(x) dx = λS(f(x)) + µS(g(x))

ii. We need to show that a1T1 + a2T2 + a3T3 = 0 has a trivial solution a1 = a2 = a3 = 0 for all a1, a2, a3 ∈ R

and for all f ∈ C∞(R). First let f(x) = 1. Then we have:

T1(1) = T2(1) = 0 , T3(1) =

∫ x

01 dt = x ⇒ a3x = 0 → a3 = 0

Now let f(x) = x. Then:

T1(x) = 1 , T2(x) = 0 ⇒ a1 = 0

Finally let f(x) = x2. Then:

T2(x2) = 2 ⇒ 2a2 = 0 → a2 = 0

Therefore we have a trivial solution ergo the list is linearly independent.

One can safely say that linear algebra deals with the vector spaces and linear maps and the objects that can

represents them. If a linear map is from a vector space to itself it is called an Operator and the vector space

of operators is denoted by L(V). These are the maps Physicists use in Quantum Mechanics and to quote Axler,

“The deepest and most important parts of linear algebra deal with operators.”

Now since L(V,W) is a vector space, it implies one nice feature that is the ability to multiply vectors (maps or

operators) which will come quite handy, for example when applying multiple operators to a state ket for measuring

an observable in Quantum Mechanics.

11

Page 15: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Definition 13 (Product of Linear Maps). Let T ∈ L(U ,V) and S ∈ L(V,W), then the product ST ∈ L(U ,W)

is defined by:

(ST )u = S(Tu) ∀u ∈ U

with the properties:

• associativity: (T1T2)T3 = T1(T2T3)

• identity: TI = IT = T

• distributivity: (S1 + S2)T = S1T + S2T and S(T1 + T2) = ST1 + ST2

There is a special type of linear maps which map a vector space into its field F . They are called linear functionals

and the vector space of them defines a Dual Space.

Definition 14 (Linear Functional). A linear functional on vector space V is a linear map ϕ : V → F .

For example, these maps are linear functionals:

• ϕ : R3 → R s.t. ϕ(x, y, z) = 4x− 5y + 2z

• For fixed (a1, ..., an) ∈ Fn, ϕ : Fn → F s.t. ϕ(x1, ..., xn) =n∑

i=1

aixi

• ϕ : P[R] → R s.t. ϕ(p) =

∫ 1

0p(x) dx

Definition 15 (Dual Space). The dual space of V denoted by V∗ is the vector space of all linear functionals on

V such that dim(V) = dim(V∗). Its elements are called “dual vectors”, “co-vectors” or “one-forms”.

Now as for the V, the dual space V∗ does also have basis called dual basis. In Dirac notation (i.e. bra-ket notation),

each vector v is called “ket” and denoted by |v⟩ and each dual vector ϕ is called “bra” which is denoted by ⟨ϕ| .

We should note that the kets and bras are elements of a special type of vector space called complex Hilbert space

denoted by H which will be introduced later. Hence the action of linear functional ϕ : H → C is written neatly

as ⟨ϕ|v⟩ which is an element of C.

Definition 16 (Dual Basis). If {v1, ...,vn} is a basis for V, then its dual basis is the list {v∗1, ...,v

∗n} ∈ V∗ of

linear functionals on V such that:

v∗i (vj) = δij s.t. δij =

1 if i = j

0 if i = j

(5)

12

Page 16: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Warning 1. Sometimes people use the convention v∗i ≡ vi to denote the dual vectors. Moreover, sometimes

people (specially in General Relativity texts) choose to write vectors by its components as V = V iei denoting V i

as components (scalars) and ei as the basis of vector space. Then they write a dual vector as V ∗ = Vie∗i denoting

Vi as dual components and e∗i as the dual basis. Frustrating, right?! The important thing one should do while

reading a book or a paper is to determine which conventions the author is using.

Remark 4 (Dual Basis in Dirac Notation). In our bra-ket notation, if {|e1⟩ , ..., |en⟩} is a basis of H, then

the dual basis of {|e1⟩ , ..., |en⟩} is the list {⟨ε1| , ..., ⟨εn|} of elements of H∗ where each ⟨εi| is the linear functional

on H such that

⟨εi|ej⟩ = δij s.t. δij =

1 if i = j

0 if i = j

Remark 5. Any vector |ψ⟩ can be expressed in terms of basis vectors as |ψ⟩ =∑j

αj |ej⟩ for αj ∈ C. Any linear

functional ⟨ϕ| can be expressed in terms of dual basis vectors as ⟨ϕ| =∑i

βi ⟨εi| for βi ∈ C. Therefore the action

of a one-form on a vector can be expressed as:

⟨ϕ|ψ⟩ =∑ij

αjβi ⟨εi|ej⟩ =∑ij

αjβi δij =∑i

αiβi (6)

Remark 6. One can talk about a one-to-one correspondence between elements in H and H∗ for fixed basis of

both spaces. In this case, corresponding to a ket |ψ⟩ =∑k

αk |ek⟩ ∈ H , there exists a bra ⟨ψ| =∑k

αk ⟨εk| ∈ H∗

such that αk is the complex conjugate of αk. Note that from now on, I will choose the standard “bar” notation

representing the complex conjugation to distinguish it from “asterisk (∗)” which is used for dual vectors. Now the

reason for this conjugation is to enable one to define a norm. We will discuss these issues later.

Remark 7. One useful way of representing bras and kets is to consider |ψ⟩ as a column vector and ⟨ϕ| as a row

vector:

|ψ⟩ :=

α1

...

αn

and ⟨ϕ| :=(β1 · · · βn

)

so that ⟨ϕ|ψ⟩ is regarded as just a matrix multiplication of a row vector and a column vector which yields a scalar

n∑i=1

αiβi ∈ C

13

Page 17: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Example 6. Let’s consider R2 with the familiar basis {i, j}. Now suppose that a linear functional f : R2 → R is

defined such that we know f(i− j) = 2 and f(i+ j) = 0. What is the action of f on individual basis vectors i.e.

f(i) and f(j) ?

a. f(i) = 1 , f(j) = 0 b. f(i) = −1 , f(j) = 1 c. f(i) = 1 , f(j) = −1 d. f(i) = 2 , f(j) = 0

Solution 6. Using the fact that linear functional is indeed linear i.e. f(αi± βj) = αf(i)± βf(j) , we have here:

f(i)− f(j) = 2 , f(i) + f(j) = 0 ⇔

f(i)− f(j) = 2

f(i) + f(j) = 0

Now we have a system of linear equations to solve and by using the methods of solving them which you are familiar

with, we get: f(i) = 1

f(j) = −1

14

Page 18: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

4 Inner Product Space

“You know, you remind me of a poem I can’t remember,

and a song that may never have existed, and a place I’m not sure I’ve ever been to.”

- Grampa Simpson

One can define some additional structures on a given algebraic or topological structure. This will enable one

to introduce and explore extra rich structures and for the case of a Physicist, to apply them on the Physical

phenomenas that he/she is studying. Now for the vector spaces, one can define an additional structure called

inner product which is a generalization of the usual dot product that we are familiar with from the geometrical

intuition that we had to vectors, like the length of a vector and angle between vectors. In fact one can think

of an inner product space as a generalization of Euclidean space in which the inner product is the scalar or dot

product which we will begin this section by reviewing it. The vector spaces endowed with an inner product (inner

product spaces) are truly important in Physics. Among these type of spaces, there is a very special one which all

the Quantum Mechanics is built upon. It is entitled the “Hilbert Space” after an influential 19th and early 20th

century Mathematician David Hilbert. Although the treatment of Hilbert space is done in a branch of Analysis

called “Functional Analysis”, which is beyond the scope of this note, we will indeed scratch its surface to hopefully

prepare some essentials for your future journey into the heart of Quantum Mechanics.

4.1 Inner Product, Norm and Inequalities

First let us recall the definition of dot product on Euclidean space.

Definition 17 (Dot Product). ∀x, y ∈ Rn, the dot product of x and y denoted by x · y is defined as:

x · y := x1y1 + x2y2 + · · ·+ xnyn =n∑

i=1

xiyi (7)

where x = (x1, x2, ..., xn) and y = (y1, y2, ..., yn), and satisfies these conditions:

1. x · x ≥ 0 2. x · x = 0 iff x = 0 3. x · y = y · x

Definition 18 (Inner Product). Let V be a vector space on F (where F is either R or C). An inner product

on V is a map ⟨· , ·⟩ : V × V → F such that ∀u,v,w ∈ V and ∀λ ∈ F we have the following axioms:

15

Page 19: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

C.1. ⟨v,v⟩ ≥ 0 (positivity)

C.2. ⟨v,v⟩ = 0 iff v = 0 (definiteness)

C.3. ⟨u+ v,w⟩ = ⟨u,w⟩+ ⟨v,w⟩ (additivity in first slot)

C.4. ⟨λu,v⟩ = λ ⟨u,v⟩ (homogeneity in first slot)

C.5. ⟨u,v⟩ = ⟨v,u⟩ (conjugate symmetry)

Remark 8. On following the standard notation for representing conjugation i.e. “bar” instead of “asterisk”, for

z = x+ iy ∈ C we have:

1. z = x− iy

2. z1z2 = z1z2

3. z1 ± z2 = z1 ± z2

4. |z| = |z|

5. zz = |z|2

6. (z1/z2 ) = z1/z2

Warning 2. You should note that in axiom (C.4), I have chosen a convention used in Mathematics literature

since I learned a great deal of these stuff by them and to be honest, I am more comfortable by those conven-

tions. Now you might encounter a case (most often in Physics literature) where the author chooses an alternative

convention and writes ⟨λu,v⟩ = λ ⟨u,v⟩ instead. Also, they might demand the additivity in the second slot as

⟨u,v +w⟩ = ⟨u,v⟩ + ⟨u,w⟩ instead of first slot that we chose. Although it is OK to choose any convention (as

long as the author keeps it to the end, as I mentioned earlier), you should know that the convention for additivity

in the second slot comes from defining inner product through the bilinear form and its generalization sequilinear

form in which they choose to denote inner product by (· , ·) instead of ⟨·, ·⟩ which we have chosen. Please take my

warning.1 together with this one seriously to avoid encounter any confusion from now on.

Example 7. Axiom (C.3) states that the inner product is additive in first slot. Does this property hold in the

second slot as well?

Solution 7. Using axioms (C.4) and (C.5) we have:

⟨u, λv + βw⟩ = ⟨λv + βw,u⟩ = λ ⟨v,u⟩+ β ⟨w,u⟩ = λ ⟨u,v⟩+ β ⟨u,w⟩

Hence the inner product is anti-linear or conjugate linear in the second slot. Note that a map f in a complex

vector space is anti-linear if f(av + bw) = af(v) + bf(w) .

16

Page 20: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Some examples of inner products are:

• The inner product on Cn defined as:

⟨x,y⟩ :=n∑

i=1

xiyi (8)

for x = (x1, ..., xn),y = (y1, ..., yn) ∈ Cn

• The most general form of an inner product on Cn is called Hermitian form and defined as:

⟨x,y⟩ := y†Mx (9)

where “†” (dagger) is the conjugate transpose (also called Hermitian conjugate or Hermitian adjoint) and

M is any positive-definite Hermitian matrix i.e. a Hermitian matrix that all its eigenvalues are positive.

• Let C[a,b] be the vector space of all continuous complex-valued functions on the interval [a, b]. Then the

inner product on C[a,b] is defined as:

⟨f, g⟩ :=∫ b

af(x) g(x) dx (10)

• The inner product on P[R] is defined as:

⟨p, q⟩ :=∫ ∞

0p(x) q(x) e−x dx (11)

Definition 19 (Inner Product Space). An inner product space is a vector space V with an inner product defined

on V.

Definition 20 (Norm). The norm of a vector v ∈ V, denoted by ∥v∥, is given by:

∥v∥ :=√⟨v,v⟩ (12)

with the properties:

i. ∥v∥ = 0 iff v = 0 ii. ∥λv∥ = |λ| ∥v∥

A exercise 4. Let x = (x1, ..., xn) ∈ Rn and z = (z1, ..., zn) ∈ Cn . Write down the explicit formula for ∥x∥ and

∥v∥ .

17

Page 21: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Figure 1: going from A to B in a taxicab world ©Wikipedia

Note that the norm defined by def.(20) is called Euclidean norm if v ∈ Rn and 2-norm (or L2-norm) in general,

and guess what?! We have other kinds of norms for a given vector as well. They are called “p-norms”.

Definition 21 (p-norm). Let x = (x1, ..., xn) be a vector in Fn. For a real number p ≥ 1, the p-norm (or

Lp-norm) of x is defined as:

∥x∥p :=(|x1|p + |x2|p + · · ·+ |xn|p

) 1p=( n∑

i=1

|xi|p) 1

p (13)

When p→ ∞ we have a L∞-norm, called maximum norm or uniform norm, given by:

∥x∥∞ = max{|x1|, |x2|, ..., |xn|} (14)

A exercise 5. Let v = (1, 2, 3) be a vector in R3. Calculate ∥v∥p for p = 1, 2, 3 and its L∞-norm.

It is fun to dig the p = 1 or L1-norm a little bit, since people has given an interesting name to it: “Taxicab” or

“Manhattan” norm! Imagine that we are living in a city which is laid out in a grid system where the measure

of distance is an edge of square cell being one unit (fig.1). Let’s say that your home is at A and you want to

go to your friend’s at B and you can’t get there by flying of course, or by jumping rooftops since you’re not

Batman/Catwoman. Are you? Now you either have to walk or take a cab (or any other vehicle). How many units

of distance you need to take to get there?

If you look again at fig.1, you can see that it’s 12 units. In fact some possible paths that you might take is shown

by red, blue and yellow lines. What about green line? That’s for when you can fly straight from A to B. How

many units for green line? Yes, approximately 8.49 units by using the ancient Pythagoras’ theorem.

18

Page 22: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

More formally, the metric (or p-distance) dp between two vectors x = (x1, ..., xn) and y = (y1, ..., yn) is defined

(the notion of metric comes from a more general space called metric space):

dlp(x,y) = ∥x− y∥p = p

√√√√ n∑i=1

|xi − yi|p (15)

So for our example, if we set the coordinates of your home (x1, y1) at origin (0, 0) and your friend’s at (x2, y2) =

(6, 6), then the taxicab distance will be d1 = |x1 − x2|+ |y1 − y2| = 12.

Definition 22 (Matrix Norm). For matrices, one can define different types of norms as well. The formal

definition comes from the operator norm or induced norm. Let Km×n be the vector space of all m × n matrices

over K. Let ∥·∥p be a norm on Kn and ∥·∥q be a norm on Km. Then the matrix norm induced by these norms is

defined by:

∥A∥p,q = max{∥Ax∥q

∥x∥p: x ∈ Kn and x = 0

}(16)

If we set p = q (which we will assume it from now on), then:

∥A∥p = max{∥Ax∥p

∥x∥p: x ∈ Kn and x = 0

}(17)

and the intuitive idea behind the concept of a matrix norm will be that of how much matrix A amplifies its

given input vector. Why? First, in functional analysis there is a special class of linear operators called bounded

operators for which there is a property:

∀v ∈ V : ∥Lv∥ ≤ ∥L∥op ∥v∥ (18)

so the action of operator on the elements of a vector space will form a set whose elements are the ratio of operator

action to the norm of the vector it is acting on i.e. ∥Lv∥∥v∥

. Then the norm of L is defined as the supremum of

this set:

∥L∥op := sup{∥Lv∥∥v∥

: v ∈ V with v = 0} (19)

So from this definition it one can see that for any bounded operator, the norm is the biggest case of scaling an

input. Now since any bounded operator in Hilbert space admits a matrix representation (we can represent it by a

specific matrix), therefore our argument following the operator norm goes for the matrix norm as well. Second, as

we will see soon below that one can define a norm of matrix by its largest eigenvalue. But what is the significance

of eigenvalues in the first place?

19

Page 23: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Although we will come back to the related issues on Matrices later in this note with details, here just think of

them as magnitudes of scaling for certain vectors after the action of a given matrix. This comes from the classic

eigenvalue problem in which for λ ∈ C being an eigenvalue of a matrix A means satisfying the equation:

Av = λv

where vector v is called eigenvector. So the largest λ means the most scaling of v.

Remark 9. Calculating matrix norm using the definition above is not an easy thing to do. However there are

some special cases in which the computation becomes simpler. These cases are for p = 1, 2,∞ together with

Frobenius norm and defined as:

i. (maximum absolute column sum norm): For p = 1 we have

∥A∥1 = maxj

n∑i=1

|aij | (20)

ii. (spectral norm): Let A† be the conjugate transpose of A and {λi(A†A)} be the eigenvalues of A. Then for

p = 2 we have

∥A∥2 = maxi

√λi(A†A) (21)

iii. (maximum absolute row sum norm): Its ∞-norm is

∥A∥∞ = maxi

n∑j=1

|aij | (22)

iv. (Frobenius norm):

∥A∥F =

√√√√ m∑i=1

n∑j=1

|aij |2 =√tr(A†A) (23)

If A be a operator in Hilbert space, then the Frobenius norm is called Hilbert-Schmidt norm denoted by ∥A∥HS

Example 8. Let A =

1 2

3 4

. Calculate the norms ∥A∥1 , ∥A∥2 , ∥A∥∞ , ∥A∥F .

20

Page 24: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Solution 8.

a.

∥A∥1 = max{|1|+ |3|, |2|+ |4|} = max{4, 6} = 6

b.

∥A∥∞ = max{|1|+ |2|, |3|+ |4|} = max{3, 7} = 7

c.

∥A∥F =√|1|2 + |2|2 + |3|2 + |4|2 =

√30 ≈ 5.4772

d.

A†A =

10 14

14 20

Now to find its eigenvalues, we need to find the roots of its characteristic equation:

det

10 14

14 20

− λ

1 0

0 1

= 0 ⇒ λ2 − 30λ+ 4 = 0 ⇒ λ =

15 +

√221

15−√221

Hence ∥A∥2 = max{√

15 +√221,

√15−

√221

}=

√15 +

√221 ≈ 5.4649

A exercise 6. Calculate the ∥A∥1 , ∥A∥2 , ∥A∥∞ , ∥A∥F norms of A =

2 −2 1

−1 3 −1

2 −4 1

.

Definition 23 (Orthogonality). Two vectors u,v are called orthogonal if ⟨u,v⟩ = 0 .

Example 9 (Pythagorean Theorem). Suppose that u,v are orthogonal vectors in V. Prove that ∥u+ v∥2 =

∥u∥2 + ∥v∥2 .

Solution 9. ∥u+ v∥2 = ⟨u+ v,u+ v⟩ = ⟨u,u⟩+ ⟨u,v⟩+ ⟨v,u⟩+ ⟨v,v⟩ = ∥u∥2 + ∥v∥2

Theorem 2 (Cauchy-Schwartz Inequality). Let V be an inner product space. Then for u,v ∈ V the following

inequality holds:

|⟨u,v⟩| ≤ ∥u∥ ∥v∥ (24)

21

Page 25: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

The equality holds iff one of the vectors is a scalar multiple of the other. Also in Rn this inequality becomes:

(n∑

i=1

uivi

)2

(n∑

i=1

u2i

) (n∑

i=1

v2i

)(25)

Here we are not going to prove this theorem and the following one, you can find the proofs almost in every related

textbook. Instead we will use the inequality to obtain a relation between L1 and L2 norms.

Example 10. Prove that ∀x ∈ Rn : ∥x∥1 ≤√n ∥x∥2 .

Solution 10. We have from the definition:

∥x∥1 =n∑

i=1

|xi| =

(n∑

i=1

|xi|

)· (1)

The right hand side is the inner product therefore satisfies the Cauchy-Schwartz inequality. Hence

∥x∥1 ≤

(n∑

i=1

|xi|2) 1

2(

n∑i=1

12

) 12

=√n ∥x∥2 ⇒ ∥x∥1 ≤

√n ∥x∥2

Theorem 3 (Triangle Inequality). ∥u+v∥ ≤ ∥u∥+∥v∥ . Again the equality holds iff one vector is non-negative

multiple of the other.

A exercise 7. Prove ∥u+ v∥2 + ∥u− v∥2 = 2(∥u∥2 + ∥v∥2

).

4.2 Orthonormal Basis and Gram-Schmidt Procedure

Definition 24 (Orthonormal Basis). A list of vectors {e1, ..., en} is called orthonormal if each vector in the

list has norm 1 and is orthogonal to all the other vectors in the list. In other words:

⟨ei, ej⟩ = δij (26)

Any such set is called complete if it is not a subset of any larger orthonormal list of vectors in the given vec-

tor space. Any complete set is basis. An orthonormal list of vectors which is also a basis is called orthonormal basis.

A exercise 8. Do the vectors (1√3,1√3,1√3), (− 1√

2,1√2, 0) and (

1√6,1√6,− 2√

6) form an orthonormal list?

22

Page 26: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Orthonormal basis is important specially in Quantum Mechanics. For example, for spin-12

systems we can choose

either |Sz;±⟩ or |Sx;±⟩ as the basis kets. Then we want to know how these two descriptions are related to

each other. That is done by a unitary transformation and the process is called change of basis or change of

representation. If we are talking about matrices, a change of basis will be a rotation. The condition for performing

such a process is to have at least two orthonormal basis. Another important example is the case of function spaces

like Fourier Analysis. The reason why a function can be represented by the Fourier decomposition is that the set

of exponential functions {eint} form an orthonormal basis:

1

∫ π

−πeint · e−imt dt =

1 if n = m

0 if n = m

(27)

Now if we are given a basis for some inner product space V, it is possible to construct an orthonormal basis for

the given space. It can be done by a process called “Gram-Schmidt procedure.

Definition 25 (Gram-Schmidt Procedure). Suppose {v1, ...,vm} be a linearly independent list of vectors in

V. Let

e1 =v1∥v1∥

Now for j = 2, 3, ...,m, define ej by:

ej =vj − ⟨vj , e1⟩ e1 − · · · − ⟨vj , ej−1⟩ ej−1

∥vj − ⟨vj , e1⟩ e1 − · · · − ⟨vj , ej−1⟩ ej−1∥(28)

Then {e1, ..., em} is an orthonormal list of vectors in V.

Example 11. For the given vectors in R3:

v1 = (1, 1,−2) , v2 = (1, 2,−3) , v3 = (0, 1, 1)

find an orthonormal set of vectors.

Solution 11. By the Gram-Schmidt procedure, letting e1 =v1∥v1∥

we have:

∥v1∥ =√

⟨v1,v1⟩ =√12 + 12 + (−2)2 =

√6 ⇒ e1 =

1√6(1, 1,−2)

Now for the next vector we have:

e2 =v2 − ⟨v2, e1⟩ e1

∥v2 − ⟨v2, e1⟩ e1∥

23

Page 27: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Hence:

v2 − ⟨v2, e1⟩ e1 = (1, 2,−3)− 9√6

1√6(1, 1,−2) = (−1

2,1

2, 0) ⇒ ∥v2 − ⟨v2, e1⟩ e1∥ =

1√2

⇒ e2 =1√2(−1, 1, 0)

Now for the last vector we have:

e3 =v3 − ⟨v3, e1⟩ e1 − ⟨v3, e2⟩ e2∥v3 − ⟨v3, e1⟩ e1 − ⟨v3, e2⟩ e2∥

So that:

v3 − ⟨v3, e1⟩ e1 − ⟨v3, e2⟩ e2 = (0, 1, 1)− (− 1√6)1√6(1, 1,−2)− 1√

2

1√2(−1, 1, 0) =

2

3(1, 1, 1)

∥v3 − ⟨v3, e1⟩ e1 − ⟨v3, e2⟩ e2∥ =2√3

⇒ e3 =1√3(1, 1, 1)

Therefore the obtained {e1, e2, e3} is the desired orthonormal vectors.

Example 12. Find an orthonormal basis of P(R) with the basis {1, x, x2} where the inner product is given by

⟨p, q⟩ =∫ 1

−1p(x) q(x) dx .

Solution 12. By the Gram-Schmidt procedure, we first evaluate e1 from v1 ≡ 1 by e1 =1

∥1∥where ∥1∥ is

obtained by the given inner product:

∥1∥ =√⟨1, 1⟩ =

√∫ 1

−112 dx =

√2 ⇒ e1 =

1√2

Now for the next basis we have:

e2 =v2 − ⟨v2, e1⟩ e1

∥v2 − ⟨v2, e1⟩ e1∥=

x− ⟨x, e1⟩ e1∥x− ⟨x, e1⟩ e1∥

So:

x− ⟨x, e1⟩ e1 = x−(∫ 1

−1x (

1√2) dx

)1√2= x and ∥x∥ =

√∫ 1

−1x2 dx =

√2

3⇒ e2 =

√3

2x

Now for the third basis we have:

e3 =v3 − ⟨v3, e1⟩ e1 − ⟨v3, e2⟩ e2

∥v3 − ⟨v3, e1⟩ e1 − ⟨v3, e2⟩ e2∥=

x2 − ⟨x2, e1⟩ e1 − ⟨x2, e2⟩ e2∥x2 − ⟨x2, e1⟩ e1 − ⟨x2, e2⟩ e2∥

24

Page 28: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Now:

x2 − ⟨x2, e1⟩ e1 − ⟨x2, e2⟩ e2 = x2 −(∫ 1

−1x2 (

1√2) dx

)1√2−

(∫ 1

−1x2 (

√3

2x) dx

) √3

2x = x2 − 1

3

And: ∥∥∥∥x2 − 1

3

∥∥∥∥ =

√∫ 1

−1

(x4 − 2

3x2 +

1

9

)dx =

√8

45⇒ e3 =

√45

8

(x2 − 1

3

)

Therefore { 1√2,

√3

2x,

√45

8

(x2 − 1

3

)} is an orthonormal list in P(R) .

A exercise 9. Find e4 and e5 for the example.12 by continuing the Gram-Schmidt procedure.

4.3 Orthogonal Polynomials

The orthonormal basis {ei} that we found in the previous example are among the important special polynomials

that are used in Physics. They are called “Legendre Polynomials” and they appear for example in the solutions

of Laplace equation ∇2Φ = 0 in Electrostatics and Schrodinger equation in Quantum Mechanics which both

separated in spherical coordinates. For the Laplace equation we have:

d

dx

[(1− x2

) dP (x)dx

]+

[l (l + 1)− m2

1− x2

]P (x) = 0 s.t. x = cos θ and l ∈ Z+ ∪ {0}

d2U(r)

dr2− l(l + 1)

r2U(r) = 0

(29)

Now if we require for the angular equation to have no azimuthal dependence (i.e. m = 0) we get the standard

Legendre’s differential equation for the angular part together with an ordinary differential equation for the radial

part as:d

dx

[(1− x2

) dP (x)dx

]+ l (l + 1)P (x) = 0

d2U(r)

dr2− l(l + 1)

r2U(r) = 0

(30)

which has the general power series solution of the form (which describe an electrostatic potential inside or outside

of an spherical region):

Φ(r, θ) =∞∑i=0

[Al r

l +Bl r−(l+1)

]Pl(cos θ) (31)

where Pl(cos θ) are Legendre polynomials that we obtained in the example:

P0(x) = 1 , P1(x) = x = cos θ , P2(x) =1

2

(3x2 − 1

)=

1

4(3 cos 2θ + 1) (32)

25

Page 29: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

You should note that the difference in the coefficients appears from the orthonormality condition that we have

set. This condition for the Legendre’s polynomial is expressed as:

∫ 1

−1Pn(x)Pm(x) dx =

2

2n+ 1if n = m

0 if n = m

(33)

Remark 10. The Legendre’s polynomials are among the spacial class of functions called “orthogonal polynomi-

als” or “orthogonal functions” which play an important role in describing the various Physical systems. Upon this

importance and their relation to the inner product spaces, I will give you their definition with examples.

Definition 26 (System of Orthogonal Polynomials). Let {Φn(x)} be a list of polynomials defined on the

interval a < x < b such that Φn is of degree n, and let w(x) > 0 be a function defined on the same interval and

call them weight function. We call the positive numbers ∥Φn∥2 the norm defined by:

∥Φn∥2 =∫ b

a[Φn(x)]

2 w(x) dx (34)

Then {Φn(x)} is said to be orthogonal over a < x < b with respect to the weight function if:

∫ b

aΦm(x)Φn(x)w(x) dx =

∥Φn∥2 if m = n

0 if m = n

s.t. m, n ∈ Z+ ∪ {0} (35)

The normalized system of polynomials {ϕn(x)} where

ϕn(x) =Φn(x)

∥Φn(x)∥(36)

is said to be orthonormal if:

∫ b

aϕm(x)ϕn(x)w(x) dx =

1 if m = n

0 if m = n

s.t. m, n ∈ Z+ ∪ {0} (37)

Remark 11. One useful thing that you can do with orthogonal polynomials is to approximate (expand) poly-

nomials in terms of them. That comes from the properties of orthonormal basis in a vector space as you may

recall.

26

Page 30: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Definition 27 (Orthogonal Expansion). Let’s assume that we have a system {ϕn(x)} of orthogonal polynomials

over a < x < b with respect to w(x). Then any polynomial f(x) of degree n can be expanded as:

f(x) =

m∑k=0

am ϕm(x) s.t. am = ⟨f, ϕm⟩ =∫ b

af(x)ϕm(x)w(x) dx (38)

where m indicates the degree of approximation (i.e. an expansion in terms of the first m orthogonal polynomial).

A exercise 10. Using the definition of norm for the orthogonal polynomials, find an orthonormal basis of P(R)

with the basis {1, x, x2} over 0 < x < ∞ with respect to the weight function w(x) = e−x . These orthogonal

polynomials will be the first three Laguerre polynomials. Then using them, expand the function f(x) = e−2x with

the approximation degree of m = 3 by finding the coefficients am .

27

Page 31: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

5 Ode to Hilbert Space

“Dr. von Neumann, I would very much like to know, what after all is a Hilbert space?”

- David Hilbert

In this section we’ll finally introduce a space that we talked about in many occasions during the note i.e. the Hilbert

space. First, note that here we will only deal with the finite-dimensional Hilbert space (which is useful in learning

basic Quantum Mechanics, for example when discussing spin, and of course it is essential to Quantum Information

and Quantum Computation as well) since dealing with infinite-dimensional case requires lots of techniques from

Analysis which is studied in Functional Analysis rather than Linear Algebra which deals with finite-dimensional

vector spaces. But we need to learn a bit about Metric Space first.

Definition 28 (Metric Space). A metric space (X, d) is a set X together with a real-valued function d : X×X →

R≥0 called metric or distance function such that ∀x, y, z ∈ X the following axioms hold:

D.1. d(x, y) = 0 iff x = y (definiteness)

D.2. d(x, y) = d(y, x) (symmetry)

D.3. d(x, z) ≤ d(x, y) + d(y, z) (sub-additivity or triangle inequality)

Now the important thing about metric space regarding our vector space is the following proposition that relates

metric to the norm:

Proposition 1. If ∥ · ∥p is a norm on vector space V then the lp-metric defined by:

dlp = ∥x− y∥p = p

√√√√ n∑i=1

|xi − yi|p (39)

is a distance on V turning it into a metric space.

Remark 12. From the proposition 1, one can see that every normed vector space is indeed a metric space i.e. every

norm induces a metric. But the reverse does not hold always. The useful sketch in fig.2 (which I have borrowed it

from our beloved Wikipedia) shows the relationship between different Mathematical spaces. To familiar examples

of metric space is the set of real numbers with the metric d(x, y) = |x− y| and the set of continuous real-valued

functions on the interval [a, b] with the metric d(f, g) =∫ b

a|f(x)− g(x)| dx . Now in order to proceed, we need

to define a Cauchy sequence.

28

Page 32: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Figure 2: Hierarchy of Mathematical Spaces ©Wikipedia

Definition 29 (Cauchy sequence). A sequence {xn} of elements of a metric space (X, d) is called a Cauchy

sequence if for every ϵ > 0 there exist a number N such that n,m ≥ N implies d(xn, xm) < ϵ .

Example 13. Consider the sequence{xj =

j∑n=1

1

n2

}of real numbers. Now let’s see whether this is a Cauchy

sequence or not. To do so, we need to find a convergence bound for elements xj and xk for j, k > N . Since the

summation to j is definitely smaller than the summation from N to infinity, hence:

|xj − xk| ≤∞∑

n=N

1

n2≤

∞∑n=N

1

n(n− 1)=

∞∑n=N

{1

n− 1− 1

n

}=

1

N − 1⇒ |xj − xk| ≤

1

N − 1

Now since we can make |xj − xk| arbitrarily small by taking N arbitrarily large, therefore {xn} is a Cauchy

sequence. In fact any convergent sequence is Cauchy. Fig.3 shows a sketch of a Cauchy sequence.

Definition 30. A metric space in which every Cauchy sequence converges is called a complete metric space

or Cauchy space.

Now we have a powerful proposition regarding our finite-dimensional spaces which states:

Proposition 2. Every Cauchy sequence in a finite-dimensional inner product space over R or C converges. That

is every finite-dimensional real or complex inner product space is complete with respect to the norm induced by its

inner product.

29

Page 33: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Figure 3: Example of a Cauchy sequence in X © Sadri Hassani

5.1 Hilbert Space and Operator Algebra

Definition 31 (Hilbert Space). A complete inner product space (with respect to the norm induced by its inner

product) is a Hilbert Space which is denoted by H .

Remark 13. From proposition 2, we see that all finite-dimensional real or complex inner product spaces are real

or complex Hilbert spaces, like Cn and Rn.

Now if you recall, we said some stuff about Dirac notation and dual spaces when we were discussing about linear

functionals. You went through all these previous abstract concepts for a reason! That is to choose the Dirac

notation from now on without worrying about its Mathematical exactness. Often when Physicists speak of bra-

kets, then mean it as an inner product. But so far we have seen that inner product and bra-kets (which are in

fact linear functionals) are two complete different concepts. Fear not! There is a powerful theorem called Riesz

lemma or Riesz Representation Theorem that connects them. I will present it here in terms of Dirac notation.

Theorem 4 (Riesz lemma). Let H be a complex vector space and let H∗ be its dual. Then for |v⟩ , |w⟩ ∈ H and

⟨v| ∈ H∗ we have:

⟨v|w⟩ = ⟨ |w⟩ , |v⟩ ⟩ (40)

Remark 14. From now on, we will choose the Dirac notation and work in the finite-dimensional Hilbert space

(unless otherwise is stated) and denote vectors by kets and therefore their inner product will be:

∀ |v⟩ , |w⟩ ∈ H, with |v⟩ :=

v1

v2...

vn

, |w⟩ :=

w1

w2

...

wn

⟨v| :=(v1 · · · vn

)

⟨w| :=(w1 · · · wn

)

30

Page 34: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

we have:

⟨v|w⟩ =(v1 v2 · · · vn

)

w1

w2

...

wn

=

n∑i=1

viwi (41)

This way we can define some useful concepts.

Definition 32 (Adjoint). Let α ∈ C, |v⟩ , |w⟩ ∈ H and the linear operators T,U ∈ L(V) . Then the adjoint or

Hermitian conjugate denoted by “ †” is defined such that:

a. (|v⟩)† = ⟨v| b. (T |v⟩)† = ⟨v|T † c. ( ⟨v|T |w⟩ )† = ⟨w|T †|v⟩

with the properties:

1. (αT )† = α T † 2. (T + U)† = T † + U †3.(T †)†

= T 4. (T U)† = U † T †

Example 14. Suppose T : C3 → C3 is defined by T

α1

α2

α3

=

α1 − iα2 + α3

iα1 − α3

α1 − α2 + iα3

. Find its adjoint T † .

Solution 13. Since from the definition we have ( ⟨v|T |w⟩ )† = ⟨w|T †|v⟩, therefore we first define

|v⟩ =

α1

α2

α3

, |w⟩ =

β1

β2

β3

then we use the left hand side and operate T on |w⟩ and then compare it to the right hand side to obtain its

adjoint:

( ⟨v|T |w⟩ )† =

(α1 α2 α3

)T

β1

β2

β3

=

(α1 α2 α3

)β1 − iβ2 + β3

iβ1 − β3

β1 − β2 + iβ3

= (α1β1 − iα1β2 + α1β3 + iα2β1 − α2β3 + α3β1 − α3β2 + iα3β3)†

=(β1α1 − iβ1α2 + β1α3 + iβ2α1 − β2α3 + β3α1 − β3α2 − iβ3α3

)

31

Page 35: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

The last equation is nothing but the usual matrix dot product. Hence:

⟨w|T †|v⟩ =

⟨w|︷ ︸︸ ︷(β1 β2 β3

)α1 − iα2 + α3

iα1 − α3

α1 − α2 − iα3

︸ ︷︷ ︸

T †|v⟩

Therefore:

T †

α1

α2

α3

=

α1 − iα2 + α3

iα1 − α3

α1 − α2 − iα3

A exercise 11. Consider an operator A = xd

dxdefined on the Hilbert space with the inner product given by:

⟨f |g⟩ =∫ +∞

−∞f(x) g(x) dx

If both f, g vanish at ±∞ i.e. f(±∞) = g(±∞) = 0, then find A† .

Definition 33 (Expectation Value). The expectation value of an operator T in (for) the ket |v⟩ , denoted by

⟨T ⟩v is a number defined by:

⟨T ⟩v = ⟨v|T |v⟩ (42)

A exercise 12. Using the definition of inner product in exercise 11 and the condition within, find the expectation

value of operator p = −i ddx

for the function of the form ψ(x) = eiθ f(x) , where f(x) is a real-valued function and

θ ∈ R .

Definition 34 (Hermitian Operator). An operator (matrix) H is Hermitian if H† = H. It is called anti-

Hermitian if H† = −H. The expectation value of a Hermitian operator is real.

A exercise 13. Check that whether these matrices are Hermitian or anti-Hermitian:

0 −i

i 0

a) 1√3

1 0 0

0 1 0

0 0 −2

b)

0 0 1 0

0 0 0 −1

−1 0 0 0

0 1 0 0

c)

32

Page 36: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Definition 35 (Positive Definite Operator). A Hermitian operator H on an inner product space is called

positive definite, denoted by H ≥ 0, if ∀ |v⟩ = |0⟩ : ⟨v|H|v⟩ ≥ 0 . It is called strictly positive if ⟨v|H|v⟩ > 0 .

A strictly positive operator is invertible and its inverse is denoted by H−1 . Unitary operators preserve the inner

product i.e.

⟨U |w⟩ , U |v⟩ ⟩ = ⟨v|U † U |w⟩ = ⟨v|1|w⟩ = ⟨v|w⟩ (43)

Definition 36 (Unitary Operator). An operator U is called a unitary operator if U † = U−1. Hence the

unitary operators satisfy U † U = 1 . If U is defined on a real vector space, it is called an orthogonal operator.

A exercise 14. Let the operator U : C2 → C2 be given by:

U

α1

α2

=

iα1√2− i

α2√2

α1√2+α2√2

Find U † and check if it is unitary.

A exercise 15. Show that the product of two unitary operators is always unitary.

Definition 37 (Outer Product). Outer product of the two kets |v⟩ , |w⟩ ∈ H is defined by:

|v⟩ ⊗ |w⟩ ≡ |v⟩ ⟨w| =

v1

v2...

vn

(w1 w2 · · · wn

)=

v1w1 v1w2 · · · v1wn

v2w1 v2w2 · · · v2wn

...... . . . ...

vnw1 vnw2 · · · vnwn

(44)

which can act on any ket |u⟩ ∈ H as:

(|v⟩ ⟨w|)(|u⟩) = |v⟩ ⟨w|u⟩ = ⟨w|u⟩ |v⟩ (45)

Definition 38 (Completeness Relation). Let {|ei⟩} be an orthonormal basis for V such that any vector can be

written in terms of them. Then: ∑i=1

|ei⟩ ⟨ei| = 1 (46)

which is known as completeness relation. This relation is very useful in Quantum Mechanics and operators like

P = |ei⟩ ⟨ei| such that P 2 = P are called projectors.

33

Page 37: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Remark 15. One useful application of completeness relation is to represent a linear operator in the outer product

notation. Let A : V → W be a linear operator , {|vi⟩} be an orthonormal basis of V and {|wj⟩} be an orthonormal

basis of W. Then we can represent A by:

A = 1W A1V =∑ij

⟨wj |A|vi⟩ |wj⟩ ⟨vi| (47)

such that ⟨wj |A|vi⟩ is the matrix element of A in the ith column and jth row with respect to the input basis {|vi⟩}

and output basis {|wj⟩} .

Now we turn ourselves to the important types of unitary transformations.

Definition 39 (Unitary Transformation). Suppose that we perform a unitary transformation on the Hilbert

space by acting a unitary operator U on its all vectors i.e. |v′⟩ = U |v⟩ . Under this transformation we have:

i. (Basis Change): Let {|ei⟩} be a basis such that any vector can be written as |v⟩ =n∑

i=1

vi |ei⟩, and let {|e′i⟩}

be a new basis under this transformation such that any vector in this basis is represented by |v′⟩ =n∑

i=1

v′i |e′i⟩ .

Now the operator U that will enable us to perform this basis change is a projector:

U =∑k

|e′k⟩ ⟨ek| (48)

and its matrix elements will be:

U =

⟨e1|e′1⟩ ⟨e1|e′2⟩ · · · ⟨e1|e′n⟩

⟨e2|e′1⟩ · · · · · · ⟨e2|e′n⟩...

... . . . ...

⟨en|e′1⟩ · · · · · · ⟨en|e′n⟩

(49)

Given a vector in the old basis |ei⟩, we can obtain its representation in the new basis |e′i⟩ as:

v′1

v′2...

v′n

= U †

v1

v2...

vn

(50)

ii. (Similarity Transformation): A linear operator X will change under this transformation as X ′ which we

call it similarity transformation such that:

X ′ = U †X U (51)

34

Page 38: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Example 15. In Quantum Mechanics, a single spin-12

particle (like electron) has a two dimensional Hilbert space

with the orthonormal basis {|+⟩ , |−⟩} with respect to the spin operator Sz i.e. spin along the z-axis. In this basis,

the spin basis along the x-axis can be represented by |Sx;±⟩ = 1√2(|+⟩ ± |−⟩) . Now assume that {|+⟩ , |−⟩} is

the old basis and |Sx;±⟩ is the new basis. Find the representation of the old in the new basis. Use the fact that

|+⟩ =

1

0

and |−⟩ =

0

1

Solution 14. First we need to find the unitary matrix that would get us to the new basis. We have:

U =

⟨+|Sx; +⟩ ⟨+|Sx;−⟩

⟨−|Sx; +⟩ ⟨−|Sx;−⟩

=1√2

⟨+|+⟩+ ⟨+|−⟩ ⟨+|+⟩ − ⟨+|−⟩

⟨−|+⟩+ ⟨−|−⟩ ⟨−|+⟩ − ⟨−|−⟩

Now since {|+⟩ , |−⟩} are orthonormal i.e. ⟨+|+⟩ = ⟨−|−⟩ = 1 , ⟨+|−⟩ = ⟨−|+⟩ = 0 we have:

U =1√2

1 1

1 −1

⇒ U † =1√2

1 1

1 −1

Therefore:

|+⟩ = 1√2

1 1

1 −1

1

0

=1√2

1

1

and |−⟩ = 1√2

1 1

1 −1

0

1

=1√2

1

−1

A exercise 16. A Hilbert space has the five basis defined on the surface of a unit sphere and expressed in spherical

coordinates by:

χ1 =

√15

4πsin θ cos θ cosφ , χ2 =

√15

4πsin θ cos θ sinφ , χ3 =

√15

4πsin2 θ sinφ cosφ

χ4 =

√15

16πsin2 θ

(cos2 φ− sin2 φ

), χ5 =

√5

16π

(3 cos2 θ − 1

)

They are orthonormal with respect to the inner product ⟨f |g⟩ =

∫ π

0sin θ dθ

∫ 2π

0dφf(θ, φ) g(θ, φ) . This Hilbert

space can have another orthonormal basis:

χ′1 = −

√15

8πsin θ cos θeiφ , χ′

2 =

√15

8πsin θ cos θe−iφ , χ′

3 =

√15

32πsin2 θe2iφ

χ′4 =

√15

32πsin2 θe−2iφ , χ′

5 = χ5

Find the unitary matrix that describes the transformation from the unprimed to the primed basis.

35

Page 39: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

6 Cadenza: Algebra Over a Field

“If a black cat crosses your path, it signifies that the animal is going somewhere.”

- Groucho Marx

Before ending this note I want to say a few words about ‘algebras’2. For those of you who might consider choosing

a High Energy Physics major, this name will come up quite often from the beginning when you see authors stating

that the Standard Model of Particle Physics is based on the gauge group SU(3)⊗SU(2)⊗U(1) whose Lie algebra

is su(3) ⊕ su(2) ⊕ u(1) . If you recall, we mentioned in section 2.2 that there’s something called ‘algebra-like

structure’. What we’re going to do here is to talk a little bit about that, particularly focusing on one of its

subsystems called ‘algebra over a field’. Remember that in vector spaces, product of vectors yielding another

vector was not defined. So what if we equip a vector space (over field K) with some special kind of vector product

satisfying some additional axioms? You guessed correctly, we’ll get an algebra over field K or simply a K-algebra!

Definition 40 (Algebra Over a Field). An algebra A over field K (K-algebra) is a vector space over K equipped

with a binary operation · : A × A → A called multiplication, for which these identities hold for all elements

x,y, z ∈ A and all elements a, b ∈ K :

• Left distributivity: x · (y + z) = x · y + x · z

• Right distributivity: (x + y) · z = x · z + y · z

• Compatibility with scalars: (ax) · (by) = (ab)(x · y)

Often algebra A over a field is denoted by (A,+, ·) where + is the vector space addition.

Some remarks are in order:

(a) The binary operation ‘ · ’ does not need to be associative per se, nor does it need an identity element.

(b) It’s tempting to confuse this product with the one you saw earlier in vector spaces which acted like an inner

product (or rather a bilinear form). The output of ‘ · ’ is a vector while bilinear form yields an element of its

field.

(c) dim (vector space) = dim (algebra) .

Now by imposing some restrictions we’ll get some interesting types of algebras:

2This section was added much later, in the revision.

36

Page 40: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

• associative algebra: if ‘ · ’ satisfies x · (y · z) = (x · y) · z

• commutative algebra: if ‘ · ’ satisfies x · y = y · x , if x · y = −y · x it’s an anticommutative algebra

• unital algebra: if ∃ e ∈ V s.t. x · e = e · x = x

– Note that we call x a left inverse if x · y = e , and it’s called a right inverse if y · x = e .

• division algebra: loosely speaking, is an algebra in which division (except by zero element) is possible,

like in an associative algebra when (by assuming that it’s unital) every nonzero element has a multiplicative

inverse. There’s an important class of such algebras heavily used in Physics (actually first two, then the

third while the fourth has been in spotlight only recently), called a normed division algebra whose vector

space is a normed vector space in which the norm satisfies ∥x · y∥ = ∥x∥ · ∥y∥ , and interestingly there are

only four normed division algebras: real numbers (R), complex numbers (C), quaternions (H) and octonions

(O) . In all of them, the norm is defined by the absolute value3.

Now let’s see what does ‘subalgebra’, ‘center’ and ‘ideal’ mean:

Definition 41 (Subalgebra). Let A be an algebra and B a linear subspace of A . If B is closed under ‘ · ’ i.e.

∀x,y ∈ B we have x · y ∈ B , then B is a subalgebra of A .

Definition 42 (Center). The subset of an algebra A whose elements commute with all the elements of A is

called the center of algebra:

Z (A) = {z ∈ A | az = za , ∀ a ∈ A}

Definition 43 (Ideal). Let A be a K-algebra and I ⊂ A , then ∀x,y ∈ I , ∀ z ∈ A , ∀ k ∈ K if the following

conditions are satisfied, I is a left(right) ideal:

1. x + y ∈ I

2. k x ∈ I

3. z · x ∈ I for left ideal and x · z ∈ I for right ideal.

Remark 16. If A is commutative, then left and right ideals are equivalent and simply called an ideal or two-sided

ideal.

To give an example, first we define c ∈ (a, b) to be a fixed-point of the function f(x) defined in (a, b) if f(c) = c .

Now consider Cr (a, b) to be the algebra of all r-times differentiable real-valued functions defined on the interval

(a, b) . Then the set of all functions that vanish at a given fixed-point c ∈ (a, b) form an ideal in Cr (a, b) .3R and C are associative and commutative, H is associative but noncommutative while O is nonassociative and noncommutative.

37

Page 41: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Roughly speaking, (as Sasha Patotski put it nicely4), “ideals are like zeros with several elements” with regard to

addition and multiplicative property of a zero element. Importance of ideals becomes significant in Lie algebras,

specially in Standard Model of Particle Physics! Let’s quickly see what is this Lie algebra5 that HEP guys are so

fond of.

Definition 44 (Lie Algebra). L is a Lie algebra over field K with a bilinear map [· , ·] : L × L → L which

satisfies

(1) [x ,x] = 0 , ∀x ∈ L (Alternativity)

(2) [x, [y, z]] + [y, [z,x]] + [z, [x,y]] = 0 (Jacobi identity)

Lie algebras are in the center of Quantum Mechanics and Particle Physics. They also gave birth to even more

interesting algebras like ‘supersymmetry algebra’ (relating bosons and fermions) or ‘Virasoro algebra’ which is

used in something beautiful called ‘2-dimensional conformal field theory’!

OK, enough of these fancy terms. Let’s continue our journey and meet with ‘structure constants’ or ‘structure

coefficients’. They are simply some numbers which determine the product of any two basis vectors which then

could be used for determining the product of any two vectors when they are completely expressed. More precisely:

Definition 45 (Structure Constants). Let {ei}Ni=1 be a basis for A . Then the constants ckij are N3 numbers

expressed in terms of the basis vectors product

ei · ej =N∑k=1

ckij ek

For example, our familiar vector cross product (×) on R3 turns it into a nonassociative noncommutative algebra

whose structure constants are given by Levi-Civita symbol ϵijk. Hence you can find the cross product of any given

two vectors.

A exercise 17. Find the structure constants of C with basis {1, i} .

A exercise 18. Pauli matrices could be used as a basis for the Lie algebra of something called ‘ su(2)’ (special

unitary Lie algebra of order 2) under the Lie product

A ·B ≡ [A,B] = AB −BA

What are the structure constants of this algebra?4https://math.stackexchange.com/questions/690187/can-someone-explain-ideals-to-me5Named after 19th century Norwegian mathematician Marius Sophus Lie

38

Page 42: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

6.1 On Homomorphism and Isomorphism

We often use the notion of ‘equality’ in our everyday mathematics with utmost confidence, as though it’s one of

the most obvious fundamental concepts. But that’s not the case to be honest, as Jonathan Campbell said (of

course he’s talking in a different context than ours but I highly recommend checking the reference article I linked

in the footnotes) “We came up with this notion of equality, it should have been equivalence all along.”6

One of the things we do when we’re looking at two algebraic structures, we often look to find whether they could

be equivalent of some sort or not. (Homo/Iso)morphism is all about that kind of equivalence. Now I want to

introduce them by a great example I found while surfing the net7: Let’s assume that we have two sets, namely

{0, 1} and {happy, sad} (I couldn’t insert the emojis in math mode, so you just correspond them with their

following pictures), and we somehow built their multiplication tables as shown below:

Now do you see any similarity between those two tables? Of course! If we relabel the emojis with the map

f : {happy, sad} → {0, 1} s.t. f(happy) = 0 and f(sad) = 1

we’ll see that they share the same structure with respect to the × operation, although it doesn’t seem that both

sets are equal to each other. In the language of algebra, the map f is called an ‘isomorphism’ and those two sets

are said to be ‘isomorphic’ to each other. Now before a precise definition, let’s get to know some types of maps.

Definition 46. Let X and Y be two sets. Then a map ϕ : X → Y is

• injective (or one-to-one) if x = x′ implies ϕ(x) = ϕ(x′) , for x, x′ ∈ X

• surjective (or onto) if for each y ∈ Y ∃x ∈ X s.t. ϕ(x) = y

• bijective if it’s both injective and surjective.

6https://www.quantamagazine.org/with-category-theory-mathematics-escapes-from-equality-20191010/7http://mathwiki.cs.ut.ee/finite_fields/04_isomorphisms

39

Page 43: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

Definition 47 (Homomorphism). Suppose two algebraic structures each contain a set, X and Y respectively,

are both equipped by ‘ ·’ operation. Then a map ϕ : X → Y is a homomorphism if it preserves the ‘ ·’ operation

i.e.

∀x, x′ ∈ X : ϕ(x · x′) = ϕ(x) · ϕ(x′)

If a homomorphism ϕ is bijective, it’s called an isomorphism and X is said to be isomorphic to Y , denoted by

X ∼= Y .

Remark 17. Note that for K-algebra homomorphism, the map should also preserve the ‘+ ’ operation as well as

the vector product and homogeneity.

In Quantum Mechanics you’ll often hear the isomorphism between su(2) and so(3) Lie algebras in order to grasp

the concept of spin in terms of angular momentum, or sl (2,C) ∼= so(3, 1) in Quantum Field Theory when talking

about Lorentz group. Now let’s end our discussion with an example:

Example 16. Show that C with basis {1, i} is isomorphic to M2×2, the set of 2× 2 matrices with basis {I, J} s.t.

I =

1 0

0 1

, J =

0 1

−1 0

under the map

ϕ : C →M2×2 s.t. ϕ(x+ iy) = xI + yJ =

x y

−y x

.

To show this, we need to show that ϕ is bijective, i.e. by showing that it’s both injective and surjective. For

injectivity we have:

ϕ(x+ iy) = ϕ(x′ + iy′) ⇒

x y

−y x

=

x′ y′

−y′ x′

x = x′

y = y′⇒ x+ iy = x′ + iy′ .

For surjectivity, we use the fact that any map is surjective on its range so if

α β

−β α

∈ M2×2 then it’s indeed

in the range of ϕ(α+ iβ) for any α, β .

40

Page 44: physics.ut.ac.irphysics.ut.ac.ir/~khatibi/linear_algebra.pdf · 1 Prelude ﻢﻴﺷﻮﺧ راﺪﻳد ی هﺪﻋو ﻳ ﻪﺑ زور و ﺐﺷ ﺎﻣ ﺪﻧا شﻮﺧ ﻮﺗ لﺎﻤﺟ

7 Coda

Alas, we reached the end. They say every end is a beginning though, and I hope this would be the case. I’d like

to share with you one of the greatest passages of ‘Steppenwolf ’ by Hermann Hesse. I hold this book very dear to

my heart.

“You have a picture of life within you, a faith, a challenge, and you were ready for deeds and sufferings and

sacrifices, and then you became aware by degrees that the world asked no deeds and no sacrifices of you whatever,

and that life is no poem of heroism with heroic parts to play and so on, but a comfortable room where people are

quite content with eating and drinking, coffee and knitting, cards and wireless. And whoever wants more and has

got it in him−the heroic and the beautiful, and the reverence for the great poets or for the saints−is a fool and a

Don Quixote. Good. And it has been just the same for me, my friend. I was a gifted girl. I was meant to live up

to a high standard, to expect much of myself and do great things. I could have played a great part. I could have

been the wife of a king, the beloved of a revolutionary, the sister of a genius, the mother of a martyr. And life

has allowed me just this, to be a courtesan of fairly good taste, and even that has been hard enough. That is how

things have gone with me. For a while I was inconsolable and for a long time I put the blame on myself. Life,

thought I, must in the end be in the right, and if life scorned my beautiful dreams, so I argued, it was my dreams

that were stupid and wrong-headed. But that did not help me at all. And as I had good eyes and ears and was

a little inquisitive too, I took a good look at this so-called life and at my neighbors and acquaintances, fifty or so

of them and their destinies, and then I saw you. And I knew that my dreams had been right a thousand times

over, just as yours had been. It was life and reality that were wrong. It was as little right that a woman like me

should have no other choice than to grow old in poverty and in a senseless way at a typewriter in the pay of a

money-maker, or to marry such a man for his money’s sake, or to become some kind of drudge, as for a man like

you to be forced in his loneliness and despair to have recourse to a razor. Perhaps the trouble with me was more

material and moral and with you more spiritual−but it was the same road. Do you think I can’t understand your

horror of the fox trot, your dislike of bars and dancing floors, your loathing of jazz and the rest of it? I understand

it only too well, and your dislike of politics as well, your despondence over the chatter and irresponsible antics of

the parties and the press, your despair over the war, the one that has been and the one that is to be, over all that

people nowadays think, read and build, over the music they play, the celebrations they hold, the education they

carry on. You are right, Steppenwolf, right a thousand times over, and yet you must go to the wall. You are much

too exacting and hungry for this simple, easygoing and easily contented world of today. You have a dimension too

many. Whoever wants to live and enjoy his life today must not be like you and me. Whoever wants music instead

of noise, joy instead of pleasure, soul instead of gold, creative work instead of business, passion instead of foolery,

finds no home in this trivial world of ours.”

41