notes on linear algebra class handoutrepresents a linear transformation from

NOTES ON LINEAR ALGEBRACLASS HANDOUT

ANTHONY S. MAIDA

CONTENTS

1. Introduction 22. Basis Vectors 23. Linear Transformations 23.1. Example: Rotation Transformation 34. Matrix Multiplication and Function Composition 34.1. Example: Rotation Transformation Revisited 45. Identity Matrices, Inverses, and Determinants 55.1. Example: Inverse of Rotation Transformation 66. Eigenvectors and Eigenvalues of a Matrix 76.1. Example: Finding Eigenvalues and Eigenvectors 86.2. Example: Eigenvalues of Rotation Matrix 87. Significance of Eigenvectors and Eigenvalues 87.1. Example: Raising M to a Power 97.2. Example: Stability of Discrete System of Linear Equations 107.3. Example: Dominant Mode of a Discrete Time System of Linear Equations 118. Vector Spaces 118.1. Distance metrics 128.2. Inner Product or Dot Product 128.2.1. Properties of the Inner Product 139. Vector Geometry 139.1. Perpendicular Vectors 149.2. Cosine of the Angle Between Vectors 149.2.1. Method 1 149.2.2. Method 2 1410. Matlab 15

Date: Version February 13, 2015. Copyright c© 2007-2015.1

2 ANTHONY S. MAIDA

1. INTRODUCTION

This write-up explains some concepts in linear algebra using the intuitive case of 2×2 matrices.Once the reader envisions the concepts for these simple matrices, it is hoped that his or her intuitionwill extend to the more general case of n×n matrices, and make more advanced treatments on thetopic accessible. In the following, we assume that a matrix M is a 2 × 2 matrix with elements asshown below.

M =

[a bc d

](1)

A key idea will be that a matrix represents a linear transformation.We will also have need to represent two-component vectors such as ~x = (x1, x2). Since we are

working in a linear algebra context, we will represent these as 2×1 matrices. These are also knownas column vectors, denoted

~x =

[x1x2

]. (2)

If we write, [x1x2]T , this denotes a column vector and it is an alternative to using expression 2

above. This convention saves vertical space in written documents.

2. BASIS VECTORS

We will denote a continuous, two-dimensional plane of points by <2, which is shorthand for<×<. Points in the plane are denoted by vectors. When we talk about<2, combined with the rulesof vector algebra, then we are using a vector space. Vectors can be decomposed into a canonicalrepresentation which is just a (linear) combination of so-called basis vectors. Any pair of vectorscan serve as a basis for<2 as long is they are non-zero and not colinear. When discussing the vectorspace <2, we normally use the standard basis, which consists of the vectors [1, 0]T and [0, 1]T . Anarbitrary point [x1x2]

T in two-dimensional space can be decomposed into a linear combination ofthese basis vectors. For instance, the point [2, 3] can be represented as[

23

]= 2

[10

]+ 3

[01

]. (3)

The right-hand side of the above formula is an example of a linear combination. As noted above,any pair of vectors that are linearly independent could be used as a basis. A pair of vectors islinearly independent if they are both nonzero and their directions are not aligned. Given a lineartransformation, we may chose a basis set of vectors that is convenient to understand the structureof the transformation. We now define a linear transformation.

3. LINEAR TRANSFORMATIONS

The first thing to learn is that a 2 × 2 matrix of real numbers is not just a table of numbers. Itrepresents a linear transformation from <2 to <2. That is, it represents a mapping from points inthe plane to other points in the plane. Algebraically, a linear transformation is a function, f(·),which has the following property.

f(a~v + b~w) = af(~v) + bf(~w) (4)

NOTES ON LINEAR ALGEBRA CLASS HANDOUT 3

In the above, a and b are scalars, and ~v and ~w are two-dimensional vectors. If a mapping has theabove property, then it is a linear transformation.

A linear transformation shall map these basis vectors into some other (possibly the same) points.Let us call these points [a, c]T and [b, d]T . Specifically, we have

f

([10

])=

[ac

](5)

f

([01

])=

[bd

]. (6)

Keeping this in mind, let us see what a linear transformation,f , does to an arbitrary value[x1x2]

T .

f

([x1x2

])= f

(x1

[10

]+ x2

[01

])= x1f

([10

])+ x2f

([01

])= x1

[ac

]+ x2

[bd

]=

[ax1cx1

]+

[bx2dx2

]=

[ax1 + bx2cx1 + dx2

]=

[a bc d

] [x1x2

]= M~x (7)

Because f([x1x2]T ) is shown to equal M~x, this shows that multiplying a matrix with a vector

is the same as applying a linear transformation to the vector. The matrix, M , represents a lineartransformation and is defined by what the linear transformation does to the basis vectors.

3.1. Example: Rotation Transformation. A counter clockwise rotation about of a point in theplane about the origin is an example of a linear transformation and can be represented by a 2 × 2matrix. The a and c values of the matrix are determined by specifying how the basis vector [1, 0]T

should be transformed. Similarly, the b and d values of the matrix are determined by specifyinghow the [0, 1]T should be transformed (see Figure 1). The resulting matrix is given below.

M =

[cos θ − sin θsin θ cos θ

](8)

4. MATRIX MULTIPLICATION AND FUNCTION COMPOSITION

Let f(·), g(·), and h(·) be arbitrary functions that map from values in <2 to values in <2. Let~x denote a vector in <2. Let (g ◦ f)(·) denote the function that results when applying g(·) to theoutput of f(·). In other words, (g ◦ f)(~x) means the same thing as g(f(~x)), which is depicted inFigure 2. The former notation is the mathematician’s way of creating a name for large procedure,

4 ANTHONY S. MAIDA

(1, 0 )T

(a, c)T

θ

(0, 1 )T

(b, d)T

θ

(A)

(1, 0 )T

(a, c)T

θ

(0, 1 )T

(b, d)T

θ

(B)

FIGURE 1. Illustration of the trigonometry for rotating the basis vectors [1, 0]T and [0, 1]T .

(g ◦ f)(·), that is built of two subprocedures f(·) and g(·) that are ‘executed’ in sequence. Theoperation of assembling functions in this fashion is called function composition and the operator ◦is the function composition operator. Function composition is associative. Specifically,

(h ◦ (g ◦ f))(·) = ((h ◦ g) ◦ f)(·). (9)

Although function composition is associative, it is not commutative. That would correspond toswapping the order of subprocedures within a procedure.

Now let us suppose that the above mentioned functions f(·), g(·), and h(·) are linear transfor-mations. Then they can be represented by the 2 × 2 matrices which are F , G, and H . When wewrite

H(G(F~x)) (10)

it means to first mulitply F with ~x. This yields a point in <2 that is represented as a columnvector. Multiplying G with this result yields another column vector, and H can be multipliedwith that result. Thus, matrix multiplication corresponds to function composition. Since functioncomposition is associative, it does not matter how we parenthesize the matrices, as long as we donot change the order of the matrices. In fact, we can leave the parentheses out completely, as shownbelow.

HGF~x (11)

In this vein, it is worth noting that matrix multiplication, like function composition, is associativebut not commutative. If this were not true, matrix multiplication would be unable to representfunction composition.

4.1. Example: Rotation Transformation Revisited. Suppose one wants to apply a rotation trans-formation by an amount θ1 and then after that apply another rotation transformation by an amountθ2. This is shown below.[

cos θ2 − sin θ2sin θ2 cos θ2

] [cos θ1 − sin θ1sin θ1 cos θ1

] [x1x2

](12)


y→ z→x→gf

FIGURE 2. Procedural representation of the effect of composing functions g(·) and f(·)to obtain ~z = g(f(~x)).

Of course, we can compose the matrix transformations by multiplying the matrices together. Thisgives the matrix.[

(cos θ1 cos θ2 − sin θ1 sin θ2) −(sin θ1 cos θ2 + cos θ1 sin θ2)(sin θ1 cos θ2 + cos θ1 sin θ2) (cos θ1 cos θ2 − sin θ1 sin θ2)

](13)

If θ = θ1 + θ2, then this matrix is equivalent to that in Equation 8. Since the corresponding matrixelements are equal, we have proved the following two trigonometric identities below.

sin(θ1 + θ2) = sin θ1 cos θ2 + cos θ1 sin θ2 (14)cos(θ1 + θ2) = cos θ1 cos θ2 − sin θ1 sin θ2 (15)

Later, we will use the latter identity to obtain a formula for the cosine of the angle between twovectors.

5. IDENTITY MATRICES, INVERSES, AND DETERMINANTS

If f(·) is the identity function, then f(~x) = ~x for all ~x. This function is a linear transformationand can be represented by the identity matrix, shown below.

I =

[1 00 1

](16)

If a function, f(·), has an inverse, denoted f−1(·), then f ◦ f−1(·) = f−1 ◦ f(·) which equals theidentity function.

If M is a matrix that has an inverse M−1, then

MM−1 = M−1M = I. (17)

For a 2×2 matrix,M , define the determinant to be the quantity ad−bc. Specifically, det(M) =ad− bc. Note that sometimes the notation |M | is used to denote the determinant of matrix M . Theformula for the inverse of a matrix is given below.

M−1 =

[d −b−c a

]1

det(M)(18)

Note that this formula is defined only if det(M) 6= 0. In particular, a matrix has an inverse if andonly if its determinant is not equal to 0. This is convenient because it allows one to easily determinewhether a linear transformation has an inverse.

6 ANTHONY S. MAIDA

(a, c)

(b, d)

FIGURE 3. Image of unit square generated by matrix transformation. Area is given byad− bc. The area is nonzero unless the parallelogram degenerates to a line or a point.

A linear transformation maps the points falling within the unit square into a parallelogram. Theunit square consists of the points in the region of <2 where 0 ≤ x1 ≤ 1 and 0 ≤ x2 ≤ 1. Thecorners of the unit square map to the corners of the parallelogram (see Figure 3). The determinantof a matrix gives the surface area of the transformation’s image when applied to the unit square.The area of this parallelogram gives the value of the determinant. If the transformation maps thesquare into a line or point (both of which are degenerate parallelograms), then the value of thedeterminant is zero. Otherwise, it is nonzero.

If the parallelogram is not degenerate then the mapping that is specified by the matrix is non-singular. Specifically, unique points on the unit square map to unique points on the parallelogram,and the reverse is also true. If the image is a line or a point, then many points on the square mapto single points on the degenerate parallelogram. In this case, the function does not (for obviousreasons) have an inverse.

5.1. Example: Inverse of Rotation Transformation. The matrix in Formula 8 gives a counterclockwise rotation by an angle θ. The inverse transformation would instead give a clockwise rota-tion. Using the formula for the inverse we can obtain the clockwise rotation matrix shown below.

M−1 =

[cos θ sin θ− sin θ cos θ

](19)

When deriving this remember that sin2 θ + cos2 θ = 1. Also note, for the case of rotation, theinverse of a rotation matrix is its transpose. That is, M−1 = MT . When this happens we havean orthonormal transformation. This corresponds to a (possibly) flipped rigid rotation. By rigidrotation, we mean that vector lengths and angles between vectors do not change when the transfor-mation is applied.


6. EIGENVECTORS AND EIGENVALUES OF A MATRIX

There is an effective way to perform a structural analysis of a linear transformation. This in-volves the use of eigenvectors and eigenvalues of the matrix representing the transformation.

Consider the situation of multiplying a matrix with a nonzero vector as in M~x. If the vector ~xis chosen correctly, this operation has the effect of shrinking or stretching the vector but it does notchange the direction of the vector other than perhaps reversing the direction. This can be written as

M~x = λ~x. (20)

In the above, λ is a scalar. Given a matrix M , if a vector ~x has this property, then ~x is said to be aneigenvector of M , and λ is its associated eigenvalue. We shall now solve for the eigenvectors andeigenvalues of 2× 2 matrix M . Consider the following steps.

M~x = λI~x (21)M~x− λI~x = 0 (22)(M − λI)~x = 0 (23)

In the above, note that (M − λI) denotes a matrix and Eq. 23 is called the characteristic equationfor matrix M . Since we have assumed that ~x is nonzero, the only way that this matrix can map ~xinto zero is if it is mapping the unit square into a degenerate parallelogram. Thus the determinantof this matrix is zero. This gives us some leverage to find the value of λ. Matrix (M −λI) expandsas shown below.

M − λI =

[a bc d

]−[λ 00 λ

](24)

=

[a− λ bc d− λ

](25)

However, we are actually interested in the determinant of this matrix, rather than the matrix itself.The determinant expands to

|M − λI| = (a− λ)(d− λ)− bc (26)

= λ2 − (a+ d)λ+ ad− bc = 0. (27)

The above is a quadratic equation where λ is the unknown, and it can be solved using the quadraticformula. It is called the characteristic polynomial for matrix M . When this is solved, we haveobtained up to two eigenvalues for the matrix M . Once we know the eigenvalues, we can useFormula 20 to solve for the eigenvectors that go with the eigenvalues. For a 2× 2 matrix, we willhave two eigenvalues and one eigenvector to go with each eigenvalue. Notice that, in Equation 27,the quantity a + d is the sum of the diagonals of matrix M . This is called the trace of M . Alsonote that ad− bc is the determinant of M .

8 ANTHONY S. MAIDA

Let us denote the trace of M by τ and the determinant of M by ∆. Then using the quadraticformula, we can write concise formulas for the eigenvalues.

λ1 =τ +√τ2 − 4∆

2(28)

λ2 =τ −√τ2 − 4∆

2(29)

6.1. Example: Finding Eigenvalues and Eigenvectors. Compute the eigenvalues and eigenvec-tors of the matrix

M =

[4 00 1

]. (30)

Solution. ∣∣∣∣[ 4− λ 00 1− λ

]∣∣∣∣ = (4− λ)(1− λ) = λ2 − 5λ+ 4 = 0 (31)

The quadratic equation on the right has two solutions: λ1 = 4 and λ2 = 1. These are the two eigen-values listed in order of numerical magnitude. The corresponding eigenvectors can be obtained bysubstituting the value of λ back into Equation 23. Specifically, using λ1 gives the first eigenvector.

(M − λ1I)~x =

[4− λ1 00 1− λ1

] [x1x2

](32)

=

[0 00 −3

] [x1x2

](33)

=

[0−3x2

]=

[00

](34)

The above equations imply that x2 = 0 but place no constraint on the value of x1. For conve-nience, we set x1 = 1 so that the length of the vector is one. We shall denote the eigenvector thatcorresponds to λ1 by ξ1. Thus, ξ1 = [1, 0]T . Similar analysis shows that ξ2 = [0, 1]T .

6.2. Example: Eigenvalues of Rotation Matrix. If one computes the eigenvalues of the rotationmatrix, one finds that they are complex numbers. This makes sense because the transformationrotates all vectors in <2, thus the transformed vector never points in the same direction.

7. SIGNIFICANCE OF EIGENVECTORS AND EIGENVALUES

For this section, we will assume that the eigenvalues of the matrix under discussion are distinctand that we are working with an n × n matrix. Given a matrix M , there is an alternative wayto represent it using its eigenvectors and eigenvalues. This yields a canonical representation thatmakes the structure of the underlying linear transformation explicit.

Define the matrix Λ as shown below

Λ =

λ1 0 . . . 00 λ2 . . . 0...

.... . . 0

0 0 0 λn

. (35)


This is a diagonal matrix of eigenvalues of M , where the λi are the eigenvalues, and they areordered according to their magnitude. We will see that this matrix represents the same lineartransformation as M but using a more convenient basis set. To go further, we have to change therepresentation of the vectors that we have been using so that they use the basis set assumed by thematrix Λ.

Define the matrix V as shown below

V = [ ξ1 ξ2 . . . ξn ] . (36)

V is an n × n matrix where the first column is the first eigenvector of M , and so forth. Theeigenvectors are ordered according to the corresponding eigenvalues of Λ (which are in turn orderedaccording to their magnitudes).

The original matrix M can be factored into the product of V , Λ, and V −1, as shown below.(Why?)

M = V ΛV −1 (37)

In other words, the factorized transformation can be applied to ~x, as shown below.

M~x = V ΛV −1~x (38)

How do we interpret this? First notice that V and V −1 are inverses. V −1 represents a transforma-tion that converts ~x to a representation that Λ understands. Λ applies the transformation of interest.Finally,V converts the result of the transformation to the representation that we started with.

Why is the diagonal matrix representation Λ desirable? Let us look in more detail at the eigen-vector matrix, V . The columns of this matrix (eigenvectors) serve as an alternate basis set forpoints in the underlying space. Specifically, an arbitrary point ~x in the space can be represented asa linear combination of the eigenvectors as shown below.

~x = α1~e1 + α2~e2 + . . . αn~en (39)

Put another way, the vector ~x which is represented using the standard basis, is represented as ~αwhen represented using the alternate basis which consists of the eigenvectors of M . This allows usto create a very simple representation of the transformation M using this new basis. The derivationbelow shows this.

M~x = V ΛV −1 (α1~e1 + α2~e2 + . . . αn~en) (40)

= V ΛV −1V [α1 . . . αn]T (41)

= V Λ [α1 . . . αn]T (42)= λ1α1~e1 + λ2α2~e2 + . . .+ λnαn~en (43)

7.1. Example: Raising M to a Power. In the next section, we will need to consider the quan-tity Mk, where k is a positive integer. It is very useful to represent this in terms of the matrixfactorization. Specifically,

Mk =(V ΛV −1

)k= V Λk V −1 (44)

10 ANTHONY S. MAIDA

Further, note that Λ is a diagonal matrix. Raising a diagonal matrix to a power involves raising theelements on the diagonal to the power. Thus,

Λk =

λk1 0 . . . 00 λk2 . . . 0...

.... . . 0

0 0 0 λkn

(45)

This representation of Mk will be used in the next example.

7.2. Example: Stability of Discrete System of Linear Equations. This example shows how touse eigenvalues to study the stability of a discrete system of linear equations. The discrete systemmay be a set of coupled equations. The same system can be alternately represented as a set ofuncoupled equations. This makes the system much easier to analyze.

Let us represent a discrete system of linear equations with constant coefficients as shown below.

~x(k + 1) = M~x(k) +~b (46)

The vector ~x holds the values of state variables which take on values at discrete time steps k =

0, 1, 2 . . . Matrix M is a matrix of constant coefficients. Finally, ~b is a vector of inputs that areconstant over the life of the system.

Let us pose the question: Is this system stable? If the system is stable, there is a vector ~x∗ suchthat when the state vector ~x(k) is sufficiently close to ~x∗, then ~x(k′) tends to evolve toward ~x∗

for all k′ ≥ k. That is, the difference between the current state and the stable state, representedas ~x(k′) − ~x∗, will approach zero as k′ approaches infinity. Furthermore, the relation below alsoholds.

~x∗ = M~x∗ +~b (47)

The above equation holds because the system is stable at point ~x∗. If the system state ever reaches~x∗, it stays at ~x∗ for all subsequent k.

With this in mind, let us expand the expression representing the difference between the currentstate and the stable state as shown below.

~x(k + 1)− ~x∗ = M~x(k) +~b−M~x∗ −~b (48)= M(~x(k)− ~x∗) (49)

We can perform a change of variable to simplify the above expression. We shall let ~z(k) =~x(k)− ~x∗. This allows us to rewrite the above equation as

~z(k + 1) = M~z(k). (50)

Since ~z(k) now represents the difference between the current state vector and the stable state, ~z(k)approaches zero as k approaches infinity for a system with stable state ~x∗. Note also that the valuesof ~z(k) need not be zero. These two facts imply that the matrix Mk approaches the zero matrix ask approaches infinity. This is because

~z(k) = Mk~z(0) (51)


From the previous example, we know that Mk can be factored as

Mk = V

λk1 0 . . . 00 λk2 . . . 0...

.... . . 0

0 0 0 λkn

V −1. (52)

Thus Mk approaches zero as k approaches infinity if and only if Λk approaches zero as k ap-proaches infinity. Λk approaches zero if and only if |λi| < 1 for all i ∈ {1 . . . n}.

7.3. Example: Dominant Mode of a Discrete Time System of Linear Equations. Consider thediscrete time system

~x(k + 1) = A~x(k) (53)

Suppose that matrix A has n distinct eigenvalues. The eigenvalue, λi, with the largest magnitude,|λi|, is the dominant eigenvalue. As k approaches infinity, the state vector x(k+1) evolves to alignwith the eigenvector corresponding to the dominant eigenvalue.

The initial state vector can be represented as

~x(0) = α1~e1 + . . . αn~en. (54)

Thus, the solution to the system for any time step k ≥ 1 is

~x(k) = α1λk1~e1 + . . .+ αnλ

kn~en (55)

Without loss of generality, assume that λ1 is the dominant eigenvalue. λk1 grows faster than λki forany other eigenvalue. Therefore, the following holds

|α1λk1| >> |αiλ

ki | (56)

as long as α1 6= 0. Therefore, for sufficiently large k, the state vector ~x(k) is essentially alignedwith ~e1.

8. VECTOR SPACES

In section 2, we referred to a vector space but did not define it. We need to delve into this so thatwe can say more about vector geometry.

A vector is a quantity consisting of a direction and a magnitude, often drawn as an arrow with ahead and a tail. If we assume that vectors always have their tails at the origin in Euclidean space,then we can specify a vector by listing the coordinates of its head. In three dimensional space,the vector ~x can be specified using the coordinates (x1, x2, x3). Such an expression is called atuple and x1, x2, and x3 are called its components. For our purposes, we shall assume that thecomponents are always real numbers and that the vector space is defined over the real numbers.That is, whenever a scalar is encountered, it is a real number. Intuitively, by vector space, we meana set of vectors which is closed under a set of operations (addition and multiplication by a scalar).For instance, if you add any two vectors, the result is another vector. Vectors in Euclidean spaceover the field of real numbers have the following properties.

12 ANTHONY S. MAIDA

1. The sum of two vectors is a vector (closure). You add two vectors by adding their corre-sponding components. Both vectors must have the same number of components. Vectoraddition is commutative and associative. The following example shows how to add twovectors and also shows that vector addition is commutative.

~x+ ~y = (x1, x2, . . . , xn) + (y1, y2, . . . , yn)

= (x1 + y1, x2 + y2, . . . , xn + yn)

= (y1 + x1, y2 + x2, . . . , yn + xn) = ~y + ~x

2. The zero vector is of length zero has zero for each of its components. Adding the zerovector to a vector doesn’t change the vector. The zero vector is the only vector that has thisproperty.

3. To multiply a vector by a scalar, multiply each component of the vector with the scalar. Thisgives you another vector. If you multiply by the scalar +1, you don’t change the vector.The +1 scalar is the only scalar that has this property. If you multiply each component of avector ~x by the scalar -1, you get −~x. The sum of ~x and −~x equals the zero vector. For agiven vector ~x, −~x is the only vector which has this property.

4. Vectors have the following algebraic properties. Let a and b be scalars.4.1. a (b ~x) = (ab) ~x4.2. (a+ b) ~x = a~x+ b~x4.3. a (~x+ ~y) = a~x+ b~y

We shall prove property (a) because it gives us an opportunity to provide an example ofscalar multiplication.

a (b ~x) = a(b x1, b x2, . . . , b xn)

= (ab x1, ab x2, . . . , ab xn) = (ab) ~x (57)

8.1. Distance metrics. A distance metric is a convention for defining distance between two pointsin an n-dimensional space. A point in n-dimensional space is specified as a vector with n com-ponents. A legal distance metric must satisfy the following properties for any points a, b, andc.

1. distance(a, a) = 0.2. distance(a, b) > 0 if a 6= b.3. distance(a, b) = distance(b, a).4. distance(a, c) ≥ distance(a, b) + distance(b, c).

Both the Euclidean distance metric, based on the Pythagorean theorem, and the city block distancemetric satisfy these properties. The third property is known as symmetry and the fourth propertyis known as the triangle inequality. In Euclidean space, the triangle inequality is a corollary of thefact that the shortest distance between two points is a straight line.

8.2. Inner Product or Dot Product. The inner product or dot product of two n-dimensionalvectors is computed by multiplying the corresponding components together and then summing theresults. The inner product of vectors ~x and ~y, written ~x · ~y, is defined below in expression (58).


x

y

-x

y-x y+x

y

x

cy

x-cy

θ

FIGURE 4. If vectors ~x and ~y are perpendicular then ||~y − ~x|| = ||~y + ~x||.

Expression (58) also shows that the inner product is commutative.

~x · ~y ≡n∑

i=1

xi yi =n∑

i=1

yi xi = ~y · ~x (58)

Using matrix notation, if vectors are written as column vectors, then the inner product between twovectors ~x and ~y is written ~xT~y.

The inner product is a measure of the degree of overlap between the vectors. If the vectorsboth point in the same direction, the inner product is positive and maximum. If they are pointingin opposite directions, the inner product is negative and minimum. If the inner product is 0, thevectors are said to be orthogonal (perpendicular). It is useful to look at the inner product of a vectorwith itself, as shown below.

~x · ~x =n∑

i=1

x2i (59)

Since a vector points in the same direction as itself, this quantity will be positive (unless the vectoris the zero vector). The square root of this quantity is known as the Euclidean norm of the vector ~x,written || ~x ||. This is the length of the vector as determined by the Pythagorean theorem (Euclideandistance metric) generalized to n dimensions. If || ~x || = 1, then we say ~x is a unit vector.

8.2.1. Properties of the Inner Product. Some basic properties of the inner product are the follow-ing.

commutativity: If ~x and ~y are vectors, then

~x · ~y = ~y · ~xdistributivity: If ~x, ~y, and ~z are vectors, then

~x · (~y + ~z) = ~x · ~y + ~x · ~zmultiplication by scalar: If c is a scalar and ~x and ~y are vectors, then

(a~x) · ~y = a(~x · ~y)

magnitude: ~x · ~x = 0 if and only if ~x is the zero vector. Otherwise, ~x · ~x > 0.

9. VECTOR GEOMETRY

This section develops intuitions about geometric interpretations of linear algebra in two-dimensions.

14 ANTHONY S. MAIDA

T

(x1, x2)T

(y1, y2)

α

β

θ

FIGURE 5. Angle θ = β − α is the angle between vectors ~x and ~y.

9.1. Perpendicular Vectors. If two vectors are perpendicular, then their dot product is zero. Tosee this, take a look at Figure 4. If vectors ~x and ~y are perpendicular then ||~y − ~x|| = ||~y + ~x||.From this we obtain

(~x+ ~y) · (~x+ ~y) = (~x− ~y) · (~x− ~y) (60)~x · ~x+ 2~x · ~y + ~y · ~y = ~x · ~x− 2~x · ~y + ~y · ~y (61)

~x · ~y = −~x · ~y (62)

The last line above can only be true if ~x · ~y = 0.

9.2. Cosine of the Angle Between Vectors. The cosine of the angle between two vectors, ~x and~y, is defined below.

cos θ ≡ ~x · ~y|| ~x || || ~y ||

(63)

This definition holds for n dimensions. To strengthen our intuitions, let us see why this definitioncorresponds to the cosine of an angle when the number of dimensions is two.

9.2.1. Method 1. Consider vectors ~x and ~y on the plane shown in Figure 5. They are separated bythe angle θ = β − α. From Equation 15, we obtain Formula 65.

cos θ = cos(β − α) (64)= cosβ cosα+ sinβ sinα (65)

=x1||~x||

y1||~y||

+x2||~x||

y2||~y||

(66)

=~xT~y

|| ~x || || ~y ||(67)

From applying trigonometry to Figure 5, we obtain Formula 66. We obtain Formula 67 by simpli-fying.

9.2.2. Method 2. Consider the angle θ between vectors ~x and ~y in Figure 6. Consider the projec-tion of ~x onto ~y at c~y so that the angle is perpendicular. For the figure, it follows that

cos θ =c||~y||||~x||

. (68)


x

y

-x

y-x y+x

y

x

cy

x-cy

θ

FIGURE 6. The dot product between ~x− c~y and c~y is zero.

It remains to obtain the value for c. Since the vectors ~x − c~y and c~y are perpendicular, it followsthat (~x− c~y) · c~y = 0. Solving for c we obtain

c =~x · ~y||~y|| ||~y||

. (69)

If we plug the value of c back into Equation 68, we obtain

cos θ =c||~y||||~x||

(70)

=~xT~y

|| ~x || || ~y ||(71)

10. MATLAB

When manipulating matrices whose dimensions are larger than 2 × 2, use MATLAB. Here aresome commands. Given a square matrix M , the expression inv(M) computes its inverse, theexpression det(M) computes its determinant, and the expression trace(M) computes the trace,and diag() extracts the diagonal and represents it as a column vector.

The expression below obtains the eigenvectors and eigenvalues.[V, D] = eig(M)

The above is a component-wise assignment statement as indicated by the brackets on the left-handside. Both the variables V and D are assigned new values because the function eig() returnstwo values. The variable V holds the eigenvectors of M. Each column of the matrix V stores aneigenvector. Their lengths are normalized to 1. The variable D is a diagonal matrix holding the cor-responding eigenvalues. The eigenvalues fall on the diagonal of the matrix. Both the eigenvectorsand the eigenvalues are sorted according to the magnitude of the eigenvalues. The first eigenvaluecorresponds to the first eigenvector and so forth for the second, third, etcetera.

notes on linear algebra class handoutrepresents a linear transformation from

Documents