review of basic linear algebra

13
ESI 6314 Deterministic Methods in Operations Research Review of Basic Linear Algebra Solving linear programs will require repeatedly solving systems of linear equations. We consider a system with m linear equations, and n variables, or unknowns. We let x i denote the i th variable, with a ij corresponding to the coefficient of x i in the j th equation and b j denoting the right-hand side value of the j th equation, i.e., we consider the following system: a 11 x 1 + a 12 x 2 + ··· + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + ··· + a 2n x n = b 2 . . . . . . a m1 x 1 + a m2 x 2 + ··· + a mn x n = b m It will be convenient to use vector and matrix notation, where A = a 11 a 12 ... a 1n a 21 a 22 ... a 2n . . . . . . . . . a m1 a m2 ... a mn , and x = x 1 x 2 . . . x n , b = b 1 b 2 . . . b n . We say that A is an m × n matrix and that x and b are n-dimensional vectors. An n-dimensional vector is a point in n-space and corresponds to a matrix with either one row or one column. We have written x and b above as column vectors, i.e., matrices containing a single column. We could also write x, for example, as a row vector. When we define x as a column vector and then write it as a row vector, we call this row vector the transpose of x, i.e., x T =[x 1 x 2 ··· x n ]. If we had originally defined x as a row vector, then its transpose would correspond to a column vector. The transpose of 1

Upload: mnbvqwerty

Post on 14-Dec-2015

12 views

Category:

Documents


2 download

DESCRIPTION

Review of Basic Linear Algebra

TRANSCRIPT

Page 1: Review of Basic Linear Algebra

ESI 6314 Deterministic Methodsin Operations Research

Review of Basic Linear Algebra

Solving linear programs will require repeatedly solving systems of linearequations. We consider a system with m linear equations, and n variables,or unknowns. We let xi denote the ith variable, with aij corresponding tothe coefficient of xi in the jth equation and bj denoting the right-hand sidevalue of the jth equation, i.e., we consider the following system:

a11x1 + a12x2 + · · ·+ a1nxn = b1

a21x1 + a22x2 + · · ·+ a2nxn = b2...

...

am1x1 + am2x2 + · · ·+ amnxn = bm

It will be convenient to use vector and matrix notation, where

A =

a11 a12 . . . a1na21 a22 . . . a2n...

......

am1 am2 . . . amn

,and

x =

x1x2...xn

, b =

b1b2...bn

.We say that A is an m × n matrix and that x and b are n-dimensionalvectors. An n-dimensional vector is a point in n-space and corresponds to amatrix with either one row or one column. We have written x and b aboveas column vectors, i.e., matrices containing a single column. We could alsowrite x, for example, as a row vector. When we define x as a column vectorand then write it as a row vector, we call this row vector the transpose ofx, i.e., xT = [x1 x2 · · ·xn]. If we had originally defined x as a row vector,then its transpose would correspond to a column vector. The transpose of

1

Page 2: Review of Basic Linear Algebra

the m× n matrix A above is an n×m matrix AT written as

AT =

a11 a21 . . . am1

a12 a22 . . . am2...

......

a1n a2n . . . amn

.Note that (AT)T = A.

Vector and matrix addition We can add matrices or vectors of thesame dimensions. For example, consider the n-dimensional vectors u and v:

u =

u1u2...un

, v =

v1v2...vn

.Then the addition of u and v simply involves the component-wise additionof the vector elements:

u+ v =

u1 + v1u2 + v2

...un + vn

.In two dimensions, we can visualize this addition as shown in Figure 1.Similarly, we can add two m× n matrices A1 and A2, with the result beingan m× n matrix such that the (i, j)th element equals a1ij + a2ij .

Scalar multiple of a vector or matrix If k is a scalar (a number), thenthe product of k and the n-vector u is given by

ku =

ku1ku2

...kun

.Figure 2 shows a positive scalar multiple of the vector u; the result is avector in the same direction as u, but scaled by the multiple 2 in this case.The figure also shows a negative scalar multiple of the vector v; the result

2

Page 3: Review of Basic Linear Algebra

Figure 1: Addition of two vectors u and v.

is a vector in the opposite direction of v, scaled by the multiple 1/2 in thiscase. Given a vector, it should be clear that a scalar multiple of that vectorcan represent any point on the infinite line that overlaps the vector arrow.Similarly, the product of k and the matrix A is written as

Figure 2: Scalar multiples of the vectors u and v.

kA =

ka11 ka12 . . . ka1nka21 ka22 . . . ka2n

......

...kam1 kam2 . . . kamn

.

3

Page 4: Review of Basic Linear Algebra

Scalar or dot product of vectors The scalar product of two n-vectorsu and v, written as uT · v (assuming u and v are both column vectors), issimply the sum of the componentwise products of the vectors, i.e., u1v1 +u2v2 + · · · + unvn. Note that if u and v are both n-dimensional columnvectors, then u · v is undefined. The inner dimensions must be the same,and the result has dimensions equal to the outer dimensions. That is, uT · vinvolves the product of a (1 × n) vector and an (n × 1) vector, and this isallowed because both inner dimensions equal n, with the result being (1×1)(which is a scalar). On the other hand, u · v tries to take the product of an(n× 1) vector and an (n× 1) vector, which is not defined because the innerdimensions are 1 and n, respectively.

The magnitude of an n-dimensional vector u, denoted by ||u||, is theEuclidean norm, i.e., ‖u‖ =

√u21 + u22 + · · ·+ u2n. It turns out that uT · v =

‖u‖‖v‖ cos θ, where θ is the measure of the angle between u and v. Thus,the scalar product of vectors that are perpendicular to one another equalszero.

We can view the vectors u and v as arrows (as in Figure 1) or as thepoints at the end of these respective arrows. Figure 3 shows that any pointon the line segment joining these points can be expressed as λu + (1 − λ)vfor some 0 ≤ λ ≤ 1. For a given 0 ≤ λ ≤ 1 we call λu + (1 − λ)v a convex

Figure 3: Convex combinations of two vectors u and v (0 ≤ λ ≤ 1).

combination of the vectors u and v. Then the line segment connecting uand v represents all convex combinations of u and v.

Matrix multiplication We can mutiply an m × n matrix A by somematrix D, provided that D has dimensions n×p for some positive integer p.

4

Page 5: Review of Basic Linear Algebra

We write this multiplication as AD, and the result is a matrix of dimensionsm × p. (Note that DA 6= AD in general, and DA is only defined in thisexample if p = m.) Element (i, j) of AD is equal to the scalar product ofthe ith row of A and the jth column of D.

Example Given the matrices A and D below, compute AD.

A =

[1 1 22 1 3

], D =

1 12 31 2

.Because A is 2× 3 and D is 3× 2, this product is defined and the result

will be a 2 × 2 matrix. Element (1, 1) of this matrix is obtained by takingthe scalar product of row 1 of A and column 1 of D, i.e.,

[1 1 2

] 121

= 5.

The resulting matrix is

AD =

[5 87 11

].

Matrix multiplication is associative and distributive, i.e., assuming compat-ible dimensions for multiplication of matrices A, B, C, and D, A(BC) =(AB)C, A(B+C) = AB+AC, and (B+C)D = BD+CD. If the matricesA and D are compatible for multiplication, then (AD)T = DTAT.

Systems of Equations

We now return to the system of equations we started with, now writing thissystem in matrix form, i.e.,

Ax = b,

where A is an m× n matrix and x and b are n-dimensional column vectors(note that Ax takes the product of an m × n matrix and an n × 1 vector,and the result is thus an m × 1 vector). We do not know a priori whetherthe above system of equations admits a feasible solution. However, exactlyone of three situations must occur:

1. The system contains no feasible solution.

2. The system contains exactly one solution.

5

Page 6: Review of Basic Linear Algebra

3. The system contains an infinite number of solutions.

Given a system of equations, we would like to be able to characterize which ofthe above conditions applies. Doing so requires understanding the conceptsof linearly independent vectors and the rank of a matrix.

It is perhaps useful to think of our system of equations in a slightlydifferent way. We can rewrite our system of equations as follows:

n∑i=1

a1ia2i...ami

xi =

b1b2...bm

.On the left-hand side, the vector ai = [a1i a2i · · · ami]

T corresponds to theith column of the A matrix. When viewed this way, it should be clear that insolving this system of equations, we are trying to express the m-dimensionalvector b as a linear combination of the n columns of the A matrix, withmultipliers x1, x2, . . . , xn. Thus, we would like to know when it is possibleto express an m-dimensional vector as a linear combination of a set of nvectors in m dimenions.

To understand this, we need the concept of linear independence. A setof n vectors in m-space is linearly independent if the only solution to thesystem

∑ni=1 a

iki = 0 requires setting ki = 0 for i = 1, . . . , n. The vectorsu and v in each of the figures we have used so far are linearly independentbecause the only way to obtain the origin as a linear combination of thesevectors requires multiplying each vector by zero. If a non-zero set of ki’sexists such that the associated linear combination of the vectors equals zero,then the vectors are not linearly independent.

A set of vectors such that a linear combination of these vectors canrepresent any point in m-space is said to span m-space. We require at leastm vectors in order to span m-space, and some subset of m of these vectorsmust be linearly independent in order for the vectors to span m-space. Aset of m vectors that span m-space is called a basis for m-space, and this setof m vector must be linearly independent by definition (moreover, removingany one of these vectors from the set will result in a set of vectors that nolonger span m-space). Given a basis for m-space, the representation of anypoint in m-space in terms of this basis is unique. That is, if we have mlinearly independent columns of A (e.g., columns 1 through m), then thesolution to

∑mi=1 a

ixi = b is unique. Note that a set of more than m vectorsmay span m-space, provided that a subset of m of these vectors is linearly

6

Page 7: Review of Basic Linear Algebra

independent. As a result, we immediately know that if a subset of the ncolumn vectors a1, a2, . . . , an spans m-space, then our system of equationshas at least one solution, as the m-dimensional vector b can be expressed asa linear combination of these vectors.

Observe that if n < m, then we cannot have a subset of m columns thatspan m-space. However, this does not immediately imply that our systemof equations has no solution. It is entirely possible that one or more of theequations in our system is a linear multiple of another equation, implyingthat we have redundant, or unnecessary equations, and should be workingin a smaller subspace of m.

In order to characterize which of the three situations applies (one solu-tion, no solution, infinite number of solutions) for a given system of equationsAx = b, it is useful to understand the concept of the rank of a matrix. Therank of a matrix A corresponds to the maximum number of linearly inde-pendent columns (or rows) of the matrix. Given an m × n matrix A, thisimmediately implies that Rank A ≤ min{m,n}. The rank of a matrix willtell us the largest subspace spanned by the set of columns of the matrix. Inorder to identify the rank of a matrix, we can use a set of elementary rowoperations (ERO’s). The following are ERO’s:

1. Interchanging rows i and j.

2. Multiplying a row by a nonzero scalar.

3. Multiplying a row by a scalar and adding it to another row.

The application of any set of ERO’s to a system of equations does notaffect the set of solutions for the system.

If an m× n matrix A has rank k ≤ min{m,n}, then it is possible to usea set of ERO’s to obtain a k × k identity matrix Ik, where

Ik =

1 0 . . . 00 1 . . . 0...

......

0 0 . . . 1

.(Note that if A is m × n, then ImA = AIn = A; generally we will write Iwhen the dimension is understood or implied.) Thus, in order to identifythe rank of a matrix A, we perform a set of ERO’s to obtain the largestpossible identity matrix. To illustrate this, consider the 3× 4 matrix below,

7

Page 8: Review of Basic Linear Algebra

followed by a series of ERO’s applied to the original matrix:

A =

1 2 3 41 3 5 60 1 2 3

.

A1 =

1 2 3 40 1 2 20 1 2 3

.A2 =

1 2 3 40 1 2 20 0 0 1

.A3 =

1 0 −1 00 1 2 20 0 0 1

.A4 =

1 0 −1 00 1 2 00 0 0 1

.Because we have found a 3×3 identity matrix and Rank A ≤ min{3, 4} = 3,we know that this matrix has rank 3. Clearly the first, second, and fourthcolumns are linearly independent and thus span and provide a basis for3-space. Next, consider a slightly different A matrix:

A =

1 2 3 41 3 4 60 1 1 2

.

A1 =

1 2 3 40 1 1 20 1 1 2

.A2 =

1 2 3 40 1 1 20 0 0 0

.A3 =

1 0 1 00 1 1 20 0 0 0

.Because the largest identity matrix we can find is 2×2, this matrix has rank2.

8

Page 9: Review of Basic Linear Algebra

Determining a solution to Ax = b (or showing that none exists) canbe done through a series of ERO’s applied to the extended matrix [A|b].Consider the following system of equations:

x1 + 2x2 + 3x3 + 4x4 = 10

x1 + 3x2 + 6x3 + 6x4 = 24

x2 + 2x3 + 3x4 = 20

The extended matrix [A|b] can be represented as 1 2 3 41 3 5 60 1 2 3

∣∣∣∣∣∣102420

.We next perform a series of ERO’s: 1 2 3 4

0 1 2 20 1 2 3

∣∣∣∣∣∣101420

. 1 2 3 4

0 1 2 20 0 0 1

∣∣∣∣∣∣10146

. 1 0 −1 0

0 1 2 20 0 0 1

∣∣∣∣∣∣−18146

. 1 0 −1 0

0 1 2 00 0 0 1

∣∣∣∣∣∣−18

26

.In this system, we have Rank [A|b] = Rank A = 3 < n = 4. When thisis the case, i.e., Rank [A|b] = Rank A < n, we have an infinite number ofsolutions. One such solution is x1 = −18, x2 = 2, x3 = 0, x4 = 6. Anothersolution is x1 = −15, x2 = −3, x3 = 1, x4 = 7.

If Rank [A|b] = Rank A = n, note that this can only occur if n ≤ m,i.e., if the number of equations in the system is at least as great as thenumber of variables. If strict inequality holds, then a solution can only existif m − n equations are redundant. When these are removed, we are leftwith an n× n system of equations such that Rank A′ = n (where A′ is theremaining matrix after removing the redundant equations), in which case

9

Page 10: Review of Basic Linear Algebra

the columns of A′ serve as a basis for n-space, and the solution must beunique.

Finally, suppose Rank [A|b] > Rank A = k. In this case, the columns ofA span k-space, but the vector b is in a higher dimension than k. As a resultwe cannot express b as a linear combination of the columns of A, and nosolution exists. The following example illustrates how we can identify this:

x1 + 2x2 + 3x3 + 4x4 = 10

x1 + 3x2 + 4x3 + 6x4 = 24

x2 + x3 + 2x4 = 20

The extended matrix [A|b] can be represented as 1 2 3 41 3 4 60 1 1 2

∣∣∣∣∣∣102420

.We next perform a series of ERO’s: 1 2 3 4

0 1 1 20 1 1 2

∣∣∣∣∣∣101420

.We next perform a series of ERO’s: 1 2 3 4

0 1 1 20 0 0 0

∣∣∣∣∣∣10146

. 1 0 1 0

0 1 1 20 0 0 0

∣∣∣∣∣∣−18146

. 1 0 1 0

0 1 1 20 0 0 0

∣∣∣∣∣∣−18146

.At this point, we can see that Rank A = 2. If we can obtain a [0 0 1]T inthe column associated with b using a series of ERO’s, this will imply thatRank [A|b] = 3 > Rank A. We continue our ERO’s: 1 0 1 0

0 1 1 20 0 0 0

∣∣∣∣∣∣−18141

.10

Page 11: Review of Basic Linear Algebra

1 0 1 00 1 1 20 0 0 0

∣∣∣∣∣∣−18

01

. 1 0 1 0

0 1 1 20 0 0 0

∣∣∣∣∣∣001

.In fact, whenever we see all zeroes in the same row of the columns associatedwith the A matrix and a non-zero in the same row of the column associatedwith the b vector, we can conclude that we have an infeasible system. Tosummarize, we have the following result:

1. If Rank [A|b] = Rank A < n, then the system has an infinite numberof solutions.

2. If Rank [A|b] = Rank A = n, then the system has a single uniquesolution.

3. If Rank [A|b] > Rank A, then the system has no solution.

The use of the extended matrix [A|b] and ERO’s to obtain a solution to asystem of equations is known as Gauss-Jordan Elimination.

Bases For convenience, we will henceforth assume that the system we areworking with has m equations and n ≥ m variables, with Rank A = m.We say that the A matrix in such cases has full row rank. Recall thatany choice of m linearly independent column vectors provides a basis form-space (and spans m-space). Let us suppose we have m such linearlyindependent columns, which form a basis for m-space, and let B denote thesquare submatrix associated with these columns. Then we know that if wethrow out all other variables (by equivalently setting their values to zero)and solve the system

Bx = b

we will obtain a unique solution (because Rank [B|b] = Rank B = m,and we effectively now have m equal to the number of variables). Thevariables corresponding to the m linearly independent columns selected arecalled basic variables, while those that were set to zero are called non-basicvariables. This unique solution is called a basic solution, and we can obtainthis solution by performing a set of ERO’s that leads to an identity matrixin place of B in the extended matrix [B|b]. Alternatively, we can identify arelated matrix, called the inverse of the matrix B. An m×m matrix with

11

Page 12: Review of Basic Linear Algebra

full row rank (Rank m) has an associated inverse matrix, denoted as B−1

such that B−1B = I. Because of this, such a matrix is called non-singular.(Not every square m ×m matrix has an inverse; when such a matrix doesnot have full row rank, it is a singular matrix and does not have an inverse.)The inverse of a matrix is convenient to have, because we can premultiplyour system of equations by this inverse to obtain

B−1Bx = B−1b,

which is equivalent tox = B−1b.

That is, if we knew B−1, we could simply premultiply b by this matrix toobtain the unique basic solution. One way to identify B−1 is as follows. Wecreate an appended matrix [B|I] and perform a set of ERO’s until we obtainan identity matrix in place of the matrix B. What we obtain when we dothis, as a final result, is [I|B−1]. We illustrate this using the following Bmatrix:

B =

1 2 41 3 60 1 3

.The appended matrix is:

[B|I] =

1 2 41 3 60 1 3

∣∣∣∣∣∣1 0 00 1 00 0 1

.We next perform a series of ERO’s on the matrix [B|I]: 1 2 4

0 1 20 1 3

∣∣∣∣∣∣1 0 0−1 1 00 0 1

. 1 2 4

0 1 20 0 1

∣∣∣∣∣∣1 0 0−1 1 01 −1 1

. 1 0 0

0 1 20 0 1

∣∣∣∣∣∣3 −2 0−1 1 01 −1 1

. 1 0 0

0 1 00 0 1

∣∣∣∣∣∣3 −2 0−3 3 −21 −1 1

.12

Page 13: Review of Basic Linear Algebra

We therefore have

B−1 =

3 −2 0−3 3 −21 −1 1

.The reader can verify that B−1B = I. Recall the system of equations

x1 + 2x2 + 3x3 + 4x4 = 10

x1 + 3x2 + 6x3 + 6x4 = 24

x2 + 2x3 + 3x4 = 20

Suppose we set x3 = 0. Then what remains is the system

x1 + 2x2 + 4x4 = 10

x1 + 3x2 + 6x4 = 24

x2 + 3x4 = 20

The A matrix for this system corresponds to the B matrix we just inverted.Thus, the unique solution can be obtained by premultiplying the right-handside vector b by B−1. When we do this we obtain: x1

x2x4

= B−1b =

3 −2 0−3 3 −21 −1 1

102420

=

−1826

.

13