review of basic linear algebra
Post on 14-Dec-2015
12 Views
Preview:
DESCRIPTION
TRANSCRIPT
ESI 6314 Deterministic Methodsin Operations Research
Review of Basic Linear Algebra
Solving linear programs will require repeatedly solving systems of linearequations. We consider a system with m linear equations, and n variables,or unknowns. We let xi denote the ith variable, with aij corresponding tothe coefficient of xi in the jth equation and bj denoting the right-hand sidevalue of the jth equation, i.e., we consider the following system:
a11x1 + a12x2 + · · ·+ a1nxn = b1
a21x1 + a22x2 + · · ·+ a2nxn = b2...
...
am1x1 + am2x2 + · · ·+ amnxn = bm
It will be convenient to use vector and matrix notation, where
A =
a11 a12 . . . a1na21 a22 . . . a2n...
......
am1 am2 . . . amn
,and
x =
x1x2...xn
, b =
b1b2...bn
.We say that A is an m × n matrix and that x and b are n-dimensionalvectors. An n-dimensional vector is a point in n-space and corresponds to amatrix with either one row or one column. We have written x and b aboveas column vectors, i.e., matrices containing a single column. We could alsowrite x, for example, as a row vector. When we define x as a column vectorand then write it as a row vector, we call this row vector the transpose ofx, i.e., xT = [x1 x2 · · ·xn]. If we had originally defined x as a row vector,then its transpose would correspond to a column vector. The transpose of
1
the m× n matrix A above is an n×m matrix AT written as
AT =
a11 a21 . . . am1
a12 a22 . . . am2...
......
a1n a2n . . . amn
.Note that (AT)T = A.
Vector and matrix addition We can add matrices or vectors of thesame dimensions. For example, consider the n-dimensional vectors u and v:
u =
u1u2...un
, v =
v1v2...vn
.Then the addition of u and v simply involves the component-wise additionof the vector elements:
u+ v =
u1 + v1u2 + v2
...un + vn
.In two dimensions, we can visualize this addition as shown in Figure 1.Similarly, we can add two m× n matrices A1 and A2, with the result beingan m× n matrix such that the (i, j)th element equals a1ij + a2ij .
Scalar multiple of a vector or matrix If k is a scalar (a number), thenthe product of k and the n-vector u is given by
ku =
ku1ku2
...kun
.Figure 2 shows a positive scalar multiple of the vector u; the result is avector in the same direction as u, but scaled by the multiple 2 in this case.The figure also shows a negative scalar multiple of the vector v; the result
2
Figure 1: Addition of two vectors u and v.
is a vector in the opposite direction of v, scaled by the multiple 1/2 in thiscase. Given a vector, it should be clear that a scalar multiple of that vectorcan represent any point on the infinite line that overlaps the vector arrow.Similarly, the product of k and the matrix A is written as
Figure 2: Scalar multiples of the vectors u and v.
kA =
ka11 ka12 . . . ka1nka21 ka22 . . . ka2n
......
...kam1 kam2 . . . kamn
.
3
Scalar or dot product of vectors The scalar product of two n-vectorsu and v, written as uT · v (assuming u and v are both column vectors), issimply the sum of the componentwise products of the vectors, i.e., u1v1 +u2v2 + · · · + unvn. Note that if u and v are both n-dimensional columnvectors, then u · v is undefined. The inner dimensions must be the same,and the result has dimensions equal to the outer dimensions. That is, uT · vinvolves the product of a (1 × n) vector and an (n × 1) vector, and this isallowed because both inner dimensions equal n, with the result being (1×1)(which is a scalar). On the other hand, u · v tries to take the product of an(n× 1) vector and an (n× 1) vector, which is not defined because the innerdimensions are 1 and n, respectively.
The magnitude of an n-dimensional vector u, denoted by ||u||, is theEuclidean norm, i.e., ‖u‖ =
√u21 + u22 + · · ·+ u2n. It turns out that uT · v =
‖u‖‖v‖ cos θ, where θ is the measure of the angle between u and v. Thus,the scalar product of vectors that are perpendicular to one another equalszero.
We can view the vectors u and v as arrows (as in Figure 1) or as thepoints at the end of these respective arrows. Figure 3 shows that any pointon the line segment joining these points can be expressed as λu + (1 − λ)vfor some 0 ≤ λ ≤ 1. For a given 0 ≤ λ ≤ 1 we call λu + (1 − λ)v a convex
Figure 3: Convex combinations of two vectors u and v (0 ≤ λ ≤ 1).
combination of the vectors u and v. Then the line segment connecting uand v represents all convex combinations of u and v.
Matrix multiplication We can mutiply an m × n matrix A by somematrix D, provided that D has dimensions n×p for some positive integer p.
4
We write this multiplication as AD, and the result is a matrix of dimensionsm × p. (Note that DA 6= AD in general, and DA is only defined in thisexample if p = m.) Element (i, j) of AD is equal to the scalar product ofthe ith row of A and the jth column of D.
Example Given the matrices A and D below, compute AD.
A =
[1 1 22 1 3
], D =
1 12 31 2
.Because A is 2× 3 and D is 3× 2, this product is defined and the result
will be a 2 × 2 matrix. Element (1, 1) of this matrix is obtained by takingthe scalar product of row 1 of A and column 1 of D, i.e.,
[1 1 2
] 121
= 5.
The resulting matrix is
AD =
[5 87 11
].
Matrix multiplication is associative and distributive, i.e., assuming compat-ible dimensions for multiplication of matrices A, B, C, and D, A(BC) =(AB)C, A(B+C) = AB+AC, and (B+C)D = BD+CD. If the matricesA and D are compatible for multiplication, then (AD)T = DTAT.
Systems of Equations
We now return to the system of equations we started with, now writing thissystem in matrix form, i.e.,
Ax = b,
where A is an m× n matrix and x and b are n-dimensional column vectors(note that Ax takes the product of an m × n matrix and an n × 1 vector,and the result is thus an m × 1 vector). We do not know a priori whetherthe above system of equations admits a feasible solution. However, exactlyone of three situations must occur:
1. The system contains no feasible solution.
2. The system contains exactly one solution.
5
3. The system contains an infinite number of solutions.
Given a system of equations, we would like to be able to characterize which ofthe above conditions applies. Doing so requires understanding the conceptsof linearly independent vectors and the rank of a matrix.
It is perhaps useful to think of our system of equations in a slightlydifferent way. We can rewrite our system of equations as follows:
n∑i=1
a1ia2i...ami
xi =
b1b2...bm
.On the left-hand side, the vector ai = [a1i a2i · · · ami]
T corresponds to theith column of the A matrix. When viewed this way, it should be clear that insolving this system of equations, we are trying to express the m-dimensionalvector b as a linear combination of the n columns of the A matrix, withmultipliers x1, x2, . . . , xn. Thus, we would like to know when it is possibleto express an m-dimensional vector as a linear combination of a set of nvectors in m dimenions.
To understand this, we need the concept of linear independence. A setof n vectors in m-space is linearly independent if the only solution to thesystem
∑ni=1 a
iki = 0 requires setting ki = 0 for i = 1, . . . , n. The vectorsu and v in each of the figures we have used so far are linearly independentbecause the only way to obtain the origin as a linear combination of thesevectors requires multiplying each vector by zero. If a non-zero set of ki’sexists such that the associated linear combination of the vectors equals zero,then the vectors are not linearly independent.
A set of vectors such that a linear combination of these vectors canrepresent any point in m-space is said to span m-space. We require at leastm vectors in order to span m-space, and some subset of m of these vectorsmust be linearly independent in order for the vectors to span m-space. Aset of m vectors that span m-space is called a basis for m-space, and this setof m vector must be linearly independent by definition (moreover, removingany one of these vectors from the set will result in a set of vectors that nolonger span m-space). Given a basis for m-space, the representation of anypoint in m-space in terms of this basis is unique. That is, if we have mlinearly independent columns of A (e.g., columns 1 through m), then thesolution to
∑mi=1 a
ixi = b is unique. Note that a set of more than m vectorsmay span m-space, provided that a subset of m of these vectors is linearly
6
independent. As a result, we immediately know that if a subset of the ncolumn vectors a1, a2, . . . , an spans m-space, then our system of equationshas at least one solution, as the m-dimensional vector b can be expressed asa linear combination of these vectors.
Observe that if n < m, then we cannot have a subset of m columns thatspan m-space. However, this does not immediately imply that our systemof equations has no solution. It is entirely possible that one or more of theequations in our system is a linear multiple of another equation, implyingthat we have redundant, or unnecessary equations, and should be workingin a smaller subspace of m.
In order to characterize which of the three situations applies (one solu-tion, no solution, infinite number of solutions) for a given system of equationsAx = b, it is useful to understand the concept of the rank of a matrix. Therank of a matrix A corresponds to the maximum number of linearly inde-pendent columns (or rows) of the matrix. Given an m × n matrix A, thisimmediately implies that Rank A ≤ min{m,n}. The rank of a matrix willtell us the largest subspace spanned by the set of columns of the matrix. Inorder to identify the rank of a matrix, we can use a set of elementary rowoperations (ERO’s). The following are ERO’s:
1. Interchanging rows i and j.
2. Multiplying a row by a nonzero scalar.
3. Multiplying a row by a scalar and adding it to another row.
The application of any set of ERO’s to a system of equations does notaffect the set of solutions for the system.
If an m× n matrix A has rank k ≤ min{m,n}, then it is possible to usea set of ERO’s to obtain a k × k identity matrix Ik, where
Ik =
1 0 . . . 00 1 . . . 0...
......
0 0 . . . 1
.(Note that if A is m × n, then ImA = AIn = A; generally we will write Iwhen the dimension is understood or implied.) Thus, in order to identifythe rank of a matrix A, we perform a set of ERO’s to obtain the largestpossible identity matrix. To illustrate this, consider the 3× 4 matrix below,
7
followed by a series of ERO’s applied to the original matrix:
A =
1 2 3 41 3 5 60 1 2 3
.
A1 =
1 2 3 40 1 2 20 1 2 3
.A2 =
1 2 3 40 1 2 20 0 0 1
.A3 =
1 0 −1 00 1 2 20 0 0 1
.A4 =
1 0 −1 00 1 2 00 0 0 1
.Because we have found a 3×3 identity matrix and Rank A ≤ min{3, 4} = 3,we know that this matrix has rank 3. Clearly the first, second, and fourthcolumns are linearly independent and thus span and provide a basis for3-space. Next, consider a slightly different A matrix:
A =
1 2 3 41 3 4 60 1 1 2
.
A1 =
1 2 3 40 1 1 20 1 1 2
.A2 =
1 2 3 40 1 1 20 0 0 0
.A3 =
1 0 1 00 1 1 20 0 0 0
.Because the largest identity matrix we can find is 2×2, this matrix has rank2.
8
Determining a solution to Ax = b (or showing that none exists) canbe done through a series of ERO’s applied to the extended matrix [A|b].Consider the following system of equations:
x1 + 2x2 + 3x3 + 4x4 = 10
x1 + 3x2 + 6x3 + 6x4 = 24
x2 + 2x3 + 3x4 = 20
The extended matrix [A|b] can be represented as 1 2 3 41 3 5 60 1 2 3
∣∣∣∣∣∣102420
.We next perform a series of ERO’s: 1 2 3 4
0 1 2 20 1 2 3
∣∣∣∣∣∣101420
. 1 2 3 4
0 1 2 20 0 0 1
∣∣∣∣∣∣10146
. 1 0 −1 0
0 1 2 20 0 0 1
∣∣∣∣∣∣−18146
. 1 0 −1 0
0 1 2 00 0 0 1
∣∣∣∣∣∣−18
26
.In this system, we have Rank [A|b] = Rank A = 3 < n = 4. When thisis the case, i.e., Rank [A|b] = Rank A < n, we have an infinite number ofsolutions. One such solution is x1 = −18, x2 = 2, x3 = 0, x4 = 6. Anothersolution is x1 = −15, x2 = −3, x3 = 1, x4 = 7.
If Rank [A|b] = Rank A = n, note that this can only occur if n ≤ m,i.e., if the number of equations in the system is at least as great as thenumber of variables. If strict inequality holds, then a solution can only existif m − n equations are redundant. When these are removed, we are leftwith an n× n system of equations such that Rank A′ = n (where A′ is theremaining matrix after removing the redundant equations), in which case
9
the columns of A′ serve as a basis for n-space, and the solution must beunique.
Finally, suppose Rank [A|b] > Rank A = k. In this case, the columns ofA span k-space, but the vector b is in a higher dimension than k. As a resultwe cannot express b as a linear combination of the columns of A, and nosolution exists. The following example illustrates how we can identify this:
x1 + 2x2 + 3x3 + 4x4 = 10
x1 + 3x2 + 4x3 + 6x4 = 24
x2 + x3 + 2x4 = 20
The extended matrix [A|b] can be represented as 1 2 3 41 3 4 60 1 1 2
∣∣∣∣∣∣102420
.We next perform a series of ERO’s: 1 2 3 4
0 1 1 20 1 1 2
∣∣∣∣∣∣101420
.We next perform a series of ERO’s: 1 2 3 4
0 1 1 20 0 0 0
∣∣∣∣∣∣10146
. 1 0 1 0
0 1 1 20 0 0 0
∣∣∣∣∣∣−18146
. 1 0 1 0
0 1 1 20 0 0 0
∣∣∣∣∣∣−18146
.At this point, we can see that Rank A = 2. If we can obtain a [0 0 1]T inthe column associated with b using a series of ERO’s, this will imply thatRank [A|b] = 3 > Rank A. We continue our ERO’s: 1 0 1 0
0 1 1 20 0 0 0
∣∣∣∣∣∣−18141
.10
1 0 1 00 1 1 20 0 0 0
∣∣∣∣∣∣−18
01
. 1 0 1 0
0 1 1 20 0 0 0
∣∣∣∣∣∣001
.In fact, whenever we see all zeroes in the same row of the columns associatedwith the A matrix and a non-zero in the same row of the column associatedwith the b vector, we can conclude that we have an infeasible system. Tosummarize, we have the following result:
1. If Rank [A|b] = Rank A < n, then the system has an infinite numberof solutions.
2. If Rank [A|b] = Rank A = n, then the system has a single uniquesolution.
3. If Rank [A|b] > Rank A, then the system has no solution.
The use of the extended matrix [A|b] and ERO’s to obtain a solution to asystem of equations is known as Gauss-Jordan Elimination.
Bases For convenience, we will henceforth assume that the system we areworking with has m equations and n ≥ m variables, with Rank A = m.We say that the A matrix in such cases has full row rank. Recall thatany choice of m linearly independent column vectors provides a basis form-space (and spans m-space). Let us suppose we have m such linearlyindependent columns, which form a basis for m-space, and let B denote thesquare submatrix associated with these columns. Then we know that if wethrow out all other variables (by equivalently setting their values to zero)and solve the system
Bx = b
we will obtain a unique solution (because Rank [B|b] = Rank B = m,and we effectively now have m equal to the number of variables). Thevariables corresponding to the m linearly independent columns selected arecalled basic variables, while those that were set to zero are called non-basicvariables. This unique solution is called a basic solution, and we can obtainthis solution by performing a set of ERO’s that leads to an identity matrixin place of B in the extended matrix [B|b]. Alternatively, we can identify arelated matrix, called the inverse of the matrix B. An m×m matrix with
11
full row rank (Rank m) has an associated inverse matrix, denoted as B−1
such that B−1B = I. Because of this, such a matrix is called non-singular.(Not every square m ×m matrix has an inverse; when such a matrix doesnot have full row rank, it is a singular matrix and does not have an inverse.)The inverse of a matrix is convenient to have, because we can premultiplyour system of equations by this inverse to obtain
B−1Bx = B−1b,
which is equivalent tox = B−1b.
That is, if we knew B−1, we could simply premultiply b by this matrix toobtain the unique basic solution. One way to identify B−1 is as follows. Wecreate an appended matrix [B|I] and perform a set of ERO’s until we obtainan identity matrix in place of the matrix B. What we obtain when we dothis, as a final result, is [I|B−1]. We illustrate this using the following Bmatrix:
B =
1 2 41 3 60 1 3
.The appended matrix is:
[B|I] =
1 2 41 3 60 1 3
∣∣∣∣∣∣1 0 00 1 00 0 1
.We next perform a series of ERO’s on the matrix [B|I]: 1 2 4
0 1 20 1 3
∣∣∣∣∣∣1 0 0−1 1 00 0 1
. 1 2 4
0 1 20 0 1
∣∣∣∣∣∣1 0 0−1 1 01 −1 1
. 1 0 0
0 1 20 0 1
∣∣∣∣∣∣3 −2 0−1 1 01 −1 1
. 1 0 0
0 1 00 0 1
∣∣∣∣∣∣3 −2 0−3 3 −21 −1 1
.12
We therefore have
B−1 =
3 −2 0−3 3 −21 −1 1
.The reader can verify that B−1B = I. Recall the system of equations
x1 + 2x2 + 3x3 + 4x4 = 10
x1 + 3x2 + 6x3 + 6x4 = 24
x2 + 2x3 + 3x4 = 20
Suppose we set x3 = 0. Then what remains is the system
x1 + 2x2 + 4x4 = 10
x1 + 3x2 + 6x4 = 24
x2 + 3x4 = 20
The A matrix for this system corresponds to the B matrix we just inverted.Thus, the unique solution can be obtained by premultiplying the right-handside vector b by B−1. When we do this we obtain: x1
x2x4
= B−1b =
3 −2 0−3 3 −21 −1 1
102420
=
−1826
.
13
top related