introduction to linear algebra, second edition by gilbert ... t… · introduction to linear...

INTRODUCTION TO LINEAR ALGEBRA, Second Editionby Gilbert Strang

SOLUTIONS TO SELECTED EXERCISESChristopher Heil

Spring 2000

CHAPTER 1

Introduction to Vectors

1.2 #13. Find two vectors v and w that are perpendicular to (1, 1, 1) and to each other.

SolutionThere are many ways to go about this. One way would be to write v = (v1, v2, v3) and

w = (w1, w2, w3) and then to write down the equations that v and w must satisfy. These are:v · (1, 1, 1) = 0, w · (1, 1, 1) = 0, and v · w = 0. This gives a system of equations with the vi

and wi as unknowns, that you could then try to solve to find the set of all possible v and wsatisfying these requirements.

On the other hand, the problem just asks you to find one specific choice of v and w, not allpossible choices. So I think it is easiest to proceed first by inspection: v = (1, 0,−1) is clearlyperpendicular to (1, 1, 1) since their dot product is v · (1, 1, 1) = 1 · 1 + 0 · 1 − 1 · 1 = 0. So,we just have to find a w = (w1, w2, w3) that is perpendicular to both of these vectors. This wmust satisfy

w · (1, 1, 1) = w1 + w2 + w3 = 0 and v · w = (1, 0,−1) · w = w1 − w3 = 0.

There are infinitely many solutions to this system of equations; one particular solution isw = (1,−2, 1). There are many others. �

1

2 STRANG EXERCISE SOLUTIONS, CHAPTER 2

CHAPTER 2

Solving Linear Equations

2.1 #15. (a) What is the 2 × 2 identity matrix? I times[

xy

]equals

[xy

].

Solution

I =[

1 00 1

]. Of course you know this, but why does it work? Because we compute Ix by

forming linear combinations of the columns of I, and those columns can be used to form anyvector using exactly the components of that vector as the scalars in the linear combination:

I

[xy

]=

[1 00 1

] [xy

]= x

[10

]+ y

[01

]=

[x0

]+

[0y

]=

[xy

]. �

(b) What is the 2 × 2 exchange matrix? P times[

xy

]equals

[yx

].

Solution

P =[

0 11 0

]. The reasoning is similar:

P

[xy

]=

[0 11 0

] [xy

]= x

[01

]+ y

[10

]=

[0x

]+

[y0

]=

[yx

].

Of course you can think about this from an “entrywise” view of matrix/vector multiplication,but often it is very advantageous to think about general problems from a linear combinationpoint of view. �

2.2 #19. (Recommended) It is impossible for a system of linear equations to have exactlytwo solutions. Explain why.

NOTE: The wording of this problem is a little ambiguous: If you just write out the answerto parts (a) and (b) of the problem, then you have done the explanation that the problemasks for!

(a) If (x, y, z) and (X, Y, Z) are two solutions of Ax = b, what is another one?

SolutionLet u = (x, y, z) and v = (X, Y, Z). The fact that these are SOLUTIONS means that they

solve the system of equations; in other words, Au = b and Av = b, with A, b coming from the

STRANG EXERCISE SOLUTIONS, CHAPTER 2 3

system of equations. Intuitively, we would think that any vector w which lies on the line �passing through u and v would also be a solution. The equation for this line is

� = {cu + dv : c, d ∈ R and c + d = 1}.Let’s check that any vector on this line actually is a solution. Suppose that w = cu+dv wherec + d = 1; then

Aw = A(cu + dv) = c(Au) + d(Av) = cb + db = (c + d)b = b.

Thus, any w on this line is in fact a solution. As one example, note that the average of uand v should lie on this line, since the average is the vector exactly halfway between u and v.That is, the average is w = 1

2u + 12v. This is on the line, and it is a solution.

Moral: If there are two solutions, then there is an entire line of solutions, and it is theline passing through the original two solutions. There might even be more, but we can’tknow that without additional information. IMPORTANT: It is NOT true that EVERY linearcombination of u and v is a solution of Ax = b!! WHY NOT? �

(b) If three planes meet at two points, where else do they meet?

SolutionThey meet at least on the line passing through those two points. To prove this, note that

each plane is given by one equation. The intersection of the two planes is the set of pointswhich solves the two equations simultaneously. In other words, it is the set of all solutions ofa system of linear equations. But as shown in part a, if there are two solutions, then there isan entire line of solutions. �

2.3 #12. Explain these facts. If the third column of B is all zero, the third column of EBis all zero (for any E). If the third row of B is all zero, the third row of EB might not be zero.

SolutionFor the first part, use columnwise multiplication. Let v1, v2, . . . , vn be the columns of B.

Then to compute AB you simply multiply A times each individual column of B:

AB = A

| | | |

v1 v2 v3 · · · vn

| | | |

=

| | | |

Av1 Av2 Av3 · · · Avn

| | | |

.

So, if v3 (the third column of B) equals 0 (the zero vector) then then Av3 (the third columnof AB) must equal 0.

On the other hand, the third row may or may not be zero. Here is an example where it isnot:

1 0 00 1 01 0 1

1 2 3

4 5 60 0 0

=

1 2 3

4 5 61 2 3

. �


2.4 #7. True or False.

(a) If columns 1 and 3 of B are the same, so are columns 1 and 3 of AB.

SolutionUse use columnwise multiplication. Let v1, v2, . . . , vn be the columns of B. Then

AB = A

| | | |

v1 v2 v3 · · · vn

| | | |

=

| | | |

Av1 Av2 Av3 · · · Avn

| | | |

.

So, if v1 (the first column of B) equals v3 (the third column of B) then Av1 (the third columnof AB) equals Av3 (the third column of EB). �

(b) If rows 1 and 3 of B are the same, so are rows 1 and 3 of AB.

SolutionThis is false in general (meaning that it can sometimes be true but usually isn’t). For

example, here are two matrices A and B such that rows 1 and 3 of B are equal but rows 1and 3 of AB are different:

0 1 00 0 11 0 0

1 2 3

4 5 61 2 3

=

4 5 6

1 2 31 2 3

.

In this example, A is an example of a permutation matrix, a matrix whose rows are the rowsof the identity matrix rearranged in order. In this case, AB is B with the rows rearranged.So, even if the first and third rows of B start out the same, there’s no reason why the firstand third rows would have to still be equal after we have changed their order around.

On the other hand, we can find other examples where rows 1 and 3 of AB are equal. Forexample, if A = I, the identity matrix, then AB = IB = B, so if rows 1 and 3 of B are equal,then rows 1 and 3 of AB will be equal. �

(c) If rows 1 and 3 of A are the same, so are rows 1 and 3 of AB.

SolutionTrue. Use rowwise multiplication. Let u1, u2, . . . , un be the rows of A. Then

AB =

— u1 —— u2 —— u3 —

...— un —

B =

— u1B —— u2B —— u3B —

...— unB —

.

Hence, if u1 = u3 (rows 1 and 3 of A are equal), then u1B = u3B (rows 1 and 3 of AB areequal). �


(d) (AB)2 = A2B2.

SolutionFalse in general. Here’s a counterexample: if

A =[

0 10 0

]and B =

[0 01 0

]

then

(AB)2 =[

1 00 0

]but A2B2 =

[0 00 0

].

But for other examples it might be true; for example, if A = B then (AB)2 = A4 = A2B2. �


(a) If A2 is defined then A is square.

SolutionTrue. Just check the sizes: you can’t multiply an m×n matrix times another m×n matrix

unless m = n. �

(b) If AB and BA are defined then A and B are square.

SolutionNot necessarily. If A is m× n and B is j × k then the only requirement we need to have in

order that both AB and BA exist is that n = j and m = k. For example, if A is 3× 4 and Bis 4 × 3, then AB is defined and is a 3 × 3 matrix, and BA is defined and is a 4 × 4 matrix,but neither A nor B are square. �

(c) If AB and BA are defined then AB and BA are square.

Solution¿From part b, we must have A is m × n and B is n × m. Then AB is m × m and BA is

n × n, so both are square. �

(d) If AB = B then A = I.

SolutionIMPORTANT: This is false in general (meaning it can sometimes be true but it can also

be false–in fact, it usually is false). For example, if A = B =[

1 00 0

]then A2 = A but A �= I!

The problem is that you CANNOT just “cancel” matrices! You CAN multiply both sides by


an inverse—IF one exists! IF you knew that B had an inverse THEN you could multiply bothsides of AB = B on the right by B−1 to get

A = ABB−1 = BB−1 = I,

but you have to know that B−1 exists before you can that! �


2.5 #7. (Important) If the 3 × 3 matrix A has row 1 + row 2 = row 3, show that A is notinvertible:

(a) Explain why Ax = (1, 0, 0) cannot have a solution.

Solution

Let u1, u2, u1 + u2 be the rows of A, i.e., A =

— u1 —

— u2 —— u1 + u2 —

. Then

Ax =

— u1x —

— u2x —— (u1 + u2)x —

.

(Why?) Therefore, row 3 of the vector Ax is always the sum of rows 1 and 2 of Ax. Sincethis isn’t true for the vector (1, 0, 0), it’s not possible for Ax to ever equal (1, 0, 0). �

(b) Which right sides (b1, b2, b3) might allow solutions to Ax = b?

SolutionWe at least require b3 = b1 + b2. But this may or may not be enough, depending on what

those first two rows of A look like. �

(c) What happens to row 3 in elimination?

SolutionTwo steps in the elimination are sure to be: 1. Subtract row 1 from row 3; 2. Subtract row

2 from row 3. The result is that row 3 becomes a row of zeros. �

2.5 #10. (a) Find invertible matrices A and B such that A + B is not invertible.

SolutionLet A be any invertible matrix. Then B = −A is also invertible. But A + B = 0, the zero

matrix, which is not invertible. �

(b) Find singular matrices A and B such that A + B is invertible.

Solution

Try A =[

1 00 0

]and B =

[0 00 1

]. �


2.5 #14. Prove that a matrix with a column of zeros cannot have an inverse.

SolutionHere’s one proof, there are many others. Suppose that column j of A is all zeros. Then

Aej = 0, where ej is the jth standard basis vector, the vector that has a 1 in row j and 0’selsewhere. This is because Aej is just the jth column of A (why?). But we also have A0 = 0,i.e., A times the zero vector is the zero vector. Thus, the problem Ax = 0 doesn’t have aunique solution, because both x = ej and x = 0 are solutions. Hence A can’t be invertible,because for invertible matrices the problem Ax = b has a unique solution for every b (namely,x = A−1b). Thus, A can’t be invertible. �

2.6 #14. If A = LDU and A = L1D1U1 with all factors invertible, then L = L1, D = D1,and U = U1. “The factors are unique.”

NOTE: The wording of this problem is ambiguous: just answering parts a and b shows whythe factors are unique. There is no other explanation required besides answering parts a and b.But do be careful to write out a valid explanation for each of those parts. In particular, you areNOT given that L = L1, D = D1, U = U1. That is the whole point of this problem: you aregoing to SHOW that L = L1, D = D1, U = U1, just from knowing that A = LDU = L1D1U1

with all factors invertible.

(a) Derive the equation L−11 LD = D1U1U

−1. Are the two sides upper or lower triangular?

SolutionStart from LDU = L1D1U1. Multiply both sides on the left by L−1

1 (remember, order isimportant!), to get L−1

1 LDU = D1U1. Then multiply both sides on the right by U−11 to get

L−11 LD = D1U1U

−1.Recall that L and L1 are lower triangular. The inverse of a lower triangular matrix is

lower triangular, so L−11 is lower triangular. The product of lower triangular matrices is still

lower triangular, so L−11 LD is lower triangular (D is diagonal, which is both lower and upper

triangular). Similarly, D1U1U−1 is upper triangular. �

(b) Show that the main diagonals give D = D1. Why does L = L1?

SolutionWe’ve shown that the lower triangular matrix L−1

1 LD is equal to the upper triangularmatrix D1U1U

−1. This can only happen if both matrices are actually diagonal, with exactlythe same diagonal entries. Now, what are the diagonal entries? Both L and L−1

1 are lowertriangular with 1’s on the diagonal. So when you multiply L−1

1 times L times D you just getthe entries of D on the diagonal. Similarly, D1U1U

−1 has the entries of D1 on the diagonal.Therefore, these entries are equal, i.e., D = D1.

Now that we’ve got D = D1, we can work on L and L1. Our equation has becomeL−1

1 LD = DU1U−1. Multiplying on the right by D−1, we get L−1

1 L = DU1U−1D−1. The


left hand side is lower triangular and the right hand side is upper triangular. Therefore, bothsides must actually be diagonal: L−1

1 L is a diagonal matrix. In addition, the diagonal entriesof L−1

1 L are all ones, because the diagonal entries of L−11 and L are all ones. This says that

L−11 L = I, so L1 = L. A similar argument gives U1 = U . �

2.7 #4. Show that A2 = 0 is possible but ATA = 0 is not possible (unless A = zero matrix.)

NOTE: When you show that ATA = 0 is not possible, you cannot assume that A is 2 × 2(or any other specific size). This is a general problem about arbitrary matrices.

Solution

To show that A2 = 0 is possible, we just have to give one example: try A =[

0 10 0

].

However, to show that ATA = 0 is never possible, we need more than just an example,because A could be anything at all. In particular, A could be rectangular.

Here’s one approach. Let v1, . . . , vn be the columns of A. Then vT1 , . . . , vT

n are the rowsof AT. Now multiply these together, and just pay attention to what’s on the diagonal of theproduct:

ATA =

— vT1 —...

— vTn —

| |

v1 · · · vn

| |

=

vT1 v1 ∗

. . .∗ vT

n vn

.

So, if ATA = 0 then we must have vT1 v1 = · · · = vT

n vn = 0. But for any vector u = (u1, . . . , un),the dot product is related to its length by the formula uTu = ‖u‖2 = u2

1 + · · ·+u2n. Therefore,

we have that ‖v1‖2 = · · · = ‖vn‖2 = 0, i.e., all these vectors have length zero. But the onlyvector with length zero is the zero vector, so v1 = · · · = vn = 0. Thus, every column v1, . . . , vn

of A is zero, so A is the zero matrix.Here’s a second approach. Suppose we did have ATA = 0. Multiply on the right by x and

on the left by xT to get(Ax)T(Ax) = xTATAx = 0.

Again, the length of a vector is related to the dot product by ‖u‖2 = uTu. Therefore,the equation above says that ‖Ax‖2 = 0, i.e., Ax has zero length. But then Ax = 0 forevery x. Since Ax is a linear combination of the columns of A, this says that every linearcombination of the columns of A is zero. But then each column of A must be zero: just usex = (0, . . . , 0, 1, 0, . . . , 0) to show that any particular column of A is zero. Therefore A mustbe the zero matrix. �


2.7 #12. (a) Explain why the dot product of x and y equals the dot product of Px and Py.

SolutionNote that P in this problem is a permutation matrix, not just any matrix.First, note that x · y = xTy and Px · Py = (Px)T(Py) = xTPTPy. So, it would certainly

be nice if PTP = I, because then we would surely have x · y = Px · Py. Now, why wouldPTP = I? Just recall that the columns of P are the standard basis vectors in some order.That is, if v1, . . . , vn are the actual columns of P , then these vectors are just the standardbasis vectors e1, . . . , en in some rearranged order. Then if we write out PTP , we get

PTP =

— vT1 —...

— vTn —

| |

v1 · · · vn

| |

=

vT1 v1 · · · vT

1 vn

.... . .

...vT

n v1 · · · vTn vn

=

1 · · · 0

.... . .

...0 · · · 1

= I.

The dot products come out to be 1’s and 0’s because the vectors v1, . . . , vn are orthonormalvectors. �

(b) With x = (1, 2, 3) and y = (1, 1, 2), show that Px · y isn’t always equal to x · Py.

Solution

Try it using a permutation matrix P that is not symmetric, like P =

0 0 1

1 0 00 1 0

. �


CHAPTER 3

Vector Spaces and Subspaces

3.1 #5. (a) Describe a subspace of M that contains A =[

1 00 0

]but not B =

[0 00 −1

].

SolutionSimply take the “line” through the vector A in the space M, i.e., all multiples of A. In

other words, the following subspace contains A but not B:

S = {cA : c ∈ R} ={[

c 00 0

]: c ∈ R

}. �

(b) If a subspace of M contains A and B, must it contain I?

SolutionThe A and B mentioned here are the two matrices given in part (a). For those two

particular choices of A and B, it is true that any subspace that contains both A and B mustalso contain I. To see this, suppose that S is some subspace of M that contains both A andB. Then S must be closed under vector addition and under scalar multiplication. So, S must

contain A + (−1)B =[

1 00 1

]= I.

On the other hand, if A and B were some other matrices, it might not follow that S would

have to contain I. For example, if A =[

1 00 0

]and B =

[0 10 0

], then the subspace

S ={[

a b0 0

]: a, b ∈ R

}

contains both A and B, but does not contain I. �

(c) Describe a subspace of M that contains no nonzero diagonal matrices.

Solution

Let S ={[

a aa a

]: a ∈ R

}, i.e., all multiples of the vector A =

[1 11 1

]. The only diagonal

matrix in S is the zero matrix. �


3.1 #8. If the sum of the “vectors” f(x) and g(x) is defined to be the function f(g(x)),then the “zero vector” is g(x) = x. Keep the usual scalar multiplication cf(x) and find tworules that are broken.

SolutionWe’re saying that f +g is defined to be the composition f(g(x)). If this is confusing because

you think + should always mean addition, try just thinking of these as abstract objects withan abstract rule for combining, or ‘adding,’ them. If you like, you could substitute a differentsymbol for ‘plus.’ For example, you could use the regular symbol for compositions, f ◦ g,instead of f + g. Using that notation, I’ll show three rules that fail.

First, commutativity fails: f ◦ g �= g ◦ f in general. To see this, just take almost any twofunctions you like, e.g., f(x) = x + 1 and g(x) = sinx, and compute the two compositions.

Second, there need not be an inverse element. For example, if g(x) = 1 for all x, thenthere’s no f such that f(g(x)) = x for all x. (We want it to equal x on the right-hand side ofthe equality because that’s the zero vector in this space.)

Third, one of the distributive laws fails: c(f ◦ g) �= cf ◦ cg in general. This is simply sayingthat c f(g(x)) �= c f(c g(x)) for some functions f and g. Almost any example will work. Forexample, consider f(x) = sinx and g(x) = x: it’s NOT true that c sin x = c sin(cx). �

3.1 #9. One requirement can be met while the other fails. Show this with

(a) A set of vectors in R2 so that x + y stays in the set but 1

2x may be outside.

Solution

Try S ={[

mn

]: m, n are integers

}. �

(b) A set of vectors in R2 (other than two quarter-planes) for which every cx stays in the

set but x + y may be outside.

SolutionLet S be the coordinate axes:

S ={[

x1

x2

]: x1 = 0 or x2 = 0

}.

Then e1 =[

10

]and e2 =

[01

]are both in S, but their sum e1 + e2 =

[11

]isn’t in S. �


3.1 #18. True or false.

(a) The symmetric matrices in M (with AT = A) form a subspace.

SolutionTrue. The subspace is S = {A ∈ M : AT = A}. Suppose that A and B are both in S.

This means that they are both symmetric, i.e., AT = A and BT = B. Then A + B is alsosymmetric, since (A + B)T = AT + BT = A + B. Hence A + B ∈ S, so S is closed undervector addition. Similarly, (cA)T = cAT = cA, so cA ∈ S for every scalar c. Thus S is closedunder scalar multiplication as well. Hence S is a subspace of M. �

(b) The skew-symmetric matrices in M (with AT = −A) form a subspace.

SolutionTrue. The subspace is S = {A ∈ M : AT = −A}. Suppose that A and B are both in S.

Then AT = −A and BT = −B, so (A + B)T = AT + BT = (−A) + (−B) = −(A + B). HenceA + B ∈ S, so S is closed under vector addition. Similarly, (cA)T = cAT = c(−A) = −(cA)for every scalar c, so S is closed under scalar multiplication. Hence S is a subspace of M. �

(c) The unsymmetric matrices in M (with AT �= A) form a subspace.

SolutionFalse. For one thing, the zero matrix (the matrix with all zero entries) is symmetric, so

the subset S = {A ∈ M : AT �= A} does not contain the zero vector and therefore cannot bea subspace. Alternatively, if A is an unsymmetric matrix then so is −A, so both A and −Aare in S. Yet their sum is A + (−A) = 0, which is not is S, so S is not closed under vectoraddition and therefore cannot be a subspace. �


(a) The vectors b that are not in the column space C(A) form a subspace.

SolutionFalse. The zero vector is in C(A), so the set of vectors not in C(A) doesn’t contain the

zero vector and therefore can’t be a subspace.

NOTE: A subspace must contain the zero vector, but that’s not enough by itself to guar-antee that a given subset is a subspace. If you wanted to show that something was a subspaceyou would have to show that it is closed under both vector addition and scalar multiplication.It is a consequence of those two properties that the zero vector must be in the subspace. �

(b) If C(A) contains only the zero vector then A is the zero matrix.

Solution


True. But can you give a careful proof? Note that each individual column of A is a vectorin the column space of A! To see this, let v1, . . . , vn be the columns of A. Then the jth columnvj is a linear combination of all the columns, namely

vj = 0 v1 + · · · + 0 vj−1 + 1 vj + 0 vj+1 + · · · + 0 vn,

and therefore vj ∈ C(A). Another way to see this is to note that vj = Aej where ej =(0, . . . , 0, 1, 0, . . . , 0) is the jth standard basis vector. Therefore, since each column is in thecolumn space and we’re told that the only thing in this matrix’s column space is the zerovector, each individual column vj of A must be zero. Therefore A is the zero matrix. �

(c) The column space of 2A equals the column space of A.

SolutionTrue. Ax = b is solvable exactly when (2A)x = b is solvable. The solutions themselves are

different, but the set of b’s for which there is a solution is exactly the same. �

(d) The column space of A − I equals the column space of A.

SolutionFalse in general. For example, if A = I then A − I = 0, so in this case C(A) = C(I) = R

n

but C(A − I) = {0}. On the other hand, for some matrices A it can be true. For example, ifA = −I then C(A) = C(−I) = R

n and C(A − I) = C(−2I) = Rn. �


(a) A square matrix has no free variables.

Solution

False, consider A =[

1 00 0

]. A square matrix may or may not have free variables. �

(b) An invertible matrix has no free variables.

SolutionTrue. If there is a free variable then there is a nonzero solution to Ax = b. But if A is

invertible then x = 0 is the only solution to Ax = 0. �

(c) An m × n matrix has no more than n pivot variables.

SolutionTrue. Look at the echelon form of U : there can’t be any more pivots than there are

columns. �


(d) An m × n matrix has no more than m pivot variables.

SolutionTrue. From the echelon form for U there can’t be any more pivots than there are rows. �

3.2 #19. If x is in the nullspace of B, prove that x is in the nullspace of AB. This means:If Bx = 0 then . Give an example in which these nullspaces are different.

SolutionSuppose that x ∈ N(B). This means that Bx = 0. Therefore (AB)x = A(Bx) = A0 = 0.

Hence x ∈ N(AB) by definition. To fill in the blank: If Bx = 0 then ABx = 0.This does NOT imply that N(AB) = N(B)! All that we have done is to show that every

element of N(B) is also in N(AB). This means that N(B) is contained in N(AB), i.e., it issome part of N(AB). It might be the case that N(B) is equal to all of N(AB), or it mightbe the case that N(B) is a strictly smaller set that N(AB). You can’t tell from what wehave done above. In fact, here is an example where the two nullspaces N(B) and N(AB) aredifferent: take

A =[

1 00 0

]and B =

[0 01 0

].

Since Bx =[

0 01 0

] [x1

x2

]=

[0x1

], we have Bx = 0 when x1 = 0, i.e., the nullspace of B is

N(B) = {x : Bx = 0} ={[

0x2

]: x2 ∈ R

}.

However, AB =[

0 00 0

]is the zero matrix, so every vector satisfies ABx = 0, i.e.,

N(AB) = R2.

This problem is continued in Problem 3.2 #20, which gives a condition on A under which itis true that N(AB) = N(B). �

3.2 #20. If A is invertible then N(AB) = N(B).

SolutionWe’ve already seen in Problem 3.1#19 that N(B) is PART of N(AB) no matter what AB

are. Now we’re supposed to show that if A is invertible then N(B) is in fact ALL of N(AB),i.e., every vector in N(AB) is really a vector in N(B). So, suppose that x is some vector inN(AB). This means that ABx = 0. But A is invertible, so we can multiply both sides of thatequation by A−1 to get Bx = A−1ABx = A−10 = 0. Thus x ∈ N(B). Hence every vector inN(AB) is also a vector in N(B), so N(B) really is all of N(AB). �


3.2 #21. Construct a matrix whose nullspace consists of all combinations of (2, 2, 1, 0) and(3, 1, 0, 1).

SolutionYou are being asked to construct a matrix whose nullspace is exactly EQUAL to the set of

all linear combinations of those two vectors, not just a matrix whose nullspace CONTAINSall linear combinations of those two vectors. For example, the nullspace of the zero matrixincludes ALL vectors, so would include all combinations of (2, 2, 1, 0) and (3, 1, 0, 1), but itwould not CONSIST (equal) the set of combinations of (2, 2, 1, 0) and (3, 1, 0, 1),

There are many ways to approach this problem. I’ll give two solutions. The direct, brute-force method is simply to realize that we are looking for a matrix A and that the entriesof this matrix are the unknowns in this problem. In order for the two specified vectorsv1 = (2, 2, 1, 0) and v2 = (3, 1, 0, 1) to be in the nullspace, A must have four columns, otherwisethe multiplication of A times v1 or v2 won’t make sense. How many rows do we need? Sincewe have four columns and we want two independent vectors in the nullspace, we need ourmatrix to have two free variables, and hence two nonzero pivots. The simplest way to get justtwo pivots would be to have only two rows. So let’s look for a 2 × 4 matrix A.

The entries of A are the unknowns; let’s label these

A =[

a1 a2 a3 a4

b1 b2 b3 b4

].

We must have Av1 = 0 and Av2 = 0. This gives the following system of equations:

2a1 + 2a2 + a3 = 0,

2b1 + 2b2 + b3 = 0,

3a1 + a2 + a4 = 0,

3b1 + b2 + b4 = 0.

There are four equations and eight unknowns, so there are infinitely many solutions, whichyou can find by Gaussian elimination. One solution is

A =[ 1

4 −34 1 0

−12

12 0 1

].

Here is another way to solve this problem. As above, let

A =[

a1 a2 a3 a4

b1 b2 b3 b4

].

The fact that v1 = (2, 2, 1, 0) and v2 = (3, 1, 0, 1) must be in N(A) means that if we let B bethe matrix with v1, v2 as columns, then

AB =[

a1 a2 a3 a4

b1 b2 b3 b4

]

2 32 11 00 1

=

[0 00 0

].


Transpose both sides to get

BTAT =[

2 2 1 03 1 0 1

]

a1 b1

a2 b2

a3 b3

a4 b4

=

[0 00 0

].

Hence the two vectors (a1, a2, a3, a4) and (b1, b2, b3, b4) have to be in the nullspace of BT!Thus, all we have to do is find two vectors that span the nullspace of BT. For this, applyGaussian elimination:

BT =[

2 2 1 03 1 0 1

]→

[2 2 1 00 −2 − 3

2 1

].

Therefore x3 and x4 are free variables, and back-substitution leads to the general solution

x =

14x3 − 1

2x4

−34x3 + 1

2x4

x3

x4

= x3

14

−34

10

+ x4

−1

212

01

.

Hence, this method again yields the matrix

A =[ 1

4 −34 1 0

−12

12 0 1

].

However, there are infinitely many other matrices that will also work. �

3.2 #26. Construct a 2 × 2 matrix whose nullspace equals its column space.

Solution

Try A =[

1 1−1 −1

]. The nullspace and column space are both equal to the line through

the vector[

1−1

]. Another example is A =

[0 01 0

]. �

3.2 #27. Why does no 3 × 3 matrix have a nullspace that equals its column space?

SolutionBecause the dimensions are wrong. The dimension of the nullspace is the number of free

variables. The dimension of the column space is the number of pivot variables. Together,


these add to n, the number of columns. So if C(A) = N(A), then the nullspace and thecolumn space have the same dimension, and therefore n must be an even number. �

3.3 #14. (Important) Suppose A and B are n × n matrices, and AB = I. Prove fromrank(AB) ≤ rank(A) that the rank of A is n. So A is invertible and B must be its two-sidedinverse (Section 2.5). Therefore BA = I (which is not so obvious!).

Remarks on rank. Before I solve the problem, let me explain why rank(AB) ≤ rank(A)is always true. This is a useful fact worth remembering, but understanding the explanationbehind it is even more useful.

In order to prove that rank(AB) ≤ rank(A), I’m not going to try to compute the numberof pivots in AB or A. Instead, I’m going to show that the column space of AB is containedin the column space of A, i.e.,

C(AB) ⊂ C(A). (1)

This doesn’t mean that the two column spaces are equal, just that A’s column space includesall of AB’s column space and possibly more. Since C(A) is bigger than C(AB), it’s dimensiontherefore has to be at least as large. But rank equals the dimension of the column space, sofrom equation (1) we get

rank(AB) = dim(C(AB)) ≤ dim(C(A)) = rank(A).

Thus, all we have to do is show why equation (1) is true.Here’s one explanation. Let v1, v2, . . . , vn be the columns of B. Then when computing AB,

the matrix A just multiples each column of B:

AB = A

| | |

v1 v2 · · · vn

| | |

=

| | · · · |

Av1 Av2 · · · Avn

| | · · · |

.

Thus each column of the product AB has the form Avi for some vector vi. But Avi is just alinear combination of the columns of A! Hence each column Avi of the product AB is a linearcombination of the columns of A, and therefore is a vector in the column space of A. That is,each column of AB is in C(A). Since C(A) contains each column of AB, it then contains thespan of those columns, which is C(AB).

Here’s a second explanation. The column space is the set of all vectors that you end upwith after multiplying by your matrix. Suppose A is an m × n matrix and B is an n × kmatrix. Then

C(A) = {Ax : x ∈ Rn}

andC(AB) = {ABy : y ∈ R

k} = {A(By) : y ∈ Rk}.

Now, every vector By is one vector in Rn. Thus, A times a vector By is one possible A times

x, with x = By. So, all the possible vectors ABy are just some of the possible Ax’s, namely,


the ones that come from x = By for some y. Hence C(A), which contains ALL POSSIBLEAx’s, includes the vectors A(By) in particular. Thus C(A) contains all of C(AB) and possiblymore.

Note that this actually tells us more: it tells us when C(AB) and C(A) will be equal.They’ll be equal if every vector x ∈ R

n equals some By for some y ∈ Rk. In other words, this

will happen if x = By is solvable for every x. But this means that every vector x must be inB’s column space, or in other words, that C(B) = R

n. That is, while we always have C(AB)is a part of C(A), they will only be equal if B’s column space is all of R

n, i.e.,

C(AB) = C(A) ⇐⇒ C(B) = Rn.

Solution of the problemNow back to the problem at hand. If A and B are square matrices such that AB = I, then

we must have rank(AB) = rank(I) = n. But on the other hand, rank(AB) ≤ rank(A), so thistells us that rank(A) ≥ n. But A can’t have rank any bigger than n since it’s an n×n matrix.Hence rank(A) = n. Thus A has n nonzero pivots, and therefore is an invertible matrix. Bydefinition, invertible means that it has a two-sided inverse A−1, i.e., there’s a matrix A−1

such that AA−1 = I = A−1A. This matrix must equal B because if we multiply both sides ofAB = I by A−1 we get

B = A−1AB = A−1I = A−1.

Hence, just from knowing that A is square and there is a matrix B such that AB = I, we’veshown that B = A−1 and therefore that BA = I as well. �

3.4 #10. Explain why all are false.

(a) The complete solution is any linear combination of xp and xn.

SolutionA linear combination of xp and xn has the form cxp + dxn for some scalars c and d. Then

A(cxp + dxn) = cAxp + dAxn = cb + d0 = cb. So to get a solution to Ax = b you must takec = 1. That means that certain linear combinations of xp and xn are solutions, but not allcombinations are solutions. �

(b) A system Ax = b has at most one particular solution.

SolutionAny solution to Ax = b is a particular solution. So if there are infinitely many solutions

then you can choose any one of them as a particular solution and then get all the others byadding anything in the nullspace to that particular solution. But it doesn’t matter whichparticular solution you choose to start from: adding all the vectors in the nullspace to thatparticular solution will give you all the solutions to Ax = b, no matter which one you startfrom. �


(c) The solution xp with all free variables zero is the shortest solution (minimum length‖x‖). Find a 2 by 2 counterexample.

SolutionHere, “shortest” is referring to length, you want the solution xp with the smallest length

‖xp‖. For example, consider A =[

1 10 0

]and b =

[10

]. Writing x =

[x1

x2

], there is one free

variable, x2. The general solution to Ax = b for this A and b is

x =[

10

]+ x2

[−11

].

This is a line in R2, and if you draw a picture of that line, you will see that x2 = 0 does not

correspond to the point on the line that is closest to the origin. In fact, when x2 = 0 we have‖x‖ = 1, but when x2 = 1/2 we have ‖x‖ = 1/

√2 ≈ 0.707. �

(d) If A is invertible there is no homogeneous solution xn.

SolutionThere always is a solution to the homogeneous equation Ax = 0, namely x = 0. So there

always is at least one homogeneous solution xn. If A is invertible then there aren’t any othersolutions, if A is not invertible then there are infinitely many other solutions. �

3.4 #22. If A is an m by n matrix with r = 1, its columns are multiples of one column andits rows are multiples of one row. The column space is a in R

m. The nullspace is ain R

n. Also the column space of AT is a in Rn.

SolutionThe answers are: Line, n − 1-dimensional subspace, line. Let’s try to see why.First, why are the columns of A all multiples of each other? Look at the echelon form of U :

there is only one pivot, so there is only one nonzero row. That means that U looks like this:

U =

c1 c2 · · · cn

0 0 · · · 0...

.... . .

...0 0 · · · 0

.

If c1 �= 0 then each column of U is a multiple of the first column. If c1 = 0 but c2 �= 0 theneach column of U is a multiple of the second column, etc. In any case, each column of U isa multiple of any one of the nonzero columns of U . Since A and U have exactly the samedependency relationships among their columns, the columns of A must also be multiples ofone column. The column space is what you get by forming linear combinations of the columns,so this will simply be the line through the one independent column of A. Because A and AT


have the same rank, the column space of AT must also be spanned by a single vector, i.e., itmust also be a line. And since the dimensions of the column space and nullspace add to n,the nullspace must be an n − 1-dimensional subspace of R

n. �


3.5 #7. If w1, w2, w3 are independent vectors, show that the differences v1 = w2 −w3 andv2 = w1 − w3 and v3 = w1 − w2 are dependent.

SolutionHere is a linear combination of v1, v2, and v3 that equals the zero vector:

v1 − v2 + v3 = (w2 − w3) − (w1 − w3) + (w1 − w2) = 0.

Hence v1, v2, v3 are dependent. �

3.5 #17. Find a basis for each of these subspaces of R4.

(a) All vectors whose components are equal.

SolutionThe subspace here is

S = {(a, a, a, a) : a ∈ R}.Each vector in S is a multiple of the vector (1, 1, 1, 1). Hence the vector (1, 1, 1, 1) spansthe space. A single nonzero vector alone is an independent set, so {(1, 1, 1, 1)} forms a basisfor S. �

(b) All vectors whose components add to zero.

SolutionThe subspace is

S =

abcd

: a + b + c + d = 0

=

abc

−a − b − c

: a, b, c ∈ R

=

a

100

−1

+ b

010

−1

+ c

001

−1

: a, b, c ∈ R

= span

100

−1

,

010

−1

,

001

−1

.

Hence S is spanned by three vectors. If those three vectors are independent, then they are abasis for S. So let’s check to see if they are independent.


Here is a direct proof that they are independent. We have to show that

a

100

−1

+ b

010

−1

+ c

001

−1

=

0000

(1)

is only possible when a = b = c = 0. But if equation (1) is true, then

abc

−a − b − c

=

0000

,

which can only happen if a = b = c = 0. Therefore these three vectors are indeed indepen-dent. �

(c) All vectors that are perpendicular to (1, 1, 0, 0) and (1, 0, 1, 1).

SolutionLet u = (1, 1, 0, 0) and v = (1, 0, 1, 1). Then a vector x ∈ R

4 is perpendicular to both u andv if u · x = v · x = 0, or, in other words, if

x1 + x2 = 0 (from u · x = 0),

x1 + x3 + x4 = 0 (from v · x = 0).

We’re looking for the set of all solutions to this system of equations. The matrix form of thissystem is

[1 1 0 01 0 1 1

]

x1

x2

x3

x4

=

[00

].

Thus we’re simply looking for the nullspace of the matrix

A =[

1 1 0 01 0 1 1

].

We know how to find a basis for a nullspace: apply Gaussian elimination. The echelon matrix is

U =[

1 1 0 00 −1 1 1

].

There are two free variables, namely x3 and x4. By back-substitution, the general solution toAx = 0 is

x =

−x3 − x4

x3 + x4

x3

x4

= x3

−1

110

+ x4

−1

101

.


Therefore

1100

,

1011

forms a basis for S = N(A). �

(d) The column space and nullspace of U =[

1 0 1 0 10 1 0 1 0

].

SolutionThe matrix U is already in echelon form. The first two columns contain pivots and therefore

will span the column space of U . But these two vectors are the two standard basis vectors[10

]and

[01

]. Together they span all of R

2, so the column space of U is all of R2:

C(U) = R2.

Note that since the column space of U is all of R2, ANY two independent vectors in R

2 willform a basis for C(U).

For the nullspace, since the matrix is already in echelon form, we just apply back-substitu-tion to find the general solution to Ux = 0. The variables x3, x4, and x5 are free. The bottomrow of U gives the equation x2 = −x4, and the top equation gives x1 = −x3 − x5. Hence thegeneral solution to Ux = 0 is

x =

−x3 − x5

−x4

x3

x4

x5

= x3

−10100

+ x4

0−1

010

+ x5

−10001

and the nullspace is

N(U) =

−x3 − x5

−x4

x3

x4

x5

: x3, x4, x5 ∈ R

= span

−10100

,

0−1

010

,

−10001

.


3.5 #23. Suppose S is a 5-dimensional subspace of R6. True or False. NOTE: Give a

specific example when false, and give a proof if true.

NOTE: This problem is a lot easier, and you get the same insights, if you let S be a 2-dimensional subspace of R

3 instead. Try it. Doing examples and simple cases first often helpsyou see how to attack a more complicated problem.

(a) Every basis for S can be extended to a basis for R6 by adding one more vector.

SolutionTrue. Suppose you have five vectors that form a basis for S, say v1, v2, v3, v4, v5. We want

to find another vector to add to these five so that we get a basis for the whole space. Well,just choose any vector v that’s not in S. Why then are the six vectors v1, v2, v3, v4, v5, vindependent? Suppose that they weren’t. Then there would be a linear combination that iszero, say:

c1 v1 + c2 v2 + c3 v3 + c4 v4 + c5 v5 + c v = 0. (1)

We want to show that all the coefficients must be zero. We do this in two steps. Suppose firstthat c �= 0. Then

v = −1c

(c1 v1 + c2 v2 + c3 v3 + c4 v4 + c5 v5).

But then since S is a subspace and v1, v2, v3, v4, v5 form a basis for S, this implies that v ∈ S!But we know that v /∈ S, so this is impossible. Therefore the case c �= 0 is impossible.

So, we can move on to the case c = 0. But if this is true, then equation (1) says thatc1 v1 + c2 v2 + c3 v3 + c4 v4 + c5 v5 = 0, i.e., a linear combination of v1, v2, v3, v4, v5 is zero. Butthese vectors are independent, so this is only possible when c1 = c2 = c3 = c4 = c5 = 0.

This shows that the six vectors v1, v2, v3, v4, v5, v are independent. And since R6 is

six-dimensional, they must form a basis for R6. �

(b) Every basis for R6 can be reduced to a basis for S by removing one vector.

SolutionFalse. It depends on whether the basis vectors you chose for R

6 happen to be in S or not!The simpler case of S being a plane in R

3 is much easier to think about. To get a basis forR

3 you just have to have 3 independent vectors in space. If none of those 3 vectors happensto lie in S, then there is no way that you can use any of those vectors to get a basis for S.You can only make a basis for S out of vectors that actually are in S. �


3.5 #34. The cosine space F3 consists of all combinations y(x) = A cos x + B cos 2x +C cos 3x. Find a basis for the subspace with y(0) = 0.

SolutionSince y(0) = A + B + C = 0, the subspace we are dealing with is

S = {A cos x + B cos 2x + C cos 3x : A, B, C ∈ R and A + B + C = 0}= {A cos x + B cos 2x − (A + B) cos 3x : A, B ∈ R}.

There are only “two degrees of freedom” in the choice of vectors in S, so we expect that S istwo-dimensional. Therefore we look for two vectors IN S that are independent and span S.We simply decide to take two “independent” choices of A and B, say A = 1, B = 0 for onechoice, and A = 0, B = 1 for the other. Of course, we must then PROVE that this leads to abasis for S.

For these choices of A, B, we obtain two functions y1 and y2 defined by

y1(x) = cos x − cos 3x and y2(x) = cos 2x − cos 3x.

We hope that {y1, y2} will form a basis for S. Now, y1 and y2 do span S because if we choosean arbitrary function A cos x + B cos 2x − (A + B) cos 3x in S then

A cos x+B cos 2x−(A+B) cos 3x = A(cos x−cos 3x)+B(cos 2x−cos 3x) = Ay1(x)+By2(x).

To see that y1 and y2 are independent, suppose that some linear combination satisfied Ay1 +By2 = 0, where this 0 means the zero function. That is, suppose that Ay1(x) + By2(x) = 0for every x. Then for x = π

2 in particular, we have

0 = Ay1

(π2

)+ By2

(π2

)= A cos π

2 − A cos 3π2 + B cos π − B cos 3π

2 = −B.

Hence B = 0. Considering next x = π4 we must have

0 = Ay1

(π4

)= A cos π

4 − A cos 3π4 =

A√

22

+A√

22

= A√

2.

Hence A = 0 as well, so y1 and y2 are independent and span S. Therefore {y1, y2} forms abasis for S. There are other bases as well, of course, but in any case you must show that yourchoice of vectors from S both spans S and is independent. �


3.5 #37. (a) Find a basis for the space of polynomials p(x) of degree 3 or less.

SolutionThe vector space is

V = {p(x) = a + bx + cx2 + dx3 : a, b, c, d ∈ R}.

One basis is {1, x, x2, x3}. To show that it is a basis you must show that it spans and that thevectors are independent. Spanning is easy: by definition any p(x) ∈ V is a linear combinationof 1, x, x2, and x3. To see that these vectors are independent, show that the only linearcombination that is the zero vector is the one with all scalars zero. That is, suppose somelinear combination a + bx + cx2 + dx3 = 0. Here “0” means the zero function: the functionwhich is zero for every x. That means that the equation a + bx + cx2 + dx3 = 0 holds forevery x. However, the Fundamental Theorem of Algebra says that a nontrivial polynomial ofdegree n can have at most n roots. But our polynomial a+ bx+ cx2 +dx3 has infinitely manyroots, so it must the trivial polynomial, meaning a = b = c = d = 0. Therefore, the vectors 1,x, x2, and x3 are independent.

(b) Find a basis for the subspace with p(1) = 0.

SolutionThe vector space here is the following subspace of V :

U = {p(x) = a + bx + cx2 + dx3 : a, b, c, d ∈ R and p(1) = 0}

= {p(x) = a + bx + cx2 + dx3 : a + b + c + d = 0}.

One basis is {x − 1, x2 − 1, x3 − 1}. As above, it’s easy to show that these vectors are inde-pendent: A linear combination is a(x−1)+ b(x2−1)+ c(x3−1) = cx3 + bx2 +ax− (a+ b+ c),and this can only equal the zero function if a = b = c = 0. To see that they span, you mustshow that any p(x) ∈ U is a linear combination of these three vectors. This follows because ifp(x) = a+bx+cx2+dx3 ∈ U then d = −(a+b+c), so p(x) = a(x−1)+b(x2−1)+c(x3−1). �

3.6 #1. (a) If a 7 by 9 matrix has rank 5, what are the dimensions of the four subspaces?

SolutionThe rank is the dimension of the column space, which also equals the dimension of the row

space. Hencedim(C(A)) = dim(C(AT)) = rank(A) = 5.

Then dimension of the nullspace is the number of free variables. Since there are 9 columnsand 5 pivots, there are four free variables. Thus

dim(N(A)) = 9 − rank(A) = 9 − 5 = 4.


The dimension of the left nullspace is the number of free variables when you do Gaussianelimination on AT. Since AT has 7 columns and there are 5 pivots, this leaves two freevariables for AT. Thus

dim(N(AT)) = 7 − rank(AT) = 7 − 5 = 2. �

(b) If a 3 by 4 matrix has rank 3, what are its column space and left nullspace?

SolutionThe rank equals the dimension of the column space, so C(A) is 3-dimensional. On the other

hand, C(A) is a subspace of R3 because each column of A contains three numbers. Hence

C(A) is a three-dimensional subspace of R3 and therefore must be all of R

3, i.e.,

C(A) = R3.

The dimension of the left nullspace is the number of free variables when you do Gaussianelimination to AT. Since A and AT always have the same rank, we know that there will be 3pivots when you do Gaussian elimination on AT. Since AT is a 4× 3 matrix, this means thatthere are no free variables left for AT. Hence

N(AT) = {0}. �


(a) If m = n then the row space of A equals the column space.

Solution

False. Easy counterexample: A =[

1 10 0

]. �

(b) The matrices A and −A share the same four fundamental subspaces.

SolutionTrue. For example, Ax = 0 has the same solutions as −Ax = 0, so the nullspaces of A

and −A are the same. The column spaces are the same because linear combinations of thecolumns of A are the same as linear combinations of the columns of −A by using the negativesof the coefficients. And so forth. �

(c) If A and B share the same four subspaces then A is a multiple of B.

SolutionFalse. For example, suppose A and B are both square (n × n) and invertible. Then

C(A) = C(B) = Rn and N(A) = N(B) = {0} and similarly for the row space and left

nullspace. This is true no matter what two invertible matrices you choose, regardless ofwhether A is a multiple of B or not. �


3.6 #16. Explain why v = (1, 2, 3) cannot be a row of A and also be in the nullspace of A.

SolutionA little misdirection here: it doesn’t matter what v actually is. As long as v is nonzero, it

can’t be both a row of A and be in the nullspace of A. To see why, suppose that vT was thekth row of A. If v is also in the nullspace, then we would have Av = 0. But then:

0 = Av =

...− vT −

...

v =

...vTv

...

.

Therefore, we must have ‖v‖2 = vTv = 0. But the only vector with zero length is the zerovector, so v = 0. �

3.6 #21. Suppose A is the sum of two matrices of rank one: A = uvT + wzT.

(a) Which vectors span the column space of A?

SolutionBefore I solve the problem, please note that uvT is a matrix and not a scalar. The reversed-

order product, vTu, is a scalar; in fact, it is simply the dot product of v and u. But if u isa vector in R

m and v is a vector in Rn, then uvT is the product of an m × 1 matrix with a

1× n matrix, and therefore is an m× n matrix. Further, note that uvT has rank one becauseits column space is C(uvT) = {(uvT)x : x ∈ R

n} = {u(vTx) : x ∈ Rn}, which contains only

multiples of the vector u (because vTx is a scalar!) and hence is one-dimensional.Proceeding now to the matrix A, recall that the column space of A is C(A) = {Ax : x ∈ R

n}.For this A, we have Ax = u(vTx) + w(zTx). Again, vTx and zTx are scalars, since they arejust the dot products of v and z with x. Therefore Ax = (vTx)u + (zTx)w is simply a linearcombination of the vectors u and w, with coefficients vTx and zTx. Hence every vector inthe column space is a linear combination of u and w, and therefore u and w span the columnspace of A. If it so happens that u and w are nonzero multiples of each other, then either onealone will also span the column space, but in any case we are sure that both of them togetherwill span C(A). �

(b) Which vectors span the row space of A?

SolutionThe row space of A is the column space of AT. Since

AT = (uvT + wzT)T = (uvT)T + (wzT)T = vuT + zwT,

the matrix AT is also the sum of two rank one matrices. Hence part (a) tells us that thecolumn space of AT = vuT + zwT is spanned by v and z. �


(c) The rank is less than 2 if or if .

SolutionWe have already seen in part (a) that the column space of A is spanned by the vectors u

and w. Therefore the column space of A will be 2-dimensional if u and w are independent,and it will be less than 2 if they are dependent. For two vectors to be dependent, one mustbe a multiple of the other. Hence, if u is nonzero and w is any multiple of u (including thezero multiple), then the column space will be spanned by u alone and will have dimension 1.Similarly, if w is nonzero and u is a multiple of w then the rank will be 1. Finally, if both uand w are zero, then C(A) = {0} and the the rank will be zero in this case. �

(d) Compute A and its rank if u = z = (1, 0, 0) and v = w = (0, 0, 1).

SolutionIn this case u and w are independent, so we know that the rank will be 2. We compute

A = uvT + wzT =

1

00

[ 0 0 1 ] +

0

01

[ 1 0 0 ]

=

0 0 1

0 0 00 0 0

+

0 0 0

0 0 01 0 0

=

0 0 1

0 0 01 0 0

.

By inspection, we confirm that C(A) is spanned by u =

1

00

and w =

0

01

. �

3.6 #25. True or false (with a reason or a counterexample).

(a) A and AT have the same number of pivots.

SolutionTrue. They have the same rank, and the rank is the number of pivots. �

(b) A and AT have the same left nullspace.

SolutionFalse in general. The left nullspace of A is the nullspace of AT and the left nullspace of AT

is the nullspace of A. There’s no reason these should be the same. In fact, in a non-squarematrix they can’t possibly be the same, as N(A) is a subspace of R

n while N(AT) is a subspace

of Rm. Even if m = n, these need not be the same. Just think about A =

[1 10 0

]. The


nullspace is spanned by the vector[

1−1

], while the left nullspace is spanned by

[01

]. �

(c) If the row space equals the column space then A = AT.

SolutionFalse. Every square invertible matrix has a row space which equals its column space; in fact,

C(A) = C(AT) = Rn in this case. But an invertible matrix doesn’t have to be symmetric. �

(d) If AT = −A then the row space of A equals its column space.

SolutionTrue. The columns of A are just −1 times the rows of A. So when you form linear

combinations of the columns you’re going to get the same possible vectors as when you formlinear combinations of the rows. �


CHAPTER 4

Orthogonality

4.1 #7. If Ax is in the nullspace of AT then Ax must be zero. Why? Which other subspaceis Ax in? This is important: ATA has the same nullspace as A.

SolutionFirst, we know that Ax is in the column space of A, by definition. That is, Ax ∈ C(A).

So, if we also have Ax ∈ N(AT), then it follows from the fact that C(A) ⊥ N(AT) that Axmust be perpendicular to itself. This implies that Ax = 0. NOTE: The easy way to do thisproblem is to use the fact that Ax is in the column space. Just knowing that Ax ∈ N(AT) isnot enough by itself to give you Ax = 0. And it is not true that x = 0, or that A is the zeromatrix. �

4.1 #12. The Fundamental Theorem is also stated in the form of Fredholm’s alternative:For any A and b, exactly one of these two problems has a solution:

(1) Ax = b,(2) ATy = 0 and bTy �= 0.

Either b is in the column space of A or else B is not orthogonal to the nullspace of A. ChooseA and b so that (1) has no solution. Find a solution to (2).

NOTE: The problem as stated tells you what Fredholm’s Alternative is and then asks youto do something with it. It does not ask you to prove Fredholm’s Alternative. However, letme prove Fredholm’s Alternative, in addition to answering the other part of the problem. Letme also state the theorem a little more explicitly, as follows.

Fredholm’s Alternative. Let A be an m × n matrix and let b ∈ Rm. Then, EITHER

(1) There is a vector x ∈ Rn such that Ax = b, i.e., b is in the column space of A,

OR

(2) There is a vector y ∈ Rm such that ATy = 0 and bTy �= 0, i.e., b is not orthogonal to

the left nullspace of A,

but NOT BOTH.

NOTE: To prove “either but not both,” we must give two proofs. First, we must show thatif (1) is true then (2) is false; then we must show that if (1) is false then (2) is true.

NOTE: Part (2) does NOT say that bTy �= 0 for EACH y that satisfies ATy = 0, only that


there is AT LEAST ONE y which satisfies ATy = 0 and bTy �= 0. Hint: To show there is atleast one of something, suppose there were none, and then show that leads to a contradiction.

Proof of Fredholm’s Alternative. We have to show that if (1) is true then (2) is false, and thatif (1) is false then (2) is true. This implies that either (1) or (2) is always true, but not both.

So, suppose that (1) is true, i.e., Ax = b has a solution. This says that b ∈ C(A). We haveto show that (2) is false. Since b ∈ C(A) and C(A) ⊥ N(AT), we know that b is orthogonalto every vector in N(AT). Therefore there is NO vector y ∈ N(AT) with bTy �= 0, so (2) isindeed false.

On the other hand, suppose that (1) is false, i.e., that b /∈ C(A). We have to show that(2) must be true. Since (2) is saying something about the left nullspace of A, let’s considerN(AT). We know that N(AT) is the orthogonal complement of C(A). Since C(A) can’t beall of R

m (since b isn’t in it!), we know that N(AT) isn’t just the zero subspace. Another wayto see that is to remember that the dimension of N(AT) is m− r, and since r is the dimensionof C(A) and C(A) isn’t all of R

m, we must have r < m and therefore m− r > 0. In any case,there must be some nonzero vector y ∈ N(AT). This means that ATy = 0 for that y. If wecould show that bTy �= 0 for at least ONE of those y, then we would have shown that (2) istrue and we would be done. We proceed by contradiction: suppose that there was no such y,i.e., suppose that bTy = 0 for every y ∈ N(AT). Then b is orthogonal to EVERY vector inthe left nullspace, hence must be in the orthogonal complement of the left nullspace, whichis exactly the column space. Thus b ∈ C(A). But this is impossible because we know thatb /∈ C(A)! Therefore we must have bTy �= 0 for at least one y ∈ N(AT), so (2) is true. �

4.1 #12 (continued). Choose A and b so that (1) has no solution. Find a solution to (2).

SolutionWe are to find a specific matrix A and a specific matrix b so that Ax = b has no solution.

Then, by Fredholm’s alternative, we know that (2) MUST have a solution, i.e., there must beat least one vector y such that ATy = 0 and bTy �= 0.

Now, if Ax = b has no solution, then this means that b is not in the column space of A. So,we need a matrix A whose column space is not all of Rm. It’s always a good idea to choosesomething as simple as possible. Let’s take m = n = 2, i.e., consider only 2× 2 matrices. The

simplest singular matrix I can think of is A =[

1 00 0

]. Its column space consists of multiples

of the vector[

10

]. So we just need to choose a b that is not one of those multiples. A simple

choice is b =[

01

]. Then certainly b /∈ C(A), so Ax = b has no solution.

Now we have to show that there is a vector y ∈ R2 such that ATy = 0 and bTy �= 0. Since

AT = A =[

1 00 0

], the general solution to ATy = 0 is y =

[0y2

]. Then

bTy = [ 0 1 ][

0y2

]= y2,


so if y2 �= 0 then we’re guaranteed bTy �= 0. For example, one solution is y =[

01

]. �

4.1 #19. If a subspace S is contained in a subspace V , prove that the subspace S⊥

contains V ⊥.

SolutionWe are given that S ⊂ V , and we must show that V ⊥ ⊂ S⊥. So, suppose that x ∈ V ⊥.

Our goal is to show that x ∈ S⊥. To do this we must show that x is orthogonal to everyvector in S. So, suppose that u is some vector in S. Since S ⊂ V , we know that u ∈ V . Butwe also know that x ∈ V ⊥! Therefore x and u must be orthogonal. Hence x is orthogonal toeach vector in S, and therefore x ∈ S⊥, as desired. �

4.1 #21. Suppose the columns of A are unit vectors, all mutually perpendicular. What isATA?

SolutionLet v1, . . . , vn be the columns of A. Then we know that vT

i vj = 0 when i �= j, and thatvT

i vi = ‖vi‖2 = 1. Therefore,

ATA =

− vT

1 −...

− vTn −

| |

v1 · · · vn

| |

=

vT1 v1 · · · vT

1 vn

.... . .

...vT


=

1 · · · 0

.... . .

...0 · · · 1

= I. �

4.1 #22. Find a 3 × 3 matrix A with no zero entries whose columns are mutually perpen-dicular. Compute ATA. Why is it a diagonal matrix?

Solution

Here’s one matrix whose columns are mutually perpendicular: A =

1 1 5

2 1 −43 −1 1

. There

are lots of others. Just pick a first column and then work to fill out the second and third

columns. You can verify that for my A, we have ATA =

14 0 0

0 3 00 0 42

. It is diagonal because

of the fact that the columns of A are mutually perpendicular.In fact, if A is any n × n matrix whose columns v1, . . . , vn are mutually orthogonal, then


vTi vj = 0 when i �= j, so

ATA =

− vT

1 −...

− vTn −

| |

v1 · · · vn

| |

=

vT1 v1 · · · vT

1 vn

.... . .

...vT


=

vT1 v1 · · · 0...

. . ....

0 · · · vTn vn

,

a diagonal matrix. The diagonal entries are the dot products of vi with itself. If the columnswere normalized so that they each had length 1, then ATA would actually be the identitymatrix I. �

4.2 #17. (Important) If P 2 = P show that (I − P )2 = I − P . When P projects onto thecolumn space of A, I − P projects onto the .

NOTE: If I − P is a projection, then it will project onto its column space. Therefore, toanswer this problem you should do the following things: (a) Show that (I − P )2 = I − P , (b)Figure out what C(I − P ) is in terms of the given matrix A, and (c) Prove that I − P is infact a projection, i.e., that the error vector e = b − (I − P )b is perpendicular to C(I − P ) forevery b ∈ R

m.

Solution(a) This is easy: (I − P )2 = I2 − IP − PI − P 2 = I − P − P + P = I − P . �

(b) This is a bit harder. The back of the book tells us that C(I − P ) = N(AT). This doesmake sense, it seems like C(I − P ) = C(A)⊥ = N(AT), but you have to explain why. Notwhy C(A)⊥ = N(AT) (we know that’s true!), but why these sets equal C(I − P )!

First, we will show that C(I − P ) ⊂ N(AT). Suppose that y ∈ C(I − P ). Then, bydefinition, y = (I −P )b for some b ∈ R

m. Therefore y = (I −P )b = b−Pb = b− p. However,this is, by definition, the error vector when b is projected orthogonally onto C(A). Thereforethis error vector y is orthogonal to C(A). Hence y ∈ C(A)⊥ = N(AT). This proves thatC(I − P ) ⊂ N(AT).

Second, we must show the opposite inclusion, that N(AT) ⊂ C(I − P ). So, suppose thatb ∈ N(AT). Since N(AT) ⊥ C(A) and P projects vectors orthogonally onto C(A), we thereforehave that Pb = 0. Hence (I − P )b = b − Pb = b − 0 = b. But certainly (I − P )b ∈ C(I − P ),so b itself is in C(I − P ). This shows that N(AT) ⊂ C(I − P ).

(c) This part is again easy. Suppose that b ∈ Rm. Then the error vector is e = b−(I−P )b =

Pb ∈ C(A). Since C(A) ⊥ N(AT), we certainly have that e ⊥ N(AT). �

4.2 #25. The projection matrix P onto an n-dimensional subspace has rank r = n. Reason:The projections Pb fill the subspace S. So S is the of P .

SolutionS is the COLUMN SPACE of P . Here’s why.


The column space of P is the set of all vectors Px for all possible x. Now, if P projectsonto an n-dimensional subspace S, then certainly Px ∈ S for every x. So the column spacemust be at least part of S. In fact it must be all of S, because if x ∈ S then Px = x. Thismeans that not only is every Px in S, but every vector in S is a Px for some x (namely,itself). So C(P ) = S. And since rank is defined to be the dimension of the column space, wehave r = rank(P ) = dim(S) = n. �


4.3 #9. For the closest parabola b = C +Dt+Et2 to the same four points, write down theunsolvable equation Ax = b. Set up the three normal equations ATAx̂ = ATb (solution notrequired). In Figure 4.9a you are now fitting a parabola—what is happening in Figure 4.9b?

NOTE: The “four points” are given in Problem 4.3 #1.

SolutionThe four points were given in Problem 4.3 #1; they are b = 0, 8, 8, 20 at times t = 0, 1, 3, 4.

If these values actually lay on the parabola b = C + Dt + Et2, then the following equationswould be satisfied:

C + D · 0 + E · 02 = 0,

C + D · 1 + E · 12 = 8,

C + D · 3 + E · 32 = 8,

C + D · 4 + E · 42 = 20.

Converting these to a matrix-vector equation, we have

1 0 01 1 11 3 91 4 16

C

DE

=

08820

.

This is the unsolvable equation Ax = b. The normal equations are ATAx̂ = ATb. Since

ATA =

1 1 1 1

0 1 3 40 1 9 16

1 0 01 1 11 3 91 4 16

=

4 8 26

8 26 9226 92 338

and

ATb =

1 1 1 1

0 1 3 40 1 9 16

08820

=

36

112400

,

the normal equation is 4 8 26

8 26 9226 92 338

C

DE

=

36

112400

.

This problem can viewed in two ways. On the one hand, we are finding the parabola C +Dt + Et2 that best fits these four data points, in the sense of minimizing the sum of the errorsquared between the actual values of bi and the values C +Dti +Et2i that lie on the parabola.On the other hand, in the form Ax = b we are trying to find the best approximate solution tothis unsolvable problem. That means that we are finding the value of x̂ so that p = Ax̂ is as


close to b as possible. This occurs when p is the orthogonal projection of b onto the columnspace of A. In this case, A has three columns. Hence, the corresponding picture is that weare projecting the vector b ∈ R

4 onto the 3-D subspace of R4 spanned by the three vectors

a1 =

1111

, a2 =

0134

, a3 =

01916

. �

4.3 #12. (Recommended) This problem projects b = (b1, . . . , bm) onto the line througha = (1, . . . , 1).

(a) Solve aTax̂ = aTb to show that x̂ is the mean of the b’s.

SolutionaTa = m and aTb = b1 + · · · + bm, so x̂ = 1

m (b1 + · · · + bm). �

(b) Find the error vector e and the variance ‖e‖2 and the standard deviation ‖e‖.Solution

Since p = ax̂, we have e = b − ax̂ and

‖e‖2 = ‖b − p‖2 = ‖(b1 − x̂, . . . , bm − x̂)‖2 = (b1 − x̂)2 + · · · + (bm − x̂)2. �

(c) Draw a graph with b = (1, 2, 6) fitted with a horizontal line. What are p and e on thegraph? Check that p is perpendicular to e and find the matrix P .

NOTE: You are finding the best fit horizontal line to the data. To fit b with a horizontalline, you are trying to fit a curve with equation B = bi to the data points (t1, b1), (t2, b2),(t3, b3). There is only one unknown (the number B). Note that because the horizontal linehas no dependence on t, it doesn’t even matter what the ti actually are!

SolutionI’ll leave the graph for you. To fit b with a horizontal line, you are trying to fit a curve with

equation B = bi to the data points (t1, b1), (t2, b2), (t3, b3). Because a horizontal line has nodependence on t, it doesn’t even matter what the ti are. The matrix form of the equation is

1

...1

B =

b1...

bm

.

This equation is simply aB = b (the unknown is B). There is no solution unless b1, . . . , bm

are already on a horizontal line. Least squares gives the best fit line: the best value for B is


x̂. And we know from part (a) what x̂ is. To work everything out, just compute as usual forleast squares:

P =aaT

aT a=

13

1 1 1

1 1 11 1 1

, p = Pb =

3

33

, e = b − p =

−2−1

3

. �

4.4 #12. If a1, a2, a3 is a basis for R3, any vector b can be written as

b = x1a1 + x2a2 + x3a3 or

| | |

a1 a2 a3

| | |

x1

x2

x3

= b.

(a) Suppose the a’s are orthonormal. Show that x1 = aT1 b.

SolutionUse the fact that b = x1a1 + x2a2 + x3a3 and that the vectors a1, a2, a3 are orthonormal

to compute

aT1 b = aT

1 (x1a1 + x2a2 + x3a3)

= aT1 (x1a1) + aT

1 (x2a2) + aT1 (x3a3)

= x1 (aT1 a1) + x2 (aT

1 a2) + x3 (aT1 a3)

= x1 · 1 + x2 · 0 + x3 · 0= x1. �

(b) Suppose the a’s are orthogonal. Show that x1 = aT1 b/aT

1 a.

SolutionThis is the same calculation as in part (a), except that we only know now that the a’s are

orthogonal, and we don’t know the value of aTi ai. As before,

aT1 b = aT

1 (x1a1 + x2a2 + x3a3)

= x1 (aT1 a1) + x2 (aT

1 a2) + x3 (aT1 a3)

= x1 (aT1 a1) + x2 · 0 + x3 · 0

= x1 (aT1 a1),

so x1 = aT1 b/aT

1 a as desired. �


(c) If the a’s are independent, x1 is the first component of times b.

SolutionIf a1, a2, a3 are three independent vectors in R

3, then they form a basis for R3 and hence

the matrix A is invertible. In this case, Ax = b has the solution x = A−1b. Hence x1, thefirst component of x, is the first component of A−1 times b. The point here is that even if thevectors are nonorthogonal, then the problem of expressing b as a linear combination of a1, a2,a3 is still solvable, although with more work than in the orthogonal case. �

4.4 #15. (a) Find orthonormal vectors q1, q2, q3 such that q1 and q2 span the column space

of A =

1 1

2 −1−2 4

. (b) Which of the four fundamental subspaces contains q3?

Solution

Normalize the first column a =

1

2−2

of A to get q1 =

13

1

2−2

. Now apply Gram–

Schmidt to the second column b =

1−1

4

to get q2 =

13

2

13

. Gram–Schmidt guarantees that

q1, q2 span the same space as a, b, i.e., span{q1, q2} = span{a, b} = C(A). There are manyways to get the third vector q3, including:

(1) Gram–Schmidt. Choose any vector c not in the plane spanned by a and b, and applyGram–Schmidt to it to get q3.

(2) The Direct Approach. We know that q3 has to be orthogonal to both a and b (or q1

and q2, if you prefer). This means that aTq3 = bTq3 = 0. Let q3 =

x

yz

, work out

what these equations mean in terms of x, y, and z, and solve them.

(3) Outside Knowledge (also known as reading ahead in the book). By definition, the crossproduct of a and b is a vector c which is orthogonal to both a and b. So, just computea × b, then normalize to get q3. (This wouldn’t work if we had four vectors instead ofthree).

(4) Fundamental Subspaces. Since q1 and q2 span the column space of A and q3 must beorthogonal to both q1 and q2, it follows that q3 must be in the orthogonal complementof the column space, which is the left nullspace. That is, q3 ∈ C(A)⊥ = N(AT). So,compute the left nullspace of A. You’ll find that it is one dimensional in this case, sowe can simply choose any nonzero vector c ∈ N(AT) and normalize to get q3.


Whichever approach you take, you end up with q3 = ±13

−2

21

. �


(c) Solve Ax = (1, 2, 7) by least squares.

SolutionAgain there are several approaches. You can solve this directly using A, or use the A = QR

factorization to help out. Set b = (1, 2, 7). The direct approach is then to solve

ATAx̄ = ATb

to get x̄ =[

12

].

NOTE: It’s a lot easier to solve the equation ATAx̄ = ATb by Gaussian elimination thanto compute (AT A)−1!

To use the QR factorization, just note that the equation ATAx̄ = ATb becomes

RTQTQRx̄ = RTQTb.

But QTQ = I and RT is invertible, so this becomes simply

Rx̄ = QTb.

We know Q already because we found q1, q2, q3, so just set up R and solve away. You get thesame solution. �


CHAPTER 5

Determinants

5.1 #3. True or false, with a reason or counterexample.

(a) |I + A| = 1 + |A|.Solution

False. Try A =[

0 10 1

]. Then |A| = 0, but |I + A| = 2. �

(b) |ABC| = |A| |B| |C|.Solution

True by basic properties of determinants. �

(c) |A4| = |A|4.Solution

True, basic properties of determinants again, since A4 = AAAA. �

(d) |4A| = 4|A|.Solution

False. The determinant is linear in each row. Therefore, if A is an n × n matrix then|4A| = 4n|A|. �

5.1 #9. Prove that |AT| = |A| by transposing A = QR. (R is triangular and Q isorthogonal; note Problem 7.) Why does |RT| = |R|?Solution

Let’s do the last part first. We know that R is an upper-triangular matrix, hence itsdeterminant is simply the product of its diagonal entries. But RT is lower-triangular withthe same diagonal entries, so its determinant is exactly the same, the product of the diagonalentries of R. Therefore,

|AT| = |(QR)T| = |RTQT| = |RT| |QT| = |R| |QT|.

Now, Q is an orthogonal matrix, so QT = Q−1. Further, the determinant of an orthogonalmatrix is either 1 or −1, so |QT| = |Q−1| = 1/|Q| = |Q|, because 1/1 = 1 and 1/(−1) = −1.


Therefore|AT| = |R| |QT| = |R| |Q| = |Q| |R| = |QR| = |A|. �

5.1 #10. (a) If the entries in every row of A add to zero, prove that det(A) = 0.

SolutionThere are several ways to do this. I think the easiest is simply to observe that if x =

(1, 1, . . . , 1) then Ax = 0, so A is singular. �

(b) If every row of A adds to one, prove that det(A − I) = 0. Does this guarantee thatdet(A) = 1?

SolutionIf every row of A adds to one, then every row of A− I sums to zero. Hence det(A− I) = 0

by part (a). This doesn’t imply that det(A) = 1. For example, if A =[

1 01 0

]then A − I is

singular, so has determinant zero. �

5.2 #14. The n by n determinant Cn has 1’s above and below the main diagonal.

(a) What are these determinants C1, C2, C3, C4?

(b) By cofactors find the relation between Cn and Cn−1 and Cn−2. Find C10.

SolutionYou can find C1, C2, C3, C4 by direct calculation. For the relationship, expand by cofactors.

The first row of Cn has only a one in row 1, column 2, zeros elsewhere. Use the cofactorformula, then expand AGAIN, again on the first row:

Cn = (−1)1+2 1

∣∣∣∣∣∣∣∣∣∣∣∣

1 1 0 0 0 0 · · ·0 0 1 0 0 0 · · ·0 1 0 1 0 0 · · ·0 0 1 0 1 0 · · ·0 0 0 1 0 1 · · ·...

......

......

.... . .

∣∣∣∣∣∣∣∣∣∣∣∣

= (−1)1+2

(−1)1+1 1

∣∣∣∣∣∣∣∣∣∣

0 1 0 0 0 · · ·1 0 1 0 0 · · ·0 1 0 1 0 · · ·0 0 1 0 1 · · ·...

......

......

. . .

∣∣∣∣∣∣∣∣∣∣+ (−1)1+2 1

∣∣∣∣∣∣∣∣∣∣

0 1 0 0 0 · · ·0 0 1 0 0 · · ·0 1 0 1 0 · · ·0 0 1 0 1 · · ·...

......

......

. . .

∣∣∣∣∣∣∣∣∣∣

= −Cn−2 + 0.


The zero comes because that last matrix is singular—it has a column of zeros. So Cn = −Cn−2.Since C1 = 0, this means that Cn = 0 for all odd n, and since C2 = −1, this means Cn for neven alternates between −1 and 1. In particular, C10 = −1. �


5.3 #21. (a) Suppose the column vectors in a 3 by 3 matrix have lengths L1, L2, and L3.What is the largest possible value for the determinant?

SolutionThe determinant is the volume of the parallelepiped determined by the three columns. The

volume is largest when the parallelepiped is a rectangular box, i.e., when the columns areorthogonal (why?). In this case the volume is just the product of the lengths, L1 L2 L3. �

(b) If all entries have |aij | = 1, can the determinant (six terms) equal 6?

SolutionNo, the lengths of the three columns are L1 = L2 = L3 =

√3, so largest possible volume

is (√

3)3 = 3√

3 ≈ 5.196 < 6. Can the volume equal 5? No—at least, not if the aij are realnumbers. Because then each aij is either 1 or −1, so the determinant is an integer, but givenby a sum of six numbers, each of which is either 1 or −1. All such sums are even, so thedeterminant must be even. Question: what if the aij are complex? �


CHAPTER 6

Eigenvalues and Eigenvectors

6.1 #6. This problem asks you to compute the eigenvalues of A, B, AB, and BA for somespecific A and B, and then to decide whether the eigenvalues of AB are equal to those ofBA. I’ll let you work out the details of that yourself. You’ll see that for this particular case,the eigenvalues of AB and the eigenvalues of BA are indeed the same. This brings up thequestion: Is this a fluke, or do the eigenvalues of AB and BA always coincide? The answeris: yes, AB and BA must always have the same eigenvalues. Let me try to show you why.

Theorem. If A and B are any two n × n matrices then AB and BA have the same n(complex) eigenvalues. However, they DO NOT have to have the same eigenvectors!

Proof. The eigenvalues of a matrix are the roots of the characteristic polynomial. Now, ifthe characteristic polynomials of AB and BA have exactly the same roots then they must beexactly the same polynomial. This is because if λ1, . . . , λn are the roots of a polynomial p(λ),then p(λ) = (λ − λ1) · · · (λ − λn), i.e., p(λ) is completely determined by its roots.

So, we just have to show that det(AB − λI) = det(BA − λI). Try thinking about this fora while: it isn’t very clear why this should have to be true. But here is a trick to prove it.Suppose we knew that B was invertible. Then det(B) �= 0, so we can multiply and divide bythe number det(B) without changing anything. Further, we know that 1/ det(B) = det(B−1).And further still, det(B) is just a scalar, and the order of multiplying scalars doesn’t matter(scalars commute, matrices don’t). Therefore:

det(AB − λI) = det(B) det(B)−1 det(AB − λI)

= det(B) det(B−1) det(AB − λI)

= det(B) det(AB − λI) det(B−1)

= det(B(AB − λI)B−1)

= det(BABB−1 − BλIB−1)

= det(BA − λBB−1)

= det(BA − λI).

This is exactly what we wanted: AB and BA have the same characteristic polynomial, hencethe same roots.

There’s just one problem: what if B isn’t invertible? I won’t give the precise details, buthere’s the idea on how to handle this. If you “perturb” the entries of B ever so slightly(meaning that you add a very small number to one or more entries), then the new “perturbedmatrix” C will be invertible. Plus, the eigenvalues of B and C will be close. Because C isinvertible, you know that AC and CA have the same eigenvalues. Now you let the perturbation


get smaller and smaller, and in the limit you find that AB and BA have the same eigenvalues.The idea is clear, but the details are messy. �

6.1 #9. What do you do to Ax = λx in order to prove (a), (b), and (c)?

(a) λ2 is an eigenvalue of A2, as in Problem 4.

SolutionMultiply both sides of Ax = λx by A. �

(b) λ−1 is an eigenvalue of A−1, as in Problem 3.

SolutionMultiply both sides of Ax = λx by A−1, then divide both sides by λ. �

(c) λ + 1 is an eigenvalue of A + I, as in Problem 2.

SolutionAdd Ix = x to both sides of Ax = λx. �

6.1 #18. A 3×3 matrix B is known to have eigenvalues 0, 1, 2. This information is enoughto find three of these:

(a) The rank of B.

SolutionYes. The nullspace is one-dimensional, therefore the rank of B is 2. �

(b) The determinant of BTB.

SolutionYes. Since the determinant of B is the product of its eigenvalues, we know that det(B) =

0 · 1 · 2 = 0. Therefore det(BTB) = det(BT) det(B) = det(B)2 = 02 = 0. �

(c) The eigenvalues of BTB.

Solution

No. For example, the two matrices B1 =

0 0 0

0 1 00 0 2

and B2 =

0 0 1

0 1 00 0 2

both have

eigenvalues 0, 1, 2, but BT1 B1 has eigenvalues 0, 1, 4 while BT

2 B2 has eigenvalues 0, 1, 5. �


(d) The eigenvalues of (B + I)−1.

SolutionYes. B + I has eigenvalues 1, 2, 3 (why?). Therefore (B + I)−1 has eigenvalues 1, 1/2, 1/3

(why?). �

6.1 #25. Suppose A and B have the same eigenvalues λ1, . . . , λn with the same independenteigenvectors x1, . . . , xn. Then A = B. Reason: Any vector x is a combination c1x1+· · ·+cnxn.What is Ax? What is Bx?

SolutionBecause we are given that x1, . . . , xn are n independent vectors, we know that they form

a basis for Rn. Therefore, if x is any vector in R

n then there exist unique scalars c1, . . . , cn

such that x = c1x1 + · · · + cnxn. Since the xk are eigenvectors, we therefore have that

Ax = A(c1x1 + · · · + cnxn)

= c1 Ax1 + · · · + cn Axn

= c1 λ1 x1 + · · · + cn λn xn

= c1 Bx1 + · · · + cn Bxn

= B(c1x1 + · · · + cnxn)

= Bx.

That is, Ax = Bx for every possible vector x. In particular, Aek = Bek where ek is the kthstandard basis vector. But Aek is just the kth column of A and likewise Bek is the kth columnof B, so this says that A and B have the same columns and therefore are the same matrix. �

6.2 #6. If the eigenvectors of A are the columns of I then A is a matrix. If theeigenvector matrix S is triangular, then S−1 is triangular. Prove that A is also triangular.

SolutionThe word to fill in the blank is DIAGONAL. Here is the reason. The columns of I are the

standard basis vectors e1, . . . , en. We are given that these are eigenvectors of A. Therefore Ahas a full set of eigenvectors (i.e., there is a basis for R

n made out of eigenvectors of A). HenceA is diagonalizable with eigenvector matrix S = I. This means Λ = S−1AS = IAI = A. ButΛ is diagonal, so A must be also.

Here is another proof that A is diagonal. Since each ek is an eigenvector, we have that


Aek = λk ek for some number λk. But Aek is the kth column of A, so this says that

A =

λ1 0 · · · 00 λ2 · · · 0...

.... . .

...0 0 · · · λn

.

Now to the second part of the question. If we know that S is triangular, let’s say uppertriangular, then S−1 will be upper triangular as well. Since we are assuming that there isan eigenvector matrix S which is invertible, we know that A must be diagonalizable. HenceΛ = S−1AS is diagonal. Therefore A = SΛS−1 with S, Λ, and S−1 all upper triangular. Theproduct, which is A, must therefore be upper triangular. The same argument works if S islower triangular to show that A was lower triangular. �

6.2 #19. Ak = SΛkS−1 approaches the zero matrix as k → ∞ if and only if every λ hasabsolute value less than . Which of these matrices has Ak → 0?

A =[

.6 .4

.4 .6

], B =

[.6 .9.1 .6

].

SolutionThe word to fill in the blank is ONE. This is an if and only if problem, so requires two

proofs.For the first direction, assume that the eigenvalues λ1, . . . , λn all satisfy |λj | < 1. Let

x1, . . . , xn be the columns of S; these are eigenvectors of A and form a basis for Rn. Therefore,

if v is any vector in Rn then v = c1 x1 + · · · + cn xn for some unique scalars c1, . . . , cn. Hence

Akv = c1 λk1 x1 + · · · + cn λk

n xn.

Applying the triangle inequality, we therefore have∥∥Akv

∥∥ ≤ |c1| |λ1|k‖x1‖ + · · · + |cn| |λn|k‖xn‖.

Since |λj | < 1, we have |λj |k → 0. Hence∥∥Akv

∥∥ → 0 for each vector v. In particular, thejth column of Ak is Akej , so each column of A is converging to zero, and therefore A itself isconverging to the zero matrix.

For the reverse direction, suppose that Ak converges to the zero matrix. Then Akv convergesto the zero vector for every vector v. But if v is an eigenvector for the eigenvalue λ, thenAkv = λkv. This can only converge to zero if |λ| < 1. Hence every eigenvalue of A must haveabsolute value less than 1.

For the last part of the problem, the eigenvalues of A are 1 and 1/5, while the eigenvaluesof B are 3/10 and 9/10. Therefore Bk converges to the zero matrix, but Ak doesn’t. �


6.2 #20. (Recommended) Find Λ and S to diagonalize A in Problem 19. What is the limitof Λk as k → ∞? What is the limit of SΛkS−1 In the columns of this limiting matrix you seethe .

Solution

Two independent eigenvectors of A are x1 =[

11

](for the eigenvalue 1) and x2 =

[1−1

]

(for the eigenvalue 1/5). Therefore A is diagonalizable using

S =[

1 11 −1

]and Λ =

[1 00 1/5

].

Therefore

Λk =[

1k 00 (1/5)k

]converges to Λ∞ =

[1 00 0

].

I use the term “Λ∞” loosely, meaning the limit of Λk as k → ∞. The limit of Ak = SΛkS−1

is therefore

A∞ = SΛ∞S−1 =[

1 11 −1

] [1 00 0

] [1/2 1/21/2 −1/2

]=

[1/2 1/21/2 1/2

].

In the columns of A∞ you see: THE 1-EIGENVECTORS OF A. Ak converges to a matrixin which every column is a 1-eigenvector of A. �

6.2 #34. Suppose that the same S diagonalizes both A and B, so that A = SΛ1S−1 and

B = SΛ2S−1. Prove that AB = BA.

SolutionThe key fact is that DIAGONAL MATRICES COMMUTE, i.e., Λ1Λ2 = Λ2Λ1. Therefore

AB = SΛ1S−1SΛ2S

−1 = SΛ1Λ2S−1 = SΛ2Λ1S

−1 = SΛ2S−1SΛ1S

−1 = BA. �


CHAPTER 7

Linear Transformations

7.1 #3. Which of these transformations is not linear? The input is v = (v1, v2).

(a) T (v) = (v2, v1).

Solution

This is linear. We are given that T

([v1

v2

])=

[v2

v1

]for every v =

[v1

v2

]. So we check for

linearity as follows:

T (u + w) = T

([u1 + w1

u2 + w2

])=

[u2 + w2

u1 + w1

]=

[u2

u1

]+

[w2

w1

]= T (u) + T (w),

and

T (cu) = T

([cu1

cu2

])=

[cu2

cu1

]= c

[u2

u1

]= c T (u).

Therefore T is linear. �

(b) T (v) = (v1, v1).

SolutionThis is linear. �

(c) T (v) = (0, v1).

SolutionThis is linear. �

(d) T (v) = (0, 1).

Solution

This is not linear, because T (u + w) =[

01

]but T (u) + T (w) =

[01

]+

[01

]=

[02

]. �


7.1 #12. Suppose a linear transformation T transforms (1, 1) to (2, 2) and (2, 0) to (0, 0).Find T (v) when:

(a) v = (2, 2) (b) v = (3, 1) (c) v = (−1, 1) (d) v = (a, b).

Solution(a) Since T (1, 1) = (2, 2) and we know that T is linear, we have

T (2, 2) = 2T (1, 1) = 2 (2, 2) = (4, 4).

(b) Since (3, 1) = (1, 1) + (2, 0), we have

T (3, 1) = T (1, 1) + T (2, 0) = (2, 2) + (0, 0) = (2, 2).

(c) Since (−1, 1) = (1, 1) − (2, 0), we have

T (−1, 1) = T (1, 1) − T (2, 0) = (2, 2) − (0, 0) = (2, 2).

(d) If we express (a, b) as a linear combination of (1, 1) and (2, 0), we will know how tocompute T (a, b). Now, by doing a little Gaussian elimination, you can check that (a, b) =b (1, 1) + a−b

2 (2, 0). Therefore,

T (a, b) = b T (1, 1) +a − b

2T (2, 0) = b (2, 2) +

a − b

2(0, 0) = (2b, 2b). �

7.1 #13. M is any 2 × 2 matrix and A =[

1 23 4

]. The transformation T is defined by

T (M) = AM . What rules of matrix multiplication show that T is linear?

SolutionWe know that

T (M + N) = A(M + N) = AM + AN = T (M) + T (N)

because of the fact that matrix multiplication is distributive over addition (that’s what makesthe key step A(M + N) = AM + AN work). Also,

T (cM) = A(cM) = (Ac)M = (cA)M = c(AM) = c T (M)

because matrix multiplication is associative (think of the scalar c as being a 1×1 matrix) andbecause multiplication by a scalar is commutative (i.e., Ac = cA, which works because c is ascalar). �

introduction to linear algebra, second edition by gilbert ... t… · introduction to linear...

Documents