princeton mat 217 honor linear algebra 2013

8/9/2019 Princeton MAT 217 Honor Linear Algebra 2013

1/99

Linear Algebra: MAT 217

Lecture notes, Spring 2013

Michael Damron

Princeton University

1


2/99

Contents

1 Vector spaces 4

1.1 Vector spaces and fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Spanning and linear dependence . . . . . . . . . . . . . . . . . . . . . . . . . 81.4 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Linear transformations 18

2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.2 Range and nullspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.3 Isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.4 Matrices and coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Dual spaces 313.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.2 Annihilators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.3 Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.4 Double dual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4 Determinants 38

4.1 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.2 Determinants: existence and uniqueness . . . . . . . . . . . . . . . . . . . . 414.3 Properties of the determinant . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.5 Exercises on polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5 Eigenvalues 52

5.1 Diagonalizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.2 Eigenspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6 Jordan form 59

6.1 Primary decomposition theorem . . . . . . . . . . . . . . . . . . . . . . . . . 596.2 Nilpotent operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6.3 Existence and uniqueness of Jordan form, Cayley-Hamilton . . . . . . . . . . 656.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

7 Bilinear forms 73

7.1 Definition and matrix representation . . . . . . . . . . . . . . . . . . . . . . 737.2 Symmetric bilinear forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

2


3/99

7.3 Sequilinear and Hermitian forms . . . . . . . . . . . . . . . . . . . . . . . . . 807.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

8 inner product spaces 84

8.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

8.2 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 858.3 Adjoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 898.4 Spectral theory in inner product spaces . . . . . . . . . . . . . . . . . . . . . 928.5 Appendix: proof of Cauchy-Schwarz by P. Sosoe . . . . . . . . . . . . . . . . 948.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

3


4/99

1 Vector spaces

1.1 Vector spaces and fields

Linear algebra is the study of linear functions. In Rn these are functions f satisfying

f(x+y) =f(x) +f(y) and f(cx) =cf(x), x, y Rn, c R.

We will generalize this immediately, taking from Rn only what we absolutely need. We startby looking at the c value above: it is called a scalar. Generally scalars do not need to comefromR. There is only some amount of structure we need for the set of scalars.

Definition 1.1.1. A setF is called a field if for eacha, b F, there is an elementab Fand anothera +b F such that

1. for alla, b, c F, (ab)c= a(bc) and(a +b) +c= a + (b+c),

2. for alla, b F, ab= ba anda +b= b+a,3. there exist element0, 1 F such that for alla F, a + 0 =a and1a= a,4. for alla F, there is an elementa F such thata + (a) = 0 and ifa= 0 there is

an elementa1 F such thataa1 = 1 and5. for alla, b, c F, a(b+c) =ab+ac.This is our generalization ofR. Note one interesting point: there is nothing that asserts

that F must be infinite, and indeed there are finite fields. Take any prime p and considerthe set Zp given by

Zp={0, . . . , p 1} with modular arithmetic .

That is, a+b is defined as (a+b) modp (for instance (2 + 5) mod 3 = 1). Then this isa field. You will verify this in the exercises. Another neat fact: ifF is a finite field then itmust have pn elements for some prime p andn N. You will prove this too.

Other examples are R and C.Given our field of scalars we are ready to generalize the idea ofRn; we will call this a

vector space.

Definition 1.1.2. A collection (V, F) of a setV and a fieldF is called a vector space (the

elements of V called vectors and those ofF called scalars) if the following hold. For eachv, wV there is a vector sumv+wV such that1. there is one (and only one) vector called0 such thatv+ 0 =v for allvV,2. for eachvV there is one (and only one) vectorv such thatv+ (v) = 0,3. for allv, wV, v+w= w+v,

4


5/99

4. for allv , w, z V, v+ (w+z) = (v+w) +z.Furthermore for allvV andc F there is a scalar productcvV such that

1. for allvV, 1v= v,

2. for allvV andc, d F, (cd)v= c(dv),3. for allv, wV andc F, c(v+w) =cv+cw and4. for allvV andc, d F, (c+d)v= cv +dv.This is really a ton of rules but they have to be verified! In case ( V, F) is a vector space,

we will typically say V is a vector space over F or V is an F-vector space. Lets look at someexamples.

1. TakeV = Rn and F = R. We define addition as you would imagine:

(v1, . . . , vn) + (w1, . . . , wn) = (v1+w1, . . . , vn+wn)

and scalar multiplication by

c(v1, . . . , vn) = (cv1, . . . , c vn) .

2. Let F be any field with n Nand write

Fn ={(a1, . . . , an) :ai F for i= 1, . . . , n}

and define addition and scalar multiplication as above. This is a vector space. Note in

particular that F is a vector space over itself.

3. IfF1 F2 are fields (with the same 0, 1 and operations) then F2 is a vector space overF1. This situation is called a field extension.

4. Let Sbe any nonempty set and Fa field. Then define

V ={f :S F :fa function} .

Then V is an F-vector space using the operations

(f1+f2)(s) =f1(s) +f2(s) and (cf1)(s) =c(f1(s)).

Facts everyone should see once.

1. For all c F,c0 = 0.

5


6/99

Proof.c0 =c(0 + 0) =c0 +c0

0 =c0 + (c0) = (c0 +c0) + (c0)= (c0) + (c0 + (

c0)) =c0 .

2. For all vV, 0v= 0.

Proof. 0v= (0 + 0)v= 0v+ 0v. Adding(0v) to both sides gives the result.

3. For all vV, (1)v=v.

Proof.

v+ (1)v= 1v+ (1)v= (1 + (1))v= 0v=0.

1.2 Subspaces

Definition 1.2.1. LetV be a vector space overF. ThenW V is called a subspace ofVifW is a vector space overF using the same operations as inV.

Suppose we are given a vector space V. To check that W V is a subspace we need toverify eight properties! Do not worry many of them follow immediately, by inheritance.That is, they are true simply because they were true in V. For example if V is a vector

space over F and v, wW then clearlyv + w= w + v, since these are also vectors inV andaddition is commutative in V.We only need to check the following.

1. 0W.2. (closed under addition) For all v, wW,v+wW.3. (closed under scalar multiplication) For allvW and c F,cvW.4. (closed under inverses) For all vW,vW.

Proposition 1.2.2. Let(V, F) be a vector space. ThenW

V is a subspace if and only if

it is nonempty and for allv, wW andc F, cv+wW.Proof. Suppose that W satisfies the property in the proposition. Then letv W. Takingv = w and c =1, we get0 = v+ (1)v W. Next, ifc F then cv =0 + cv W. IfwW thenv +(1)w= v + wW, givingWas a subspace. Conversely, ifWis a subspacethen for all vW and c F,cvW, so ifwW, we get cv +wW. FurthermoreW isnonempty since it contains0.

6


7/99

IfV is a vector space over F with W1, W2 subspaces we can generate a new space. Wedefine

W1+W2={w1+w2 : w1W1, w2W2}.Generally we define

W1+ +Wn= (W1+ +Wn1) +Wn .Claim 1.2.3. W1+W2 is a subspace.

Proof. First it is nonempty. Next ifv, wW1+W2 and cF, we can write v = w1+w2and w = w1+w

2 for w1, w

1W1 and w2, w2W2. Then

cv+w= c(w1+w2) + (w1+w

2) = (cw1+w

1) + (cw2+w

2) .

Since W1 and W2 are subspaces, the first element is in W1 and the second in W2, givingcv+wW1+W2, so it is a subspace.

Question from class. IfV is a vector space over F and W is a subset ofVthat is a vectorspace using the same operations of addition and scalar multiplication, can the zero elementofW be different from the zero element ofV? No. Let0W be the zero element from W.Then0W+ 0W =0W. However denoting byv the additive inverse element of0W from V,we have

0 = 0W+v= (0W+ 0W) +v= 0W+ (0W+v) = 0W .

Examples.

1. For all n1n2, Cn1 is a subspace ofCn2 (as C-vector spaces). Here we identifyCn1 ={(z1, . . . , z n2) :zn1+1= = zn2 = 0} .

2. Given a vector space V over F,{0} is a subspace.3. In R2, any subspace is either (a) R2, (b){0} or (c) a line through the origin. Why?

IfW is a subspace and contains some w=0, it must contain the entire line spannedbyw; that is, the set{cw : c R}. This is a line through the origin. If it containsanything outside this line, we can use this new vector along with w to generate all ofR2.

4. Generally in Rn, any subspace is a hyperplane through the origin.

Last time we saw that ifW1 and W2 are subspaces of a vector space V then

W1+W2={w1+w2: w1W1, w2W2}is also a subspace. This is actually the smallest subspace containing both W1 and W2. Youmight think this would be W1 W2, but generally, the union does not need to be a subspace.ConsiderV = R2 over Rand

W1 ={(x, 0) :x R}, W2 ={(0, y) :y R}.

7


8/99

Then both of these are subspaces but their union is not, since it is not closed under addition((1, 1) = (1, 0) + (0, 1) /W1 W2).

In the case that W1 W2 ={0}, we say that W1+ W2 is a direct sumand we write itW1 W2.

1.3 Spanning and linear dependence

Given a subset S(not necessarily a subspace) of a vector space Vwe want to generate thesmallest subspace containingS.

Definition 1.3.1. LetV be a vector space andSV. The span ofS is defined

Span(S) =WCSW ,

whereCS is the collection of subspaces ofV containingS.Note that the Span is the smallest subspace containing Sin that ifWis another subspace

containingSthen Span(S)W. The fact that Span(S) is a subspace follows from:Proposition 1.3.2. LetC be a collection of subspaces of a vector spaceV. ThenWCWis a subspace.

Proof. First each W inC contains0, soWCW is nonempty. Ifv, w WCW and c Fthen v, w W for all W C . Since each W is a subspace, cv+ w W for all W C ,meaning thatcv +w WCW, completing the proof.

Examples.

1. Span() ={0}.2. IfW is a subspace ofV then Span(W) =W.

3. Span(Span(S)) = Span(S).

4. IfST V then Span(S)Span(T).There is a different way to generate the span of a set. We can imagine that our initial

definition of span is from the outside in. That is, we are intersecting spaces outside ofS.The second will be from the inside out: it builds the span from within, using the elementsofS. To define it, we introduce some notation.

Definition 1.3.3. IfS V thenv V is said to be a linear combination of elements ofS if there are finitely many elements v1, . . . , vn S and scalars a1, . . . , an F such thatv= a1v1+ +anvn.Theorem 1.3.4. LetSVbe nonempty. Then Span(S)is the set of all linear combinationsof elements ofS.

8


9/99

Proof. Let S be the set of all linear combinations of elements of S. We first prove SSpan(S), so let a1v1+ + anvn S. Each of the vis is in S and therefore in Span(S).By closure of Span(S) under addition and scalar multiplication, we find a1v1+ + anvnSpan(S).

To show that Span(S)

S, it suffices to show that Sis a subspace ofV; then it is one of

the spaces we are intersecting to get Span(S) and we will be done. BecauseS= we can findsSand then 1s is a linear combination of elements ofS, making the Span nonempty. Soletv, wSpan(S) andc F. We can writev = a1v1+ + anvn andw = b1w1+ + bkwkfor vi, wiS. Then

cv+w= (ca1)v1+ + (can)vn+b1w1+ +bkwk S .

Corollary 1.3.5. IfW1, W2 are subspaces ofV then Span(W1 W2) =W1+W2.Proof. Because 0

W1and inW2, we haveW1 + W2

(W1

W2). ThereforeW1 + W2is one

of the subspaces we intersect to get the span and Span(W1 W2)W1+W2. Conversely,any element inW1 + W2is in Span(W1W2) as it is already a linear combination of elementsofW1 W2.Definition 1.3.6. A vector spaceV is finitely generated if there is a finite setSV suchthatV =Span(S). Such anS is called a generating set.

The space Rn is finitely generated: we can choose

S={(1, 0, . . . , 0), (0, 1, 0, . . . , 0), . . . , (0, . . . , 0, 1)} .

The spaceRc ={(x1, x2, . . .) :xi R, finitely many nonzero terms}

with coordinate-wise addition and scalar multiplication is not finitely generated.

Generating sets are closely linked to linear independence.

Definition 1.3.7. A setS Vis called linearly dependent if there existsv S such thatvSpan(S\ {v}). We decree that is linearly independent; that is, not linearly dependent.

The intuition is that a set is linearly dependent if there are unnecessary elements in it tospan Span(S). Indeed, we can restate this condition for S=as

S linearly dependent iffvSsuch that Span(S) = Span(S\ {v}) .

Exercise: prove this!

Examples.

9


10/99

1.{0} is linearly dependent in any vector space.2. In C2,{(1, 0), (0, 1), (1, 1)} is linearly dependent, since (1, 1)Span({(1, 0), (0, 1)}).3. In Cn,

{(1, 0, . . . , 0), . . . , (0, . . . , 0, 1)

}is linearly independent. Indeed, suppose we remove any element from this set. Forsimplicity let us take the first. Then every element in the span of the others must havezero first-coordinate, and cannot be (1, 0, . . . , 0).

There is a very simple condition we can check to see if a set is linearly independent.

Proposition 1.3.8. LetV be a vector space andSV. ThenS is linearly independent ifand only if whenevera1, . . . , an F andv1, . . . , vnSsatisfy

a1v1+ +anvn = 0

we must havea1= = an= 0.Proof. IfS = then S is linearly independent. Furthermore, it satisfies the condition ofthe proposition vacuously: it is true because we cannot ever find a linear combination ofelements ofSequal to0.

Otherwise suppose that S is linearly dependent butS=. Then we can findvSsuchthat vSpan(S\ {v}). Therefore v is a linear combination of elements ofS\ {v}: we canfindw1, . . . , wnS\ {v}and scalars a1, . . . , an such that v = a1w1+ +anwn. Then

(a1)w1+ + (an)wn+v = 0.

This is a linear combination of elements ofSequal to0 with not all coefficients equal to 0,proving that if the condition of the proposition holds, then Smust be linearly independent.

Conversely ifSis linearly independent suppose that

a1v1+ +anvn = 0for some v1, . . . , vnSand a1, . . . , an F. If the coefficients are not all 0, we can find one,saya1 which is nonzero. Then we solve:

v1= (a11 ) [a2v2+ +anvn] ,givingv

1Span(S

\ {v1}

). (Note here thata11

is defined sincea1

= 0 and all nonzero fieldelements are invertible.)

Corollary 1.3.9. LetS1S2V, anF-vector space.1. IfS1 is linearly dependent, so isS2.

2. IfS2 is linearly independent, so isS1.

10


11/99

Proof. The first item follows from the second, so we prove the second. Suppose thatS2 islinearly independent and that v1, . . . , vnS1, a1, . . . , an Fsuch that

a1v1+ +anvn= 0.

Since these vectors are also in S2 and S2 is linearly independent, a1 = = an = 0. ThusS1 is linearly independent.Recall the intuition that a set is linearly independent if each vector in it is truly needed

to represent vectors in the span. Not only are the all needed, but linear independence impliesthat there is exactly one way to represent each vector of the span.

Proposition 1.3.10. LetSVbe linearly independent. Then for each nonzero vectorvSpan(S)there exists exactly one choice ofv1, . . . , vnSand nonzero coefficientsa1, . . . , anF such that

v= a1v1+ +anvn .

Proof. Let v Span(S) be nonzero. By characterization of the Span as the set of linearcombinations of elements of S, there is at least one representation as above. To showit is unique, suppose that v = a1v1 + + anvn and v = b1w1 + + bkwk and writeS1 ={v1, . . . , vn}, S2 ={w1, . . . , wk}. We can rearrange the Sis so that the elementsv1= w1, . . . , vm= wm are the common ones; that is, the ones in S1 S2. Then

0 =v v=m

j=1

(aj bj)vj+n

l=m+1

alvl+k

p=m+1

bpwp .

This is just a linear combination of elements ofS, so by linear independence, all coefficientsare zero, implying that aj = bj for j = 1, . . . , m, and all other als and bls are zero. Thusall nonzero coefficients are the same in the linear combinations and we are done.

1.4 Bases

We are now interested in maximal linearly independent sets. It turns out that these mustgenerateVas well, and we will work toward proving that.

Definition 1.4.1. LetV be anF-vector space andSV. IfS generatesV and is linearlyindependent then we callSa basis forV.

Note that the above proposition says that any vector in Vhas a unique representation

as a linear combination of elements from the basis.We will soon see that any basis ofVmust have the same number of elements. To prove

that, we need a famous lemma. It says that if we have a linearly independent T and aspanning setS, we can add #S #Tvectors fromS toTto make it spanning.Theorem 1.4.2 (Steinitz exchange lemma). LetS={v1, . . . , vm} satisfy Span(S) =V andletT ={w1, . . . , wk} be linearly independent. Then

11


12/99

1. km and2. after possibly reordering the setS, we have

Span({w1, . . . , wk, vk+1, . . . , vm}) =V .

Proof. The proof is by induction on k, the size ofT. Ifk = 0 then T is empty and thuslinearly independent. In this case, we do not exchange any elements ofTwith elements ofSand the lemma simply states that 0m and Span(S) =V, which is true.

Suppose that for some k 0 and all linearly independent sets T of size k, the lemmaholds; we will prove it holds for k+ 1, so let T ={w1, . . . , wk+1} be a linearly independentset of size k+ 1. By last lecture,{w1, . . . , wk} is linearly independent and by induction,km and we can reorder Sso that

Span({w1, . . . , wk, vk+1, . . . , vm}) =V .

Because of this we can find scalars a1, . . . , am such that

a1w1+ +akwk+ak+1vk+1+ +amvm= wk+1 . (1)

If k = m or if k m 1 but all the coefficients ak+1, . . . , am are zero, then we havewk+1 Span({w1, . . . , wk}), a contradiction since T is linearly independent. Therefore wemust have k+ 1m and at least one ofak+1, . . . , am must be nonzero. Reorder the set Sso that ak+1= 0. Then we can solve for vk+1 in (1) to find

vk+1Span({w1, . . . , wk+1, vk+2, . . . , vm}) .

Therefore each element of{

w1, . . . , wk, vk+1, . . . , vm}

can be represented as a linear combi-nation of elements from{w1, . . . , wk+1, vk+2, . . . , vm}, and since the former set spans V, wesee that

Span({w1, . . . , wk+1, vk+2, . . . , vm}) =V .This completes the proof.

We can now give all the consequences of this theorem.

Corollary 1.4.3. LetV be anF-vector space. IfB1 andB2 are both bases forV then theyhave the same number of elements.

Proof. If B1 is finite, with n elements, then suppose that B2 has at least n+ 1 elements.Choosing any such subset of size n+ 1 as T and B1 as the spanning set from the previoustheorem, we see that n+ 1 n, a contradiction. This means #B2 #B1. If on theother hand B1 is infinite, then ifB2 were finite, we could reverse the roles ofB2 and B1,apply Steinitz again, and see #B1 #B2, a contradiction. Therefore in all cases we have#B2#B1. Applying this same logic for B1 andB2 reversed, we get #B1#B2, provingthe corollary.

12


13/99

Definition 1.4.4. A vector space with a basis of size n is called n-dimensional and wewrite dim(V) =n. If this is true for somen we say the vector space is finite dimensional.Otherwise we say thatVis infinite dimensional and write dim(V) =.

Note that{0} is zero dimensional, sinceis a basis for it.Corollary 1.4.5. LetV be ann-dimensional vector space (n1) andS={v1, . . . , vm}.

1. Ifm < n thenS cannot spanV.

2. Ifm > n thenScannot be linearly independent.

3. Ifm= n thenSis linearly independent if and only if it spansV.

Proof. Let B be a basis for V. Then using Steinitz withB as the linearly independent setandSas the spanning set, we see that ifSspansV thenShas at least n elements, provingthe first part. Similarly, using Steinitz with B as the spanning set and S as the linearlyindependent set, we get part two.

Ifm= n and S is linearly independent then Steinitz implies that we can add 0 vectorsfrom B to S to make S span V. This means S itself spans V. Conversely, ifS spans V,then if it is not linearly independent, we can find v V such that v Span(S\ {v}), orV = Span(S)Span(S\ {v}). Therefore S\ {v} is a smaller spanning set, contradictingthe first part.

Corollary 1.4.6. IfW is a subspace ofV then dim(W)dim(V). In particular, ifV hasa finite basis, so doesW.

Proof. IfVis infinite dimensional there is nothing to prove, so let B be a finite basis for Vof sizen. Consider all subsets ofWthat are linearly independent. By the previous corollary,

none of these have more than n elements (they cannot be infinite either since we could thenextract a linearly independent subset of size n + 1). Choose any one with them largestnumber of elements and call it BW. It must be a basis the reason is that it is a maximallinearly independent subset ofW(this is an exercise on this weeks homework). Because ithas no more than dim(V) number of elements, we are done.

Now we have one of two main subspace theorems. It says we can extend a basis for asubspace to a basis for the full space.

Theorem 1.4.7(One subspace theorem). LetWbe a subspace of a finite-dimensional vectorspaceV. IfBW is a basis forW, there exists a basisB ofV containingBW.

Proof. Consider all linearly independent subsets ofV that containBW(there is at least one,BW!) and choose one,S, of maximal size. We know that #S dimV and if #S= dimVit must be a basis and we are done, so assume that # S = k < dimV. We must thenhave Span(S)= V so choose a vector v V\ Span(S). We claim that S {v} is linearlyindependent, contradicting maximality ofS. To see this write S={v1, . . . , vk} and

a1v1+ +akvk+bv= 0.

13


14/99

If b= 0 then we can solve for v, getting v Span(S), a contradiction, so we must haveb= 0. But then a1v1+ +akvk = 0 and linear independence ofS gives ai = 0 for all i, acontradiction.

The second subspace theorem will follow from a dimension theorem.

Theorem 1.4.8. LetW1, W2 be subspaces ofV, a finite-dimensional vector space. Then

dim(W1+W2) +dim(W1 W2) =dim(W1) +dim(W2) .Proof. Let B be a basis for the intersection W1 W2. By the one subspace theorem we canfind bases B1 andB2 ofW1 and W2 respectively that both contain B. Write

B={v1, . . . , vk}B1={v1, . . . , vk, vk+1, . . . , vl}B2={v1, . . . , vk, wk+1, . . . , wm}.

We will now show that B = B1 B2 is a basis for W1+W2. This will prove the theorem,since then dim(W1+W2) + dim(W1 W2) =k + (l+m k) =l+m.

To show that B is a basis for W1+W2 we first must prove Span(B) =W1+W2. SinceB W1+ W2, we have Span(B)Span(W1+ W2) =W1+ W2. On the other hand, eachvector inW1 + W2 can be written as w1 + w2 forw1W1 and w2W2. BecauseB containsa basis for each ofW1 and W2, these vectors w1 and w2 can be written in terms of vectorsin B , so w1+w2Span(B).

Next we show that B is linearly independent. We set a linear combination equal to zero:

a1v1+ +akvk+ak+1vk+1+ +alvl+bk+1wk+1+ +bmwm= 0 . (2)

By subtracting thew terms to one side we find that bk+1wk+1+ + bmwmW1. But thissum is already in W2, so it must be in the intersection. As B is a basis for the intersectionwe can write

bk+1wk+1+ +bmwm= c1v1+ +ckvkfor somecis in F. Subtracting thews to one side and using linear independence ofB2 givesbk+1= = bm= 0. Therefore (2) reads

a1v1+ +alvl= 0 .Using linear independence ofB1 givesai = 0 for alli and thusB is linearly independent.

The proof of this theorem gives:

Theorem 1.4.9 (Two subspace theorem). IfW1, W2 are subspaces of a finite-dimensionalvector spaceV, there exists a basis ofV that contains bases ofW1 andW2.

Proof. Use the proof of the last theorem to get a basis for W1+W2 containing bases ofW1and W2. Then use the one-subspace theorem to extend it to V.

14


15/99

Note the difference from the one subspace theorem. We are not claiming that you canextend any given bases ofW1 and W2 to a basis ofV. We are just claiming there exists atleast one basis ofVsuch that part of this basis is a basis for W1 and part is a basis for W2.

In fact, given bases ofW1 andW2 we cannot generally find a basis ofV containing thesebases. Take

V = R3, W1 ={(x,y, 0) :x, y R}, W2 ={(x, 0, z) :x, z R}.

If we take bases B1 ={(1, 0, 0), (1, 1, 0)} and B2 ={(1, 0, 1), (0, 0, 1)}, there is no basis ofV = R3 containing both B1 and B2 since V is 3-dimensional.

1.5 Exercises

We will write N ={1, 2, . . .} and Z ={. . . , 1, 0, 1, . . .} for the natural numbers and integers,respectively. Let N = N {0}. The rationals are Q ={m/n : m, n Z, n= 0} and Rstands for the real numbers.

1. Ifa, bN we say that a divides b, written a|b, if there is another natural number csuch that b = ac. Fixm, n N and define

S={mp+nq: p, q Z} N.

(a) Let d be the smallest element ofS. Show that d|m andd|n.Hint. You can use the statement of the division algorithm without proof; thatis, ifa, b Nthen there exist r, s N such that r < band a = bs +r.

(b) Show that ife is another element ofN that divides both m and n thene|d. Thisnumberdis called the greatest common divisor ofmandn, writtend= gcd(m, n).

(c) For any nonzero integersm, n define gcd(m, n) = gcd(|m|, |n|). Show there existp, q Zsuch that mp+nq= gcd(m, n).

2. Let p be a prime and Zp be the set{0, . . . , p 1}. Show that Zp is a field using theoperations

ab= (ab) mod p and a +b= (a +b) mod p .

Here we have defined a mod p for aN as the unique rN withr < p such thata= ps+r for some s N.

3. Let Sbe a nonempty set and Fa field. Let Vbe the set of functions from S toFand

define addition and scalar multiplication on (V, F) by

(f+ g)(s) =f(s) +g(s) and (cf)(s) =c(f(s)).

Show Vis a vector space over F.

15


16/99

4. Let Wbe a subspace of an F-vector space Vand define the set

V /W ={v+W :vV} .Here the notation v+ W means the set{v+ w : w W}, so V /W is a set whoseelements are sets.

(a) Show that two elements v1 +W and v2 +W of V/W are equal if and only ifv1 v2W. In this case we say that v1 andv2 are equivalent modulo W.

(b) Show that the elements ofV/W form a partition ofV. That is, their union is Vand distinct elements must have empty intersection.

(c) In the case ofV = R2 andW ={(x, y) :x + y= 0}, with F = R, give a geometricdescription of the elements ofV /W.

(d) Define addition and scalar multiplication onV /W as follows. For C1, C2V /W,selectv1, v2V such that C1= v1+W andC2 = v2+W. Define

C1+C2= (v1+v2) +W and cC1= (cv1) +W for c

F.

Show that these definitions do not depend on the choice ofv1, v2.

(e) Prove that the above operations turn V /W into an F-vector space. It is calledthequotient space ofV over W.

5. IfV is an F-vector space, recall thatV isfinitely generatedif there is a finite setSVsuch that V= Span(S).

(a) Is R finitely generated as a vector space over Q?

(b) Is the space of functions fromR to R finitely generated as a vector space over R?

6. Show that ifS V is a finite generating set then Scontains a basis for V. DeducethatV is finitely generated if and only if it has a finite basis.7. Let SV.

(a) Suppose that SgeneratesVbut no proper subset ofSgeneratesV (that is, S isa minimal spanning set). Show that S is a basis.

(b) Suppose that Sis linearly independent and is not a proper subset of any linearlyindependent set in V (that is, S is a maximal linearly independent set). ShowthatSis a basis.

8. Let Wbe a subspace ofV.

(a) We say thatSVis linearly independent modulo W if whenever v1, . . . , vkSanda1, . . . , ak F are such that

a1v1+ +akvkWthen a1 = = ak = 0. Show that S is linearly independent modulo W if andonly if the set{v+W :vS} is linearly independent as a subset ofV /W.

16


17/99

(b) Assume now that V has dimension n 2 this is nottrue. That is, ifW1, . . . , W k satisfyWi Wj ={0}for alli=j then these spacesneed not be independent.

(b) Prove that the following are equivalent.

1. W1, . . . , W k are independent.

2. Whenever w1+ +wk = 0 for wiWi for allithenwi= 0 for all i.3. Whenever Bi is a basis for Wi for alli, the Bis are disjoint andB :=ki=1Bi

is a basis for W1+ +Wk.10. Give an example to show that there is no three subspace theorem. That is, if

W1, W2, W3 are subspaces of V then there need not exist a basis of V containing a

basis for Wi for all i = 1, 2, 3.

11. Let Fbe a finite field. Define a sequence (sn) of elements ofF bys1 = 1 and sn+1 =sn+ 1 for n N. Last, define the characteristicofF as

char(F) = min{n N :sn = 0} .

(If the set on the right is empty, we set char(F) = 0.)

(a) Show that because Fis finite, its characteristic is a prime number p.

(b) Show that the set{0, s1, . . . , sp1} with the same addition and multiplication asin F is itself a field, called the prime subfield ofF.

(c) Using the fact that Fcan be viewed as a vector space over its prime subfield,show that F haspn elements for some n N.

17


18/99

2 Linear transformations

We now move on to the main subject of the course, linear transformations.

2.1 Definitions

Definition 2.1.1. LetV andWbe vector spaces over the same fieldF. A functionT :VW is called a linear transformation if

T(v1+v2) =T(v1) +T(v2) andT(cv1) =cT(v1) for allv1, v2V andc F .

As usual, we only need to check the condition

T(cv1+v2) =cT(v1) +T(v2) for v1, v2V and c F.

Examples

1. Consider Cas a vector space over itself. Then ifT : C Cis linear, we can writeT(z) =zT(1)

soTis completely determined by its value at 1.

2. Let V be finite dimensional and B ={v1, . . . , vn} a basis for V. Each v V can bewritten uniquely as

v= a1v1+ +anvn for ai F.So defineT :V Fn byT(v) = (a1, . . . , an). This is called thecoordinate map relativeto B. It is linear because ifv = a1v1+

+anvn,w= b1v1+

+bnvn and c

F,

cv+w= (ca1+b1)v1+ + (can+bn)vnis one representation ofcv + win terms of the basis. But this representation is unique,so we get

T(cv+w) = (ca1+b1, . . . , c an+bn)

=c(a1, . . . , an) + (b1, . . . , bn)

=cT(v) +T(w) .

3. Given any m n matrix A with entries from F (the notation from the homework isAMm,n(F), we can define a linear transformationsLA: Fn Fm andRA: Fm Fnby

LA(v) =A v and RA(v) =v A .Here we are using matrix multiplication and in the first case, representingvas a columnvector. In the second, v is a row vector.

18


19/99

4. In fact, the set of linear transformations fromV toW, writtenL(V, W), forms a vectorspace! Since the space of functions from V toWis a vector space, it suffices to checkthat it is a subspace. So given T, UL(V, W) and c F, we must show that cT+Uis a linear transformation. So let v1, v2V and c F:

(cT+U)(c

v+w) = (cT)(c

v+w) +U(c

v+w)=c(T(cv+w)) +U(cv+w)

=c(cT(v) +T(w)) +cU(v) +U(w)

=c(cT(v) +U(v)) +cT(w) +U(w)

=c(cT+U)v+ (cT+U)(w) .

Another obvious way to build linear transformations is composition.

Proposition 2.1.2. Let T : V W and U : W Z be linear (with all spaces over thesame fieldF). Then the compositionU T is a linear transformation fromV to Z.Proof. Letv1, v2V and c F. Then

(U T)(cv1+v2) =U(T(cv1+v2)) =U(cT(v1) +T(v2))=cU(T(v1)) +U(T(v2)) =c(U T)(v1) + (U T)(v2) .

Recall that each T : C C that is linear is completely determined by its value at 1.Note that{1}is a basis. This fact holds true for all linear transformations and is one of themost important theorems of the course: in the words of Conway, each linear transformationis completely determined by its values on a basis, and any values will do!

Theorem 2.1.3 (The slogan). Let V and W be vector spaces over F. If{v1, . . . , vn} isa basis forV and w1, . . . , wn are any vectors in W(with possible duplicates) then there isexactly oneT L(V, W) such thatT(vi) =wi for alli= 1, . . . , n.Proof. This is an existence and uniqueness statement, so lets first prove uniqueness. SupposethatT , UL(V, W) both map vi towi for alli. Then write an arbitraryvV uniquely asv= a1v1+ +anvn. We have

T(v) =T(a1v1+ +anvn) =a1T(v1) + +anT(vn) =a1w1+ +anwn=a1U(v1) + +anU(vn) =U(a1v1+ +anvn) =U(v) .

To prove existence we must construct one such linear map. Each vV can be writtenuniquely as v = a1v1+ +anvn, so define T :VW by

T(v) =a1w1+ +anwn .The fact that T is a function (that is, for each vVthere is exactly one wW such thatT(v) = w) follows from uniqueness of the representation ofv in terms of the basis. So we

19


20/99

must show linearity. Ifv, v V, write v = a1v1+ + anvn andv =b1v1+ + bnvn. Wehave for c F,

T(cv+v) =T((ca1+b1)v1+ + (can+bn)vn)= (ca1+b1)w1+

+ (can+bn)wn

=c(a1w1+ +anwn) + (b1w1+ +bnwn)=cT(v) +T(v) .

2.2 Range and nullspace

Next we define two very important subspaces that are related to a linear transformationT.

Definition 2.2.1. Let T : V W be linear. The nullspace, or kernel, of T is the setN(T)

V defined by

N(T) ={vV :T(v) = 0} .The range, or image, ofT, is the setR(T)W defined by

R(T) ={wW :T(v) =w for somevV} .In the definition ofN(T) above,0 is the zero vector in the space W.

Proposition 2.2.2. LetT :VWbe linear. ThenN(T) is a subspace ofV andR(T) isa subspace ofW.

Proof. First N(T) is nonempty, since each linear transformation must map 0 to0: T(0) =

T(00) = 0T(0) = 0. Ifv1, v2N(T) andc F,T(cv1+v2) =cT(v1) +T(v2) =c0 + 0 = 0 ,

so cv1+v2N(T), showing that N(T) is a subspace ofV. For R(T), it is also non-empty,since 0 is mapped to by 0. If w1, w2 R(T) and c F, choose v1, v2 V such thatT(v1) =w1 andT(v2) =w2. Then

cw1+w2 = cT(v1) +T(v2) =T(cv1+v2) ,

so cw1+w2 is mapped to by cv1+v2, a vector in Vand we are done.

In the finite-dimensional case, the dimensions of these spaces are so important they gettheir own names: therankofTis the dimension ofR(T) and thenullityofTis the dimensionofN(T). The next theorem relates these dimensions to each other.

Theorem 2.2.3 (Rank-nullity). LetT :VWbe linear and dim(V)


21/99

Proof. In a way, this theorem is best proved using quotient spaces, and you will do this inthe homework. We will prove it the more standard way, by counting and using bases. Let{v1, . . . , vk} be a basis for the nullspace ofTand extend it to a basis {v1, . . . , vk, vk+1, . . . , vn}for V. We claim that T(vk+1), . . . , T (vn) are distinct and form a basis for R(T); this willcomplete the proof. IfT(vi) =T(vj) for somei, j

{k +1, . . . , n

}, we then haveT(vi

vj) =

0, implying that vi vjN(T). But we have a basis for N(T): we can writevi vj =a1v1+ +akvk

and subtractingvi vj to the other side, we have a linear combination of elements of a basisequal to zero with some nonzero coefficients, a contradiction.

Now we show B ={T(vk+1), . . . , T (vn)} is a basis for R(T). They are clearly containedin the range, so Span(B)R(T). Conversely, ifwR(T) we can write w = T(v) for somevVand using the basis, find coefficients such that bi such that

w= T(v) =T(b1v1+. . .+bnvn) .

Expanding the inside, we get b1T(v1) + + bnT(vn). The first k vectors are zero, sincev1, . . . , vkN(T), so

w= bk+1T(vk+1) + +bnT(vn) ,provingwSpan(B) and therefore B spans R(T).

For linear independence, let bk+1T(vk+1) + +bnT(vn) = 0. Then0 =T(bk+1vk+1+ +bnvn) ,

so bk+1vk+1+ + bnvn N(T). As before, we can then write these vectors in terms ofv1, . . . , vk, use linear independence of{v1, . . . , vn} to get bi = 0 for all i.

One reason the range and nullspace are important is that they tell us when a transfor-mation is one-to-one (injective) or onto (surjective). Recall these definitions:

Definition 2.2.4. IfX andY are sets andf :X Yis a function then we say thatf isone-to-one (injective) iffmaps distinct points to distinct points; that is, ifx1, x2X withx1=x2 thenf(x1)=f(x2). We say thatf is onto (surjective) if each point ofY is mappedto by somex; that is, for eachyY there existsxX such thatf(x) =y.Proposition 2.2.5. LetT :VW be linear. Then

1. T is injective if and only ifN(T) ={0}.2. Tis surjective if and only ifR(T) =W.

Proof. The second is just the definition of surjective, so we prove the first. Suppose thatT is injective and let v N(T). ThenT(v) =0 = T(0), but because T injective, v =0,proving that N(T) {0}. As N(T) is a subspace, we have{0} N(T), giving equality.

Conversely suppose that N(T) ={0}; we will prove that T is injective. So assume thatT(v1) =T(v2). By linearity, T(v1 v2) = 0, sov1 v2N(T). But he only vector in N(T)is the zero vector, so v1 v2 = 0, giving v1= v2 and T is injective.

21


22/99

In the previous proposition, the second part holds for all functions T, regardless ofwhether they are linear. The first, however, need not be true if T is not linear. (Thinkof an example!)

We can give an alternative characterization of one-to-one and onto:

Proposition 2.2.6. LetT :V

W be linear.

1. Tis injective if and only if it maps linearly independent sets ofV to linearly indepen-dent sets ofW.

2. Tis surjective if and only if it maps spanning sets ofV to spanning sets ofW.

3. T is bijective if and only if it maps bases ofV to bases ofW.

Proof. The third part follows from the first two. For the first, assume that T is injectiveand letSVbe linearly independent. We will show that T(S) ={T(v) :vS}is linearlyindependent. So let

a1T(v1) + +anT(vn) = 0 .This implies thatT(a1v1 + + anvn) = 0, implying thata1v1 + + anvn= 0 by injectivity.But this is a linear combination of vectors in S, a linearly independent set, givingai= 0 forall i. Thus T(S) is linearly independent.

Conversely suppose that Tmaps linearly independent sets to linearly independent setsand let vN(T). Ifv=0 then{v} is linearly independent, so{T(v)} is linearly indepen-dent. But ifT(v) =0 this is impossible, since{0} is linearly dependent. Thus v=0 andN(T) ={0}, implying T is injective.

For item two, suppose that T is surjective and let Sbe a spanning set for V. Then ifwWwe can findvV such thatT(v) =w and a linear combination of vectors ofSequaltov: v= a1v1+ +anvn for viS. Therefore

w= T(v) =a1T(v1) + +anT(vn) ,meaning that we have wSpan(T(S)), so T(S) spans W. Conversely ifTmaps spanningsets to spanning sets, then T(V) =R(T) must spanW. But since R(T) is a subspace ofW,this meansR(T) =W andT is onto.

2.3 Isomorphisms

Definition 2.3.1. A linear transformation T : V W that is bijective (that is, injectiveand surjective) is called an isomorphism.

Generally speaking, we can view a bijection between sets XandYas a relabeling of theelements ofX(to get those ofY). In the case of an isomorphism, this labeling also respectsthe vector space structure, being linear.

Proposition 2.3.2. Let T : V W be an isomorphism. Then T1 : W V is anisomorphism. Here, as always, the inverse function is defined by

T1(w) =v if and only ifT(v) =w .

22


23/99

Proof. It is an exercise to see that any bijection has a well-defined inverse function and thatthis inverse function is a bijection. (This was done, for example, in the 215 notes in the firstchapter.) So we must only show that T1 is linear. To this end, let w1, w2W and cF.Then

T(T1(cw1+w2)) =cw1+w2 ,

whereas

T(cT1(w1) +T1(w2)) =cT(T

1(w1)) +T(T1(w2)) = cw1+w2 .

SinceTis injective, this implies that T1(cw1+w2) =cT1(w1) +T

1(w2).

Using the notion of isomorphism, we can see that any n dimensional vector space V overFis just Fn.

Theorem 2.3.3. LetV be ann-dimensional vector space overF. ThenV is isomorphic toFn.

Proof. LetB ={v1, . . . , vn} be a basis forV. We will think ofBas being ordered. Define thecoordinate mapTB :V Fn as before as follows. Each vV has a unique representationv = a1v1+ +anvn. So set TB(v) = (a1, . . . , an). This was shown before to be a lineartransformation. So we must just show it is an isomorphism.

Since the dimension ofVis equal to that ofFn, we need only show thatTB is onto. Thenby the rank-nullity theorem, we will find

dimN(TB) = dim(V) dim(R(TB)) = dim(V) dim(Fn) = 0,implying thatN(TB) ={0}, and that TBis one-to-one. So to show onto, let (a1, . . . , an) Fn.The element v = a1v1+

+anvn maps to it:

TB(v) =TB(a1v1+ +anvn) = (a1, . . . , an) ,so TB is an isomorphism.

2.4 Matrices and coordinates

We will now see that, just as V with dimensionnlooks just like Fn, all linear maps fromV to Wlook just like matrices with entries from F.

Suppose that T : V W is linear and these are finite dimensional vector spaces withdimensionn and m respectively. FixB =

{v1, . . . , vn

}and C=

{w1, . . . , wm

}to be bases of

V and W respectively. We know that T is completely determined by its values on B, andeach of these values lies in W, so we can write

T(v1) =a1,1w1+ +am,1wmT(v2) =a1,2w1+ +am,2wm

23


24/99

and so on, up toT(vn) =a1,nw1+ +am,nwm .

Now we take some arbitrary v Vand express it in terms of coordinates using B. Thistime we write it as a column vector and use the notation [v]B :

[v]B = a1

an

, where v = a1v1+ +anvn .Let us compute T(v) and write it in terms ofC:

T(v) =a1T(v1) + +anT(vn)=a1(a1,1w1+ +am,1wm) + +an(a1,nw1+ +am,nwm)= (a1a1,1+ +ana1,n)w1+ + (a1am,1+ +anam,n)wm .

Therefore we can write T(v) in coordinates using C as

[T(v)]C= a1a1,1+ +ana1,n

a1am,1+ +anam,n = a1,1 a1,n

am,1 am,n a1

an

.Therefore we have found on half of:

Theorem 2.4.1 (Matrix representation). Let T : V W be linear andB ={v1, . . . , vn}andC ={w1, . . . , wm} be (ordered) bases of V and W respectively. There exists a uniquematrix, written [T]BCsuch that for allvV,

[T(v)]C= [T]BC[v]B .

Proof. We have already shown existence. To show uniqueness, suppose that A is anym

n

matrix with entries from F such that for all v V, A[v]B = [T(v)]C. Choose v = vi forsome i= 1, . . . , n(one of the basis vectors in B). Then the coordinate representation ofvis [v]B = ei, the vector with all 0s but a 1 in the i-th spot. Now the product of matricesA[v]Bactually gives thei-th column ofA. We can see this by using the matrix multiplicationformula: ifM is an m nmatrix and N is ann p matrix then the matrix MN is m pand its (i, j)-th coordinate is given by

(MN)i,j =n

k=1

Mi,kNk,j .

Therefore asAism

nand [v]Bis n

1, the matrixA[v]B ism

1 and its (j, 1)-th coordinate

is

(A[v]B)j,1=n

k=1

Aj,k([v]B)k,1 =n

k=1

Aj,k(ei)k,1 = Aj,i .

This means the entries ofA[v]B are A1,i, A2,i, . . . , Am,i, the i-th column ofA. However, thisalso equals [T(ei)]C, which is the i-th column of [T]

BCby construction. ThusA and [T]

BC have

the same columns and are thus equal.

24


25/99

In fact much more is true. What we have done so far is defined a mapping : L(V, W)Mm,n(F) in the following manner. Given fixed bases B andCof sizes n and m respectively,we set

(T) = [T]BC .

This function is actually an isomorphism, meaning that the space of linear transformations

is just a relabeling of the space of matrices (after choosing coordinates B andC):

Theorem 2.4.2. Given basesB andC ofV andW of sizesn andm, the spacesL(V, W)andMm,n(F) are isomorphic via the mapping.

Proof. We must show that is a bijection and linear. First off, if (T) = (U) then for allvV, we have

[T(v)]C= (T)[v]B = (U)[v]B = [U(v)]C .

But the map sending vectors in Wto their coordinates relative to C is also a bijection, soT(v) = U(v). Since this is true for allv, we get T = U, meaning is injective. To showsurjective, let A be any m

n matrix with (i, j)-th entry Ai,j. Then we can define a linear

transformationT :VWby its action on the basis B : setT(vi) =A1,iw1+ +Am,iwm .

By the slogan, there is a unique linear transformation satisfying this and you can then checkthat [T]BC=A, meaning is surjective and therefore a bijection.

To see that is linear, letT , UL(V, W) andc F. Then thei-th column of [cT+ U]BCis simply the coefficients of (cT+U)(vi) expressed relative to the basis C. This coordinatemap is linear, so

[(cT+U)(vi)]C= [cT(vi) +U(vi)]C=c[T(vi)]C+ [U(vi)]C ,

which isctimes the i-th column of (T) plus the i-th column of (U). Thus[cT+U]BC=c[T]

BC+ [U]

BC .

Last time we saw that ifV andWhave dimensionn and m and we fix basesB ofV andC ofWthen there is an isomorphism :L(V, W)Mm,n(F) given by

(T) = [T]BC .

A simple corollary of this follows. Because of any basis is a basis, these spaces havethe same dimension:

Corollary 2.4.3. The dimension ofL(V, W) ismn, whereV has dimensionn andW hasdimensionm. Given bases B ofV and C ofW, a basis of L(V, W) is given by the set ofsizemn

{Ti,j : 1in, 1jm} ,whereTi,j is the unique linear transformation sendingvi to wj and all other elements ofBto0.

25


26/99

Proof. Since L(V, W) andMm,n(F) are isomorphic, they have the same dimension, which inthe latter case is mn (that was a homework problem). Further the basis ofMm,n(F) of sizemngiven by the matrices with a 1 in the ( i, j)-th entry and 0 everywhere else map by 1

to a basis for L(V, W), and it is exactly the set listed in the corollary.

We can now give many nice properties of the matrix representation.1. Let T : V W and U : W Z be linear with B , C, D bases for V,W, Z. For any

vV,[(U T)v]D = [U(T(v))]D = [U]CD[T(v)]C= [U]CD[T]BC[v]B .

However [U T]BD is the unique matrix with this property, so we find[U T]BD= [U]CD[T]BC .

In other words, transformation composition corresponds to matrix multiplication. Agood way to remember this is that the Cs cancel out on the right.

2. If T : V W is an isomorphism, setting IdV : V V as the identity map andIdW :W Was the identity map and Ias the identity matrix,

I= [IdV]BB = [T]

BC[T

1]CB

I= [IdW]CC= [T

1]CB[T]BC .

In other words, [T]BC is an invertible matrix.

Definition 2.4.4. We say that A Mn,n(F) is invertible if there is a B Mn,n(F)such thatAB= BA= I.

You will show in the homework ifA is invertible, there is exactly one (invertible) Bthat satisfiesAB =BA= I. Therefore we write A1 =B. This gives

[T]BC1

= [T1]CB .

Exercise: ifA is an invertible n n matrix and B is a basis for V then there is anisomorphismT :VV such that [T]BB =A.We summarize the relation between linear transformations and matrices using thefollowing table. Fix V , W,T :VWand basesB, CofV , W.

Linear transformations MatricesvV the n 1 column vector [v]B

wW the m 1 column vector [w]CT the m nmatrix [T]BC

U T (composition) [U]CD[T]BC(matrix multiplication)isomorphisms invertible matrices

26


27/99

3. Change of basis. Suppose we have T :VW withB, Cbases ofV,W. We wouldlike to relate [T]BC to [T]

B

C , the matrix relative to other bases B, C ofV,W. How do

we do this? Consider the matrices [IdV]B

B and [IdW]CC :

[IdW]CC[T]

BC[IdV]

B

B = [IdW T IdV]B

C = [T]B

C .

Note that [IdW]CC and [IdV]

B

B are invertible. Therefore:

IfT :VWis linear andB, B are bases ofV withC, C bases ofW, there existinvertible matricesP = [IdW]

CCMm,m(F) andQ = [IdV]B

B Mn,n(F) such that[T]B

C =P[T]BCQ .

Not only is each [IdV]BB invertible, each invertible matrix can be seen as a changeof basis matrix: given a basisB ofVand an invertible matrixP Mn,n(F), thereexists a basis B ofV such that P = [IdV]

B

B.

Proof. By the exercise above, there is an isomorphism TP : V V such that[TP]

BB =P. Writing B =

{v1, . . . , vn

}, define B =

{T(v1), . . . , T (vn)

}. Then the

j-th column of [IdV]B

B is computed by evaluating

[IdV(T(vj))]B = [T(vj)]B =j-th column of [TP]BB =P .

So [IdV]B

B and Phave the same columns and are thus equal.

In one case we have a simpler form for P and Q. Suppose that T : V V islinear andB , B are bases for V. Then

[T]B

B = [IdV]BB [T]

BB[IdV]

B

B .

That is, we have [T]B

B =P1[T]BBP, where P is an invertible n nmatrix. This

motivates the definition

Definition 2.4.5. Twon nmatricesA andB are said to be similar if there isan invertiblen n matrixP such thatB=P1AP.The message is that similar matrices represent the same transformation but rela-tive to a different basis. Therefore if there is some property of matrices that is thesame for all matrices that are similar, we are right to say it is a property of theunderlying transformation. For instance we define the trace of an n nmatrixAby

T r(A) =n

i=1

Ai,i .

We can show easily that T r(AB) =T r(BA):

T r(AB) =n

i=1

(AB)i,i=n

i=1

nk=1

Ai,kBk,i

=n

k=1

ni=1

Bk,iAi,k=n

k=1

(BA)k,k =T r(BA) .

27


28/99

Therefore if P is invertible, T r(P1AP) = T r(AP P1) = T r(A). This meansthat ifT :VV is linear, we can define its trace as T r(T) =T r([T]BB) for anybasis ofB (and it will not depend on our choice ofB!).

2.5 Exercises

1. Let T : V Vbe linear with dim V m then T cannot be injective.

(b) Prove that ifn < m then T cannot be surjective.

(c) Prove that ifn = m then Tis injective if and only if it is surjective.

3. Let V, W and Z be finite-dimensional vector spaces over F. If T : V W andU :W Zare linear, show that

rank(U T)min{rank(U), rank(T)} .Prove also that if either ofU orTis invertible, the rank ofU T is equal to the rank ofthe other one. Deduce that ifP :VV andQ : W Ware isomorphisms then therank ofQT Pequals the rank ofT.

4. Given an angle

[0, 2), let T : R2

R2 be the function that rotates a vector

clockwise about the origin by an angle . Find [T]BB, where B ={(1, 0), (0, 1)}.

5. LetV andWbe finite dimensional vector spaces over Fand T :VW linear. Showthere exist ordered bases B ofV and C ofW such that

[T]BCi,j

=

0 ifi=j0 or 1 if i = j

.

6. Let F be a field and consider the vector space of polynomials of degree at most n:

Fn[x] =

{anx

n +

+a0 : ai

Ffori = 0, . . . , n

}.

(a) Show that B ={1, x , x2, . . . , xn} is a basis for this space.(b) Fix an element b F and define the evaluation map Tb : Fn[x] F byTb(p) =

p(b). Show this is linear. Find the range and nullspace ofTb.

(c) Give the representation ofTb in terms of the basis B for Fn[x] and the basis{1}for F.

28


29/99

(d) For distinct b1, . . . , bn+2 in F show that the functions Tb1, . . . , T bn+2 are linearlydependent in L(Fn[x],F). Deduce that any polynomial p in Fn[x] with at leastn+ 1 zeros must have p(x) = 0 for all x F.

7. Here you will give an alternative proof of the rank-nullity theorem. LetT : V Wbe linear and suppose that dim(V)


30/99

(e) Use exercise 9 on the previous homework and parts (a) - (d) above to show thatthe row rank ofAequals the column rank ofA.

11. Givenm Rdefine the line

Lm={

(x, y)R2 :y = mx

}.

(a) LetTmbe the function which maps a point in R2 to its closest point in Lm. Find

the matrix ofTm relative to the standard basis.

(b) Let Rm be the function which maps a point in R2 to the reflection of this point

about the line Lm. Find the matrix ofTm relative to the standard basis.

Hint for both. First find the matrix relative to a carefully chosen basis.

30


31/99

3 Dual spaces

3.1 Definitions

We have been talking about coordinates, so lets examine them more closely. LetV be an

n-dimensional vector space and fix a basis B ={v1, . . . , vn} ofV. We can write any vectorVin coordinates relative to B as

[v]B =

a1 an

, wherev = a1v1+ +anvn .For anyi= 1, . . . , nwe can define thei-th coordinate map byvi :V F given byvi (v) =ai,where ai is the i-th entry of [v]B. These elements v

i are linear and are thus in the space

L(V, F). This space comes up so much we give it a name:

Definition 3.1.1. We writeV =L(V, F) and call it the dual space to V. Elements ofV

will be writtenfand called linear functionals.

Given any basis B ={v1, . . . , vn} we call B ={v1, . . . , vn} the basis ofV dual to B .Proposition 3.1.2. IfB is a basis ofV thenB is a basis ofV.

Proof. The dimension of V is n, the dimension of V, so we must show B is linearlyindependent or spanning. We show linearly independent: suppose that

a1v1+ +anvn= 0 ,

where0 on the right is the zero transformation from V to F. Apply both sides to vi. Fori=j we get vj (vi) = 0, since the j -th coordinate ofvi is 0. For i = j we get vi (vi) = 1, so

ai = (a1v1+ +anvn)(vi) = 0(vi) = 0 .This is true for all isoB is linearly independent and we are done.

It is not surprising thatB is a basis ofV. The reason is that each element fV canbe written in its matrix form using the basisB ofV and{1}ofF. Then then matrix for viis

[vi ]B{1} = (0 0 1 0 0) ,

where the 1 is in the i-th spot. Clearly these form a basis for M1,n(F) and since the mapsending linear transformations to their matrices relative to these bases is an isomorphisms,so shouldB be a basis ofV.

There is an alternate characterization: each vi is in L(V, F) so can be identified by itsaction on the basis B :

vi (vj) =

1 ifi = j

0 otherwise.

One nice thing about considering the dual basis B is that we can write an arbitraryfV in terms of the basis B quite easily.

31


32/99

Proposition 3.1.3. LetB be a basis forV andB the dual basis forV. Then iffV,

f=f(v1)v1+ +f(vn)vn .

Proof. We simply need to check that both sides give the same answer when evaluated at the

basis ofV. So apply each to vi: the left side gives f(vi) and the right gives

(f(v1)v1+ +f(vn)vn)(vi) =f(v1)v1(vi) + +f(vn)vn(vi) =f(vi)vi (vi) =f(vi) .

A nice way to think about linear functionals involves their nullspaces. By the rank-nullitytheorem, iffV,

dim(N(f)) + dim(R(f)) = dim(V) .

Because R(f) F, it is at most one-dimensional. ThereforeN(f) = V or N(f) is n1dimensional, where n = dim(V). This gives

Iffis not the zero functional, nullity(f) = dim(V) 1. A subspace of this dimensionis called a hyperspace.

Because of the simple structure of the nullspace, we can characterize linear functionals easily.

Proposition 3.1.4. Two nonzero elementsf, gV are equal if and only if they have thesame nullspaceN=N(f) =N(g) and they agree at one vector outsideN.

Proof. One direction is clear, so suppose that N = N(f) = N(g) and v V\ N satisfiesf(v) =g(v). You can check that ifBN is a basis for N thenBN{v}is a basis forV. (Theproof is similar to how we proved the one-subspace theorem.) But thenf and g agree on

BN {v}and must agree everywhere, giving f=g.

3.2 Annihilators

As we have seen before, one useful tool for the study of linear transformations is the nullspace.We will consider the dual version of this now: givenSV, we give a name to those fVsuch that SN(f).Definition 3.2.1. IfSV then the annihilator ofS is

S ={fV :f(s) = 0 for allsS} .

Note that ifST thenS T.Proposition 3.2.2. LetSV(not necessarily a subspace).

1. S is a subspace ofV.

2. S = (Span(S)).

32


33/99

3. LetVbe finite-dimensional withUa subspace. Let{v1, . . . , vk}be a basis forUand ex-tend it to a basis{v1, . . . , vn}forV. If{v1, . . . , vn}is the dual basis then{vk+1, . . . , vn}is a basis forU.

Proof. For the first item, S contains the zero linear functional, so it is nonempty. If

f, gS

and c Fthen for any sS,(cf+ g)(s) =cf(s) +g(s) =c 0 + 0 = 0 ,

so cf+gS. ThusS is a subspace ofV.Next sinceSSpan(S), we haveS (Span(S)). Conversely, iff(s) = 0 for allsS

then let a1s1+ +akskSpan(S). Then

f(a1s1+ +aksk) =a1f(s1) + +akf(sk) = 0 ,

so f(Span(S)).For the third item, each ofvk

+1

, . . . , vn annihilates v1, . . . , vk, so they annihilate every-thing in the span, that is, U. In other words, they are in U, and we already know theyare linearly independent since they are part of the dual basis. To show they span U, letfU and writefin terms of the dual basis using the previous proposition:

f=f(v1)v1+ +f(vk)vk+f(vk+1)vk+1+ +f(vn)vn = f(vk+1)vk+1+ +f(vn)vn

Span({vk+1, . . . , vn}) .

Corollary 3.2.3. IfVis finite dimensional andW is a subspace,

dim(V) =dim(W) +dim(W) .

Proof. This follows from item 3 above.

3.3 Transpose

Given T : V W that is linear, we will define a corresponding transformation Tt on thedual spaces, but it will act in the other direction. We will have Tt :W V.Definition 3.3.1. If T : V W is linear, we define the function Tt : W V by the

following. Giveng

W, setTt(g)

V as the linear functional such that

(Tt(g))(v) =g(T(v)) for allvV .

Tt is called the transpose ofT.

Note that the definition here is Tt(g) =g T. Since both maps on the right are linear,so is their composition. So Tt(g) is in fact a linear functional (it is in V).

33


34/99

Proposition 3.3.2. IfT :VW is linear thenTt :W V is linear.Proof. Let g1, g2 W and c F. We want to show that Tt(cg1+ g2) = cTt(g1) + Tt(g2)and these are both elements ofV, so we want to show they act the same on each elementofV. So let vVand compute

(Tt(cg1+g2))(v) = (cg1+g2)(T(v)) = cg1(T(v)) +g2(T(v)) = c(Tt(g1))(v) + (T

t(g2))(v)

= (cTt(g1) +Tt(g2))(v) .

The matrix ofTt can be written in a very simple way using dual bases.

Theorem 3.3.3. LetT :VW be linear andB, C bases forV andW. WritingB andC for the dual bases,

[Tt]C

B =

[T]BC

t

.

The matrix on the right the transpose matrix; that is, ifA is a matrix then the transposedmatrixAt is defined by(At)i,j =Aj,i.

Proof. Let B ={v1, . . . , vn} and C ={w1, . . . , wm}. When we build the matrix [T]BC, wemake the j-th column by expanding T(vj) in terms of the basis C. So our matrix can berewritten as

[T]BC=

w1(T(v1)) w

1(T(v2)) w1(T(vn))

w2(T(v1)) w2(T(v2)) w2(T(vn))

wm(T(v1)) w

m(T(v2)) wm(T(vn))

.To build the matrix [Tt]C

B, we begin with the first vector ofC and express it in terms

ofB. We writeTt(w1) =a1v

1+ +anvn .

The coefficients have a simple form:

Tt(w1) = (Tt(w1))(v1)v

1+ + (Tt(w1))(vn)vn

=w1(T(v1))v1+ +w1(T(vn))vn .

This means the first column of our matrix is

w1(T(v1))

w1(T(vn)) .This is just the first row of [T]BC. Similarly, the j -th column is the j -th row of [T]

BCand this

completes the proof.

Proposition 3.3.4. LetT :VWbe linear withV, W finite-dimensional. Then

34


35/99

1. N(Tt) =R(T),

2. R(Tt) =N(T),

3. rank(Tt) =rank(T) and nullity(Tt) =nullity(T).

Proof. For the first item, let g R(T)

. Then we would like to show that g N(Tt

), orthatTt(g) = 0. SinceTt(g)V this amounts to showing that (Tt(g))(v) = 0 for allvV.So let vVand compute

(Tt(g))(v) =g(T(v)) = 0 ,

sinceg annihilates the range ofT. This shows that R(T) N(Tt). For the other direction,letgN(Tt) andwR(T). Then we can findvV such that w= T(v) and so

g(w) =g(T(v)) = (Tt(g))(v) = 0 ,

sinceTt(g) = 0. This completes the proof of the first item.Next iffR(Tt) we can find gW such that f=Tt(g). IfvN(T) then

f(v) = (Tt

(g))(v) =g(T(v)) = g(0) = 0 ,so f N(T). This shows that R(Tt) N(T). To show the other direction, we countdimensions.

dimR(Tt) = dim W dim N(Tt)= dimW dim R(T)= dimR(T)

= dimW dim N(T)= dimN(T) .

Since these spaces have the same dimension and one is contained in the other, they must beequal.

The last item follows from dimension counting as well.

3.4 Double dual

We now move one level up, to look at the dual of the dual, the double dual.

Definition 3.4.1. IfV is a vector space, we define the double dualV as the dual ofV.It is the spaceL(V,F) of linear functionals onV.

As before, whenV is finite-dimensional, since dim L(V,F) = dim (V) dim (F), we find

dim V = dimV when dimV


36/99

evalvis a linear functional onV. To see this, note first that it certainly mapsV to F,so we must only show it is linear. The proof is the same as in the previous homework:letf , gV and c F. Then

evalv(cf+ g) = (cf+ g)(v) =cf(v) +g(v) =c evalv(f) +evalv(g) .

In fact the map :VV given by (v) =evalv is an isomorphism when dim V 1(k+ 1) thenthis is an inversion pair for since (1(k)) = k < k+ 1 = (1(k+ 1)). However then (1(k)) = k + 1> k= (1(k + 1)), so it is not an inversion pair for and thereforeis inI nv( )Inv(). In the case that 1(k)< 1(k+ 1) a similar argument shows that{1(k), 1(k+ 1)} is not an inversion pair for but it is one for and therefore is inInv( )Inv().

Now we must show that if

{a, b

} =

{1(k), 1(k+ 1)

}then

{a, b

}is an inversion pair

for if and only if it is an inversion pair for . We will just show one direction; the otheris similar. This will prove thatInv( )Inv() does not contain any other elements andwe will be done with the lemma.

So suppose that {a, b} is an inversion pair for but it is not equal to {1(k), 1(k+1)}.If neither ofa, bare equal to 1(k), 1(k+ 1) then we have

((a)) ((b)) = (a) (b) =b a ,

40


41/99

so {a, b} is an inversion pair for . Otherwise if exactly one ofa, bis equal to1(k), 1(k+1) then let us suppose that a < b (else we can just switch the roles ofa and b). Then because{a, b} is an inversion pair for we have (b) < (a), so if a = 1(k), we must have(b) < k = (a), so (b) = (b) < (a) < k+ 1 = (a), so{a, b} is still an inversionpair for . If instead a = 1(k + 1) we cannot have b = 1(k), so (b) < k, giving

(b) =(b) < k = (a) and{a, b} is an inversion pair for . Last, if(a) / {k, k+ 1}we must have (b) {k, k+ 1} and therefore if (b) = k, (a) > k+ 1, giving (b) =k + 1< (a) = (a), so{a, b}is an inversion pair for . If(b) =k + 1 then(a)> k + 1and (b) = k < k+ 1 = (b) < (a) = (a), so{a, b} is an inversion pair for . Thiscompletes the proof.

4.2 Determinants: existence and uniqueness

Given n vectorsv1, . . . , vn in Rn we want to define something like the volume of the paral-

lelepiped spanned by these vectors. What properties would we expect of a volume?

1. vol(e1, . . . , en) = 1.

2. If two of the vectors vi are equal the volume should be zero.

3. For eachc >0, vol(cv1, v2, . . . , vn) =c vol(v1, . . . , vn). Same in other arguments.

4. For eachv1,vol(v1+v1, v2, . . . , vn) =vol(v1, . . . , vn)+vol(v

1, v2, . . . , vn). Same in other

arguments.

Using the motivating example of the volume, we define a multilinear function as follows.

Definition 4.2.1. IfV is ann-dimensional vector space overF then define

Vn ={(v1, . . . , vn) :viV for alli= 1, . . . , n} .

A functionf :Vn Fis called multilinear if for eachiand vectorsv1, . . . , vi1, vi+1, . . . , vnV, the functionfi: VF is linear, where

fi(v) =f(v1, . . . , vi1, v , vi+1, . . . , vn) .

A multilinear functionf is called alternating iff(v1, . . . , vn) = 0 whenevervi = vj for somei=j .Proposition 4.2.2. Letf :Vn

F be a multilinear function. IfFdoes not have charac-

teristic two thenf is alternating if and only if for allv1, . . . , vn andi < j,

f(v1, . . . , vi, . . . , vj, . . . , vn) =f(v1, . . . , vj , . . . , vi, . . . , vn) .

41


42/99

Proof. Suppose that f is alternating. Then

0 = f(v1, . . . , vi+vj, . . . , vi+vj, . . . , vn)

= f(v1, . . . , vi, . . . , vi+vj, . . . , vn) +f(v1, . . . , vj , . . . , vi+vj, . . . , vn)

= f(v1, . . . , vi, . . . , vj, . . . , vn) +f(v1, . . . , vj, . . . , vi, . . . , vn) .

Conversely suppose that fhas the property above. Then ifvi = vj,

f(v1, . . . , vi, . . . , vj, . . . , vn) = f(v1, . . . , vj , . . . , vi, . . . , vn)= f(v1, . . . , vi, . . . , vj, . . . , vn) .

SinceFdoes not have characteristic two, this means this is zero.

Corollary 4.2.3. Letf :Vn Fbe an n-linear alternating function. Then for eachSn,f(v(1), . . . , v(n)) =sgn() f(v1, . . . , vn) .

Proof. Write = 1 k where the is are transpositions and (1)k =sgn(). Thenf(v(1), . . . , v(n)) =f(v1k1(1), . . . , v1k1(n)) .

Applying this k 1 more times gives the corollary.Theorem 4.2.4. LetV be anF-vector space and{e1, . . . , en} a basis. There exists a uniquen-linear alternating functionf onV such thatf(e1, . . . , en) = 1.

Proof. We will first prove uniqueness, so assume that f is an n-linear alternating functionon V such that f(v1, . . . , vn) = 1. We will show that f must have a certain form. Letv1, . . . , vn

Vand write then as

vk =a1,ke1+ +an,ken .We can then expand using n-linearity:

f(v1, . . . , vn) =f(a1,1e1+ +an,1en, v2, . . . , vn) =n

i1=1

ai,1f(ei, v2, . . . , vn)

=n

i1=1

n

in=1

ai1,1 ain,nf(ei1, . . . , ein)

= i1,...,in ai1,1 ain,nf(ei1, . . . , ein) .Since f is alternating, all choices ofi1, . . . , in that are not distinct have f(ei1, . . . , ein) = 0.So we can write this as

i1,...,in distinct

ai1,1 ain,nf(ei1, . . . , ei,n) .

42


43/99

The choices of distinct i1, . . . , in can be made using permutations. Each permutation Sngives exactly one such choice. So we can yet again write as

Sn

a(1),1 a(n),nf(e(1), . . . , e(n)) .

Using the lemma from last time, f(e(1), . . . , e(n)) = sgn()f(e1, . . . , en) = sgn(), so

f(v1, . . . , vn) =Sn

sgn()a(1),1 a(n),n .

Ifg is any other n-linear alternating function with g(e1, . . . , en) = 1 then the same compu-tation as above gives the same formula for g(v1, . . . , vn), so f=g. This shows uniqueness.

For existence, we need to show that the formula above actually gives an n-linear alter-nating function with f(e1, . . . , en) = 1.

1. We first show f(e1, . . . , en) = 1. To do this, we write

ek =a1,ke1+ +an,ken ,

where ak,j = 0 unless j = k, in which case it is 1. If Sn is not the identity,we can find k= j such that (k) = j. This means that a(k),k = aj,k = 0 and sosgn()a(1),1 a(n),n= 0. Therefore

f(e1, . . . , en) =Sn

sgn()a(1),1 a(n),n= sgn(id)a1,1 an,n= 1 .

2. Next we show alternating. Suppose that vi = vj for some i= j and let i,j be the

transposition (ij). Split all permutations intoA, those which invertiand j and Sn\A,those which do not. Then ifvk = a1,ke1+ +an,ken, we can write

f(v1, . . . , vn) =Sn

sgn()a(1),1 a(n),n

=A

sgn()a(1),1 a(n),n+

Sn\A

sgn(i,j)ai,j(1),1 ai,j(n),n

=A

sgn()[a(1),1 a(n),n ai,j(1),1 ai,j(n),n] .

Note however that ai,j(i),i= a(j),i= a(j),j, sincevi= vj. Similarlyai,j(j),j =a(i),i.Therefore

a(1),1 a(n),n= ai,j(1),1 ai,j(n),n .So the above sum is zero and we are done.

43


44/99

3. Forn-linearity, we will just show it in the first coordinate. So let v, v1, . . . , vnV andc F. Writing

vk = a1,ke1+ +an,ken and v = a1e1+ +anen ,thencv +v1 = (ca1+a1,1)e1+

+ (can+an,1)en, so

f(cv+v1, v2, . . . , vn) =Sn

sgn()(ca(1)+a(1),1)a(2),2 a(n),n

=cSn

sgn()a(1)a(2),2 a(n),n+Sn

sgn()a(1),1 a(n),n

=cf(v, v2, . . . , vn) +f(v1, . . . , vn) .

One nice property of n-linear alternating functions is that they can determine whenvectors are linearly independent.

Theorem 4.2.5. LetV be ann-dimensionalF-vector space andfa nonzero n-linear alter-nating function onV. Then{v1, . . . , vn}is linearly independent if and only iff(v1, . . . , vn)=0.

Proof. Ifn = 1 then the proof is an exercise, so take n2 and first assume that the vectorsare linearly dependent. Then we can write one as a linear combination of the others. Supposefor example that v1 = b2v2+ +bnvn. Then

f(v1, . . . , vn) =b2f(v2, v2, . . . , vn) + +bnf(vn, v2, . . . , vn) = 0.Here we have used that f is alternating.

Conversely suppose that{

v1, . . . , vn}

is linearly independent. Then it must be a basis.We can then proceed exactly along the development given above and, ifu1, . . . , unare vectorswritten as

uk =a1,kv1+ +an,kvn ,then iff(v1, . . . , vn) = 0, we find

f(u1, . . . , un) =Sn

sgn()a(1),1 a(n),nf(v1, . . . , vn) = 0 .

Thereforef is zero. This is a contradiction, so f(v1, . . . vn)= 0.Definition 4.2.6. Choosing V = Fn and e1, . . . , en the standard basis, we write det (the

determinant) for the uniquen-linear alternating function such thatdet(e1, . . . , en) = 1. IfAMn,n(F) we definedet(A) = det(a1, . . . ,an), whereai is thei-th column ofA.Corollary 4.2.7. LetAMn,n(F). Thendet(A)= 0 if and only ifA is invertible.Proof. By the previous theorem, det(A)= 0 if and only if the columns of A are linearlyindependent. This is equivalent to saying that the column rank of A is n, or that A isinvertible.

44


45/99

4.3 Properties of the determinant

One of the most important properties of the determinant is that it factors through products(compositions).

Theorem 4.3.1. LetA, B

Mn,n(F). We have the following factorization:

det AB= det A det B .Proof. First if det A= 0 the matrix A cannot be invertible and therefore neither is AB, sodet AB= 0, proving the formula in that case. Otherwise we have det A= 0. In this case wewill use a method of proof that is very common when dealing with determinants. We willdefine a function on matrices that is n-linear and alternating as a function of the columns,mapping the identity to 1, and use the uniqueness of the determinant to see that it is justthe determinant function. So define f :Mn,n(F) Fby

f(B) =det AB

det A

.

First note that ifIn is the n n identity matrix, f(In) = (det AIn)/ det A= 1. Next, ifB has two equal columns, its column rank is strictly less than nand so is the column rankofAB , meaning that AB is non-invertible. This gives f(B) = 0/ det A= 0.

Last to show n-linearity offas a function of the columns ofB, write B in terms of itscolumns as (b1, . . . ,bn). Note that ifei is the i-th standard basis vector, then we can writebi= Bei. Therefore thei-th column ofAB is (AB)ei= Abi and AB = (Ab1, . . . , Abn). Thus

ifb1,b1 are column vectors and c F,

det

A(cb1+ b

1,b2, . . . ,bn)

= det(A(cb1+ b1), Ab2, . . . , Abn)

= det(cAb1+Ab1, Ab2, . . . , Abn)

=c det(Ab1, Ab2, . . . , Abn) + det(Ab1, Ab2, . . . , Abn) .

This means that det AB isn-linear (at least in the first column the same argument worksfor all columns), and so is f.

There is exactly onen-linear alternating functionfwithf(In) = 1, sof(B) = det B.

Here are some consequences.

Similar matrices have the same determinant.

Proof.

det P1AP= det P1 det A det P= det A det P1 det P= det A det In= det A .

IfA is invertible then det(A1) = 1detA .

45


46/99

The definition of the determinant is probably different than what you may have seenbefore. So now we will relate the definition to the Laplace (cofactor) expansion.

Definition 4.3.2. GivenAMn,n(F), the(i, j)-th minor ofA, writtenA(i|j), is then 1 n 1 matrix formed by removing the i-th row and thej-th column fromA.

The Laplace expansion is a recursive formula for the determinant. We can write det Ain terms of the determinant of smaller matrices, the minors ofA.

Theorem 4.3.3. LetAMn,n(F) with entries(ai,j). Then

det A=n

i=1

(1)i1ai,1det A(i|1) .

Proof. Write the columns ofA as a1, . . . ,an witha1 = a1,1e1+ + an,1en (where e1, . . . , enare the standard basis vectors) and use n-linearity on the matrix A= (a1, . . . ,an) to get

det A= det(a1,1e1, a2, . . . ,an) + + det(an,1en, a2, . . . ,an) . =

n

i=1 ai,1det(ei, a2, . . . ,an) .Now we must only show that det(ei,a2, . . . ,an) = (1)i1 det A(i|1).

We will need to use two facts from the homework.

1. For any matrix B, det B = det Bt. As a consequence of this, det is n-linear andalternating when viewed as a function of the rows ofB .

2. IfB is anyblock upper triangular matrix; that is, of the form

B=

C D

0 E

for square matrices Cand E, then det B= det C det E.

So now use i 1 adjacent row swaps to turn the matrix (ei, a2, . . . ,an) into

1 ai,2 ai,3 ai,n0 a1,2 a1,3 a1,n

0 ai1,2 ai1,3 ai1,n0 ai+1,2 ai+1,3 ai+1,n

0 an,2 an,3 an,n

.

Since we applied i 1 transpositions, the determinant of this matrix equals (1)i1 timesdet(ei, a2, . . . ,an). Now we apply the block upper-triangular result, noting that this matrixis of the form

1 D0 A(i|1)

.

Therefore det(ei, a2, . . . ,an) = (1)i1 det A(i|1) and we are done.

46


47/99

There is a more general version of this result. The above we call expanding along thefirst column. We can expand along thej-th column by first applying j 1 adjacent columnswaps to get

det A=n

i=1(1)i+jai,jdet A(i|j) .

By taking the transpose initially we can expand along any row too.

4.4 Exercises

1. Prove that Sn, the set of permutations on n letters, is a group under composition.Show that Sn is abelian (its multiplication is commutative) if and only ifn n. Show that f is identically zero.

7. Suppose that A Mn,n(F) is upper-triangular; that is, ai,j = 0 if i > j. Show thatdet A= a1,1a2,2 an,n. (Dont use the next exercise though!)

8. This exercise is a generalization of the previous one to block upper-triangular matrices.For n 2 we say that M Mn,n(F) is block upper-triangular if there exists k with1k n 1 and matrices AMk,k(F), B Mk,nk(F) and C Mnk,nk(F) suchthatMhas the form

A B0 C

.

That is, the elements ofMare given by

Mi,j =

Ai,j 1ik, 1jkBi,jk 1ik, k < jn0 k < in, 1jkCik,jk k < in, k < jn

.

47


48/99

We will show in this exercise that

det M= det A det C .

(a) Show that if det C= 0 then the above formula holds.

(b) Suppose that det C= 0 and define a function : Mk,k(F) F by

(A) = [det C]1 det

A B0 C

.

That is,(A) is a scalar multiple of the determinant of the block upper-triangularmatrix we get when we replace A by A and keep B andCfixed.

i. Show that is k-linear as a function of the columns ofA.

ii. Show that is alternating and satisfies (Ik) = 1, where Ik is the kkidentity matrix.

iii. Conclude that the above formula holds when det C= 0.9. Leta0, . . . , anbe distinct complex numbers. WriteMn(a0, . . . , an) for theVandermonde

matrix 1 a0 a

20 an0

1 a1 a21 an1

1 an a

2n ann

.The goal of this exercise is to prove the Vandermonde determinant formula

det Mn(a0, . . . , an) = 0i


49/99

11. Let AM7,7(C) be anti-symmetric; that is, A=At. What is det A?12. Let T :VV be linear andB a finite basis for V. We define

det T= det[T]BB .

(a) Show that the above definition does not depend on the choice ofB .

(b) Show that iffis any nonzero n-linear alternating function on V then

det T =f(T(v1), . . . , T (vn))

f(v1, . . . , vn) ,

where we have writtenB ={v1, . . . , vn}. (This is an alternate definition of det T.)13. Let AMnn(F) for some field F. Recall that if 1i, jn then the (i, j)-th minor

ofA, writtenA(i|j), is the (n 1) (n 1) matrix obtained by removing the i-th rowandj -th column from A. Define the cofactor

Ci,j = (1)i+j det A(i|j) .Note that the Laplace expansion for the determinant can be written

det A=n

i=1

Ai,jCi,j .

(a) Show that if 1j, kn withj=k thenn

i=1 Ai,kCi,j = 0 .(b) Define the classical adjointofA, written adj A, by

(adjA)i,j =Cj,i .

Show that (adj A)A= (det A)I.

(c) Show that A(adjA) = (det A)Iand deduce that ifA is invertible then

A1 = (det A)1adj A .

Hint: begin by applying the result of the previous part to At.

(d) Use the formula in the last part to find the inverses of the following matrices:

1 2 41 3 91 4 16

,

1 2 3 41 0 0 00 1 1 16 0 1 1

.

49


50/99

14. Consider a system of equations in nvariables with coefficients from a field F. We canwrite this asAX=Y for annnmatrixA, ann1 matrixX(with entriesx1, . . . , xn)and ann 1 matrixY(with entriesy1, . . . , yn). Given the matricesA and Ywe wouldlike to solve for X.

(a) Show that(det A)xj =

ni=1

(1)i+jyidet A(i|j) .

(b) Show that if det A= 0 then we have

xj = (det A)1 det Bj ,

whereBj is an n nmatrix obtained from A by replacing the j -th column ofAbyY. This is known as Cramers rule.

(c) Solve the following systems of equations using Cramers rule.

2x y+z = 32y z = 1y x = 1

2x y+z 2t =52x+ 2y 3z+t =1x+y z =14x 3y+ 2z 3t =8

4.5 Exercises on polynomials

1. Let Fbe a field and write F[x] for the set of polynomials with coefficients in F. Definedeg(p) for the degree ofp F[x]: the largest k such that the coefficient ofxk in p isnonzero. The degree of the zero polynomial is defined to be.

(a) Show that for p, q F[x], the product pqhas degree deg(pq) = deg(p) + deg(q).(b) Show that for p, d F[x] such that d is nonzero, there exist q, r F[x] such that

p= qd +r and deg(r)< deg(d). (This result is called the division algorithm.)

Hint. We may assume that deg(p)0, for otherwise we can choose r =q= 0.Also we can assume deg(d)deg(p), or else we choose q= 0 and r =p. So useinduction on deg(p), starting with deg(p) = 0, meaning that p(x) = c for somenonzero c F. For the inductive step, if deg(p) > 0, find some q1 F[x] suchthat deg(p q1d)< deg(p) and continue.

2. Show that ifp F[x] and c F then p(c) = 0 if and only if the polynomial xcdividesp (that is, we can find d F[x] such that (x c)d= p).

3. Let p, q F[x] be nonzero and define the subsetSofF[x] as

S={ap+bq: a, b F[x]} .

50


51/99

(a) Let d Sbe nonzero of minimal degree. Show that d divides both p and q(seethe definition of divides in exercise 2).

(b) Show that ifs F[x] divides both p and qthensdividesd.(c) Conclude that there exists a unique monic polynomial (that is, with leading co-

efficient 1) d F[x] satisfying:i. ddivides both pandqand

ii. ifs F[x] divides both p and qthens divides d.(Thisd is called the greatest common divisor ofp andq.)

4. A field F is called algebraically closed if every pF[x] with deg(p)1 has a zero inF. Prove that ifF is algebraically closed then for any nonzero p F[x], we can finda, 1, . . . , k Fand natural numbers n1, . . . , nk withn1 + + nk= deg(p) such that

p(x) =a(x 1)n1 (x k)nk .

Here we say that 1, . . . , k are the roots ofp and n1, . . . , nk are their multiplicities.

Hint. Use induction on the degree ofp.

5. Let Fbe algebraically closed. Show that for nonzero p, q F[x], the greatest commondivisor ofp and q is 1 if and only ifp and q have no common root. Is this true forF = R?

51


52/99

5 Eigenvalues

5.1 Diagonalizability

Our goal for most of the rest of the semester is to classify all linear transformations T :VVwhen dimV


53/99

coefficients. We may assume that there are at least two coefficients, or else we would havea1v1 = 0 and since v1= 0 we would have a1 = 0, meaning all coefficients are zero and{v1, . . . , vk} is linearly independent.

So applyTto both sides:

a1T(v1) + +akT(vk) = 0 .Since these are eigenvectors, we can rewrite as

a11v1+ +akkvk= 0 .

However multiplying the linear combination by 1 we get

a11v1+ +ak1vk = 0 .

Subtracting these two,

a2(1 2)v2+ +ak(1 k)vk= 0 .All is were distinct and all ais were nonzero, so this is a linear combination of the visequal to zero with fewer nonzero coefficients than in the original one, a contradiction.

For a matrix AMn,n(F), we define its eigenvalues and eigenvectors similarly: is aneigenvalue ofA if there is a nonzero v Fn such that A v= v.

To find the eigenvalues, we make the following observation.

is an eigenvalue forAif and only if there exists a nonzero vsuch that (IA)(v) = 0.This is true if and only ifI A is not invertible. Therefore

is an eigenvalue ofA(I A) not invertible det(I A) = 0 .

This leads us to define

Definition 5.1.5. The characteristic polynomial of a matrixA Mn,n(F) is the functioncA: F F given by

cA(x) = det(xI A) .The definition is similar for a linear transformation. The characteristic polynomial of

T : V V is cT(x) = det[xI T]BB, where B is any finite basis ofV. (You will show on

homework that this definition does not depend on the choice of basis.)Facts about the characteristic polynomial.

1. cA is a monic polynomial of degree n.

53


54/99

Proof. We simply write out the definition of the determinant, using the notation thatAi,j(x) is the (i, j)-th entry ofxI A:

cA(x) = det(xI A) =Sn

sgn A(1),1(x) A(n),n(x) .

Each term in this sum is a product ofn polynomial

princeton mat 217 honor linear algebra 2013

Documents