basic algebra - stony brook universityaknapp/download/b2-alg-inside.pdf · contents of advanced...

Basic Algebra

Digital Second EditionsBy Anthony W. Knapp

Basic Algebra

Advanced Algebra

Basic Real Analysis,with an appendix Elementary Complex Analysis

Advanced Real Analysis

Anthony W. Knapp

Basic Algebra

Along with a Companion Volume Advanced Algebra

Digital Second Edition, 2016

Published by the AuthorEast Setauket, New York

Anthony W. Knapp81 Upper Sheep Pasture RoadEast Setauket, N.Y. 117331729, U.S.A.Email to: [email protected]: www.math.stonybrook.edu/aknapp

Title: Basic AlgebraCover: Construction of a regular heptadecagon, the steps shown in color sequence; see page 505.

Mathematics Subject Classification (2010): 1501, 2001, 1301, 1201, 1601, 0801, 18A05,68P30.

First Edition, ISBN-13 978-0-8176-3248-9c2006 Anthony W. KnappPublished by Birkhauser Boston

Digital Second Edition, not to be sold, no ISBNc2016 Anthony W. KnappPublished by the Author

All rights reserved. This file is a digital second edition of the above named book. The text, images,and other data contained in this file, which is in portable document format (PDF), are proprietary tothe author, and the author retains all rights, including copyright, in them. The use in this file of tradenames, trademarks, service marks, and similar items, even if they are not identified as such, is notto be taken as an expression of opinion as to whether or not they are subject to proprietary rights.All rights to print media for the first edition of this book have been licensed to Birkhuser Boston,c/o Springer Science+Business Media Inc., 233 Spring Street, New York, NY 10013, USA, andthis organization and its successor licensees may have certain rights concerning print media for thedigital second edition. The author has retained all rights worldwide concerning digital media forboth the first edition and the digital second edition.The file is made available for limited noncommercial use for purposes of education, scholarship, andresearch, and for these purposes only, or for fair use as understood in the United States copyright law.Users may freely download this file for their own use and may store it, post it online, and transmit itdigitally for purposes of education, scholarship, and research. They may not convert it from PDF toany other format (e.g., EPUB), they may not edit it, and they may not do reverse engineering with it.In transmitting the file to others or posting it online, users must charge no fee, nor may they includethe file in any collection of files for which a fee is charged. Any exception to these rules requireswritten permission from the author.Except as provided by fair use provisions of theUnited States copyright law, no extracts or quotationsfrom this file may be used that do not consist of whole pages unless permission has been granted bythe author (and by Birkhuser Boston if appropriate).The permission granted for use of the whole file and the prohibition against charging fees extend toany partial file that contains only whole pages from this file, except that the copyright notice on thispage must be included in any partial file that does not consist exclusively of the front cover page.Such a partial file shall not be included in any derivative work unless permission has been grantedby the author (and by Birkhuser Boston if appropriate).Inquiries concerning print copies of either edition should be directed to Springer Science+BusinessMedia Inc.

iv

To Susan

and

To My Children, Sarah and William,

and

To My Algebra Teachers:

Ralph Fox, John Fraleigh, Robert Gunning,

John Kemeny, Bertram Kostant, Robert Langlands,

Goro Shimura, Hale Trotter, Richard Williamson

CONTENTS

Contents of Advanced Algebra xPreface to the Second Edition xiPreface to the First Edition xiiiList of Figures xviiDependence Among Chapters xixStandard Notation xxGuide for the Reader xxi

I. PRELIMINARIES ABOUT THE INTEGERS,POLYNOMIALS, ANDMATRICES 11. Division and Euclidean Algorithms 12. Unique Factorization of Integers 43. Unique Factorization of Polynomials 94. Permutations and Their Signs 155. Row Reduction 196. Matrix Operations 247. Problems 30

II. VECTOR SPACES OVER Q, R, AND C 331. Spanning, Linear Independence, and Bases 332. Vector Spaces Defined by Matrices 383. Linear Maps 424. Dual Spaces 505. Quotients of Vector Spaces 546. Direct Sums and Direct Products of Vector Spaces 587. Determinants 658. Eigenvectors and Characteristic Polynomials 739. Bases in the Infinite-Dimensional Case 7810. Problems 82

III. INNER-PRODUCT SPACES 891. Inner Products and Orthonormal Sets 892. Adjoints 993. Spectral Theorem 1054. Problems 112

vii

viii Contents

IV. GROUPS AND GROUP ACTIONS 1171. Groups and Subgroups 1182. Quotient Spaces and Homomorphisms 1293. Direct Products and Direct Sums 1354. Rings and Fields 1415. Polynomials and Vector Spaces 1486. Group Actions and Examples 1597. Semidirect Products 1678. Simple Groups and Composition Series 1719. Structure of Finitely Generated Abelian Groups 17610. Sylow Theorems 18511. Categories and Functors 18912. Problems 200

V. THEORY OF A SINGLE LINEAR TRANSFORMATION 2111. Introduction 2112. Determinants over Commutative Rings with Identity 2153. Characteristic and Minimal Polynomials 2184. Projection Operators 2265. Primary Decomposition 2286. Jordan Canonical Form 2317. Computations with Jordan Form 2388. Problems 241

VI. MULTILINEAR ALGEBRA 2481. Bilinear Forms and Matrices 2492. Symmetric Bilinear Forms 2533. Alternating Bilinear Forms 2564. Hermitian Forms 2585. Groups Leaving a Bilinear Form Invariant 2606. Tensor Product of Two Vector Spaces 2637. Tensor Algebra 2778. Symmetric Algebra 2839. Exterior Algebra 29110. Problems 295

VII. ADVANCED GROUP THEORY 3061. Free Groups 3062. Subgroups of Free Groups 3173. Free Products 3224. Group Representations 329

Contents ix

VII. ADVANCED GROUP THEORY (Continued)5. Burnsides Theorem 3456. Extensions of Groups 3477. Problems 360

VIII. COMMUTATIVE RINGS AND THEIR MODULES 3701. Examples of Rings and Modules 3702. Integral Domains and Fields of Fractions 3813. Prime and Maximal Ideals 3844. Unique Factorization 3875. Gausss Lemma 3936. Finitely Generated Modules 3997. Orientation for Algebraic Number Theory and

Algebraic Geometry 4118. Noetherian Rings and the Hilbert Basis Theorem 4179. Integral Closure 42010. Localization and Local Rings 42811. Dedekind Domains 43712. Problems 443

IX. FIELDS AND GALOIS THEORY 4521. Algebraic Elements 4532. Construction of Field Extensions 4573. Finite Fields 4614. Algebraic Closure 4645. Geometric Constructions by Straightedge and Compass 4686. Separable Extensions 4747. Normal Extensions 4818. Fundamental Theorem of Galois Theory 4849. Application to Constructibility of Regular Polygons 48910. Application to Proving the Fundamental Theorem of Algebra 49211. Application to Unsolvability of Polynomial Equations with

Nonsolvable Galois Group 49312. Construction of Regular Polygons 49913. Solution of Certain Polynomial Equations with Solvable

Galois Group 50614. Proof That Is Transcendental 51515. Norm and Trace 51916. Splitting of Prime Ideals in Extensions 52617. Two Tools for Computing Galois Groups 53218. Problems 539

x Contents

X. MODULES OVER NONCOMMUTATIVE RINGS 5531. Simple and Semisimple Modules 5532. Composition Series 5603. Chain Conditions 5654. Hom and End for Modules 5675. Tensor Product for Modules 5746. Exact Sequences 5837. Problems 587

APPENDIX 593A1. Sets and Functions 593A2. Equivalence Relations 599A3. Real Numbers 601A4. Complex Numbers 604A5. Partial Orderings and Zorns Lemma 605A6. Cardinality 610

Hints for Solutions of Problems 615Selected References 715Index of Notation 717Index 721

CONTENTS OFADVANCED ALGEBRA

I. Transition to Modern Number TheoryII. WedderburnArtin Ring TheoryIII. Brauer GroupIV. Homological AlgebraV. Three Theorems in Algebraic Number TheoryVI. Reinterpretation with Adeles and IdelesVII. Infinite Field ExtensionsVIII. Background for Algebraic GeometryIX. The Number Theory of Algebraic CurvesX. Methods of Algebraic Geometry

PREFACE TO THE SECOND EDITION

In the years since publication of the first edition of Basic Algebra, many readershave reacted to the book by sending comments, suggestions, and corrections.People especially approved of the inclusion of some linear algebra before anygroup theory, and they liked the ideas of proceeding from the particular to thegeneral and of giving examples of computational techniques right from the start.They appreciated the overall comprehensive nature of the book, associating thisfeature with the large number of problems that develop so many sidelights andapplications of the theory.Along with the general comments and specific suggestions were corrections,

and there were enough corrections, perhaps a hundred in all, so that a secondedition now seems to be in order. Many of the corrections were of minor matters,yet readers should not have to cope with errors along with new material. Fortu-nately no results in the first edition needed to be deleted or seriously modified,and additional results and problems could be included without renumbering.For the first edition, the author granted a publishing license to Birkhauser

Boston that was limited to print media, leaving the question of electronic publi-cation unresolved. The main change with the second edition is that the questionof electronic publication has now been resolved, and a PDFfile, called the digitalsecondedition, is beingmade freely available to everyoneworldwide for personaluse. This file may be downloaded from the authors own Web page and fromelsewhere.Themain changes to the text of the first edition ofBasic Algebra are as follows:

The corrections sent by readers and by reviewers have been made. The mostsignificant such correction was a revision to the proof of Zorns Lemma, theearlier proof having had a gap.

A number of problems have been added at the ends of the chapters, most ofthem with partial or full solutions added to the section of Hints at the back ofthe book. Of particular note are problems on the following topics:(a) (Chapter II) the relationship in two and three dimensions between deter-

minants and areas or volumes,(b) (Chapters V and IX) further aspects of canonical forms for matrices and

linear mappings,(c) (Chapter VIII) amplification of uses of the Fundamental Theorem of

Finitely Generated Modules over principal ideal domains,

xi

xii Preface to the Second Edition

(d) (Chapter IX) the interplay of extension of scalars and Galois theory,(e) (Chapter IX) properties and examples of ordered fields and real closed

fields. Some revisions have been made to the chapter on field theory (Chapter IX).It was originally expected, and it continues to be expected, that a reader whowants a fuller treatment of fields will look also at the chapter on infinitefield extensions in Advanced Algebra. However, the original placement of thebreak between volumes left some possible confusion about the role of normalextensions in field theory, and that matter has now been resolved.

Characteristic polynomials initially have a variable as a reminder of howthey arise from eigenvalues. But it soon becomes important to think of themas abstract polynomials, not as polynomial functions. The indeterminatehad been left as throughout most of the book in the original edition, andsome confusion resulted. The indeterminate is now called X rather than from Chapter V on, and characteristic polynomials have been treatedunambiguously thereafter as abstract polynomials.

Occasional paragraphs have been added that point ahead to material inAdvanced Algebra.The preface to the first edition mentioned three themes that recur throughout

and blend together at times: the analogy between integers and polynomials inone variable over a field, the interplay between linear algebra and group theory,and the relationship between number theory and geometry. A fourth is the gentlemention of notions in category theory to tie together phenomena that occur indifferent areas of algebra; an example of such a notion is universal mappingproperty. Readers will benefit from looking for these and other such themes,since recognizing them helps one get a view of the whole subject at once.It was Benjamin Levitt, Birkhauser mathematics editor in New York, who

encouraged the writing of a second edition, who made a number of suggestionsabout pursuing it, and who passed along comments from several anonymousreferees about the strengths and weaknesses of the book. I am especially gratefulto those readerswhohave sentme comments over the years. Many corrections andsuggestions were kindly pointed out to the author by Skip Garibaldi of EmoryUniversity and Ario Contact of Shiraz, Iran. The long correction concerningZorns Lemma resulted from a discussion with Qiu Ruyue. The typesetting wasdone by the program Textures using AMS-TEX, and the figures were drawn withMathematica.Just as with the first edition, I invite corrections and other comments from

readers. For as long as I am able, I plan to point to a list of known correctionsfrom my own Web page, www.math.stonybrook.edu/aknapp.

A. W. KNAPPJanuary 2016

PREFACE TO THE FIRST EDITION

Basic Algebra and its companion volume Advanced Algebra systematically de-velop concepts and tools in algebra that are vital to every mathematician, whetherpure or applied, aspiring or established. These two books together aim to give thereader a global view of algebra, its use, and its role in mathematics as a whole.The idea is to explain what the youngmathematician needs to know about algebrain order to communicate well with colleagues in all branches of mathematics.The books are written as textbooks, and their primary audience is students who

are learning the material for the first time and who are planning a career in whichthey will use advanced mathematics professionally. Much of the material in thebooks, particularly in Basic Algebra but also in some of the chapters of AdvancedAlgebra, corresponds to normal course work. The books include further topicsthat may be skipped in required courses but that the professional mathematicianwill ultimately want to learn by self-study. The test of each topic for inclusion iswhether it is something that a plenary lecturer at a broad international or nationalmeeting is likely to take as known by the audience.The key topics and features of Basic Algebra are as follows:

Linear algebra and group theory build on each other throughout the book.A small amount of linear algebra is introduced first, as the topic likely to bebetter known by the reader ahead of time, and then a little group theory isintroduced, with linear algebra providing important examples.

Chapters on linear algebra develop notions related to vector spaces, thetheory of linear transformations, bilinear forms, classical linear groups, andmultilinear algebra.

Chapters on modern algebra treat groups, rings, fields, modules, and Galoisgroups, including many uses of Galois groups and methods of computation.

Three prominent themes recur throughout and blend together at times: theanalogy between integers and polynomials in one variable over a field, the in-terplay between linear algebra and group theory, and the relationship betweennumber theory and geometry.

The development proceeds from the particular to the general, often introducingexamples well before a theory that incorporates them.

More than 400 problems at the ends of chapters illuminate aspects of thetext, develop related topics, and point to additional applications. A separate

xiii

xiv Preface to the First Edition

90-page section Hints for Solutions of Problems at the end of the book givesdetailed hints for most of the problems, complete solutions for many.

Applications such as the fast Fourier transform, the theory of linear error-correcting codes, the use of Jordan canonical form in solving linear systemsof ordinarydifferential equations, andconstructionsof interest inmathematicalphysics arise naturally in sequences of problems at the ends of chapters andillustrate the power of the theory for use in science and engineering.Basic Algebra endeavors to show some of the interconnections between

different areas of mathematics, beyond those listed above. Here are examples:Systems of orthogonal functions make an appearance with inner-product spaces.Covering spaces naturally play a role in the examination of subgroups of freegroups. Cohomology of groups arises from considering group extensions. Useof the power-series expansionof the exponential function combineswith algebraicnumbers to prove that is transcendental. Harmonic analysis on a cyclic groupexplains the mysterious method of Lagrange resolvents in the theory of Galoisgroups.Algebra plays a singular role in mathematics by having been developed so

extensively at such an early date. Indeed, the major discoveries of algebra evenfrom the days of Hilbert are well beyond the knowledge of most nonalgebraiststoday. Correspondingly most of the subject matter of the present book is atleast 100 years old. What has changed over the intervening years concerningalgebra books at this level is not so much the mathematics as the point ofview toward the subject matter and the relative emphasis on and generality ofvarious topics. For example, in the 1920s Emmy Noether introduced vectorspaces and linear mappings to reinterpret coordinate spaces and matrices, andshe defined the ingredients of what was then called modern algebratheaxiomatically defined rings, fields, and modules, and their homomorphisms. Theintroduction of categories and functors in the 1940s shifted the emphasis evenmore toward the homomorphisms and away from the objects themselves. Thecreation of homological algebra in the 1950s gave a unity to algebraic topicscutting across many fields of mathematics. Category theory underwent a periodof great expansion in the 1950s and 1960s, followed by a contraction and a returnmore to a supporting role. The emphasis in topics shifted. Linear algebra hadearlier been viewed as a separate subject, with many applications, while grouptheory and the other topics had been viewed as having few applications. Codingtheory, cryptography, and advances in physics and chemistry have changed allthat, and now linear algebra and group theory together permeatemathematics andits applications. The other subjects build on them, and they too have extensiveapplications in science and engineering, as well as in the rest of mathematics.Basic Algebra presents its subject matter in a forward-looking way that takes

this evolution into account. It is suitable as a text in a two-semester advanced

Preface to the First Edition xv

undergraduate or first-year graduate sequence in algebra. Depending on the grad-uate school, it may be appropriate to include also some material from AdvancedAlgebra. Briefly the topics in Basic Algebra are linear algebra and group theory,rings, fields, and modules. A full list of the topics in Advanced Algebra appearson page x; of these, the Wedderburn theory of semisimple algebras, homologicalalgebra, and foundational material for algebraic geometry are the ones that mostcommonly appear in syllabi of first-year graduate courses.A chart on page xix tells the dependence among chapters and can help with

preparing a syllabus. Chapters IVII treat linear algebra and group theory atvarious levels, except that three sections of Chapter IV and one of Chapter Vintroduce rings and fields, polynomials, categories and functors, and determinantsover commutative ringswith identity. ChapterVIII concerns rings, with emphasison unique factorization; Chapter IX concerns field extensions and Galois theory,with emphasis on applications of Galois theory; and Chapter X concernsmodulesand constructions with modules.For a graduate-level sequence the syllabus is likely to include all of Chapters

IV and parts of Chapters VIII and IX, at a minimum. Depending on theknowledge of the students ahead of time, it may be possible to skim much ofthe first three chapters and some of the beginning of the fourth; then time mayallow for some of Chapters VI and VII, or additional material from Chapters VIIIand IX, or some of the topics in Advanced Algebra. For many of the topics inAdvanced Algebra, parts of Chapter X of Basic Algebra are prerequisite.For an advanced undergraduate sequence the first semester can include Chap-

ters I through III except Section II.9, plus the first six sections of Chapter IV andas much as reasonable from Chapter V; the notion of category does not appearin this material. The second semester will involve categories very gently; thecourse will perhaps treat the remainder of Chapter IV, the first five or six sectionsof Chapter VIII, and at least Sections 13 and 5 of Chapter IX.More detailed information about how the book can be used with courses can

be deduced by using the chart on page xix in conjunction with the section Guidefor the Reader on pages xxixxiv. In my own graduate teaching, I have built onecourse around Chapters IIII, Sections 16 of Chapter IV, all of Chapter V, andabout half of Chapter VI. A second course dealt with the remainder of ChapterIV, a little of Chapter VII, Sections 16 of Chapter VIII, and Sections 111 ofChapter IX.The problems at the ends of chapters are intended to play a more important

role than is normal for problems in a mathematics book. Almost all problemsare solved in the section of hints at the end of the book. This being so, someblocks of problems form additional topics that could have been included in thetext but were not; these blocks may either be regarded as optional topics, or theymay be treated as challenges for the reader. The optional topics of this kind

xvi Preface to the First Edition

usually either carry out further development of the theory or introduce significantapplications. For example one block of problems at the end of Chapter VIIcarries the theory of representations of finite groups a little further by developingthe Poisson summation formula and the fast Fourier transform. For a secondexample blocks of problems at the ends of Chapters IV, VII, and IX introducelinear error-correcting codes as an application of the theory in those chapters.Not all problems are of this kind, of course. Some of the problems are

really pure or applied theorems, some are examples showing the degree to whichhypotheses can be stretched, and a few are just exercises. The reader gets noindication which problems are of which type, nor of which ones are relativelyeasy. Each problem can be solved with tools developed up to that point in thebook, plus any additional prerequisites that are noted.Beyond a standard one-variable calculus course, the most important prereq-

uisite for using Basic Algebra is that the reader already know what a proof is,how to read a proof, and how to write a proof. This knowledge typically isobtained from honors calculus courses, or from a course in linear algebra, orfrom a first juniorsenior course in real variables. In addition, it is assumed thatthe reader is comfortable with a small amount of linear algebra, including matrixcomputations, row reduction ofmatrices, solutions of systems of linear equations,and the associated geometry. Some prior exposure to groups is helpful but notreally necessary.The theorems, propositions, lemmas, and corollaries within each chapter are

indexed by a single number stream. Figures have their own number stream, andone can find the page reference for each figure from the table on pages xviixviii.Labels on displayed lines occur only within proofs and examples, and they arelocal to the particular proof or example in progress. Some readers like to skimor skip proofs on first reading; to facilitate this procedure, each occurrence of theword PROOF or PROOF is matched by an occurrence at the right margin of thesymbol to mark the end of that proof.

I am grateful to Ann Kostant and Steven Krantz for encouraging this projectand for making many suggestions about pursuing it. I am especially indebted toan anonymous referee, who made detailed comments about many aspects of apreliminary version of the book, and to David Kramer, who did the copyediting.The typesetting was by AMS-TEX, and the figures were drawn withMathematica.I invite corrections and other comments from readers. I plan to maintain a list

of known corrections on my own Web page.A. W. KNAPPAugust 2006

LIST OF FIGURES

2.1. The vector space of lines v +U in R2 parallel to a given line Uthrough the origin 55

2.2. Factorization of linear maps via a quotient of vector spaces 562.3. Three 1-dimensional vector subspaces of R2 such that each pair

has intersection 0 622.4. Universal mapping property of a direct product of vector spaces 642.5. Universal mapping property of a direct sum of vector spaces 652.6. Area of a parallelogram as a difference of areas 883.1. Geometric interpretation of the parallelogram law 923.2. Resolution of a vector into a parallel component and an

orthogonal component 944.1. Factorization of homomorphisms of groups via the quotient of a

group by a normal subgroup 1334.2. Universal mapping property of an external direct product of groups 1374.3. Universal mapping property of a direct product of groups 1374.4. Universal mapping property of an external direct sum of abelian

groups 1394.5. Universal mapping property of a direct sum of abelian groups 1404.6. Factorization of homomorphisms of rings via the quotient of a ring

by an ideal 1474.7. Substitution homomorphism for polynomials in one indeterminate 1514.8. Substitution homomorphism for polynomials in n indeterminates 1574.9. A square diagram 1944.10. Diagrams obtained by applying a covariant functor and a

contravariant functor 1954.11. Universal mapping property of a product in a category 1964.12. Universal mapping property of a coproduct in a category 1985.1. Example of a nilpotent matrix in Jordan form 2345.2. Powers of the nilpotent matrix in Figure 5.1 2346.1. Universal mapping property of a tensor product 2646.2. Diagrams for uniqueness of a tensor product 264

xvii

xviii List of Figures

6.3. Commutative diagram of a natural transformation {TX } 2686.4. Commutative diagram of a triple tensor product 2776.5. Universal mapping property of a tensor algebra 2827.1. Universal mapping property of a free group 3087.2. Universal mapping property of a free product 3237.3. An intertwining operator for two representations 3337.4. Equivalent group extensions 3528.1. Universal mapping property of the integral group ring of G 3748.2. Universal mapping property of a free left R module 3778.3. Factorization of R homomorphisms via a quotient of R modules 3798.4. Universal mapping property of the group algebra RG 3818.5. Universal mapping property of the field of fractions of R 3838.6. Real points of the curve y2 = (x 1)x(x + 1) 4128.7. Universal mapping property of the localization of R at S 4319.1. Closure of positive constructible x coordinates under

multiplication and division 4709.2. Closure of positive constructible x coordinates under square roots 4709.3. Construction of a regular pentagon 5019.4. Construction of a regular 17-gon 50510.1. Universal mapping property of a tensor product of a right R module

and a left R module 575

DEPENDENCE AMONG CHAPTERS

Below is a chart of the main lines of dependence of chapters on prior chapters.The dashed lines indicate helpful motivation but no logical dependence. Apartfrom that, particular examplesmaymake use of information from earlier chaptersthat is not indicated by the chart.

I, II

III

IV.1IV.6

IV.7IV.11 V

VII VI VIII.1VIII.6

X IX.1IX.13 VIII.7VIII.11

IX.14IX.17

xix

STANDARD NOTATION

See the Index of Notation, pp. 717719, for symbols defined starting on page 1.Item Meaning

#S or |S| number of elements in S empty set{x E | P} the set of x in E such that P holdsEc complement of the set EE F, E F, E F union, intersection, difference of setsS

E,T

E union, intersection of the sets EE F, E F E is contained in F , E contains FE $ F, E % F E properly contained in F , properly contains FE F, sS Xs products of sets(a1, . . . , an), {a1, . . . , an} ordered n-tuple, unordered n-tuplef : E F, x 7 f (x) function, effect of functionf g or f g, f

E composition of g followed by f , restriction to E

f ( , y) the function x 7 f (x, y)f (E), f 1(E) direct and inverse image of a seti j Kronecker delta: 1 if i = j , 0 if i 6= jnk

binomial coefficientn positive, n negative n > 0, n < 0Z, Q, R, C integers, rationals, reals, complex numbersmax (and similarly min) maximum of a finite subset of a totally ordered setPor

Qsum or product, possibly with a limit operation

countable finite or in one-one correspondence with Z[x] greatest integer x if x is realRe z, Im z real and imaginary parts of complex zz complex conjugate of z|z| absolute value of z1 multiplicative identity1 or I identity matrix or operator1X identity function on XQn , Rn , Cn spaces of column vectorsdiag(a1, . . . , an) diagonal matrix= is isomorphic to, is equivalent to

xx

GUIDE FOR THE READER

This section is intended to help the reader find out what parts of each chapter aremost important and how the chapters are interrelated. Further information of thiskind is contained in the abstracts that begin each of the chapters.The book pays attention to at least three recurring themes in algebra, allowing

a person to see how these themes arise in increasingly sophisticated ways. Theseare the analogy between integers and polynomials in one indeterminate over afield, the interplay between linear algebra and group theory, and the relationshipbetween number theory and geometry. Keeping track of how these themes evolvewill help the reader understand the mathematics better and anticipate where it isheaded.InChapter I the analogybetween integers andpolynomials inone indeterminate

over the rationals, reals, or complex numbers appears already in the first threesections. The main results of these sections are theorems about unique factoriza-tion in each of the two settings. The relevant parts of the underlying structures forthe two settings are the same, and unique factorization can therefore be proved inboth settings by the same argument. Many readers will already know this uniquefactorization, but it is worth examining the parallel structure and proof at leastquickly before turning to the chapters that follow.Beforeproceedingvery far into thebook, it isworth lookingalso at the appendix

to see whether all its topics are familiar. Readers will find Section A1 usefulat least for its summary of set-theoretic notation and for its emphasis on thedistinction between range and image for a function. This distinction is usuallyunimportant in analysis but becomes increasingly important as one studies moreadvanced topics in algebra. Readers who have not specifically learned aboutequivalence relations and partial orderings can learn about them from SectionsA2 and A5. Sections A3 and A4 concern the real and complex numbers; theemphasis is on notation and the Intermediate Value Theorem, which plays a rolein proving the Fundamental Theorem of Algebra. Zorns Lemma and cardinalityin Sections A5 and A6 are usually unnecessary in an undergraduate course. Theyarise most importantly in Sections II.9 and IX.4, which are normally omitted inan undergraduate course, and in Proposition 8.8, which is invoked only in the lastfew sections of Chapter VIII.The remainder of this section is an overview of individual chapters and pairs

of chapters.

xxi

xxii Guide for the Reader

Chapter I is in threeparts. Thefirst part, asmentionedabove, establishesuniquefactorization for the integers and for polynomials in one indeterminate over therationals, reals, or complex numbers. The second part defines permutations andshows that they have signs such that the sign of any composition is the product ofthe signs; this result is essential for defining general determinants in Section II.7.The third part will likely be a review for all readers. It establishes notation for rowreduction of matrices and for operations on matrices, and it uses row reductionto show that a one-sided inverse for a square matrix is a two-sided inverse.Chapters IIIII treat the fundamentals of linear algebra. Whereas the matrix

computations in Chapter I were concrete, Chapters IIIII are relatively abstract.Much of thismaterial is likely to be a review for graduate students. The geometricinterpretation of vectors spaces, subspaces, and linearmappings is not included inthe chapter, being taken as known previously. The fundamental idea that a newlyconstructed object might be characterized by a universal mapping propertyappears for the first time in Chapter II, and it appears more and more frequentlythroughout the book. One aspect of this idea is that it is sometimes not soimportant what certain constructed objects are, but what they do. A related ideabeing emphasized is that themappings associatedwith a newly constructed objectare likely to be as important as the object, if not more so; at the least, one needs tostop and find what those mappings are. Section II.9 uses Zorns Lemma and canbe deferred until Chapter IX if one wants. Chapter III discusses special featuresof real and complex vector spaces endowed with inner products. The main resultis the Spectral Theorem in Section 3. Many of the problems at the end of thechapter make contact with real analysis. The subject of linear algebra continuesin Chapter V.Chapter IV is the primary chapter on group theory and may be viewed as in

three parts. Sections 16 form the first part, which is essential for all later chaptersin the book. Sections 13 introduce groups and some associated constructions,along with a number of examples. Many of the examples will be seen to berelated to specific or general vector spaces, and thus the theme of the interplaybetween group theory and linear algebra is appearing concretely for the first time.In practice, many examples of groups arise in the context of group actions, andabstract group actions are defined in Section 6. Of particular interest are grouprepresentations, which are group actions on a vector space by linear mappings.Sections 45 are a digression to define rings, fields, and ring homomorphisms,and to extend the theories concerning polynomials and vector spaces as presentedin Chapters III. The immediate purpose of the digression is to make prime fields,their associated multiplicative groups, and the notion of characteristic availablefor the remainder of the chapter. The definition of vector space is extendedto allow scalars from any field. The definition of polynomial is extended toallow coefficients from any commutative ring with identity, rather than just the

Guide for the Reader xxiii

rationals or reals or complex numbers, and to allow more than one indeterminate.Universal mapping properties for polynomial rings are proved. Sections 710form the second part of the chapter and are a continuation of group theory. Themain result is the Fundamental Theorem of Finitely Generated Abelian Groups,which is in Section 9. Section 11 forms the third part of the chapter. This sectionis a gentle introduction to categories and functors, which are useful for workingwith parallel structures in different settings within algebra. As S. Mac Lane saysin his book, Category theory asks of every type of Mathematical object: Whatare the morphisms?; it suggests that these morphisms should be described at thesame time as the objects. . . . This emphasis on (homo)morphisms is largely due toEmmyNoether, who emphasized the use of homomorphismsof groups and rings.The simplest parallel structure reflected in categories is that of an isomorphism.The section also discusses general notions of product and coproduct functors.Examples of products are direct products in linear algebra and in group theory.Examples of coproducts are direct sums in linear algebra and in abelian grouptheory, as well as disjoint unions in set theory. The theory in this section helps inunifying the mathematics that is to come in Chapters VIVIII and X. The subjectof group theory in continued in Chapter VII, which assumes knowledge of thematerial on category theory.ChaptersV andVI continue the development of linear algebra. ChapterVI uses

categories, but Chapter V does not. Most of Chapter V concerns the analysis of alinear transformation carrying a finite-dimensional vector space over a field intoitself. The questions are to find invariants of such transformations and to classifythe transformations up to similarity. Section 2 at the start extends the theory ofdeterminants so that the matrices are allowed to have entries in a commutativering with identity; this extension is necessary in order to be able to work easilywith characteristic polynomials. The extension of this theory is carried out byan important principle known as the permanence of identities. Chapter VIlargely concerns bilinear forms and tensor products, again in the context that thecoefficients are from a field. This material is necessary in many applications togeometry and physics, but it is not needed in Chapters VIIIX. Many objects inthe chapter are constructed in such a way that they are uniquely determined bya universal mapping property. Problems 1822 at the end of the chapter discussuniversal mapping properties in the general context of category theory, and theyshow that a uniqueness theorem is automatic in all cases.ChapterVII continues the developmentof group theory,makinguseof category

theory. It is in two parts. Sections 13 concern free groups and the topic ofgenerators and relations; they are essential for abstract descriptions of groupsand for work in topology involving fundamental groups. Section 3 constructs anotion of free product and shows that it is the coproduct functor for the categoryof groups. Sections 46 continue the theme of the interplay of group theory and

xxiv Guide for the Reader

linear algebra. Section 4 analyzes group representations of a finite group whenthe underlying field is the complex numbers, and Section 5 applies this theoryto obtain a conclusion about the structure of finite groups. Section 6 studiesextensions of groups and uses them to motivate the subject of cohomology ofgroups.Chapter VIII introduces modules, giving many examples in Section 1, and

then goes on to discuss questions of unique factorization in integral domains.Section 6 obtains a generalization for principal ideal domains of the FundamentalTheorem of Finitely Generated Abelian Groups, once again illustrating the firstthemesimilarities between the integers and certain polynomial rings. Section 7introduces the third theme, the relationship between number theory and geometry,as a more sophisticated version of the first theme. The section compares a certainpolynomial ring in two variables with a certain ring of algebraic integers thatextends the ordinary integers. Unique factorization of elements fails for both, butthe geometric setting has a more geometrically meaningful factorization in termsof ideals that is evidently unique. This kind of unique factorization turns out towork for the ring of algebraic integers aswell. Sections 811 expand the examplesin Section 7 into a theory of unique factorization of ideals in any integrally closedNoetherian domain whose nonzero prime ideals are all maximal.Chapter IX analyzes algebraic extensions of fields. The first 13 sections

make use only of Sections 16 in Chapter VIII. Sections 15 of Chapter IXgive the foundational theory, which is sufficient to exhibit all the finite fields andto prove that certain classically proposed constructions in Euclidean geometryare impossible. Sections 68 introduce Galois theory, but Theorem 9.28 andits three corollaries may be skipped if Sections 1417 are to be omitted. Sec-tions 911 give a first round of applications of Galois theory: Gausss theoremabout which regular n-gons are in principle constructible with straightedge andcompass, the Fundamental Theorem of Algebra, and the AbelGalois theoremthat solvability of a polynomial equation with rational coefficients in terms ofradicals implies solvability of the Galois group. Sections 1213 give a secondround of applications: Gausss method in principle for actually constructing theconstructible regular n-gons and a converse to theAbelGalois theorem. Sections1417make use of Sections 711 ofChapterVIII, proving that is transcendentaland obtaining two methods for computing Galois groups.Chapter X is a relatively short chapter developing further tools for dealing

with modules over a ring with identity. The main construction is that of thetensor product over a ring of a unital right module and a unital left module, theresult being an abelian group. The chapter makes use of material from ChaptersVI and VIII, but not from Chapter IX.

Basic Algebra

CHAPTER I

Preliminaries about the Integers, Polynomials,and Matrices

Abstract. This chapter is mostly a review, discussing unique factorization of positive integers,unique factorization of polynomials whose coefficients are rational or real or complex, signs ofpermutations, and matrix algebra.Sections 12 concern unique factorization of positive integers. Section 1 proves the division

and Euclidean algorithms, used to compute greatest common divisors. Section 2 establishes uniquefactorization as a consequence and gives several number-theoretic consequences, including theChinese Remainder Theorem and the evaluation of the Euler function.Section 3 develops unique factorization of rational and real and complex polynomials in one inde-

terminate completely analogously, and it derives the complete factorization of complex polynomialsfrom the Fundamental Theorem of Algebra. The proof of the fundamental theorem is postponed toChapter IX.Section 4 discusses permutations of a finite set, establishing the decomposition of each permu-

tation as a disjoint product of cycles. The sign of a permutation is introduced, and it is proved thatthe sign of a product is the product of the signs.Sections 56 concern matrix algebra. Section 5 reviews row reduction and its role in the solution

of simultaneous linear equations. Section 6 defines the arithmetic operations of addition, scalarmultiplication, and multiplication of matrices. The process of matrix inversion is related to themethod of row reduction, and it is shown that a square matrix with a one-sided inverse automaticallyhas a two-sided inverse that is computable via row reduction.

1. Division and Euclidean Algorithms

The first three sections give a careful proof of unique factorization for integersand for polynomials with rational or real or complex coefficients, and they givean indication of some first consequences of this factorization. For the momentlet us restrict attention to the set Z of integers. We take addition, subtraction,and multiplication within Z as established, as well as the properties of the usualordering in Z.A factor of an integer n is a nonzero integer k such that n = kl for some

integer l. In this case we say also that k divides n, that k is a divisor of n, andthat n is amultiple of k. We write k | n for this relationship. If n is nonzero, anyproduct formula n = kl1 lr is a factorization of n. A unit in Z is a divisor

1

2 I. Preliminaries about the Integers, Polynomials, and Matrices

of 1, hence is either +1 or 1. The factorization n = kl of n 6= 0 is callednontrivial if neither k nor l is a unit. An integer p > 1 is said to be prime if ithas no nontrivial factorization p = kl.The statement of unique factorization for positive integers, whichwill be given

precisely in Section 2, says roughly that each positive integer is the product ofprimes and that this decomposition is unique apart from the order of the factors.1Existencewill followby an easy induction. The difficulty is in the uniqueness. Weshall prove uniqueness by a sequence of steps based on the Euclidean algorithm,which we discuss in a moment. In turn, the Euclidean algorithm relies on thefollowing.

Proposition 1.1 (division algorithm). If a and b are integers with b 6= 0, thenthere exist unique integers q and r such that a = bq + r and 0 r < |b|.PROOF. Possibly replacing q byq, we may assume that b > 0. The integers

n with bn a are bounded above by |a|, and there exists such an n, namelyn = |a|. Therefore there is a largest such integer, say n = q. Set r =a bq. Then 0 r and a = bq + r . If r b, then r b 0 says thata = b(q + 1) + (r b) b(q + 1). The inequality q + 1 > q contradicts themaximality of q, and we conclude that r < b. This proves existence.For uniqueness when b > 0, suppose a = bq1 + r1 = bq2 + r2. Subtracting,

we obtain b(q1 q2) = r2 r1 with |r2 r1| < b, and this is a contradictionunless r2 r1 = 0.

Let a and b be integers not both 0. The greatest common divisor of a andb is the largest integer d > 0 such that d | a and d | b. Let us see existence.The integer 1 divides a and b. If b, for example, is nonzero, then any such dhas |d| |b|, and hence the greatest common divisor indeed exists. We writed = GCD(a, b).Let us suppose that b 6= 0. The Euclidean algorithm consists of iterated ap-

plication of the division algorithm (Proposition 1.1) to a and b until the remainderterm r disappears:

a = bq1 + r1, 0 r1 < b,b = r1q2 + r2, 0 r2 < r1,r1 = r2q3 + r3, 0 r3 < r2,

...

rn2 = rn1qn + rn, 0 rn < rn1 (with rn 6= 0, say),rn1 = rnqn+1.

1It is to be understood that the prime factorization of 1 is as the empty product.

1. Division and Euclidean Algorithms 3

The process must stop with some remainder term rn+1 equal to 0 in this way sinceb > r1 > r2 > 0. The last nonzero remainder term, namely rn above, willbe of interest to us.

EXAMPLE. For a = 13 and b = 5, the steps read

13 = 5 2+ 3,5 = 3 1+ 2,

3 = 2 1+ 1 ,2 = 1 2.

The last nonzero remainder term is written with a box around it.

Proposition 1.2. Let a and b be integers with b 6= 0, and let d = GCD(a, b).Then

(a) the number rn in the Euclidean algorithm is exactly d,(b) any divisor d 0 of both a and b necessarily divides d,(c) there exist integers x and y such that ax + by = d.

REMARK. Proposition 1.2c is sometimes called Bezouts identity.

EXAMPLE, CONTINUED. We rewrite the steps of the Euclidean algorithm, asapplied in the above example with a = 13 and b = 5, so as to yield successivesubstitutions:

13 = 5 2+ 3, 3 = 13 5 2,5 = 3 1+ 2, 2 = 5 3 1 = 5 (13 5 2) 1 = 5 3 13 1,

3 = 2 1+ 1 , 1 = 3 2 1 = (13 5 2) (5 3 13 1) 1= 13 2 5 5.

Thus we see that 1 = 13x + 5y with x = 2 and y = 5. This shows for theexample that the number rn works in place of d in Proposition 1.2c, and the restof the proof of the proposition for this example is quite easy. Let us now adjustthis computation to obtain a complete proof of the proposition in general.

PROOF OF PROPOSITION 1.2. Put r0 = b and r1 = a, so that

rk2 = rk1qk + rk for 1 k n. ()

The argument proceeds in three steps.


Step 1. We show that rn is a divisor of both a and b. In fact, from rn1 =rnqn+1, we have rn | rn1. Let k n, and assume inductively that rn dividesrk1, . . . , rn1, rn . Then () shows that rn divides rk2. Induction allows us toconclude that rn divides r1, r0, . . . , rn1. In particular, rn divides a and b.Step 2. We prove that ax + by = rn for suitable integers x and y. In fact,

we show by induction on k for k n that there exist integers x and y withax + by = rk . For k = 1 and k = 0, this conclusion is trivial. If k 1 is givenand if the result is known for k 2 and k 1, then we have

ax2 + by2 = rk2,ax1 + by1 = rk1

()

for suitable integers x2, y2, x1, y1. We multiply the second of the equalities of() by qk , subtract, and substitute into (). The result is

rk = rk2 rk1qk = a(x2 qkx1) + b(y2 qk y1),

and the induction is complete. Thus ax + by = rn for suitable x and y.Step 3. Finally we deduce (a), (b), and (c). Step 1 shows that rn divides a and

b. If d 0 > 0 divides both a and b, the result of Step 2 shows that d 0 | rn . Thusd 0 rn , and rn is the greatest common divisor. This is the conclusion of (a); (b)follows from (a) since d 0 | rn , and (c) follows from (a) and Step 2.

Corollary 1.3. Within Z, if c is a nonzero integer that divides a product mnand if GCD(c,m) = 1, then c divides n.

PROOF. Proposition 1.2c produces integers x and y with cx + my = 1.Multiplying by n, we obtain cnx + mny = n. Since c divides mn and dividesitself, c divides both terms on the left side. Therefore it divides the right side,which is n.

Corollary 1.4. Within Z, if a and b are nonzero integers with GCD(a, b) = 1and if both of them divide the integer m, then ab divides m.

PROOF. Proposition 1.2c produces integers x and y with ax + by = 1.Multiplying by m, we obtain amx + bmy = m, which we rewrite in integersas ab(m/b)x + ab(m/a)y = m. Since ab divides each term on the left side, itdivides the right side, which is m.

2. Unique Factorization of Integers

We come now to the theorem asserting unique factorization for the integers. Theprecise statement is as follows.

2. Unique Factorization of Integers 5

Theorem 1.5 (Fundamental Theorem of Arithmetic). Each positive integern can be written as a product of primes, n = p1 p2 pr , with the integer 1being written as an empty product. This factorization is unique in the followingsense: if n = q1q2 qs is another such factorization, then r = s and, after somereordering of the factors, qj = pj for 1 j r .

The main step is the following lemma, which relies on Corollary 1.3.

Lemma 1.6. Within Z, if p is a prime and p divides a product ab, then pdivides a or p divides b.

REMARK. Lemma 1.6 is sometimes known as Euclids Lemma.

PROOF. Suppose that p does not divide a. Since p is prime, GCD(a, p) = 1.Taking m = a, n = b, and c = p in Corollary 1.3, we see that p divides b.

PROOF OF EXISTENCE IN THEOREM 1.5. We induct on n, the case n = 1 beinghandled by an empty product expansion. If the result holds for k = 1 throughk = n 1, there are two cases: n is prime and n is not prime. If n is prime, thenn = n is the desired factorization. Otherwise we can write n = ab nontriviallywith a > 1 and b > 1. Then a n 1 and b n 1, so that a and b havefactorizations into primes by the inductive hypothesis. Putting them togetheryields a factorization into primes for n = ab.

PROOF OF UNIQUENESS IN THEOREM 1.5. Suppose that n = p1 p2 pr =q1q2 qs with all factors prime and with r s. We prove the uniqueness byinduction on r , the case r = 0 being trivial and the case r = 1 following fromthe definition of prime. Inductively from Lemma 1.6 we have pr | qk for somek. Since qk is prime, pr = qk . Thus we can cancel and obtain p1 p2 pr1 =q1q2 bqk qs , the hat indicating an omitted factor. By induction the factorson the two sides here are the same except for order. Thus the same conclusionis valid when comparing the two sides of the equality p1 p2 pr = q1q2 qs .The induction is complete, and the desired uniqueness follows.

In the product expansion of Theorem 1.5, it is customary to group factors thatare equal, thus writing the positive integer n as n = pk11 pkrr with the primespj distinct and with the integers kj all 0. This kind of decomposition is uniqueup to order if all factors pkjj with kj = 0 are dropped, and we call it a primefactorization of n.

Corollary 1.7. If n = pk11 pkrr is a prime factorization of a positive integern, then the positive divisors d of n are exactly all products d = pl11 plrr with0 lj kj for all j .


REMARK. A general divisor of n within Z is the product of a unit 1 and apositive divisor.

PROOF. Certainly any such product divides n. Conversely if d divides n, writen = dx for some positive integer x . Apply Theorem 1.5 to d and to x , form theresulting prime factorizations, and multiply them together. Then we see from theuniqueness for the prime factorization of n that the only primes that can occur inthe expansions of d and x are p1, . . . , pr and that the sum of the exponents of pjin the expansions of d and x is kj . The result follows.

If we want to compare prime factorizations for two positive integers, we caninsert 0th powers of primes as necessary and thereby assume that the same primesappear in both expansions. Using this device, we obtain a formula for greatestcommon divisors.

Corollary 1.8. If two positive integers a and b have expansions as productsof powers of r distinct primes given by a = pk11 pkrr and b = p

l11 plrr , then

GCD(a, b) = pmin(k1,l1)1 pmin(kr ,lr )r .

PROOF. Let d 0 be the right side of the displayed equation. It is plain that d 0is positive and that d 0 divides a and b. On the other hand, two applications ofCorollary 1.7 show that the greatest common divisor of a and b is a number dof the form pm11 pmrr with the property that mj kj and mj lj for all j .Therefore mj min(kj , lj ) for all j , and d d 0. Since any positive divisor ofboth a and b is d, we have d 0 d. Thus d 0 = d.

In special cases Corollary 1.8 provides a useful way to compute GCD(a, b),but the Euclidean algorithm is usually a more efficient procedure. Nevertheless,Corollary 1.8 remains a handy tool for theoretical purposes. Here is an example:Two nonzero integers a and b are said to be relatively prime if GCD(a, b) = 1.It is immediate fromCorollary 1.8 that two nonzero integers a and b are relativelyprime if and only if there is no prime p that divides both a and b.

Corollary 1.9 (Chinese Remainder Theorem). Let a and b be positive rela-tively prime integers. To each pair (r, s) of integerswith 0 r < a and 0 s < bcorresponds a unique integer n such that 0 n < ab, a divides n r , and bdivides n s. Moreover, every integer n with 0 n < ab arises from some suchpair (r, s).

REMARK. In notation for congruences thatwe introduce formally inChapter IV,the result says that if GCD(a, b) = 1, then the congruences n r mod a andn s mod b have one and only one simultaneous solution n with 0 n < ab.

2. Unique Factorization of Integers 7

PROOF. Let us see that n exists as asserted. Since a and b are relativelyprime, Proposition 1.2c produces integers x 0 and y0 such that ax 0 by0 = 1.Multiplying by s r , we obtain ax by = s r for suitable integers x and y.Put t = ax + r = by + s, and write by the division algorithm (Proposition 1.1)t = abq + n for some integer q and for some integer n with 0 n < ab. Thennr = tabqr = axabq is divisible by a, and similarly n s is divisibleby b.Suppose that n and n0 both have the asserted properties. Then a divides

n n0 = (n r) (n0 r), and b divides n n0 = (n s) (n0 s). Sincea and b are relatively prime, Corollary 1.4 shows that ab divides n n0. But|n n0| < ab, and the only integer N with |N | < ab that is divisible by ab isN = 0. Thus n n0 = 0 and n = n0. This proves uniqueness.Finally the argument just given defines a one-one function from a set of ab

pairs (r, s) to a set of ab elements n. Its image must therefore be all such integersn. This proves the corollary.

If n is a positive integer, we define (n) to be the number of integers k with0 k < n such that k and n are relatively prime. The function is called theEuler function.

Corollary 1.10. Let N > 1 be an integer, and let N = pk11 pkrr be a primefactorization of N . Then

(N ) =rY

j=1pkj1j (pj 1).

REMARK. The conclusion is valid also for N = 1 if we interpret the right sideof the formula to be the empty product.

PROOF. For positive integers a and b, let us check that

(ab) = (a)(b) if GCD(a, b) = 1. ()

In view of Corollary 1.9, it is enough to prove that the mapping (r, s) 7 n givenin that corollary has the property that GCD(r, a) = GCD(s, b) = 1 if and only ifGCD(n, ab) = 1.To see this property, suppose that n satisfies 0 n < ab andGCD(n, ab) > 1.

Choose a prime p dividing both n and ab. By Lemma1.6, p divides a or p dividesb. By symmetrywemayassume that p dividesa. If (r, s) is the pair correspondingto n under Corollary 1.9, then the corollary says that a divides n r . Since pdivides a, p divides n r . Since p divides n, p divides r . Thus GCD(r, a) > 1.Conversely suppose that (r, s) is a pair with 0 r < a and 0 s < b such

that GCD(r, a) = GCD(s, b) = 1 is false. Without loss of generality, we may


assume that GCD(r, a) > 1. Choose a prime p dividing both r and a. If n is theinteger with 0 n < ab that corresponds to (r, s) under Corollary 1.9, then thecorollary says that a divides n r . Since p divides a, p divides n r . Since pdivides r , p divides n. Thus GCD(n, ab) > 1. This completes the proof of ().For a power pk of a prime p with k > 0, the integers n with 0 n < pk

such that GCD(n, pk) > 1 are the multiples of p, namely 0, p, 2p, . . . , pk p.There are pk1 of them. Thus the number of integers n with 0 n < pk suchthat GCD(n, pk) = 1 is pk pk1 = pk1(p 1). In other words,

(pk) = pk1(p 1) if p is prime and k 1. ()

To prove the corollary, we induct on r , the case r = 1 being handled by (). Ifthe formula of the corollary is valid for r 1, then () allows us to combine thatresult with the formula for (pkr ) given in () to obtain the formula for (N ).

Weconclude this section by extending the notion of greatest commondivisor toapply to more than two integers. If a1, . . . , at are integers not all 0, their greatestcommon divisor is the largest integer d > 0 that divides all of a1, . . . , at . Thisexists, and we write d = GCD(a1, . . . , at) for it. It is immediate that d equals thegreatest common divisor of the nonzero members of the set {a1, . . . , at}. Thus,in deriving properties of greatest common divisors, we may assume that all theintegers are nonzero.

Corollary 1.11. Let a1, . . . , at be positive integers, and let d be their greatestcommon divisor. Then

(a) if for each j with 1 j t , aj = pk1, j1 p

kr, jr is an expansion of aj as

a product of powers of r distinct primes p1, . . . , pr , it follows that

d = pmin1 jt {k1, j }1 pmin1 jt {kr, j }r ,

(b) any divisor d 0 of all of a1, . . . , at necessarily divides d,(c) d = GCD

GCD(a1, . . . , at1), at

if t > 1,

(d) there exist integers x1, . . . , xt such that a1x1 + + at xt = d.

PROOF. Part (a) is proved in the sameway asCorollary1.8 except thatCorollary1.7 is to be applied r times rather than just twice. Further application of Corollary1.7 shows that any positive divisor d 0 of a1, . . . , at is of the form d 0 = pm11 pmrrwith m1 k1, j for all j , . . . , and with mr kr, j for all j . Therefore m1 min1 jr {k1, j }, . . . , and mr min1 jr {kr, j }, and it follows that d 0 dividesd. This proves (b). Conclusion (c) follows by using the formula in (a), and (d)follows by combining (c), Proposition 1.2c, and induction.

3. Unique Factorization of Polynomials 9

3. Unique Factorization of Polynomials

This section establishes unique factorization for ordinary rational, real, and com-plex polynomials. We write Q for the set of rational numbers, R for the set ofreal numbers, and C for the set of complex numbers, each with its arithmeticoperations. The rational numbers are constructed from the integers by a processreviewed in Section A3 of the appendix, the real numbers are defined from therational numbers by a process reviewed in that same section, and the complexnumbers are defined from the real numbers by a process reviewed in Section A4of the appendix. Sections A3 and A4 of the appendix mention special propertiesof R and C beyond those of the arithmetic operations, but we shall not makeserious use of these special properties here until nearly the end of the sectionafter unique factorization of polynomials has been established. Let F denote anyof Q, R, or C. The members of F are called scalars.We work with ordinary polynomials with coefficients in F. Informally these

are expressions P(X) = anXn+ +a1X+a0 withan, . . . , a1, a0 inF. Althoughit is tempting to think of P(X) as a function with independent variable X , it isbetter to identify P with the sequence (a0, a1, . . . , an, 0, 0, . . . ) of coefficients,using expressions P(X) = anXn + + a1X + a0 only for conciseness and formotivation of the definitions of various operations.The precise definition therefore is that a polynomial in one indeterminate

with coefficients in F is an infinite sequence of members of F such that all termsof the sequence are 0 from somepoint on. The indexing of the sequence is to beginwith 0. We may refer to a polynomial P as P(X) if we want to emphasize thatthe indeterminate is called X . Addition, subtraction, and scalar multiplicationare defined in coordinate-by-coordinate fashion:

(a0, a1, . . . , an, 0, 0, . . . ) + (b0,b1, . . . , bn, 0, 0, . . . )= (a0 + b0, a1 + b1, . . . , an + bn, 0, 0, . . . ),

(a0, a1, . . . , an, 0, 0, . . . ) (b0,b1, . . . , bn, 0, 0, . . . )= (a0 b0, a1 b1, . . . , an bn, 0, 0, . . . ),

c(a0, a1, . . . , an, 0, 0, . . . ) = (ca0, ca1, . . . , can, 0, 0, . . . ).

Polynomial multiplication is defined so as to match multiplication of expressionsanXn + + a1X + a0 if the product is expanded out, powers of X are added,and then terms containing like powers of X are collected:

(a0, a1, . . . , 0, 0, . . . )(b0, b1, . . . , 0, 0, . . . ) = (c0, c1, . . . , 0, 0, . . . ),

where cN =PN

k=0 akbNk . We take it as known that the usual associative,commutative, and distributive laws are then valid. The set of all polynomials inthe indeterminate X is denoted by F[X].


The polynomial with all entries 0 is denoted by 0 and is called the zeropolynomial. For all polynomials P = (a0, . . . , an, 0, . . . ) other than 0, thedegree of P , denoted by deg P , is defined to be the largest index n such thatan 6= 0. The constant polynomials are by definition the zero polynomial and thepolynomials of degree 0. If P and Q are nonzero polynomials, then

P + Q = 0 or deg(P + Q) max(deg P, deg Q),deg(cP) = deg P,

deg(PQ) = deg P + deg Q.

In the formula for deg(P + Q), equality holds if deg P 6= deg Q. Implicit in theformula for deg(PQ) is the fact that PQ cannot be 0 unless P = 0 or Q = 0. Acancellation law for multiplication is an immediate consequence:

PR = QR with R 6= 0 implies P = Q.

In fact, PR = QR implies (P Q)R = 0; since R 6= 0, P Q must be 0.If P = (a0, . . . , an, 0, . . . ) is a polynomial and r is in F, we can evaluate P

at r , obtaining as a result the number P(r) = anrn + + a1r + a0. Taking intoaccount all values of r , we obtain a mapping P 7 P( ) of F[X] into the set offunctions from F into F. Because of the way that the arithmetic operations onpolynomials have been defined, we have

(P + Q)(r) = P(r) + Q(r),(P Q)(r) = P(r) Q(r),

(cP)(r) = cP(r),(PQ)(r) = P(r)Q(r).

In other words, the mapping P 7 P( ) respects the arithmetic operations. Wesay that r is a root of P if P(r) = 0.Now we turn to the question of unique factorization. The definitions and the

proof are completely analogous to those for the integers. A factor of a polynomialA is a nonzero polynomial B such that A = BQ for some polynomial Q. Inthis case we say also that B divides A, that B is a divisor of A, and that A is amultiple of B. We write B | A for this relationship. If A is nonzero, any productformula A = BQ1 Qr is a factorization of A. A unit inF[X] is a divisor of 1,hence is any polynomial of degree 0; such a polynomial is a constant polynomialA(X) = c with c equal to a nonzero scalar. The factorization A = BQ ofA 6= 0 is called nontrivial if neither B nor Q is a unit. A prime P in F[X] is anonzero polynomial that is not a unit and has no nontrivial factorization P = BQ.Observe that the product of a prime and a unit is always a prime.


Proposition 1.12 (division algorithm). If A and B are polynomials in F[X]and if B not the 0 polynomial, then there exist unique polynomials Q and R inF[X] such that

(a) A = BQ + R and(b) either R is the 0 polynomial or deg R < deg B.

REMARK. This result codifies the usual method of dividing polynomials inhigh-school algebra. That method writes A/B = Q+ R/B, and then one obtainsthe above result by multiplying by B. The polynomial Q is the quotient in thedivision, and R is the remainder.

PROOF OF UNIQUENESS. If A = BQ + R = BQ1 + R1, then B(Q Q1) =R1R. Without loss of generality, R1R is not the 0 polynomial since otherwiseQ Q1 = 0 also. Then

deg B + deg(Q Q1) = deg(R1 R) max(deg R, deg R1) < deg B,

and we have a contradiction.

PROOF OF EXISTENCE. If A = 0 or deg A < deg B, we take Q = 0 andR = A, and we are done. Otherwise we induct on deg A. Assume the resultfor degree n 1, and let deg A = n. Write A = anXn + A1 with A1 = 0or deg A1 < deg A. Let B = bk Xk + B1 with B1 = 0 or deg B1 < deg B. PutQ1 = anb1k Xnk . Then

A BQ1 = anXn + A1 anXn anb1k Xnk B1 = A1 anb1k X

nk B1

with the right side equal to 0 or of degree < deg A. Then the right side, byinduction, is of the form BQ2 + R, and A = B(Q1 + Q2) + R is the requireddecomposition.

Corollary 1.13 (Factor Theorem). If r is in F and if P is a polynomial inF[X], then X r divides P if and only if P(r) = 0.

PROOF. If P = (X r)Q, then P(r) = (r r)Q(r) = 0. Conversely letP(r) = 0. Taking B(X) = X r in the division algorithm (Proposition 1.12),we obtain P = (X r)Q + R with R = 0 or deg R < deg(X r) = 1.Thus R is a constant polynomial, possibly 0. In any case we have 0 = P(r) =(r r)Q(r) + R(r), and thus R(r) = 0. Since R is constant, we must haveR = 0, and then P = (X r)Q.

Corollary 1.14. If P is a nonzero polynomial with coefficients in F and ifdeg P = n, then P has at most n distinct roots.


REMARKS. Since there are infinitely many scalars in any of Q and R andC, the corollary implies that the function from F to F associated to P , namelyr 7 P(r), cannot be identically 0 if P 6= 0. Starting in Chapter IV, we shallallow other Fs besides Q and R and C, and then this implication can fail. Forexample, when F is the two-element field F = {0, 1} with 1+ 1 = 0 and withotherwise the expected addition and multiplication, then P(X) = X2 + X is notthe zero polynomial but P(r) = 0 for r = 0 and r = 1. It is thus important todistinguish polynomials in one indeterminate from their associated functions ofone variable.

PROOF. Let r1, . . . , rn+1 be distinct roots of P(X). By the Factor Theorem(Corollary 1.13), X r1 is a factor of P(X). We prove inductively on k thatthe product (X r1)(X r2) (X rk) is a factor of P(X). Assume that thisassertion holds for k, so that P(X) = (X r1) (X rk)Q(X) and

0 = P(rk+1) = (rk+1 r1) (rk+1 rk)Q(rk+1).

Since the rj s are distinct, we must have Q(rk+1) = 0. By the Factor Theorem,we can write Q(X) = (X rk+1)R(X) for some polynomial R(X). Substitutiongives P(X) = (Xr1) (Xrk)(Xrk+1)R(X), and (Xr1) (Xrk+1)is exhibited as a factor of P(X). This completes the induction. Consequently

P(X) = (X r1) (X rn+1)S(X)

for some polynomial S(X). Comparing the degrees of the two sides, we find thatdeg S = 1, and we have a contradiction.

We can use the division algorithm in the same way as with the integers inSections 12 to obtain unique factorization. Within the set of integers, we definedgreatest common divisors so as to be positive, but their negatives would haveworked equally well. That flexibility persists with polynomials; the essentialfeature of any greatest common divisor of polynomials is shared by any productof that polynomial by a unit. A greatest common divisor of polynomials A andB with B 6= 0 is any polynomial D of maximum degree such that D divides Aand D divides B. We shall see that D is indeed unique up to multiplication by anonzero scalar.2

2For some purposes it is helpful to isolate one particular greatest common divisor by taking thecoefficient of the highest power of X to be 1.


TheEuclidean algorithm is the iterative process thatmakes use of the divisionalgorithm in the form

A = BQ1 + R1, R1 = 0 or deg R1 < deg B,B = R1Q2 + R2, R2 = 0 or deg R2 < deg R1,R1 = R2Q3 + R3, R3 = 0 or deg R3 < deg R2,

...

Rn2 = Rn1Qn + Rn, Rn = 0 or deg Rn < deg Rn1,Rn1 = RnQn+1.

In the above computation the integer n is defined by the conditions that Rn 6= 0and that Rn+1 = 0. Such an n must exist since deg B > deg R1 > 0. Wecan now obtain an analog for F[X] of the result for Z given as Proposition 1.2.

Proposition 1.15. Let A and B be polynomials in F[X] with B 6= 0, and letR1, . . . , Rn be the remainders generated by the Euclidean algorithmwhen appliedto A and B. Then

(a) Rn is a greatest common divisor of A and B,(b) any D1 that divides both A and B necessarily divides Rn ,(c) the greatest common divisor of A and B is unique up to multiplication

by a nonzero scalar,(d) any greatest common divisor D has the property that there exist polyno-

mials P and Q with AP + BQ = D.

PROOF. Conclusions (a) and (b) are proved in the same way that parts (a) and(b) of Proposition 1.2 are proved, and conclusion (d) is proved with D = Rn inthe same way that Proposition 1.2c is proved.If D is a greatest common divisor of A and B, it follows from (a) and (b) that

D divides Rn and that deg D = deg Rn . This proves (c).

Using Proposition 1.15, we can prove analogs for F[X] of the two corollariesof Proposition 1.2. But let us instead skip directly to what is needed to obtain ananalog for F[X] of unique factorization as in Theorem 1.5.

Lemma 1.16. If A and B are nonzero polynomials with coefficients in F andif P is a prime polynomial such that P divides AB, then P divides A or P dividesB.PROOF. If P does not divide A, then 1 is a greatest common divisor of A and

P , and Proposition 1.15d produces polynomials S and T such that AS+ PT = 1.Multiplication by B gives ABS + PT B = B. Then P divides ABS because itdivides AB, and P divides PT B because it divides P . Hence P divides B.


Theorem1.17 (unique factorization). EverymemberofF[X] of degree 1 is aproduct of primes. This factorization is unique up to order and up tomultiplicationof each prime factor by a unit, i.e., by a nonzero scalar.PROOF. The existence follows in the same way as the existence in Theorem

1.5; induction on the integers is to be replaced by induction on the degree. Theuniqueness follows from Lemma 1.16 in the same way that the uniqueness inTheorem 1.5 follows from Lemma 1.6.

We turn to a consideration of properties of polynomials that take into accountspecial features of R and C. If F is R, then X2 + 1 is prime. The reason is thata nontrivial factorization of X2 + 1 would have to involve two first-degree realpolynomials and then r2+1would have to be 0 for some real r , namely for r equalto the root of either of the first-degree polynomials. On the other hand, X2 + 1is not prime when F = C since X2 + 1 = (X + i)(X i). The FundamentalTheorem of Algebra, stated below, implies that every prime polynomial overC isof degree 1. It is possible to prove the Fundamental Theorem of Algebra withincomplex analysis as a consequence of Liouvilles Theorem or within real analysisas a consequence of theHeineBorel Theoremand other facts about compactness.This text gives a proof of the Fundamental Theorem of Algebra in Chapter IXusing modern algebra, specifically Sylow theory as in Chapter IV and Galoistheory as in Chapter IX. One further fact is needed; this fact uses elementarycalculus and is proved below as Proposition 1.20.

Theorem 1.18 (Fundamental Theorem of Algebra). Any polynomial in C[X]with degree 1 has at least one root.

Corollary 1.19. Let P be a nonzero polynomial of degree n in C[X],and let r1, . . . , rk be the distinct roots. Then there exist unique integers mj > 0for 1 j k such that P(X) is a scalar multiple of

Qkj=1 (X rj )mj . The

numbers mj havePk

j=1mj = n.PROOF. We may assume that deg P > 0. We apply unique factorization

(Theorem 1.17) to P(X). It follows from the Fundamental Theorem of Algebra(Theorem 1.18) and the Factor Theorem (Corollary 1.13) that each prime polyno-mial with coefficients in C has degree 1. Thus the unique factorization of P(X)has to be of the form c

Qnl=1(X zl) for some c 6= 0 and for some complex

numbers zl that are unique up to order. The zls are roots, and every root is a zl bythe Factor Theorem. Grouping like factors proves the desired factorization andits uniqueness. The numbers mj have

Pkj=1mj = n by a count of degrees.

The integersmj in the corollary are called themultiplicities of the roots of thepolynomial P(X).

4. Permutations and Their Signs 15

We conclude this section by proving the result from calculus that will enterthe proof of the Fundamental Theorem of Algebra in Chapter IX.

Proposition 1.20. Any polynomial in R[X] with odd degree has at least oneroot.

PROOF. Without loss of generality, we may take the leading coefficient tobe 1. Thus let the polynomial be P(X) = X2n+1 + a2n X2n + + a1X + a0 =X2n+1 + R(X). Since limx P(x)/x2n+1 = 1, there is some positive r0 suchthat P(r0) < 0 and P(r0) > 0. By the Intermediate Value Theorem, given inSection A3 of the appendix, P(r) = 0 for some r with r0 r r0.

4. Permutations and Their Signs

Let S be a finite nonempty set of n elements. A permutation of S is a one-onefunction from S onto S. The elements might be listed as a1, a2, . . . , an , but itwill simplify the notation to view them simply as 1, 2, . . . , n. We use ordinaryfunction notation for describing the effect of permutations. Thus the value of apermutation at j is ( j), and the composition of followed by is orsimply , with ( )( j) = ( ( j)). Composition is automatically associative,i.e., ( ) = ( ), because the effect of both sides on j , when we expandthings out, is ( ( ( j))). The composition of two permutations is also calledtheir product.The identity permutation will be denoted by 1. Any permutation , being

a one-one onto function, has a well-defined inverse permutation 1 with theproperty that 1 = 1 = 1. One way of describing concisely the effectof a permutation is to list its domain values and to put the corresponding range

values beneath them. Thus =1 2 3 4 54 3 5 1 2

is the permutation of {1, 2, 3, 4, 5}

with (1) = 4, (2) = 3, (3) = 5, (4) = 1, and (5) = 2. The inverse

permutation is obtained by interchanging the two rows to obtain4 3 5 1 21 2 3 4 5

and

then adjusting the entries in the rows so that the first row is in the usual order:

1 =

1 2 3 4 54 5 2 1 3

.

If 2 k n, a k-cycle is a permutation that fixes each element in somesubset of n k elements and moves the remaining elements c1, . . . , ck accordingto (c1) = c2, (c2) = c3, . . . , (ck1) = ck , (ck) = c1. Such a cycle may bedenoted by (c1 c2 ck1 ck) to stress its structure. For example take n = 5;

then = (2 3 5) is the 3-cycle given in our earlier notation by1 2 3 4 51 3 5 4 2

.


The cycle (2 3 5) is the same as the cycle (3 5 2) and the cycle (5 2 3). It issometimes helpful to speak of the identity permutation 1 as the unique 1-cycle.A system of cycles is said to be disjoint if the sets that each of them moves

are disjoint in pairs. Thus (2 3 5) and (1 4) are disjoint, but (2 3 5) and (1 3)are not. Any two disjoint cycles and commute in the sense that = .

Proposition 1.21. Any permutation of {1, 2, . . . , n} is a product of disjointcycles. The individual cycles in the decomposition are unique in the sense ofbeing determined by .

EXAMPLE.1 2 3 4 54 3 5 1 2

= (2 3 5)(1 4).

PROOF. Let us prove existence. Working with {1, 2, . . . , n}, we show that any is the disjoint product of cycles in such a way that no cycle moves an elementj unless moves j . We do so for all simultaneously by induction downwardon the number of elements fixed by . The starting case of the induction is that fixes all n elements. Then is the identity, and we are regarding the identityas a 1-cycle.For the inductive step suppose fixes the elements in a subset T of r el-

ements of {1, 2, . . . , n} with r < n. Let j be an element not in T , so that ( j) 6= j . Choose k as small as possible so that some element is repeatedamong j, ( j), 2( j), . . . , k( j). This condition means that l( j) = k( j) forsome l with 0 l < k. Then kl( j) = j , and we obtain a contradiction tothe minimality of k unless k l = k, i.e., l = 0. In other words, we have k( j) = j . We may thus form the k-cycle = ( j ( j) 2( j) k1( j)). Thepermutation 1 then fixes the r + k elements of T U , where U is the set ofelements j, ( j), 2( j), . . . , k1( j). By the inductive hypothesis, 1 is theproduct 1 p of disjoint cycles that move only elements not in T U . Since moves only the elements inU , is disjoint from each of 1, . . . , p. Therefore = 1 p provides the required decomposition of .For uniqueness we observe from the proof of existence that each element

j generates a k-cycle Cj for some k 1 depending on j . If we have twodecompositions as in the proposition, then the cycle within each decompositionthat contains j must be Cj . Hence the cycles in the two decompositions mustmatch.

A 2-cycle is often called a transposition. The proposition allows us to seequickly that any permutation is a product of transpositions.

Corollary 1.22. Any k-cycle permuting {1, 2, . . . , n} is a product of k 1transpositions if k > 1. Therefore any permutation of {1, 2, . . . , n} is a productof transpositions.

4. Permutations and Their Signs 17

PROOF. For the first statement, we observe that (c1 c2 ck1 ck) =(c1 ck)(c1 ck1) (c1 c3)(c1 c2). The second statement follows by combiningthis fact with Proposition 1.21.

Our final tasks for this section are to attach a sign to each permutation and toexamine the properties of these signs. We begin with the special case that ourunderlying set S is {1, . . . , n}. If is a permutation of {1, . . . , n}, consider thenumerical products

Y

1 j


Case 3. Continuingwithmatters as in Case 2, we next consider pairs (a, t) and(b, t) with a < b < t . These together contribute the factors ( (t) (a)) and( (t) (b)) to the product for , and they contribute the factors ( (t) (b))and ( (t) (a)) to the product for (a b). Since

( (t) (a))( (t) (b)) = ( (t) (b))( (t) (a)),

the pairs together make the same contribution to the product for (a b) as to theproduct for , and they can be ignored.Case 4. Still with matters as in Case 2, we consider pairs (t, a) and (t, b) with

t < a < b. Arguing as in Case 3, we are led to an equality

( (a) (t))( (b) (t)) = ( (b) (t))( (a) (t)),

and these pairs can be ignored.Case 5. Finally we consider the pair (a, b) itself. It contributes (b) (a)

to the product for , and it contributes (a) (b) to the product for (a b).These are negatives of one another, and we get a net contribution of one minussign in comparing our two product formulas. The lemma follows.

Proposition 1.24. The signs of permutations of {1, 2, . . . , n} have the follow-ing properties:

(a) sgn 1 = +1,(b) sgn = (1)k if can be written as the product of k transpositions,(c) sgn( ) = (sgn )(sgn ),(d) sgn(1) = sgn .

PROOF. Conclusion (a) is immediate from the definition. For (b), let =1 k with each j equal to a transposition. We apply Lemma 1.23 recursively,using (a) at the end:

sgn(1 k) = (1) sgn(1 k1) = (1)2 sgn(1 k2)

= = (1)k1 sgn 1 = (1)k sgn 1 = (1)k .

For (c), Corollary1.22 shows that any permutation is the product of transpositions.If is the product of k transpositions and is the product of l transpositions, then is manifestly the product of k + l transpositions. Thus (c) follows from (b).Finally (d) follows from (c) and (a) by taking = 1.

Our discussion of signs has so far attached signs only to permutations ofS = {1, 2, . . . , n}. If we are given some other set S0 of n elements and we want toadapt our discussion of signs so that it applies to permutations of S0, we need

5. Row Reduction 19

to identify S with S0, say by a one-one onto function : S S0. If is apermutationof S0, then1 is a permutationof S, andwe can define sgn( ) =sgn(1). The question is whether this definition is independent of .Fortunately the answer is yes, and the proof is easy. Suppose that : S S0

is a second one-one onto function, so that sgn( ) = sgn(1). Then1 = is a permutation of {1, 2, . . . , n}, and (c) and (d) in Proposition 1.24give

sgn( ) = sgn(1) = sgn(111)

= sgn(1) sgn(1) sgn( ) = sgn( ) sgn( ) sgn( ) = sgn( ).

Consequently the definition of signs of permutations of {1, 2, . . . , n} can becarried over to give a definition of signs of permutations of any finite nonempty setof n elements, and the resulting signs are independent of the way we enumeratethe set. The conclusions of Proposition 1.24 are valid for this extended definitionof signs of permutations.

5. Row Reduction

This section and the next review row reduction and matrix algebra for rational,real, and complex matrices. As in Section 3 let F denote Q or R or C. Themembers of F are called scalars.The term row reduction refers to the main part of the algorithm used for

solving simultaneous systems of algebraic linear equations with coefficients inF. Such a system is of the form

a11x1 + a12x2 + + a1nxn = b1,...

ak1x1 + ak2x2 + + aknxn = bk,

where the ai j and bi are known scalars and the xj are the unknowns, or variables.The algorithm makes repeated use of three operations on the equations, each ofwhichpreserves the set of solutions (x1, . . . , xn)because its inverse is an operationof the same kind:

(i) interchange two equations,(ii) multiply an equation by a nonzero scalar,(iii) replace an equation by the sumof it and amultiple of some other equation.


The repeated writing of the variables in carrying out these steps is tedious andunnecessary, since the steps affect only the known coefficients. Instead, we cansimply work with an array of the form

a11

ak1

a12

ak2

. . .

a1n

akn

b1...bk

.

The individual scalars appearing in the array are called entries. The aboveoperations on equations correspond exactly to operations on the rows3 of thearray, and they become

(i) interchange two rows,(ii) multiply a row by a nonzero scalar,(iii) replace a row by the sum of it and a multiple of some other row.

Any operation of these types is called an elementary row operation. The verticalline in the array is handy from one point of view in that it separates the left sidesof the equations from the right sides; if we have more than one set of right sides,we can include all of them to the right of the vertical line and thereby solve allthe systems at the same time. But from another point of view, the vertical line isunnecessary since it does not affect which operation we perform at a particulartime. Let us therefore drop it, abbreviating the system as

a11

ak1

a12

ak2

. . .

a1n

akn

b1...bk

!

.

The main step in solving the system is to apply the three operations in succes-sion to the array to reduce it to a particularly simple form. An array with k rowsand m columns4 is in reduced row-echelon form if it meets several conditions:

Each member of the first l of the rows, for some l with 0 l k, has atleast one nonzero entry, and the other rows have all entries 0.

Each of the nonzero rows has 1 as its first nonzero entry; let us say thatthe i th nonzero row has this 1 in its j (i)th entry.

The integers j (i) are to be strictly increasing as a function of i , and theonly entry in the j (i)th column that is nonzero is to be the one in the i throw.

Proposition 1.25. Any array with k rows and m columns can be transformedinto reduced row-echelon form by a succession of steps of types (i), (ii), (iii).

3 Rows are understood to be horizontal, while columns are vertical.4In the above displayed matrix, the array has m = n + 1 columns.

5. Row Reduction 21

In fact, the transformation in the proposition is carried out by an algorithmknown as the method of row reduction of the array. Let us begin with anexample, indicating the particular operation at each stage by a label over an arrow7. To keep the example from being unwieldy, we consolidate steps of type (iii)into a single step when the other row is the same.

EXAMPLE. In this example, k = m = 4. Row reduction gives

0 0 2 71 1 1 1

1 1 4 52 2 5 4

(i)7

1 1 1 10 0 2 7

1 1 4 52 2 5 4

(iii)7

1 1 1 10 0 2 70 0 3 60 0 3 6

(ii)7

1 1 1 10 0 1 720 0 3 60 0 3 6

(iii)7

1 1 0 520 0 1 720 0 0 3320 0 0 332

(ii)7

1 1 0 520 0 1 720 0 0 10 0 0 332

(iii)7

1 1 0 00 0 1 00 0 0 10 0 0 0

.

The final matrix here is in reduced row-echelon form. In the notation of thedefinition, the number of nonzero rows in the reduced row-echelon form is l = 3,and the integers j (i) are j (1) = 1, j (2) = 3, and j (3) = 4.

The example makes clear what the algorithm is that proves Proposition 1.25.We find the first nonzero column, apply an interchange (an operation of type (i))if necessary to make the first entry in the column nonzero, multiply by a nonzeroscalar to make the first entry 1 (an operation of type (ii)), and apply operations oftype (iii) to eliminate the other nonzero entries in the column. Then we look forthe next column with a nonzero entry in entries 2 and later, interchange to get thenonzero entry into entry 2 of the column, multiply to make the entry 1, and applyoperations of type (iii) to eliminate the other entries in the column. Continuingin this way, we arrive at reduced row-echelon form.In the general case, as soon as our array, which containsboth sides of our system

of equations, has been transformed into reduced row-echelon form, we can readoff exactly what the solutions are. It will be handy to distinguish two kinds ofvariables among x1, . . . , xn without including any added variables xn+1, . . . , xmin either of the classes. The corner variables are those xj s for which j is n andis some j (i) in the definition of reduced row-echelon form, and the other xj swith j n will be called independent variables. Let us describe the last stepsof the solution technique in the setting of an example. We restore the vertical linethat separated the data on the two sides of the equations.


EXAMPLE. We consider what might happen to a certain system of 4 equationsin 4 unknowns. Putting the data in place for the right side makes the array have 4rows and 5 columns. We transform the array into reduced row-echelon form andsuppose that it comes out to be

1000

1000

0100

basic algebra - stony brook universityaknapp/download/b2-alg-inside.pdf · contents of advanced...

Documents