chapter 1 the nature of mathematics - macs.hw.ac.uk

Chapter 1

The nature of mathematics

This chapter is a guide to the mathematics described in this book.

1.1 What are algebra, geometry and combi-

natorics?

1.1.1 Algebra

Algebra started as the study of equations. The simplest kinds of equationsare ones like

3x− 1 = 0

where there is only one unknown x and that unknown occurs to the power1. This means we have x alone and not, say, x1000. It is easy to solve thisspecific equation. Add 1 to both sides to get

3x = 1

and then divide both sides by 3 to get

x =1

3.

This is the solution to my original equation and, to make sure, we check ouranswer by calculating

3 · 1

3− 1

1

2 CHAPTER 1. THE NATURE OF MATHEMATICS

and observing that we really do get 0 as required. Even this simple exampleraises an important point: to carry out these calculations, I had to knowwhat rules the numbers and symbols obeyed. You probably applied theserules unconsciously, but in this book it will be important to know explicitlywhat they are. The method used for the specific example above can beapplied to any equation of the form

ax+ b = 0

as long as a 6= 0. Here a, b are specific numbers, probably real numbers, andx is the real number I am trying to find. This equation is the most generalexample of a linear equation in one unknown.

If x occurs to the power 2 then we get

ax2 + bx+ c = 0

where a 6= 0. This is an example of a quadratic equation in one unknown. Youwill have learnt a formula to solve such equations. But there is no reason tostop at 2. If x occurs to the power 3 we get a cubic equation in one unknown

ax3 + bx2 + cx+ d = 0

where a 6= 0. Solving such equations is much harder than solving quadraticsbut there is also an algebraic formula for the roots. But there is no reason tostop at cubics. We could look at equations in which x occurs to the power 4,quartics, and once again there is a formula for finding the roots. The highestpower of x that occurs in such an equation is called its degree. These resultsmight lead you to expect that there are always algebraic formulae for findingthe roots of any polynomial equation whatever its degree. There aren’t.For equations of degree 5, the quintics, and more, there are no algebraicformulae which enable you to solve the equations. I don’t mean that noformulae have yet been discovered, I mean that someone has proved that sucha formula is impossible, that someone being the young French mathematicianEvariste Galois (1811–1832), the James Dean of mathematics. Galois’s workmeant the end of the view that algebra was about finding formulae to solveequations. We shall not study Galois’s work in this book but it has had ahuge impact on algebra. It is one of the reasons why the algebra you studylater in your university careers will look very different from the algebra youstudied at school. In fact, one of my goals in writing this book is to help younavigate this transition.

1.1. WHAT ARE ALGEBRA, GEOMETRY AND COMBINATORICS? 3

I have talked about solving equations where there is one unknown butthere is no reason to stop there. We can also study equations where thereare any finite number of unknowns and those unknowns occur to any powers.The best place to start is where we have any number of unknowns but eachunknown can occur only to the first power and no products of unknowns areallowed. This means we are studying linear equations like

x+ 2y + 3z = 4.

Our goal is to find all the values of x, y and z that satisfy this equation.Thus the solutions are ordered triples (x, y, z). For example, both (0, 2, 0)and (2, 1, 0) are solutions whereas (1, 1, 1) is not a solution. It is unusual tohave just one linear equation to solve. Usually we have two or more such as

x+ 2y + 3z = 4 and x+ y + z = 0.

We then need to find all the triples (x, y, z) that satisfy both equationssimultaneously. In fact, as you should check, all the triples

(λ− 4, 4− 2λ, λ)

where λ is any number satisfy both equations. For this reason, we oftenspeak about simultaneous linear equations. It turns out that solving systemsof linear equations never becomes difficult however many unknowns thereare. The modern way of studying systems of linear equations uses matrixtheory.

That leaves studying equations where there are at least 2 unknowns andwhere there are no constraints on the powers of the unknowns and the extentto which they may be multiplied together. This is much more complicated.If you only allow squares such as x2 or products of at most two unknowns,such as xy, then there are relatively simple methods for solving them. But,even here, strange things happen. For example, the solutions to

x2 + y2 = 1

can be written (x, y) = (sin θ, cos θ). If you allow cubes or products of morethan two unknowns then you enter the world of subjects like algebraic ge-ometry and even connect with current research.

In this book, I shall introduce you to the theory of polynomial equationsand also to the theory of linear equations. I shall also show you how to solveequations that look like this

ax2 + bxy + cy2 + dx+ ey + f = 0.


So far, I have been talking about the algebra of numbers. But I shall alsointroduce you to the algebra of matrices, and the algebra of vectors, and thealgebra of subsets of a set, amongst others. In fact, I think the first shockon encountering university mathematics can be summed up in the followingstatement.

There is not one algebra, but many different algebras, each de-signed for different purposes.

These different algebras are governed by different sets of rules. For thisreason, it becomes crucial in university mathematics to make those rulesexplicit. In this book, the algebra you studied at school I often call high-school algebra so we know what we are talking about.

In my description of solving equations, I have left to one side somethingthat probably seemed obvious: the nature of the solutions. These solutionsare of course numbers but what do we mean by ‘numbers’? You might thinkthat a number is a number but in mathematics this concept turns out tobe much more interesting than it might first appear. The everyday ideaof a number is essentially that of a real number. Informally, these are thenumbers that can be expressed as positive or negative decimals, with possiblyan infinite number of digits after the decimal place such as

π = 3 · 14159265358 . . .

where the dots indicate that this can be continued forever. Whilst suchnumbers are sufficient to solve linear equations in one unknown, they arenot enough to solve quadratics, cubics, quartics etc. These require the in-troduction of complex numbers which involve such apparent ineffabilities asthe square root of minus one. Because such numbers don’t occur in everydaylife, there is a temptation to view them as somehow artificial or of purelytheoretical interest. This is wrong with a capital w. All numbers are artifi-cial, in that they are artefacts of our imaginations that help us to understandthe world. Although you can see examples of two things you cannot see thenumber two. It is an idea, an abstraction. As for being of only theoreticalinterest, it is worth noting that quantum mechanics, the theory that explainsthe behaviour of atoms and their constituents, uses complex numbers in anessential way. In fact, for mathematicians the word ‘number’ usually means‘complex number’ and mathematics is unthinkable without them.


But this is not the end of our excavations of what we mean by the word‘number’. There are occasions when we want to restrict the solutions: wemight want whole number solutions or solutions as fractions. It turns outthat the usual high-school methods for solving equations don’t work in thesecases. For example, consider the equation

2x+ 4y = 3.

To find the real or complex solutions, we let x be any real or complex valueand then we can solve the equation to work out the corresponding value ofy. But suppose that we are only interested in whole number solutions? Infact, there are none. You can see why by noting that the lefthand side of theequation is exactly divisible by 2, whereas the righthand side isn’t. Whenwe are interested in solving equations, of whatever type, by means of wholenumbers or fractions we say that we are studying Diophantine equations. Thename comes from Diophantus of Alexandria who flourished around 250 CE,and who studied such equations in his book Arithmetica. It is ironic thatsolving Diophantine equations is often much harder than solving equationsusing real or complex numbers.

1.1.2 Geometry

If algebra is about manipulating symbols, then geometry is about pictures.The Ancient Greeks developed geometry to a very high level. Some of theirachievements are recorded in Euclid’s book the Elements which I shall havemore to say about later. It developed the whole of what became known asEuclidean geometry on the basis of a few rules known as axioms. This geom-etry gives every impression of being a faithful mathematical version of thegeometry of actual space and for that reason you might expect that, unlikealgebra, there is only one geometry and that’s that. In fact, it was discoveredin the nineteenth century that there are other mathematical geometries suchas spherical geometry and hyperbolic geometry. In the twentieth century, itbecame apparent that even the space we inhabit was much more complexthan it appeared. First came the four dimensional geometry of special rela-tivity and then the curved space-time of general relativity. Modern particlephysics suggests that there may be many more dimensions in real space thanwe can see. So, in fact, we have the following.


There is not one geometry, but many different geometries, eachdesigned for different purposes.

In this book, I will only talk about three-dimensional Euclidean geometry,but this is the gateway to all these other geometries.

This, however, is not the end of the story. In fact, any book about algebramust also be about geometry. The two are indivisible but it was not alwayslike that. Unlike geometry which began with a sort of Big Bang in AncientGreece, algebra crystallized much more slowly over time and in differentplaces. There is even some algebra, disguised, in the Elements. In the 17thcentury, Rene Descartes discovered the first connection between algebra andgeometry which will be completely familiar to you. For example, x2 + y2 = 1is an algebraic equation, but it also describes something geometric: a circleof unit radius centred on the origin. This connection between algebra andgeometry will play an important role in our study of linear equations andvectors. But it is just a beginning.

If you are studying an algebra look for an accompanying geometry,and if you are studying a geometry find a companion algebra.

This is quite a fancy way of saying things, but it boils down to the fact thatmanipulating symbols is often helped by drawing pictures, and sometimesthe pictures are to complex so it is helpful to replace them with symbols. It’snot a one-way street.

I want to give you some idea of why the connection between algebra andgeometry is so significant. Let me start with a problem that looks completelyalgebraic. Problem: find all whole numbers a, b, c that satisfy the equationa2 + b2 = c2. I’ll write solutions that satisfy this equation as (a, b, c). Suchnumbers are called Pythagorean triples. Thus (0, 0, 0) is a solution and so is(3, 4, 5), and I can put in minus signs since when squared they disappear so(−3, 4,−5) is a solution. In addition, if (a, b, c) is a solution so is (λa, λb, λc)where λ is any whole number. I shall now show that this problem is equivalentto one in geometry. Suppose first that a2 + b2 = c2. We exclude the casewhere c = 0 since then a = 0 and b = 0. We may therefore divide both sidesby c2 and get

(ac

)2

+

(b

c

)2

= 1.


Recall that a rational number is a real number that can be written in theform u

vwhere u and v are whole numbers and v 6= 0. It follows that

(x, y) =

(a

c,b

c

)

is a rational point on the unit circle; that is, a point with rational co-ordinates.On the other hand, if

(x, y) =

(m

n,p

q

)

is a rational point on the unit circle then

(mq)2

(nq)2+

(np)2

(nq)2= 1.

Thus (mq, pn, nq) is a Pythagorean triple. We may therefore interpret ouralgebraic question as a geometric one: to find all Pythagorean triples, findall those points on the unit circle with centre the origin whose x and y co-ordinates are both rational. In fact, this can be used to get a very nicesolution to the original algebraic problem as we shall show later.

1.1.3 Combinatorics

The term ‘combinatorics’ may not be familiar though the sorts of questionsit deals with are. Combinatorics is the branch of mathematics that dealswith arrangements and the counting of arrangements. The fact that it dealsin counting makes it sound like this should be an easy subject. In fact, it isoften very difficult. For example, counting lies behind probability theory, asubject that can often defy intuition. Let me give you a simple example. In aclass of, say, 25 students, how likely do you think it is that two students willshare the same birthday? By this I mean, the same date and month, thoughnot year. Unless you’ve seen this problem before, I think the instinct is to say‘not very’. This is because we imagine in our mind’s eye those 25 studentsto be arranged across 365 days without any pair of students landing on thesame date. In fact the answer, which you can calculate using the methods ofthis book, is just over a half. In other words, there is the same chance of twostudents sharing the same birthday as there is of tossing a coin and gettingheads. This little problem is often known as the birthday paradox. It is agood example of where maths can be used to correct our faulty intuition. But


this is really a counting problem. To get the right answers to such problems,you need to think about what you are counting in the right way.

1.2 The scope of mathematics

The most common replies to the question ‘what is mathematics?’ addressedto a non-mathematician are usually the depressing ‘arithmetic’ or ‘accoun-tancy’. Asked what they remember about school maths and they might beable to dredge up some more-or-less arcane words with challenging spellings:hypotenuse, isosceles, parallelogram. It either sounds a bit boring or a bitweird, but in any event is so obviously completely removed from real life thatit can safely be ignored.

Mathematics, therefore, has an image problem.I think part of the reason for this is the kind of maths that is taught in

schools and the way it is taught. School mathematics suffers by being basedon the narrow syllabuses proscribed by examining boards under politicaldirection. As a result, it is more by luck than design if anyone at schoolgets an idea of what maths is actually about. In addition, teaching too oftenmeans teaching to the exam, which means working through past exam papersand learning tricks1.

Let me begin by showing you just how vast a subject mathematics reallyis. The official Mathematics Subject Classification currently divides math-ematics into 64 broad areas in any one of which a mathematician couldwork their entire professional life. You can see what they are in the box.By the way, the missing numbers are deliberate and not because I cannotcount.

Mathematics Subject Classification 2010 (adapted)

00. General 01. History and biography 03. Mathematical logicand foundations 05. Combinatorics 06. Order theory 08. Gen-eral algebraic systems 11. Number theory 12. Field theory 13.Commutative rings 14. Algebraic geometry 15. Linear and multi-linear algebra 16. Associative rings 17. Non-associative rings 18.Category theory 19. K-theory 20. Group theory and generaliza-

1I say teaching and not teachers. My criticism is directed at policy not those who areforced to carry out that policy often under enormous pressures.

1.3. PURE VERSUS APPLIED MATHEMATICS 9

tions 22. Topological groups 26. Real functions 28. Measureand integration 30. Complex functions 31. Potential theory 32.Several complex variables 33. Special functions 34. Ordinary dif-ferential equations 35. Partial differential equations 37. Dynamicalsystems 39. Difference equations 40. Sequences, series, summa-bility 41. Approximations and expansions 42. Harmonic analysis43. Abstract harmonic analysis 44. Integral transforms 45. Integralequations 46. Functional analysis 47. Operator theory 49. Calcu-lus of variations 51. Geometry 52. Convex geometry and discretegeometry 53. Differential geometry 54. General topology 55.Algebraic topology 57. Manifolds 58. Global analysis 60. Proba-bility theory 62. Statistics 65. Numerical analysis 68. Computerscience 70. Mechanics 74. Mechanics of deformable solids 76.Fluid mechanics 78. Optics 80. Classical thermodynamics 81.Quantum theory 82. Statistical mechanics 83. Relativity 85. As-tronomy and astrophysics 86. Geophysics 90. Operations research91. Game theory 92. Biology 93. Systems theory 94. Informationand communication 97. Mathematics education

Each of these broad areas is then subdivided into a large number of smallerareas, any one of which could be the subject of a PhD thesis. This is a littleoverwhelming, so to make it more manageable it can be summarized, veryroughly, into the following ten areas:

Algebra Number theoryCalculus and analysis Probability and statisticsCombinatorics Differential equationsGeometry and topology Mathematical physicsLogic Computing

Most undergraduate courses will fit under one of these headings. But it isimportant to remember that mathematics is one subject — dividing it upinto smaller areas is done for convenience only. When solving a problem anyand all of the above areas might be needed.

1.3 Pure versus applied mathematics

Sometimes a distinction is drawn between pure and applied mathematics.Pure maths is supposed to be maths done for its own sake with no thought to


applications, whereas applied maths is maths used to solve some, presumablypractical, problem. I think there is often an implicit moralistic undertone tothis distinction with pure maths being viewed as perhaps rather self-indulgentand decorative, and applied maths as socially responsible grown-up mathsthat pays its way. Politicians prefer applied maths because they think it willmake money. Evidence for this distinction is the following quote from theEnglish mathematician G. H. Hardy (1877–1947) that is often used to provethe point:

“I have never done anything ‘useful’. No discovery of mine hasmade, or is likely to make, directly or indirectly, for good or ill,the least difference to the amenity of the world.”

Hardy was a truly great mathematician and a decent human being. Ashis dates show, he was of the generation that witnessed the First WorldWar where science and technology were applied to the business of wholesaleslaughter. His views on maths are therefore a not unnatural reaction onthe part of someone who taught young people who then went to war neverto return. Maths for him was perhaps a sanctuary2. In reality, the termspure and applied are extremely fuzzy. A mathematician might start work onsolving a real-life problem and then be led to develop new pure mathematics,or start in pure maths and develop an application. Calculus, for example,developed mainly out of the need to solve problems in physics and then wasapplied to pure maths. Complex numbers couldn’t have been more pure,introduced to provide the missing roots to polynomial equations, but are nowthe basis of quantum mechanics. In reality, there is just one mathematics.

The Banach-Tarski Paradox

The glory of mathematics is often to be found in its sheer weirdness.For a universe founded on logic, it can lead to some pretty confoundingconclusions. For example, a solid the size of a pea may be cut into afinite number of pieces which may then be reassembled in such a way asto form another solid the size of the sun. This is known as the Banach-Tarski Paradox (1924). There’s no trickery involved here and no sleightof hand. This is clearly pure maths — give me a real pea and whateverI do it will remain resolutely pea-sized — but the ideas it uses involve

2There was a similar reaction at the end of the Second World War amongst physicistswho turned instead to biology as an alternative to building weapons.

1.4. THE ANTIQUITY OF MATHEMATICS 11

such fundamental and seemingly straightforward notions as length, areaand volume that have important applications in applied maths.

1.4 The antiquity of mathematics

The history of chemistry or astronomy is not hugely relevant, however inter-esting it may be, to modern theories of chemistry or astronomy. A few hun-dred years ago, chemistry was alchemy and astronomy was astrology: modernchemists are not searching for the philosopher’s stone and astronomers don’tcaste horoscopes. Alchemists and astrologers are often the forbears theywould prefer to forget.3 Maths is different, since what was mathematicallytrue hundreds of years ago remains true today. Here is a famous example.Plimpton 322 is a small clay tablet kept in the George A. Plimpton Collectionat Columbia University dating to about 1,800 BCE. Impressed on the tabletare a number of columns of numbers written in cuneiform. The numbers arewritten not in base 10 but in base 60, the base that still lies behind the waywe tell the time and measure angles. The meaning and purpose of this claytablet is much disputed. But the second and third columns consist of thefollowing numbers, where I have given the usual corrected numbers. I havegiven the first seven lines of the table — there are fifteen in the original.

B C

1 119 1692 3367 48253 4601 66494 12709 185415 65 976 319 4817 2291 3541

If you calculate C2−B2 you will get a perfect square D2. Thus (B,D,C) isa Pythagorean triple. How such large Pythagorean triples were computed isa mystery.

This antiquity, combined with the fact that maths is a cumulative subject,meaning that you have to learnX before you can learn Y , has the unfortunate

3I am exaggerating a little here for rhetorical purposes. In fact, much fine work wascarried out under the guise of alchemy and astrology.


effect that most of the mathematics you learnt at school was invented before1800. Here is a very rough chronology.

BCE CE2000 Solving quadratics 1550 Solving cubics and quartics400 Existence of irrational numbers 1590 Logarithms300 Euclidean geometry 1630 Analytic geometry200 Conics 1675 Calculus

1700 Probability1795 Complex numbers

Only matrices (1850) and vectors (1880) were introduced more recently. How-ever, if you think of all the developments in physics since 1800 such as blackholes, the big bang theory, parallel universes, quantum then you might sus-pect that there have also been big developments in mathematics. There have,but you would be forgiven for not knowing about them because they are notpromoted in the media or taught in school.

I should add that like any other field of human endeavour, it is of coursetrue that mathematical ideas go in and out of fashion, but crucially theydon’t become wrong with time.

1.5 The modernity of mathematics

The fact that what’s taught in schools doesn’t seem to change much fromgeneration to generation leads to one of the biggest misconceptions aboutmathematics: that it has already all been discovered. To try and bring youup to date, I am going to say a little about three mathematicians and theirwork: Alan Turing (1912–1954), Sir Andrew Wiles (b. 1953), and TerenceTao (b. 1975). I have chosen them to illustrate some additional points I wantto make about maths.

Alan Turing

Alan Turing is the only mathematician I know who has had a West Endplay written about his life: the 1986 play Breaking the code by Hugh White-more. Turing is best known as one of the leading members of Bletchley Parkduring the Second World War, for his role in the British development ofcomputers during and after the War, and for the ultimately tragic nature of

1.5. THE MODERNITY OF MATHEMATICS 13

his early death. Here I want to return to Turing the mathematician. As agraduate student, he wrote a paper in 1936 entitled On computable numberswith an application to the Entscheidungsproblem, where the long Germanword means decision problem and refers to a specific question in mathemat-ical logic. It was as a result of solving this problem that Turing was led toformulate a precise mathematical blueprint for a computer now called Tur-ing machines in his honour. This is the most extreme example I know ofa problem in pure maths leading to new applied maths — in fact, it led tothe whole field of computer science and the information age we now inhabit.Amongst computer scientists, Turing is regarded as the father of computerscience. So, mathematicians invented the modern world.

Andrew Wiles

Mathematicians operate on a completely different timescale from everyoneelse. I have already talked about Pythagorean triples, those whole numbers(x, y, z) that satisfy the equation x2 +y2 = z2. Here’s an idle thought. Whathappens if we try to find whole number solutions to x3+y3 = z3 or x4+y4 = z4

or more generally xn + yn = zn where n ≥ 3. Let’s exclude the trivial casewhere some of the numbers x, y or z are 0. So, here is the question: for n ≥ 3find all whole number solutions to xn + yn = zn where xyz 6= 0. Back in the17th century, Pierre de Fermat (1601?–1665) wrote in the margin of a book,the Arithmetica of Diophantus, that he had found a proof that there wereno such solutions but that sadly there wasn’t enough room for him to recordit. This became known as Fermat’s Last Theorem. In fact, since Fermat’ssupposed proof was never found, it was really a conjecture. More to thepoint, it is highly unlikely that he ever had a proof since in the subsequentcenturies many attempts were made to prove this result, all in vain, althoughsubstantial progress was made. This problem became one of mathematics’many Mount Everests: the peak that everyone wanted to scale. Finally, onMonday 19th September, 1994, sitting at his desk, Andrew Wiles, buildingon over three centuries of work, and haunted by his premature announcementof his success the previous year, had a moment of inspiration as the followingquote from the Daily Telegraph dated 3rd May 1997 reveals

“Suddenly, totally unexpectedly, I had this incredible revelation.It was so indescribably beautiful, it was so simple and so elegant.”

As a result Fermat’s Conjecture really is a theorem, but the proof requiredtravelling through what can only be described as mathematical hyperspace.


Wiles’s reaction to his discovery is also a glimpse of the profound intellectualexcitement that engages the emotions as well as the intellect when doingmathematics4.

Terence Tao

Tao won the 2006 Field’s medal. This is a mathematical honour compa-rable with a Nobel Prize though with the added twist that you have to beunder 40 to get one. You can read his thoughts at his blog, as well as use itto find all manner of interesting things. So, what sorts of things does he do?Here is one example that is remarkably easy to explain though the proof isformidable. You know what primes are and, in any event, we shall talk aboutthem later. They can be regarded as the atoms of numbers and their prop-erties have inspired hard questions and deep results. One of the things thatinterests mathematicians is the sorts of patterns that can be found in primes.An arithmetic progression is a sequence of numbers of the form a+ dk wherea and d are fixed numbers. Consider the arithmetic progression 3 + 2k. Ob-serve that for the consecutive values of k = 0, 1, 2, the numbers 3, 5, 7 whicharise are all prime. But when k = 3 we get 9 which is not prime. Our littleexample is an instance of an arithmetic progression with 3 terms all prime.Here is one with 10 terms 199 + 210k where k = 0, 1, . . . , 9. In 2004, Taoand his colleague Ben Green proved that there were arithmetic progressionsof arbitrary length all of whose terms are prime. In other words, for anynumber n there is an arithmetic progression so that the first n terms are allprime.

1.6 The legacy of the Greeks

The word ‘mathematics’ is Greek. In fact, many mathematical terms areGreek: lemma, theorem, hypotenuse, orthogonal, polygon, to name just afew. The Greek alphabet is used as a standard part of mathematical nota-tion. The very concept of a mathematical proof is a Greek idea. All of thisreflects the fact that Ancient Greece is the single most important historicalinfluence on the development and content of mathematics. By Ancient Greek

4There is a BBC documentary directed by Simon Singh about Andrew Wiles madefor the BBC’s Horizon series. It is an exemplary example of how to portray complexmathematics in an accessible way and cannot be too highly recommend.

1.7. THE LEGACY OF THE ROMANS 15

mathematics, I mean the mathematics developed in the wider Greek worldaround the Mediterranean in the thousand or more years between roughly600 BCE and and 600 CE. It begins with the work of semi-mythical figures,such as Thales of Miletus and Pythagoras of Samos, and is developed in thebooks of such mathematicians as Euclid, Archimedes, Apollonius of Perga,Diophantus and Pappus. Of all the Ancient Greek mathematicians the great-est was Archimedes. His work is sophisticated mathematics of the highestorder. In particular, he developed methods that are close to those of integralcalculus and used them to calculate areas and volumes of complicated curvedshapes.

1.7 The legacy of the Romans

For all their aqueducts, roads, baths and maintenance of public order, it hasbeen said of the Romans that their only contribution to mathematics waswhen Cicero rediscovered the grave of Archimedes and had it restored5.

1.8 What they didn’t tell you in school

This book is written to help you make the transition from school maths touniversity maths. You might well still be in school, or you might have leftschool fifty years ago, it doesn’t matter. Maths as taught in school and themaths taught at university are very different, but the failure to understandthose differences can cause problems. To be successful in university mathe-matics you have to think in new ways. University Mathematics is not justSchool Mathematics with harder sums and fancier notation, it is different,fundamentally different, from what you did at school.

In much of school mathematics, you learn methods for solving spe-cific problems. Often, you just learn formulae.

A method for solving a problem that requires little thought in its appli-cation is called an algorithm. Computer programs are the supreme examplesof algorithms, and it is certainly true that finding algorithms for solving spe-cific problems is an important part of mathematics, but it is by no means the

5George Simmons, Calculus Gems, McGraw-Hill, Inc., New York, 1992, page 38.


only part. Problems do not come neatly labelled with the methods neededfor their solution. A new problem might be solvable using old methods orit might require you to adapt those methods. On the other hand, you mayhave to invent completely new methods to solve it. Such new methods re-quire new ideas. In fact, what you might not have appreciated from schoolmathematics is the important role played in mathematics by ideas. An ideais a tool to help you think.

Mathematics at school is often taught without reasons being givenfor why the methods work.

This is the fundamental difference between school mathematics and uni-versity mathematics. A reason why something works is called a proof. I shallsay a lot more about proofs in Chapter 2.

The Millennium Problems

Mathematics is difficult but intellectually rewarding. Just how hard canbe gauged by the following. The Millennium Problems is a list of sevenoutstanding problems posed by the Clay Institute in the year 2000. Acorrect solution to any one of them carries a one million dollar prize.To date, only one has been solved, the Poincare conjecture, by GrigoriPerelman in 2010, who declined to take the prize money. The point isthat no one offers a million dollars for something that is trivial. You canread more about these problems at

http://www.claymath.org/millennium-problems

1.9 Further reading and links

There is a wealth of material about mathematics available on the Web andI would encourage exploration. Here, I will point out some books and linksthat develop the themes of this chapter. A book that is in tune with thegoals of this chapter is

P. Davis, R. Hersh, E. A. Marchisotto, The mathematical experience, Birkhauser,2012.

1.9. FURTHER READING AND LINKS 17

It’s one of those books that you can dip into and you will learn somethinginteresting but, most importantly, it will expand your understanding of whatmathematics is, as it did mine.

A good source book for the history of mathematics, and again somethingthat can be dipped into, is

C. B. Boyer, U. C. Merzbach, A history of mathematics, Jossey Bass, 3rdEdition, 2011.

The books above are about maths rather than doing maths. Let me nowturn to some books that do maths in a readable way. There is a plethoraof popular maths books now available, and if you pick up any books by IanStewart — though if the book appears to be rather more about volcanoesthan is seemly in a maths book, you have Iain Stewart — and Peter Higginsthen you will find something interesting. Sir (William) Timothy Gowers wona Field’s Medal in 1998 and so can be assumed to know what he is talkingabout.

T. Gowers, Mathematics: A Very Short Introduction, Oxford UniversityPress, 2002

It is worth checking out his homepage for some interesting links. He alsohas his own blog which is worth checking out. I think the Web is serving tohumanize mathematicians: their ivory towers all have wi-fi. A classic bookof this type is

R. Courant, H. Robbins, What is mathematics, OUP, 1996.

This is also an introduction to university-level maths, and it has influencedmy thinking on the subject.

If you have never looked into Euclid’s book the Elements, then I wouldrecommend you do6. There is an online version that you can access via DavidE. Joyce’s website at Clark University. A handsome printed version, editedby Dana Densmore, has been published by Green Lion Press, Santa Fe, NewMexico.

6Whenever I refer to Euclid, it will always be to this book. It consists of thirteenchapters, themselves called ‘books’, which are numbered in the Roman fashion I–XIII.


Finally, let me mention the books of Martin Gardner. For a quarter ofa century, he wrote a monthly column on recreational mathematics for theScientific American which inspired amateurs and professionals alike. I wouldstart with

M. Gardner, Hexaflexagons, probability paradoxes, and the Tower of Hanoi:Martin Gardner’s first book of mathematical puzzles and games, CUP, 2002

and follow your interests.

Chapter 2

Proofs

Part of the argument sketch, Monty Python

M = Man looking for an argumentA = Arguer

M: An argument isn’t just contradiction.A: It can be.M: No it can’t. An argument is a connected series of statements intended toestablish a proposition.A: No it isn’t.M: Yes it is! It’s not just contradiction.A: Look, if I argue with you, I must take up a contrary position.M: Yes, but that’s not just saying ‘No it isn’t.’A: Yes it is!M: No it isn’t!A: Yes it is!M: Argument is an intellectual process. Contradiction is just the automaticgainsaying of any statement the other person makes.(short pause)A: No it isn’t.

The most fundamental difference between school and university mathe-matics lies in proofs. At school, you were probably told mathematical factsand given recipes that solved particular kinds of problems. But the chances

19

20 CHAPTER 2. PROOFS

are, you were not given any reasons to back up those facts or explanationas to why those recipes worked. University and professional mathematics isdifferent. Reasons and explanations are essential and are called proofs. Theyare the essence of mathematics. Mathematical truth, and the notion of proofthat supports it, is so different from what we encounter in everyday life thatI shall need to begin by setting the scene.

2.1 How do we know what we think is true is

true?

Human beings usually believe something first for emotional reasons, andthen look for the evidence to back it up. The pitfalls of this are obvious.We shall therefore be interested in reasons that do not involve emotion. Tobe concrete, how would you verify the following claim: Mount Everest isbetween 8 and 9 km high?

The appeal to authority

In the past, claims such as this would be resolved by consulting an en-cyclopedia or atlas whereas today, of course, we would simply go online. Ifyou do this, you will find that a height of about 8.8 km is quoted. For mostpurposes this would settle things. But it’s important to understand whatthis entails. We are, in effect, taking someone’s word for it. We assume thatwhoever posted this information knows what they are talking about. Whatwe are doing, therefore, is appealing to authority. Most of what we take tobe true is based on such appeals to authority: parents, teachers, politicians,religiosi etc tell us things that they claim to be true and more often than notwe believe them. There’s a small element of laziness involved on our part,but it is so convenient. The pitfalls of this are also obvious.

The appeal to experiment

But where did the figure of 8.8km come from? It wasn’t just pluckedfrom the sky. The height of Mount Everest was first measured as part of thegreat survey of India undertaken in the nineteenth century. This consisted ofa team of expert surveyers who not only employed extremely precise instru-ments that were used to take multiple measurements but who also tried to

2.1. HOW DO WE KNOW WHAT WE THINK IS TRUE IS TRUE? 21

minimize the effect of factors influencing the accuracy of their measurementssuch as temperature and, amazingly, variations in gravity. Making measure-ments and taking great pains over those measurements together with esti-mations of the error bounds is such an important part of science that scienceitself would be impossible without it. Let’s call this the appeal to experiment.

This brings me to how we know statements are true in mathematics. Theessential point is the following:

Neither of the above methods for ascertaining truth plays anyrole whatsoever in determining mathematical truth.

This is so important, I am going to say it again in a different way:

• Results are not true in maths because I say so or because someoneimportant said they were true a long time ago.

• Results are not true in mathematics because I have carried out exper-iments and I always get the same answer.

• Results are not true in maths ‘just because they are’.

How then can we determine whether something in mathematics is true?

• Results are true in maths only because they have been proved to betrue.

• A proof shows that a result is true.

• A proof is something that you yourself can follow and at the end youwill see the truth of what has been proved.

• A result that has been proved to be true is called a theorem.

• The appeal to authority and the appeal to experiment are both fallible.The appeal to proof is never fallible. The only truths we know forcertain are mathematical truths.

This is heady stuff. So what, then, is a proof? The remainder of thischapter is devoted to an introductory answer to this question.


2.2 Three fundamental assumptions of logic

In order to understand how mathematical proofs work, there are three sim-ple, but fundamental, assumptions you have to understand.

I. Mathematics only deals in statements that are capableof being either true or false.

Mathematics does not deal in statements which are ‘sometimes true’ or‘mostly false’. There are no approximations to the truth in mathematics andno grey areas. Either a statement is true or a statement is false, though wemight not know which. This is quite different from everyday life, where weoften say things which contain a grain of truth or where we say things forrhetorical reasons which we don’t entirely mean. Mathematics also doesn’tdeal in statements that are neither true nor false like exclamations such as‘Out damned spot!’ or with questions such as ‘To be or not to be?’.

II. If a statement is true then its negation is false, and ifa statement is false then its negation is true.

In natural languages, negating a sentence is achieved in different ways.In English, the negation of ‘It is raining’ is ‘It is not raining’. In French,the negation of ‘Il pleut’ is obtained by wrapping the verb in ‘ne . . . pas’ toget ‘It ne pleut pas’. To avoid grammatical idiosyncracies, we can use theformal phrase ‘it is not the case that’ and place it in front of any sentenceto negate it. So, ‘It is not the case that it is raining’ is the negation of ‘It israining’. In some languages, and French is one of them, adding negatives isused for emphasis. This used to be the case in older forms of English and isoften the case in informal English. In formal English, we are taught that twonegatives make a positive which is actually the rule taken from mathematicsabove where it is true. In fact, negating negatives in natural languages ismore complex than this. For example, if your partner says they are ‘not un-happy’ then this isn’t quite the same as being ‘happy’ and maybe you needto talk.

III. Mathematics is free of contradictions.

2.3. EXAMPLES OF PROOFS 23

A contradiction is where both a statement and its negation are true. Thisis impossible by (II) above. This assumption will play a vital role in proofsas we shall see later.

2.3 Examples of proofs

Armed with the three assumptions above, I am going to take you throughfive proofs of five results, three of them being major theorems. This willenable me to show you examples of proofs but will also illustrate importantissues about how proofs, and mathematics, work.

Although proofs can be long or short, hard or easy they all tend to followthe same script. First, there will be a statement of what is going to beproved. This usually has the form: if a bunch of things are assumed truethen something else is also true. If the things assumed true are lumpedtogether as A, for assumptions, and the thing to be proved true is labelledC, for conclusion, then a statement to be proved usually has the shape ‘if Athen C’ or ‘A implies C’ or, in notation, ‘A ⇒ C’. The proof itself shouldbe thought of as a (rational) argument between two protagonists whom weshall call Alice and Bob. We assume that Alice wants to prove C. She canuse any of the assumptions A, any previously proved theorems, the rules oflogic, which I shall describe as we meet them, and definitions. Bob’s roleis to act like an attorney and to demand that Alice justify each claim shemakes. Thus Alice cannot just make assertions without justifying them, andshe is limited in the sorts of things that count as justifications. At the endof this, Alice can say something like ‘ . . . and so C is proved’ and Bob willbe forced to agree.

2.3.1 Proof 1

We shall prove the following statement.

The square of an even number is even, and the square of an oddnumber is odd.

In fact, this is really two statements ‘If n is an even number then n2 is even’and ‘If n is an odd number then n2 is odd.’ Before we can prove them, we


need to understand what they are actually saying. The terms odd and evenare only used of whole numbers such as

0, 1, 2, 3, 4, . . .

These numbers are called the natural numbers and they are the first kindsof numbers we learn about as children. Thus we are being asked to prove astatement about natural numbers. The terms ‘odd’ and ‘even’ might seemobvious, but we need to be clear about how they are used in maths. Bydefinition, a natural number n is even if it is exactly divisible by 2, otherwiseit is said to be odd. In maths, we usually just say divisible rather than exactlydivisible. This definition of divisibility only makes sense when talking aboutwhole numbers. For fractions, for example, it is pointless since one fractionwill always divide another fraction. Notice that 0 is an even number because0 = 2 × 0. In other words, 0 is exactly divisible by 2. However, remember,you cannot divide by 0 but you can certainly divide into 0. You might havebeen told that a number is even if its last digit is one of the digits 0, 2, 4, 6, 8.In fact, this is a consequence of our definition rather than a definition itself.I shall ask you to prove this result in the exercises. I shall say no more aboutthe definition of even. What about the definition of odd? A number is oddif it is not even. This is not a very useful definition since a number is oddif it fails to be even. We want a more positive characterization. So we shalldescribe a better one. If you attempt to divide a number by 2 then there aretwo possibilities: either it goes exactly, in which case the number is even, orit goes so many times plus a remainder of 1, in which case the number is odd.It follows that a better way of defining an odd number n is one that can bewritten n = 2m + 1 for some natural number m. So, the even numbers arethose natural numbers that are divisible by 2, thus the numbers of the form2n for some n, and the odd numbers are those that leave the remainder 1when divided by 2, thus the numbers of the form 2n + 1 for some n. Everynumber is either odd or even but not both.

There is a moral to be drawn from what I have just done, and I shallstate it boldly because of its importance. It may seem obvious but experi-ence shows that it is, in fact, not.

Every time you are asked to prove a statement, you must ensurethat you understand what that statement is saying. This means,in particular, checking that you understand what all the words in


the statement mean.

The next point is that we are making a claim about all even numbers. Ifyou pick a few even numbers at random and square them then you will findin every case that the result is even but this does not prove our claim. Even ifyou checked a trillion even numbers and squared them and the results were alleven it wouldn’t prove the claim. Maths, remember, is not an experimentalscience. There are plenty of examples in maths of statements that look trueand are true for umpteen cases but are in fact bunkum.

This means that, in effect, we have to prove an infinite number of state-ments: 02 is even, and 22 is even, and 42 is even . . . I cannot therefore provemy claim by picking a specific even number, like 12, and checking that itssquare is even. This simply verifies one of the infinitely many statementsabove. As a result, the starting point for my proof cannot be a specific evennumber. It has to be a general even number. We are now in a position toprove our claims.

First, we prove that the square of an even number is even.

1. Let n be an even number. This is the assumption that gets the ballrolling. Notice that n is not a specific even number. We want to provesomething for all even numbers so we cannot argue with a specific one.

2. Then n = 2m for some natural number m. Here we are using thedefinition of what it means to be an even number.

3. Square both sides of the equation in (2) to get n2 = 4m2. To do thiscorrectly, you need to follow the rules of high-school algebra.

4. Now rewrite this equation as n2 = 2(2m2). This uses more basic high-school algebra.

5. Since 2m2 is a natural number, it follows that n2 is even using ourdefinition of an even number. This proves our claim.

Second, we prove that the square of an odd number is odd. I’ll provide lesscommentary than in the previous case.

1. Let n be an odd number.

2. By definition n = 2m+ 1 for some natural number m.


3. Square both sides of the equation in (2) to get n2 = 4m2 + 4m+ 1.

4. Now rewrite the equation in (3) as n2 = 2(2m2 + 2m) + 1.

5. Since 2m2 + 2m is a natural number, it follows that n2 is odd using ourdefinition of an odd number. This proves our claim.

We have therefore proved our two claims. I admit that they are notexciting but just bear with me.

2.3.2 Proof 2

We shall prove the following statement.

If the square of a number is even then that number is even, andif the square of a number is odd then that number is odd.

In fact, this is really two statements ‘If n2 is even then n is even’ and ‘If n2

is odd then n is odd’. At first reading, you might think that I am simplyrepeating what I proved above. But in Proof 1, I proved

‘if n is even then n2 is even’

whereas now I want to prove

‘if n2 is even then n is even’.

Our assumptions in each case are different and our conclusions in each caseare different. It is therefore important to distinguish between A ⇒ B andB ⇒ A. The statement B ⇒ A is called the converse of the statementA ⇒ B. Experience shows that people are prone to swapping assumptionsand conclusions without being aware of it.

We prove the first claim.

1. Suppose that n2 is even.

2. Now it is very tempting to try and use the definition of even here, justas we did in Proof 1, and write n2 = 2m for some natural number m.But this turns out to be a dead-end. Just like playing a game such aschess, not every possible move is a good one. Choosing the right movecomes with experience and sometimes just plain trial-and-error.


3. So we make a different move. We know that n is either odd or even.Our goal is to prove that it must be even.

4. Could n be odd? The answer is no, because as we showed in Proof 1,if n is odd then, as we showed above, n2 is odd.

5. Therefore n is not odd.

6. But a number that is not odd must be even. It follows that n is even.

We use a similar strategy to prove the second claim.

The proofs here were more subtle, and less direct, than in our first ex-ample and they employed the following important strategy: if there are twopossibilities exactly one of which is true; we rule out one of those possibilitiesand so deduce that the other possibility must be true.1

Here is a concrete example. There are two politicians, Alice and Bob.One of them always lies and the other always tells the truth. Suppose youask Bob the question: is it true that 2 + 2 = 5? If he replies ‘yes’ then youknow Bob is lying. Without further ado, you can deduce that Alice is thatparagon of politicians and always tells the truth.

If A ⇒ B and B ⇒ A then we say that A if, and only, if B or A iff Bor A ⇔ B. The use of the word iff is peculiar to mathematical English. Ifwe combine Proofs 1 and 2, we have proved the following two statements forall natural numbers n: ‘n is even if, and only if, n2 is even’ and ‘n is odd if,and only if, n2 is odd’.

It is important to remember that the statement ‘A if, and only, if B’ is infact two statements in one. It means (1) ‘A implies B’ and (2) ‘B impliesA’. So, to prove the statement ‘A if and only if B’ we have to prove TWOstatements: we have to prove ‘A implies B’ and we have to prove ‘B impliesA’.

The results of this example were trickier to prove than the previous ones,but not much more exciting. However, we have now laid the foundations fora truly remarkable result.

1This might be called the Sherlock Holmes method. “How often have I said to you thatwhen you have eliminated the impossible, whatever remains, however improbable, mustbe the truth?” The Sign of Four, 1890.


2.3.3 Proof 3

We shall now prove our first real theorem.

√2 cannot be written as an exact fraction.

If you square each of the fractions in turn

3

2,7

5,17

12,41

29, . . .

you will find that you get closer and closer to 2 and so each of these numbersis an approximation to the square root of 2. This raises the question: isit possible to find a fraction x

ywhose square is exactly 2? In fact, it isn’t

but that isn’t proved just because my attempts above failed. Maybe, I justhaven’t looked hard enough. So, I have to prove that it is impossible. Toprove that

√2 is not an exact fraction, I am actually going to begin by trying

to show you that it is.

1. Suppose that√

2 = xy

where x and y are positive whole numbers wherey 6= 0.

2. We may assume that xy

is a fraction in its lowest terms so that the onlynatural number that divides both x and y is 1. Keep your eye on thisassumption because it will come back to sting us later.

3. Square both sides of the equation in (2) to get 2 = x2

y2.

4. Multiply both sides of the equation in (3) by y2.

5. We therefore get the equation 2y2 = x2.

6. Since 2 divides the lefthandside of this equation, it must divide therighthandside. This means that x2 is even.

7. We now use Proof 2 to deduce that x is even.

8. We may therefore write x = 2u for some natural number u.

9. Substitute this value for x we have found in (5) to get 2y2 = 4u2.

10. Divide both sides of the equation in (9) by 2 to get y2 = 2u2.


11. Since the righthand-side of the equation in (10) is even so is the left-handside. Thus y2 is even.

12. Since y2 is even, it follows by Proof 2, that y is even.

13. If (1) is true then we are led to the following two conclusions. From (2),we have that the only natural number to divide both x and y is 1. From(7) and (12), 2 divides both x and y. This is a contradiction. Thus (1)cannot be true. Hence

√2 cannot be written as an exact fraction.

This result is phenomenal. It says that no matter how much money youspend on a computer it will never be able to calculate the exact value of√

2, just a very, very good approximation. We now make a very importantdefinition. A real number that is not rational is called irrational. We havetherefore proved that

√2 is irrational.

2.3.4 Proof 4

We now prove our second real theorem.

The sum of the angles in a triangle add up to 180◦.

This is a famous result that everyone knows. You might have learnt aboutit at school by drawing lots of triangles and measuring their angles but asI said above, maths is not an experimental science and so this enterprizeproves nothing. The proof I give is very old and occurs in Euclid’s book theElements: specifically, Book I, Proposition 32. Draw a triangle and call itsthree angles α, β and γ respectively.

α γ

β

Our goal is to prove that

α + β + γ = 180◦.

In fact, we shall show that the three angles add up to a straight line whichis the same thing. Draw a line through the point P parallel to the base ofthe triangle.


α γ

β

P

Then extend the two sides of the triangle that meet at the point P as shown.

α γ

β

β′γ′ α′

As a result, we get three angles that I have called α′, β′ and γ′. I now makethe following claims

• β′ = β because the angles are opposite each other in a pair of inter-secting straight line.

• α′ = α because these two angles are formed from a straight line cuttingtwo parallel lines.

• γ′ = γ for the same reason as above.

But since α′ and β′ and γ′ add up to give a straight line, we have proved theclaim.

Now this is all well and good, but we have proved our result on the basisof three other results currently unproved:

1. That given a line l and a point P not on that line I may draw a linethrough the point P and parallel to l.

2. If two line intersect, then opposite angles are equal.

3. If a line l cuts two parallel lines l1 and l2 the angle l makes with l1 isthe same as the angle it makes with l2.


How do we know they are true? Result (2) can readily be proved. We shalluse the diagram below.

α γβ

δ

The proof that α = γ follows from the simple observation that α+β = β+γ.This still leaves (1) and (3). I shall say more about them later when I talkabout axioms.

2.3.5 Proof 5

The most famous theorem of them all is the one attributed to Pythagorasand proved in Book I, Proposition 47 of Euclid. We begin with a right-angledtriangle.

ca

b

We want to prove, of course, that

a2 + b2 = c2.

Consider the shape below. It has been constructed from four copies of ourtriangle and two squares of areas a2 and b2, respectively. I claim that thisshape is actually a square. First, the sides all have the same length a + b.Second, the angles at the corners are right angles by Proof 4.


a2

b2

a

b

a b

Now look at the following picture. This is also a square with sides a+ b so ithas the same area as the first square. Using Proof 4, the shape in the middlereally is a square with area c2.

c2

b

a

b a

a b

a

b

If we subtract the four copies of the original triangle from both squares, theshapes that remain must have the same areas, and we have proved the claim.


Exercises 2.3

1. Raymond Smullyan is both a mathematician and a magician. Hereare two of his puzzles. On an island there are two kinds of people:knights who always tell the truth and knaves who always lie. They areindistinguishable.

(a) You meet three such inhabitants A, B and C. You ask A whetherhe is a knight or knave. He replies so softly that you cannot makeout what he said. You ask B what A said and they say ‘he saidhe is a knave’. At which point C interjects and says ‘that’s a lie!’.Was C a knight or a knave?

(b) You encounter three inhabitants: A, B and C.A says ‘exactly one of us is a knave’.B says ‘exactly two of us are knaves’.C says: ‘all of us are knaves’.What type is each?

2. There are five houses, from left to right, each of which is painted adifferent colour, their inhabitants are called W, C, O, S and M, but notnecessarily in that order, who own different pets, drink different drinksand drive different cars.

(a) There are five houses.

(b) W lives in the red house.

(c) C owns the dog.

(d) Coffee is drunk in the green house.

(e) O drinks tea.

(f) The green house is immediately to the right (that is: your right)of the ivory house.

(g) The Oldsmobile driver owns snails.

(h) The Bentley owner lives in the yellow house.

(i) Milk is drunk in the middle house.

(j) S lives in the first house.


(k) The person who drives the Chevy lives in the house next to theman with the fox.

(l) The Bentley owner lives in a house next to the house where thehorse is kept.

(m) The Lotus owner drinks orange juice.

(n) M drives the Porsche.

(o) S lives next to the blue house.

There are two questions: who drinks water and who owns the aardvark?

3. Prove that the sum of any two even numbers is even, that the sum ofany two odd numbers is even, and that the sum of an odd and an evennumber is odd.

4. Prove that the sum of the interior angles in any quadrilateral is equalto 360◦.

5.

(a) A rectangular box has side of length 2, 3 and 7 units. What is thelength of the longest diagonal?

(b) I draw a square. Without measuring any lengths, you now haveconstruct a square that has exactly twice the area.

(c) A right-angled triangle has sides with lengths x, y and hypotenusez. Prove that if the area of the triangle is z2

4then the triangle is

isosceles.

6.

(a) Prove that the last digit in the square of a positive whole numbermust be one of 0,1,4,5,6, or 9. Is the converse true?

(b) Prove that a natural number is even if, and only if, its last digitis even.

(c) Prove that a natural number is exactly divisible by 9 if, and onlyif, the sum of its digits is divisible by 9.

7. Prove that√

3 cannot be written as an exact fraction.


8. The goal of this question is to prove Ptolomy’s theorem2. This dealswith cyclic quadrilaterals, that is those quadrilaterals whose vertices lieon a circle. With reference to the diagram below,

A

B

C

D

b

c

d

a

x

y

this theorem states that

xy = ac+ bd.

Hint. Show that on the line BD there is a point X such that the angleXAD is equal to the angle BAC. Deduce that the triangles AXD andABC are similar, and that the triangles AXB and ACD are similar.Let the distance between D and X be e. Show that

e

a=c

xand that

y − ed

=b

x.

From this, the result follows by simple algebra. To help you show thatthe triangles are similar, you will need to use Proposition III.21 fromEuclid which is illustrated by the following diagram

2Claudius Ptolomeus was a Greek mathematician and astronomer who flourishedaround 150 CE in the city of Alexandria.


9. The goal of this question is to find all Pythagorean triples. That isnatural numbers (a, b, c) such that a2 + b2 = c2. We shall do this usinggeometry by finding all the rational points on the unit circle. We shalluse the diagram below.

A

P

We have drawn a unit circle centre the origin. From the point (−1, 0),called A, we draw a line to any other point P on the circle.

(a) Show that any line passing through the point A has the equationy = t(x+ 1) where t is any real number.

(b) Show that this line intersects the circle at some point P on thecircle, different from A, when

(x, y) =

(1− t21 + t2

,2t

1 + t2

).

(c) Deduce that the rational points on the circle correspond to thevalues of t which are rational.

2.4. AXIOMS 37

(d) Put t = pq, in its lowest terms. Deduce that all Pythagorean triples

are obtained as the following

(r(q2 − p2), 2pqr, r(p2 + q2))

where p, q, r are any integers.

10. Take any positive natural number n; so n = 1, 2, 3, . . . If n is even,divide it by 2 to get n

2; if n is odd, multiply it by 3 and add 1 to obtain

3n+1. Now repeat this process and stop only if you get 1. For example,if n = 6 you get 6, 3, 10, 5, 16, 8, 4, 2, 1. What happens if n = 11? Whatabout n = 27? Prove that no matter what number you start with, youwill always eventually reach 1.

2.4 Axioms

At this point, I need to confront some potential problems with the idea ofproof I have been developing. Once this is done, I will then be able tocomplete the proof of Proof 4. Suppose I am trying to prove the statementS. Then I am done if I can find a theorem S1 so that S1 ⇒ S. But this raisesthe question of how I know that S1 is a theorem. This can only be because Ican find a theorem S2 such that S2 ⇒ S1. There are now three possibilities:

1. At some point I find a theorem Sn such that S ⇒ Sn. This is clearlya bad thing. In trying to prove S I have in fact used S and so haven’tproved anything at all. This is an example of circular reasoning and hasto be avoided. I can do this by organizing what I know in a hierarchy— so to prove a result, I am only allowed to use those theorems alreadyproved. In this way, I can avoid going around in circles.

2. Assuming I have avoided the above pitall, the next nasty possibility isthat I get an infinite sequence of implications:

. . .⇒ Sn ⇒ Sn−1 ⇒ . . .⇒ S1 ⇒ S.

I never actually know that S is a theorem because it is always proved interms of something else without end. This is also clearly a bad thing.I establish relative truth, a statement is true if another is true, but notabsolute truth. I clearly don’t want this to happen. But if not, then Iam led inexorably to the third possibility.


3. To prove S, I only have to prove only a finite number of implications

Sn ⇒ Sn−1 ⇒ . . .⇒ S1 ⇒ S.

But, if Sn is supposed to be a theorem then how do I know it is true ifnot in terms of something else, contradicting the assumption that thiswas supposed to be a complete argument?

I shall now delve into case (3) above in more detail, since resolving it willlead to an important insight. Maths is supposed to be about proving theo-rems but the analysis above has led us to the uncomfortable possibility thatsome things have to be accepted as true ‘because they are’ which contradictswhat I went to great trouble to rubbish earlier. Before I explain the wayout of this conundrum, let me first consider an example from an apparentlycompletely different enterprize: playing a game.

To be concrete, let’s take the game of chess. Most people have learntchess at some point even if, like me, you are not very good at it. This gameconsists of a board and some pieces. The pieces are of different types — kings,queens, knights, bishops, castles, pawns — each of which can be moved indifferent ways. To play chess means to accept the rules of chess and to movethe pieces in accordance with the rules. Whether one player wins or there isa draw is also described by the rules of chess. It’s meaningless to ask whetherthe rules of chess are true. But a move in chess is valid, which is another wayof saying true, if it is made according to those rules. This example providesa way of understanding how maths works.

Maths should be viewed as a collection of different mathematical domainseach described by its own ‘rules of the game’ which in maths are termedaxioms. These axioms are the basic assumptions on which the theory is builtand are the building blocks of all proofs within that mathematical domain.Our goal is to prove interesting theorems from those axioms.

As an example, consider Euclidean geometry. The Greeks attributed thediscovery of geometry to the Ancient Egyptians who needed it in recalculat-ing land boundaries for the purposes of tax assessment after the yearly floodof the Nile. Thus geometry probably first existed as a collection of geomet-rical methods that worked: the tax was calculated, the pyramids built andeveryone was happy. But it was the Ancient Greeks themselves who elevatedit into a mathematical science and a model of what could be achieved inmathematics. Euclid’s book the Elements codified what was known aboutgeometry into a handful of axioms and then showed that all of geometry

2.4. AXIOMS 39

could be deduced from those axioms by the use of mathematcial proof. TheElements is not only the single most important mathematics book ever writ-ten but one of the most important books — fullstop. Here is a list of the keyaxioms.

1. Two distinct points determine a unique straight line.

2. A line segment can be extended infinitely in either direction.

3. Circles can be drawn with any centre and any radius.

4. Any two right angles are equal to each other.

5. Suppose that a straight line cuts two lines l1 and l2. If the interiorangles on the same side add up to strictly less than 180◦, then if l1 andl2 are extended on that side they will eventually meet.

The last axiom needs a picture to illustrate what is going on.

l1

l2

In principle, all of the results you learnt in school about triangles and cir-cles can be proved from these axioms. I say ‘in principle’ since there werea few bugs which were later fixed by a number of mathematicians most no-tably David Hilbert. But this shouldn’t detract from what an enormousachievement Euclid’s book was and is. We may now finish off Proof 4: claim


(1) is proved in Book I, Proposition 31, and claim (3) is proved in Book I,Proposition 29.

One way of teaching maths at university would therefore be to start witha list of axioms and start proving things. But this approach has a num-ber of disadvantages: it is time-consuming, laborious, sometimes, even, abit tedious, and takes a very, very long time to reach the really interestingtheorems. Therefore, in this book, I shall usually base each topic on quitehigh-level axioms so that we can get to the interesting theorems quickly,but I shall also give pointers to readers who want to see the full axiomaticdevelopment.

Exercises 2.4

1. Hofstadter’s MU-puzzle. A string is just an ordered sequence of sym-bols. In this puzzle, you will construct strings using the letters M, I, U .You are given the string MI which is your only axiom. You can makenew strings only by using the following rules any number of times insuccession in any order:

(I) If you have a string that ends in I then you can add a U on at theend.

(II) If you have a string Mx where x is a string then you may formMxx.

(III) If III occurs in a string then you may make a new string withIII replaced by U .

(IV) If UU occurs in a string then you may erase it.

I shall write x→ y to mean that y is the string obtained from the stringx by applying one of the above four rules. Here are some examples:

• By rule (I), MI →MIU .

• By rule (II), MIU →MIUIU .

• By rule (III), UMIIIMU → UMUMU .

• By rule (IV), MUUUII →MUII.

The question is: can you make MU?

2.5. MATHEMATICS AND THE REAL WORLD 41

2.5 Mathematics and the real world

Euclidean geometry appears to be about the real world. In fact, for thou-sands of years this was what mathematicians believed until they discoveredother geometries with different properties. On the surface of a sphere, forexample, the sum of the angles in a spherical triangle will actually be biggerthan 180◦, the exact amount being determined by the area of the triangle.This result played an important role in surveying. But our analysis aboveleads us to the following conclusion:

Mathematics is about logically consistent mathematical universes.

A mathematical truth is therefore something proved in one of those math-ematical universes, and is not a truth about ‘out there’. Despite this, math-ematical truths do help us to understand the actual physical universe weinhabit. For example, does the geometry of the universe follow the rules ofEuclidean geometry? Here is what NASA says on the basis of the WilkinsonMicrowave Anisotropy Probe (WMAP):

“WMAP also confirms the predictions that the amplitude of thevariations in the density of the universe on big scales should beslightly larger than smaller scales, and that the universe shouldobey the rules of Euclidean geometry so the sum of the interiorangles of a triangle add to 180 degrees.”

http://map.gsfc.nasa.gov/news/index.html

2.6 Proving something false

‘Proving a statement true’ and ‘proving a statement false’ sound similarbut it turns out that ‘proving a statement false’ requires a lot less workthan ‘proving a statement true’. There is an asymmetry between them. Toprove a statement false all you need do is find a counterexample. Here is anexample. Consider the following statement: every odd number bigger than 1is a prime. This is false. The reason is that 9 is odd, bigger than 1, and notprime. Thus 9 is a counterexample. The number 9 here can be regarded asa witness that shows the claim to be false. To prove a statement true, youhave to work hard. To prove a statement false, you only have to find one


counterexample and you are done. (Though in research mathematics findinga counterexample can be a Herculean task).

2.7 Key points

• One of the goals of this book is to introduce you to proofs. This doesnot mean that you will afterwards be able to do proofs. That takestime and practice.

• Initially, you should aim to understand proofs. This means seeingwhy a proof is true. A good test of whether you really understand aproof is whether you can explain it to someone else. It is much easierto check that a proof is correct then it is to invent the proof in thefirst place. Nevertheless, be warned, it can also take a long time justto understand a proof.

• I shall ask you to find some proofs for yourself. But do not expect tofind them in a few minutes. Constructing proofs takes time, trial anderror and, yes, luck.

• If you don’t understand the words used in a statement that you areasked to prove then you are not going to be able to prove that state-ment. Definitions are vitally important in mathematics.

• Every statement that you make in a proof must be justified: if it is adefinition, say that it is a definition; if it is a result known to be true,that is a theorem, say that it is known to be true; if it is one of theassumptions, say that it is one of the assumptions; if it is an axiom,say that it is an axiom.

• When starting out, it is probably best to write each statement of aproof on a separate line followed by its justification.

Finally, there are one or two pieces of terminology and notation that areworth mentioning here. The conclusion of a proof is marked using the symbol2. This replaces the older use of QED. If we believe something might be truebut there isn’t yet a proof we say that it is a conjecture. The things we canprove fall, roughly, into the following categories: a theorem is a major result,worthy of note; a proposition is a result, and a lemma is an auxiliary result,

2.8. MATHEMATICAL CREATIVITY 43

a tool, useful in many different places; a corollary is a result we can deducewith little or no effort from a proposition or theorem.

2.8 Mathematical creativity

Everything I have said above is true, but does need to be placed in perspec-tive. Where do proofs come from? More to the point, where do theoremscome from? Music is a useful analogy. You can learn how to write musicdown, but that doesn’t make you a musician. In fact, there are some talentedmusicians who cannot even read music. Proofs keep us honest and groundwhat we are doing, but what makes maths fun is that it is creative, andfor creativity there are no rules. For example, in dreaming up a theorem,experimentation may well play a role. Sometimes a theorem may evolve intandem with a proof, at other times the theorem, or more accurately, theconjecture comes first and then there is the struggle to prove it, which maytake place over many generations and centuries.

2.9 Set theory: the language of mathematics

Everyday English is good at everyday jobs, but can be hopelessly impre-cise where accuracy is important. To get around this, special varieties ofEnglish, little dialects, have been constructed for particular purposes. Inmathematics, we use precise versions of everyday language augmented withspecial symbols. Part of this special language is that of set theory, inventedby Georg Cantor (1845–1918) in the last quarter of the nineteenth century.This section is mainly a phrasebook of the most important terms we shallneed for most of this book. I shall develop this language further when I needto when studying combinatorics.

The starting point of set theory are the following two deceptively simpledefinitions:

• A set is a collection of objects which we wish to regard as a whole. Themembers of a set are called its elements3.

• Two sets are equal precisely when they have the same elements.

3Strictly speaking this definition is nonsense. Why?


We often use capital letters to name sets: such as A, B, or C or fancy capitalletters such as N and Z. The elements of a set are usually denoted by lowercase letters. If x is an element of the set A then we write

x ∈ A

and if x is not an element of the set A then we write

x /∈ A.

A set should be regarded as a bag of elements, and so the order of theelements within the set is not important. In addition, repetition of elementsis ignored.4

Examples 2.9.1.

1. The following sets are all equal: {a, b}, {b, a}, {a, a, b}, {a, a, a, a, b, b, b, a}because the order of the elements within a set is not important and anyrepetitions are ignored. Despite this it is usual to write sets withoutrepetitions to avoid confusion. We have that a ∈ {a, b} and b ∈ {a, b}but α /∈ {a, b}.

2. The set {} is empty and is called the empty set. It is given a specialsymbol ∅, which is taken from Danish and is the first letter of the Danishword meaning ‘empty’. Remember that ∅ means the same thing as {}.Take careful note that ∅ 6= {∅}. The reason is that the empty setcontains no elements whereas the set {∅} contains one element. By theway, the symbol for the emptyset is different from the Greek letter phi:φ or Φ.

The number of elements in a set is called its cardinality. If X is a setthen |X| denotes its cardinality. A set is finite if it only has a finite numberof elements, otherwise it is infinite. If a set has only finitely many elementsthen we might be able to list them if there aren’t too many: this is done byputting them in ‘curly brackets’ { and }. We can sometimes define infinitesets by using curly brackets but then, because we can’t list all elements inan infinite set, we use ‘. . .’ to mean ‘and so on in the obvious way’. This canalso be used to define finite sets where there is an obvious pattern. Often,

4If you want to take account of repetitions you have to use multisets.

2.9. SET THEORY: THE LANGUAGE OF MATHEMATICS 45

we describe a set by saying what properties an element must have to belongto the set. Thus

{x : P (x)}means ‘the set of all things x which satisfy the condition P ’. Here are someexamples of sets defined in various ways.

Examples 2.9.2.

1. D = {Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sun-day }, the set of the days of the week. This is a small finite set and sowe can conveniently list its elements.

2. M = { January, February, March, . . . , November, December }, the setof the months of the year. This is a finite set but I didn’t want to writedown all the elements so I wrote ‘. . . ’ to indicate that there were otherelements of the set which I was too lazy to write down explicitly butwhich are, nevertheless, there.

3. A = {x : x is a prime number}. I define a set by describing the proper-ties that the elements of the set must have. Here P (x) is the statement‘x is a prime number’ and those natural numbers x are admitted mem-bership to the set when they are indeed prime.

In this book, the following sets of numbers will play a special role. Weshall use this notation throughout and so it is worthwhile getting used to it.

Examples 2.9.3.

1. The set N = {0, 1, 2, 3, . . .} of all natural numbers.

2. The set Z = {. . . ,−3,−2,−1, 0, 1, 2, 3, . . .} of all integers. The reasonZ is used to designate this set is because ‘Z’ is the first letter of theword ‘Zahl’, the German for number.

3. The set Q of all rational numbers i.e. those numbers that can be writtenas fractions whether positive or negative.

4. The set R of all real numbers i.e. all numbers which can be representedby decimals with potentially infinitely many digits after the decimalpoint.


5. The set C of all complex numbers, which I shall introduce from scratchlater on.

Given a set A, a new set B can be formed by choosing elements fromA to put in B. We say that B is a subset of A, which is written B ⊆ A.In mathematics, the word ‘choose’ also includes the possibilty of choosingnothing and the possibility of choosing everything. In addition, there doesn’thave to be any rhyme or reason to your choices: you can pick elements ‘atrandom’ if you want. If B ⊆ A and A 6= B then we say that B is a propersubset of A.

Examples 2.9.4.

1. ∅ ⊆ A for every set A, where we choose no elements from A. It is avery common mistake to forget the empty set when listing subsets of aset.

2. A ⊆ A for every set A, where we choose all the elements from A. It isa very common mistake to forget the set itself when listing subsets ofa set.

3. N ⊆ Z ⊆ Q ⊆ R ⊆ C.

4. E, the set of even natural numbers, is a subset of N.

5. O, the set of odd natural numbers, is a subset of N.

6. P = {2, 3, 5, 7, 11, 13, 17, 19, 23, . . .}, the set of primes, is a subset of N.

7. A = {x : x ∈ R and x2 = 4} which is just the set {−2, 2}.

There is a particular kind of subset that will be convenient to define now.If A and B are sets we define the set A \B to consist of those elements of Athat are not in B. Thus, in particular, A \ B ⊆ A. The operation is calledrelative complement. For example, N \E = O. The set R \Q is precisely theset of irrational numbers.

When set theory is first encountered it doesn’t look very impressive. Whatcould you possibly do with these very simple, if not simple-minded, defini-tions? In fact, all of mathematics can be developed using set theory. I amgoing to finish off this section with a first glimpse at the power of set theory.


Consider the set {a, b}. I have explained above that order doesn’t matterand so this is the same set as {b, a}. But there are many occasions wherewe do want order to matter. For example, in the Olympics it is importantto know who came first and second in the 100m sprint not merely that thefirst two over the finishing line were X and Y in alphabetical order. So weneed a new notion where order does matter. It is called an ordered pair andis written (a, b), where a is called the first component and b is called thesecond component. The key feature of this new object is that (a, b) = (c, d)if, and only if, a = c and b = d. So, order matters. For example, the or-dered pair (1, 2) is different from the ordered pair (2, 1). Furthermore, (1, 1)does not mean the same as 1 on its own. The idea of an ordered pair is afamiliar one from co-ordinate geometry. We use ordered pairs of real num-bers (x, y) to specifiy points in the plane. At first blush, set theory seemsinadequate to define ordered pairs. But in fact it can. I have put the detailsin a box and you don’t need to read them to understand the rest of the book.

Ordered Pairs

I am going to show you how sets, which don’t encode order directly,can nevertheless be used to define ordered pairs. It is an idea due toKuratowski (1896–1980). Define

(a, b) = {{a}, {a, b}}.

We have to prove, using only this definition, that we have (a, b) = (c, d)if, and only if, a = c and b = d. The proof is essentially an exercise inspecial cases. I shall prove the hard direction. Suppose that

{{a}, {a, b}} = {{c}, {c, d}}.

Since {a} is an element of the lefthand side it must be an element of therighthand side. So {a} ∈ {{c}, {c, d}}. There are now two possibilities.Either {a} = {c} or {a} = {c, d}. The first case gives us that a = c, andthe second case gives us that a = c = d. Since {a, b} is an element of thelefthand side it must be an element of the righthand side. So {a, b} ∈{{c}, {c, d}}. There are again two possibilities. Either {a, b} = {c} or{a, b} = {c, d}. The first case gives us that a = b = c, and the second


case gives us that (a = c and b = d) or (a = d and b = c). We thereforehave the following possibilities:

• a = b = c. But then {{a}, {a, b}} = {{a}}. It follows that c = dand so a = b = c = d and, in particular, a = c and b = d.

• a = c and b = d.

• In all remaining cases, a = b = c = d and so, in particular, a = cand b = d.

We can now build sets of ordered pairs. Let A and B be sets. DefineA×B, the product of A and B, to be the set

A×B = {(a, b) : a ∈ A and b ∈ B}.Example 2.9.5. Let A = {1, 2, 3} and let B = {a, b}. Then

A×B = {(1, a), (1, b), (2, a), (2, b), (3, a), (3, b)}and

B × A = {(a, 1), (1, b), (a, 2), (b, 2), (a, 3), (b, 3)}.So, in particular, A×B 6= B × A, in general.

If A = B it is natural to abbreviate A× A as A2. This now agrees withthe notation R2 which is the set of all ordered pairs of real numbers and,geometrically, can be regarded as the real plane.

We have defined ordered pairs but there is no reason to stop with justpairs. We may also define ordered triples. This can be done by defining

(x, y, z) = ((x, y), z).

The key property of ordered triples is that if (a, b, c) = (d, e, f) then a = d,b = e and c = f . Given three sets A, B and C we may define their productA × B × C to be the set of all ordered triples (a, b, c) where a ∈ A, b ∈ Band c ∈ C. A good example of an ordered triple in everyday life is a datethat consist of a day, a month and a year. Thus the 16th June 1904 is reallyan ordered triple (16, June, 1904) where we specify day, month and year inthat order. If A = B = C then we write A3 rather than A × A × A. Thusthe set R3 consists of all Cartesian co-ordinates (x, y, z). In general, we maydefine ordered n-tuples, which look like this (x1, . . . , xn), and products of n-sets A1 × . . . × An. And if A1 = . . . = An then we write An for their n-foldproduct.


Russell’s Paradox

There is more to sets than meets the eye. I shall now describe a famousresult in the history of mathematics called Russell’s Paradox. Definethe following

R = {x : x /∈ x},in other words: the set of all sets that do not contain themselves as anelement. For example, ∅ ∈ R. We now ask the question: is R ∈ R?Before resolving this question, let’s back off a bit and ask what it meansfor X ∈ R. From the entry requirements, we would have to show thatX /∈ X . Putting X = R we deduce that R ∈ R is true only if R /∈ R.Since this is an evident contradiction, we are inclined to deduce thatR /∈R. However, if R /∈ R then in fact R satisfies the entry requirementsto be an element of R and so R ∈ R. Thus exactly one of R ∈ R andR /∈ R must be true but assuming one is true implies the other is true.We therefore have an honest-to-goodness contradiction. Our only wayout is to conclude that, whatever R might be, it is not a set. But thisin turn contradicts my definition of a set as a collection of objects sinceR is a collection of objects. If you want to understand how to escapethis predicament, you will have to study set theory. Disconcerting as thismight be to you, imagine how much more so it was to the mathematicianGottlob Frege (1848–1925). He was working on a book which based thedevelopment of maths on sets when he received a letter from Russelldescribing this paradox and undermining what Frege was attempting toachieve.

Bertrand Russell himself was an Anglo-Welsh philosopher born in1872, when Queen Victoria still had another thirty years on the throneas ‘Queen empress’, and who died in 1970 a few months after Neil Arm-strong stepped onto the moon. As a young man he made importantcontributions to the foundations of mathematics but in the course of hisextraordinary life he found time to stand for parliament, encouraged thephilosopher Ludwig Wittgenstein, received two prison sentences, wonthe Nobel prize for literature, was the first president of CND, and cam-paigned against the Vietnam war. See Russell: a very short introductionby A. C. Grayling published by OUP, 2002, for a very short introduction.

I shall conclude this section by touching on a fundamental notion of math-ematics: that of a function. I shall approach it by first defining something


more general.Let A and B be any sets. By definition a subset X ⊆ A × B is called

a relation from A to B. To motivate this definition, and new terminology, Ishall consider an example.

Example 2.9.6. Let A be the set {A(dam),B(eth),C(ate),D(ave)} of peo-ple. Let B be the set {a(apples), b(ananas), o(ranges)} of fruit. Define X tobe the following set of ordered pairs

{(A, a), (A, o), (B, b), (D, a), (D, b), (D, o)}

which tells us who likes which fruit. Thus, for example, Adam likes applesand oranges (but not bananas) and Cate doesn’t like any of the fruit on offer.It is pretty irresistible to represent this information by means of a directedgraph, such as the one below. Clearly, such graphs can be drawn to representany relation. The term ‘relation’ is now explained by the fact that X tellsus how the elements of A are related to the elements of B. In this case, therelation is ‘likes to eat’.

D

C

B

A

b

o

a

Let X be a relation from A to B. We say that X is a function if it satisfiestwo additional conditions: first, for each a ∈ A there is at least one b ∈ Bsuch that (a, b) ∈ X; second, if (a, b), (a, c) ∈ X then b = c. If we think backto the graph in our example above, then the first condition says that everyelement in A is at the base of an arrow, and the second condition says thatfor each element in A is never at the base of two, or more, arrows. Slightlydifferent notation is used when dealing with functions. Rather than thinkingof ordered pairs, we think instead of inputs and outputs. Thus a functionfrom A to B is determined when for each a ∈ A there is associated exactlyone element b ∈ B. We think of a as the input and b as the corresponding,uniquely determined, output. If we denote our function by f then we writeb = f(a) or that a 7→ b. Thus the corresponding relation is the set of all


ordered pairs (a, f(a)) where a ∈ A We call the set A the domain of thefunction and the set B the codomain of the function. We write f : A→ B or

Af→ B.

Example 2.9.7. Here is an example of a function f : A→ B. Let A be theset of all students in the lecture theatre at this time. Let B be the set ofnatural numbers. Then f is defined when for each student a ∈ A we associatetheir age f(a). We can see why this is precisely a function and not merely arelation. First, everyone has an age and, assuming they don’t lie, they haveexactly one age. On the other hand, if we kept A as it is and let B be theset of nationalities then we will no longer have a function in general. Somepeople might be stateless, but even if we include that as a possibility in theset B, we still won’t necessarily have a function since some people own morethan one passport.

Exercises 2.9

1. LetA = {♣,♦,♥,♠}, B = {♠,♦,♣,♥} and C = {♠,♦,♣,♥,♣,♦,♥,♠}.Is it true or false that A = B and B = C? Explain.

2. Let X = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Write down the following subsets ofX:

(a) The subset A of even elements of X.

(b) The subset B of odd elements of X.

(c) C = {x : x ∈ X and x ≥ 6}.(d) D = {x : x ∈ X and x > 10}.(e) E = {x : x ∈ X and x is prime}.(f) F = {x : x ∈ X and (x ≤ 4 or x ≥ 7)}.

3. (a) Find all subsets of {a, b}. How many are there? Write down alsothe number of subsets with respectively 0, 1 and 2 elements.

(b) Find all subsets of {a, b, c}. How many are there? Write down alsothe number of subsets with respectively 0, 1, 2 and 3 elements.


(c) Find all subsets of the set {a, b, c, d}. How many are there? Writedown also the number of subsets with respectively 0, 1, 2, 3 and4 elements.

(d) What patterns do you notice arising from these calculations?

4. If the set A has m elements and the set B has n elements how manyelements does the set A×B have?

5. If A has m elements, how many elements does the set An have?

6. Given a set A define a new set A+ = A ∪ {A}. Calculate in successionthe sets

∅+, ∅++, ∅+++

which are obtained by repeated application of the operation +. Writedown the cardinalities of these sets.

7. Prove that that two sets A and B are equal if, and only if, A ⊆ B andB ⊆ A.

2.10 Proof by induction

This is a method of proof that, although useful, does not always deliver muchinsight into why something is true. The basis of this method is the following:

Let X be a subset of N that satisfies the following two conditions:first, 0 ∈ X, and second if n ∈ X then n+ 1 ∈ X. Then X = N.

This fact is called the induction principle, and can be viewed as one of thebasic axioms describing the natural numbers.

We may use it as a proof technique in the following way. Suppose wehave an infinite number of statements S0, S1, S2, . . . which we want to prove.By the induction principle, it is enough to do two things:

1. Show that S0 is true.

2. Show that if Sn is true then Sn+1 is also true.

It will then follow that Si is true for all positive i.Proofs by induction have the following script:

2.10. PROOF BY INDUCTION 53

Base step Show that the case n = 0 holds.

Induction hypothesis (IH) Assume that the case where n = k holds.

Proof bit Now use (IH) to show that the case where n = k + 1 holds.

Conclude that the result holds for all n by the induction principle.

Example 2.10.1. Prove by induction that n3 + 2n is exactly divisible by 3for all natural numbers n ≥ 0.

Base step: when n = 0, we have that 03 + 2 · 0 = 0 which is exactlydivisible by 3.

Induction hypothesis: assume result is true for n = k. We prove it forn = k + 1. We need to prove that (k + 1)3 + 2(k + 1) is exactly divisibleby 3 assuming only that k3 + 2k is exactly divisible by 3. We first expand(k + 1)3 + 2(k + 1) to get

k3 + 3k2 + 3k + 1 + 2k + 2.

This is equal to

(k3 + 2k) + 3(k2 + k + 1)

which is exactly divisible by 3 using the induction hypothesis.

In practice, some simple variants of this principle are used. Rather thanthe whole set N, we often work with a set of the form

N≥k = N \ {0, 1, . . . , k − 1}

where k ≥ 1. Our induction principle is modified accordingly: a subset X ofN≥k that contains k and contains n+1 whenever it contains n must be equalto the whole of N≥k. In our script above, the base step involves checking thecase where n = k.

What I described above I shall call basic induction. There is also some-thing called the strong induction principle which runs as follows:

Let X be a subset of N that satisfies the following two conditions:first, 0 ∈ X and second, if {0, 1 . . . , n} ⊆ X, where n ≥ 1, then{0, 1 . . . , n+ 1} ⊆ X. Then X = N.


Finally, there is the well-ordering principle that states that every non-empty subset of the natural numbers has a smallest element.

Induction, strong induction and well-ordering look very different fromeach other. In fact, they are equivalent and all useful in proving theorems.

Proposition 2.10.2. The following are equivalent.

1. The induction principle.

2. The strong induction principle.

3. The well-ordering principle.

Proof. (1)⇒(2). I shall assume that the induction principle holds and provethat the strong induction principle holds. Let X ⊆ N be such that 0 ∈ Xand and if {0, 1 . . . , n} ⊆ X, where n ≥ 1, then {0, 1 . . . , n + 1} ⊆ X. Weshall use induction to prove that X = N. Let Y ⊆ N consist of all naturalnumbers n such that {0, 1, . . . , n} ⊆ X. We have that 0 ∈ Y and we havethat n + 1 ∈ Y whenever n ∈ Y . By induction, we deduce that Y = N. Itfollows that X = N.

(2)⇒(3). I shall assume that the strong induction principle holds andprove that the well-ordering principle holds. Let X ⊆ N be a subset that hasno smallest element. I shall prove that X must be empty. Put Y = N \X. Iclaim that 0 ∈ Y . If not, then 0 ∈ X and that would obviously have to be thesmallest element, which is a contradiction. Suppose that {0, 1, . . . , n} ⊆ Y .Then we must have that n + 1 ∈ X because otherwise n + 1 would be thesmalest element of X. We now invoke strong induction to deduce that Y = Nand so X = ∅.

(3)⇒(1). I shall assume the well-ordering principle and prove the induc-tion principle. Let X ⊆ N be a subset such that 0 ∈ X and whenever n ∈ Xthen n + 1 ∈ X. Suppose that N \ X is non-empty. Then it would have asmallest element k say. But then k − 1 ∈ X and so, by assumption, k ∈ X,which is a contradiction. Thus N \X is empty and so X = N.

Strong induction will be used in a few places in this book but I will discussit in more detail when needed.

Exercises 2.10

2.10. PROOF BY INDUCTION 55

1. Prove that for each natural number n ≥ 3, we have that

n2 > 2n+ 1.

2. Prove that for each natural number n ≥ 5, we have that

2n > n2.

3. Prove that for each natural number n ≥ 1, the number 4n+2 is divisibleby 3.

4. Prove that

1 + 2 + 3 + . . .+ n =n(n+ 1)

2.

5. Prove that2 + 4 + 6 + . . .+ 2n = n(n+ 1).

6. Prove that

13 + 23 + 33 + . . .+ n3 =

(n(n+ 1)

2

)2

.

7. Prove that a set with n ≥ 0 elements has exactly 2n subsets.

Chapter 3

High-school algebra revisited

In this chapter, I will review some of the basic constructions from high-schoolalgebra from the perspective of this book.

3.1 The rules of the game

3.1.1 The axioms

Algebra deals with the manipulation of symbols. This means that symbolsare altered and combined according to certain rules. In high-school, the alge-bra you studied was mainly based on the properties of the real numbers. Thismeans that when you write x you mean an unknown or yet-to-be-determinedreal number. In this section, I shall describe the rules, or axioms, that youuse for doing algebra with real numbers. The primary operations we areinterested in are addition x + y and multiplication x × y. As usual, I shallabbreviate the operation of multiplication by concatenation, which simplymeans we write xy. Sometimes, it is helpful to denote multiplication asfollows x · y. Of course, there are two other familiar operations: subtrac-tion and division. We shall see that these should be treated in a differentway: subtraction as the inverse of addition, and division as the inverse ofmultiplication.

Both addition and multiplication require two inputs and then deliverone output with the inputs and outputs all being taken from the same set.They are therefore examples of what are called binary operations and arethe commonest kinds of operations in algebra. For example, as we shall see

57

58 CHAPTER 3. HIGH-SCHOOL ALGEBRA REVISITED

later, matrix addition and matrix multiplication are both binary operations,the vector product of two vectors is a binary operation, and the intersectionand union of two sets are both binary operations. A binary operation on aset X is nothing other than a function from X × X to X. I shall use ∗ tomean any binary operation defined on some specified set X. We usually writebinary operations between the inputs rather than using the usual functionalnotation.

a

ba ∗ b∗

The two most important properties a binary operation may have is com-mutativity and associativity.

A binary operation is commutative if

a ∗ b = b ∗ a

in all cases. That is, the order in which you carry out the operation isnot important. Addition and multiplication of real, and as we shall see later,complex numbers are commutative. But we shall also meet binary operationsthat are not commutative: both matrix multiplication and vector productsare examples. Commutativity is therefore not automatic.

A binary operation is associative if

(a ∗ b) ∗ c = a ∗ (b ∗ c)

in all cases. Remember that the brackets tell you how to work out theproduct. Thus (a ∗ b) ∗ c means first work out a ∗ b, let’s call it d, and thenwork out d ∗ c. Almost all the binary operations we shall meet in this bookare associative, the one important exception being the vector product.

In order to show that a binary operation ∗ is associative, we have to checkthat all possible products (a ∗ b) ∗ c and a ∗ (b ∗ c) are equal. To show that abinary operation is not associative, we simply have to find specific values fora, b and c so that (a ∗ b) ∗ c 6= (a ∗ b) ∗ c. Here are examples of both of thesepossibilities.

Example 3.1.1. Let’s take the set or real numbers R and investigate a newbinary operation denoted by ◦ that is defined as follows

a ◦ b = a+ b+ ab.

3.1. THE RULES OF THE GAME 59

We shall prove that it is associative. First, we have to understand what it iswe have to show. From the definition of associativity, we have to prove that

(a ◦ b) ◦ c = a ◦ (b ◦ c)

for all real numbers a, b and c. To do this, we calculate first the lefthand sideand then the righthand side and then verify they are equal. Because we aretrying to prove a result true for all real numbers, we cannot choose specificvalues of a, b and c. We first calculate (a ◦ b) ◦ c. Using the axioms for realnumbers, we get that

(a ◦ b) ◦ c = (a+ b+ ab) ◦ c = (a+ b+ ab) + c+ (a+ b+ ab)c

which is equal to a+ b+ c+ ab+ ac+ bc+ abc. Now we calculate a ◦ (b ◦ c).We get that

a ◦ (b ◦ c) = a ◦ (b+ c+ bc) = a+ (b+ c+ bc) + a(b+ c+ bc)

which is equal to a+ b+ c+ ab+ ac+ bc+ abc. We now see that we get thesame answers however we bracket the product and so we have proved thatthe binary operation ◦ is associative.

Example 3.1.2. Let’s take the set N and define the binary operation ⊕ asfollows

a⊕ b = a2 + b2.

I shall show that this binary operation is not associative. Let’s calculate first(1⊕ 2)⊕ 3. By definition this is computed as follows

(1⊕ 2)⊕ 3 = (12 + 22)⊕ 3 = 5⊕ 3 = 52 + 32 = 25 + 9 = 34.

Now we calculate 1⊕ (2⊕ 3) as follows

1⊕ (2⊕ 3) = 1⊕ (22 + 32) = 1⊕ (4 + 9) = 1⊕ 13 = 12 + 132 = 1 + 169 = 170.

Therefore(1⊕ 2)⊕ 3 6= 1⊕ (2⊕ 3).

It follows that the binary operation ⊕ is not associative.

We are now ready to state the algebraic axioms that form the basis ofhigh-school algebra. We shall split them up into three groups: those dealingonly with addition, those dealing only with multiplication, and finally thosethat deal with both operations together.


Axioms for addition

(F1) Addition is associative. Let x, y and z be any real numbers. Then(x+ y) + z = x+ (y + z).

(F2) There is an additive identity. The number 0 (zero) is the additiveidentity. This means that for an real number x we have that x + 0 =x = 0 + x. Thus adding zero to a number leaves it unchanged.

(F3) Each element has a unique additive inverse. This means that for eachnumber x there is another number, written −x, with the property thatx+(−x) = 0 = (−x)+x. The number −x is called the additive inverseof the number x.

(F4) Addition is commutative. Let x and y be any real numbers. Thenx + y = y + x. The word commutative means that the order in whichyou add the numbers does not matter.

The first thing to understand is that none of these axioms should besurprising. They should all agree with your intuition.

Axioms for multiplication

(F5) Multiplication is associative. Let x, y and z be any real numbers. Then(xy)z = x(yz).

(F6) There is a multiplicative identity. The number 1 is the multiplicativeidentity. This means that for any real number x we have that 1x =x = x1.

(F7) Each non-zero number has a unique multiplicative inverse. Let x 6= 0.Then there is a unique real number written x−1 with the property thatx−1x = 1 = xx−1. The number x−1 is called the multiplicative inverseof x. It is, of course, the number 1

x. It is very important to observe

that zero does not have a multiplicative inverse.

(F8) Multiplication is commutative. Let x and y be any real numbers. Thenxy = yx. Once again the word commutative means that the order inwhich you carry out the operations doesn’t matter. In this case, theoperation is multiplication.


The axioms for multiplication are very similar to those for addition. Theonly real difference between them is axiom (F7). This expresses the fact thatyou cannot divide by zero.

Linking axioms

(F9) 0 6= 1.

(F10) The additive identity is a multiplicative zero. This means that 0x =0 = x0. If you multiply any real number by 0 then you get 0.

(F11) Multiplication distributes over addition on the left and the right. Thereare actually two distributive laws: the left distributive law

x(y + z) = xy + xz

and the right distributive law

(y + z)x = yx+ zx.

Let me come back to the omission of subtraction and division. These arenot viewed as binary operations in their own right. Instead, we define a− bto mean a + (−b). Thus to subtract b means the same thing as adding −b.Likewise, we define a÷ b, when b 6= 0 to mean a× b−1. Thus to divide by bis to multiply by b−1.

We have missed out one further ingredient in algebra, and that is theproperties of equality.

Properties of equality

(E1) If a = b then c+ a = c+ b.

(E2) If a = b then ca = cb.

Example 3.1.3. When I talked about algebra in Chapter 1, I mentionedthat the usual way of solving a linear equation in one unknown depended onthe properties of real numbers. Let me now show you how we use the above


axioms to solve ax+b = 0 where a 6= 0. Throughout, I use without commentthe two properties of equality I have listed above.

ax+ b = 0

(ax+ b) + (−b) = 0 + (−b) by (F3)

ax+ (b+ (−b)) = 0 + (−b) by (F1)

ax+ 0 = 0 + (−b) by (F3)

ax = 0 + (−b) by (F2)

ax = −b by (F2)

a−1(ax) = a−1(−b) by (F10) since a 6= 0

(a−1a)x = a−1(−b) by (F5)

1x = a−1(−b) by (F10)

x = a−1(−b) by (F5)

I don’t propose that you go into quite such gory detail when solvingequations, but I wanted to show you what actually lay behind the rules thatyou might have been taught at school.

Example 3.1.4. We can use our axioms to prove that−1×−1 = 1 somethingwhich is hard to understand in any other way. By definition, −1 is theadditive inverse of 1. This means that 1 + (−1) = 0. Let us calculate(−1)(−1)− 1. We have that

(−1)(−1)− 1 = (−1)(−1) + (−1) by definition of subtraction

= (−1)(−1) + (−1)1 since 1 is the multiplicative identity

= (−1)[(−1) + 1] by the left distributivity law

= (−1)0 by properties of additive inverses

= 0 by properties of zero

Hence (−1)(−1) = 1. In other words, the result follows from the usual rulesof algebra.

3.1.2 Indices

We usually write a2 rather than aa, and a3 instead of aaa. In this section,I want to review the meaning of algebraic expressions such as a

rs where r

sis


any rational number. Our starting point is a result that I would encourageyou to assume as an axiom at a first reading. I have included the proof toshow you a more sophisticated example of proof by induction.

Lemma 3.1.5 (Generalized associativity). Let ∗ be any binary operationdefined on a set X. If ∗ is associative then however you bracket a productsuch as

x1 ∗ . . . ∗ xnyou will always get the same answer.

Proof. If x1, x2, · · · , xn are elements of the set X then one particular brack-eting will play an important role in our proof

x1 ∗ (x2 ∗ (· · · (xn−1 ∗ xn) · · · ))

which we write as [x1x2 . . . xn].The proof is by strong induction on the length n of the product in ques-

tion. The base case is where n = 3 and is just an application of the associativelaw. Assume that n ≥ 4 and that for all k < n, all bracketings of a sequenceof k elements of X lead to the same answer. This is therefore the induc-tion hypothesis for strong induction. Let X denote any properly bracketedexpression obtained by inserting brackets into the sequence x1, x2, · · · , xn.Observe that the computation of such a bracketed product involves comput-ing n − 1 products. This is because at each step we can only compute theproduct of adjacent letters xi ∗ xi+1. Thus at each step of our calculationwe reduce the number of letters by one until there is only one letter left.However the expression may be bracketed, the final step in the computationwill be of the form Y ∗Z, where Y and Z will each have arisen from properlybracketed expressions. In the case of Y it will involve a bracketing of somesequence x1, x2, . . . , xr, and for Z the sequence xr+1, xr+2, . . . xn for some rsuch that 1 ≤ r ≤ n − 1. Since Y involves a product of length r < n, wemay assume by the induction hypothesis that Y = [x1x2 . . . xr]. Observe that[x1x2 . . . xr] = x1 ∗ [x2 . . . xr]. Hence by associativity,

X = Y ∗ Z = (x1 ∗ [x2 . . . xr]) ∗ Z = x1 ∗ ([x2 . . . xr] ∗ Z).

But [x2 . . . xr] ∗ Z is a properly bracketed expression of length n − 1 inx2, · · · , xn and so using the induction hypothesis must equal [x2x3 . . . xn].It follows that X = [x1x2 . . . xn]. We have therefore shown that all possiblebracketings yield the same result in the presence of associativity.


We illustrate a special case of the above proof in the example below.

Example 3.1.6. Take n = 5. Then the notation [x1x2x3x4x5] introducedin the above proof means x1 ∗ (x2 ∗ (x3 ∗ (x4 ∗ x5))). Consider the product((x1 ∗ x2) ∗ x3) ∗ (x4 ∗ x5). Here we have Y = (x1 ∗ x2) ∗ x3 and Z = x4 ∗ x5.By associativity Y = x1 ∗ (x2 ∗ x3). Thus Y ∗Z = (x1 ∗ (x2 ∗ x3)) ∗ (x4 ∗ x5).But this is equal to x1 ∗ ((x2 ∗ x3) ∗ (x4 ∗ x5)) again by associativity. By theinduction hypothesis (x2 ∗ x3) ∗ (x4 ∗ x5) = x2 ∗ (x3 ∗ (x4 ∗ x5)), and so

((x1 ∗ x2) ∗ x3) ∗ (x4 ∗ x5) = x1 ∗ (x2 ∗ (x3 ∗ (x4 ∗ x5))),

as required.

If a binary operation is associative then the above lemma tells us thatcomputing products of elements is straightforward because we never haveto worry about how to evaluate it as long as we maintain the order of theelements. We now consider a special case of this result. Let a be any realnumber. Define the nth power an of a, where n is a natural number, asfollows: a1 = a and an = aan−1 for any n ≥ 2. Generalized associativitytells us that an can in fact be calculated in any way we like because we shallalways obtain the same answer. The following result should be familiar. Ishall ask you to prove it in the exercises.

Lemma 3.1.7 (Laws of exponents). Let m,n ≥ 1 be any natural numbers.

1. am+n = aman.

2. (am)n = amn.

It follows from the above lemma that powers of the same element a com-mute with one another: aman = anam as both products equal am+n. Our goalnow is to define what am means when m is an arbitrary rational number. Weshall be guided by the requirement that the above laws of exponents shouldcontinue to hold. We may extend the laws of exponents to allow m or n tobe 0. The only way to do this is to define a0 = 1, where 1 is the identity anda 6= 0.

An extreme case! What about 00? This is a can of worms. For this book,it is probably best to define 00 = 1.


We have explained what an means when n is positive but what can we saywhen the exponent is negative? In other words, what does a−n mean? Weassume that the rules above still apply. Thus whatever a−n means we shouldhave that a−nan = a0 = 1. It follows that a−n = 1

an. With this interpretation

we have defined an for all integer values of x.

We now investigate what a1n should mean. If the law of exponents are to

continue holidng, then (a1n )n = a1 = a. It follows that a

1n = n√a.

We may now calculate ars it is equal to

ars = ( s

√a)r.

How do we calculate (ab)n? This is just ab times itself n times. But theorder in which we multiply a’s and b’s doesn’t matter and so we can arrangeall the a’s to the front. Thus (ab)n = anbn.

We also have similar results for addition. We define 2x = x + x andnx = x+ . . .+ x where the x occurs n times. We have 1x = x and 0x = 0.

Let {a1, . . . , an} be a set of n elements. If we write them all in someorder ai1 , . . . , ain then we have what is called a permutation of the elements.The following lemma can be treated as an axiom and the proof omitted untillater.

Lemma 3.1.8 (Generalized commutativity). Let ∗ be an associative andcommutative binary operation on a set X. Let a1, . . . , an be any n elementsof X. Then

a1 ∗ . . . ∗ an = ai1 ∗ . . . ∗ ain .

Proof. First prove by induction the result that

a1 ∗ . . . ∗ an ∗ b = b ∗ a1 ∗ . . . ∗ an.

Let a1, . . . , an, an+1 be n+1 elements. Consider the product ai1∗. . .∗ain∗ain+1 .Suppose that an+1 = air . Then

ai1 ∗ . . . ∗ air ∗ . . . ∗ ain ∗ ain+1 = (ai1 ∗ . . . ∗ ain) ∗ an+1

where the expression in the backets is a product of some permutation ofthe elements a1, . . . , an. We have used here our result above. But by theinduction hypothesis, we may write ai1 ∗ . . . ∗ ain = a1 ∗ . . . ∗ an.


3.1.3 Sigma notation

At this point, it is appropriate to introduce some useful notation. Leta1, a2, . . . , an be n numbers. Their sum is a1 + a2 + . . . + an and becauseof generalized associativity we don’t have to worry about brackets. We nowabbreviate this as

n∑

i=1

ai.

Where∑

is Greek ‘S’ and stands for Sum. The letter i is called a subscript.The equality i = 1 tells us that we start the value of i at 1. The equalityi = n tells us that we end the value of i at n. Although I have started thesum at 1, I could, in other circumstances, have started at 0, or any otherappropriate number. This notation is very useful and can be manipulatedusing the rules above. If 1 < s < n, then we can write

n∑

i=1

ai =s∑

i=1

ai +n∑

s+1

ai.

If b is any number then

b

(n∑

i=1

ai

)=

n∑

i=1

bai

is the generalized distributivity law that you are asked to prove in the exer-cises. These uses of sigma-notation shouldn’t cause any problems.

The most complicated use of∑

-notation arises when we have to sum upwhat is called an array of numbers aij where 1 ≤ i ≤ m and 1 ≤ j ≤ n.This arises in matrix theory, for example. For concreteness, I shall give theexample where m = 3 and n = 4. We can therefore think of the numbers aijas being arranged in a 3× 4 array as follows:

a11 a12 a13 a14

a21 a22 a23 a24

a31 a32 a33 a34

Observe that the first subscript tells you the row and the second subscripttells you the column. Thus a23 is the number in the second row and the thirdcolumn. Now we can add these numbers up in two different ways getting the


same answer in both cases. The first way is to add the numbers up along therows. So, we calculate the following sums

4∑

j=1

a1j,

4∑

j=1

a2j,

4∑

j=1

a3j.

We then add up these three numbers

4∑

j=1

a1j +4∑

j=1

a2j +4∑

j=1

a3j =3∑

i=1

(4∑

j=1

aij

).

The second way is to add the numbers up along the columns. So, we calculatethe following sums

3∑

i=1

ai1,

3∑

i=1

ai2,

3∑

i=1

ai3,

3∑

i=1

ai4.

We then add up these four numbers

n∑

i=1

ai1 +n∑

i=1

ai2 +n∑

i=1

ai3 +n∑

i=1

ai4 =4∑

j=1

(3∑

i=1

aij

).

The fact that3∑

i=1

(4∑

j=1

aij

)=

4∑

j=1

(3∑

i=1

aij

)

is a consequence of the generalized commutativity law that you are asked toprove in the exercises. We therefore have in general that

m∑

i=1

(n∑

j=1

aij

)=

n∑

j=1

(m∑

i=1

aij

).

3.1.4 Infinite sums

What I have defined so far are finite sums and form part of algebra. Thereare also infinite sums ∞∑

i=1

ai


which form part of analysis, the subject that provides the foundations forcalculus. There is one place where we use infinite sums in everyday life, andthat is in the decimal representations of numbers. Thus the fraction 1

3can

be written as 0 · 3333 . . . and this is in fact an infinite sum: it means theinfinite sum ∞∑

i=1

3

10i.

But in general infinite sums are problematic. For example, consider theinfinite sum

S =∞∑

i=1

(−1)i+1.

So, this is justS = 1− 1 + 1− 1 + . . .

What is S? You’re first instinct might be to say 0 because

S = (1− 1) + (1− 1) + . . .

But it could equally well be 1 calculated as follows

S = 1 + (−1 + 1) + (−1 + 1) + . . .

In fact, it could even be 12

since S + S = 1 and so S = 12. There is clearly

something seriously awry here, and it is that infinite sums have to be handledvery carefully if they are to make sense. Just how is the business of analysisand won’t be an issue in this book.

Warning! ∞ is not a number. It simply tells us to keep adding on termsfor increasing values of i without end so we never write

3

10∞.

Exercises 3.1

1. Prove the following identities using the axioms introduced.

(a) (a+ b)2 = a2 + 2ab+ b2.


(b) (a+ b)3 = a3 + 3a2b+ 3ab2 + b3

(c) a2 − b2 = (a+ b)(a− b)(d) (a2 + b2)(c2 + d2) = (ac− bd)2 + (ad+ bc)2

2. Calculate the following.

(a) 23.

(b) 213 .

(c) 2−4.

(d) 2−32 .

3. Assume that aij are assigned the following values

a11 = 1 a12 = 2 a13 = 3 a14 = 4a21 = 5 a22 = 6 a23 = 7 a24 = 8a31 = 9 a32 = 10 a33 = 11 a34 = 12

Calculate the following sums.

(a)∑3

i=1 ai2.

(b)∑4

j=1 a3j.

(c)∑3

i=1

(∑4j=1 a

2ij

).

4. Let a, b, c ∈ R. If ab = ac is it true that b = c? Explain.

5. Laws of exponents.

(a) Prove by induction that am+n = aman. To do this, fix m and thenprove the result by induction on n. Deduce that it holds for allm.

(b) Prove by induction that (am)n = amn. To do this, fix m and thenprove the result by induction on n. Deduce that it holds for allm.

6. Prove by induction that the left generalized distributivity law holds

a(b1 + b2 + b3 + . . .+ bn) = ab1 + ab2 + ab3 + . . .+ abn,

for any n ≥ 2.


3.2 Solving quadratic equations

The previous section might have given the impression that algebraic calcu-lations are routine. In fact, once you pass beyond linear equations, theyusually require good ideas. The first place where a good idea is needed is insolving quadratic equations. Quadratic equations were solved by the Baby-lonians and the Egyptians and are dealt with in all school algebra courses. Ihave included them here because I want to show you that you don’t have toremember a formula to solve such equations; what you have to remember isa method. Let’s recall some definitions. An expression of the form

ax2 + bx+ c

where a, b, c are numbers and a 6= 0 is called a quadratic polynomial or apolynomial of degree 2. The numbers a, b, c are called the coefficients of thequadratic. A quadratic where a = 1 is said to be monic. A number r suchthat

ar2 + br + c = 0

is called a root of the polynomial. The problem of finding all the roots of aquadratic is called solving the quadratic. Usually this problem is stated inthe form: ‘solve the quadratic equation ax2 + bx+ c = 0’. Equation becausewe have set the polynomial equal to zero. I shall now show you how to solvea quadratic equation without having to remember a formula. Observe firstthat if ax2 + bx+ c = 0 then

x2 +b

ax+

c

a= 0.

Thus it is enough to find the roots of monic quadratics. We shall solve thisequation by trying to do the following: write x2 + b

ax as a perfect square plus

a number. This will turn out to be the crux of solving the quadratic. Weshall illustrate our construction by using some diagrams. First, we representgeometrically the expression x2 + b

ax.

3.2. SOLVING QUADRATIC EQUATIONS 71

x

x

ba

Now cut the red rectangle into two pieces along the dotted line and rearrangethem as shown below.

x

x

b2a

b2a

It is now geometrically obvious that if we add in the small dotted square, weget a new bigger square. This explain why the procedure is called completingthe square. We now express in algebraic terms what these diagrams suggest.

x2 +b

ax =

(x2 +

b

ax+

b2

4a2

)− b2

4a2=

(x+

b

2a

)2

− b2

4a2.


We therefore have that

x2 +b

ax =

(x+

b

2a

)2

− b2

4a2.

Look carefully at what we have done here: we have rewritten the lefthandside as a perfect square — the first term on the righthandside — plus anumber — the second term on the righthandside. It follows that

x2 +b

ax+

c

a=

(x+

b

2a

)2

− b2

4a2+c

a=

(x+

b

2a

)2

+4ac− b2

4a2.

Setting the last expression equal to zero and rearranging, we get

(x+

b

2a

)2

=b2 − 4ac

4a2.

Now take square roots of both sides, remembering that a non-zero numberhas two square roots:

x+b

2a= ±

√b2 − 4ac

4a2

which of course simplifies to

x+b

2a= ±√b2 − 4ac

2a.

Thus

x =−b±

√b2 − 4ac

2a

the usual formula for finding the roots of a quadratic.

Example 3.2.1. Solve the quadratic equation

2x2 − 5x+ 1 = 0.

by completing the square. Divide through by 2 to make the quadratic monicgiving

x2 − 5

2x+

1

2= 0.

We now want to write

x2 − 5

2x

3.2. SOLVING QUADRATIC EQUATIONS 73

as a perfect square plus a number. We get

x2 − 5

2x =

(x− 5

4

)2

− 25

16.

Thus our quadratic becomes

(x− 5

4

)2

− 25

16+

1

2= 0.

Rearranging and taking roots gives us

x =5

4±√

17

4=

5±√

17

4.

We now check our answer by substituting each of our two roots back intothe original quadratic and ensuring that we get zero in both cases.

For the quadratic equation

ax2 + bx+ c = 0

the number D = b2 − 4ac, called the discriminant of the quadratic, plays animportant role.

• If D > 0 then the quadratic equation has two distinct real solutions.

• If D = 0 then the quadratic equation has one real root repeated. In

this case, the quadratic is the perfect square(x+ b

2a

)2.

• If D < 0 then we shall see that the quadratic equation has two complexroots which are complex conjugate to each other. This is called theirreducible case.

If we put y = ax2 + bx+ c then we may draw the graph of this equation.The roots of the original quadratic therefore correspond to the points wherethis graph crosses the x-axis. The diagrams below illustrate the three casesthat can arise.


D > 0

D = 0

D < 0

Exercises 3.2

1. Calculate the discriminants of the following quadratics and so deter-mine whether they have two distinct roots, or repeated roots, or noreal roots.

(a) x2 + 6x+ 5.

(b) x2 − 4x+ 4.

3.3. ORDER 75

(c) x2 − 2x+ 5.

2. Solve the following quadratic equations by completing the square. Checkyour answers.

(a) x2 + 10x+ 16 = 0.

(b) x2 + 4x+ 2 = 0.

(c) 2x2 − x− 7 = 0.

3. I am thinking of two numbers x and y. I tell you their sum a and theirproduct b. What are x and y in terms of a and b?

4. Let p(x) = x2 + bx + c be a monic quadratic with roots x1 and x2.Express the discriminant of p(x) in terms of x1 and x2.

5. This question is an interpretation of part of Book X of Euclid. We shallbe interested in numbers of the form a+

√b where a and b are rational

and b > 0 where√b is irrational1.

(a) If√a = b+

√c where

√c is irrational Then b = 0.

(b) If a+√b = c+

√d where a and c are rational and

√b and

√d are

irrational then a = c and√b =√d.

(c) Prove that the square roots of a+√b have the form ±(

√x+√y).

3.3 Order

In addition to algebraic operations, the real numbers are also ordered: wecan always say of two real numbers whether they are equal or whether one ofthem is bigger than the other. I shall write down first the axioms for orderthat hold both for rational and complex numbers. The following notation isimportant. If a ≤ b and a 6= b then we write a < b and say that a is strictlyless than b.

Axioms for order

(O1) For every element a ≤ a.

1Remember that irrational means not rational.


(O2) If a ≤ b and b ≤ a then a = b.

(O3) If a ≤ b and b ≤ c then a ≤ c.

(O4) Given any two elements a and b then either a ≤ b or b ≤ a or a = b.

If a > 0 the we say that it is positive and if a < 0 we say it is negative.

(O5) If a ≤ b and c ≤ d then a+ b ≤ b+ d.

(O6) If a ≤ b and c is positive then ac ≤ bc.

The only axiom that you really have to watch is (O6). Here is an exampleof a proof using these axioms.

Example 3.3.1. We prove that a ≤ b if, and only if, b− a is positive. Sincethis statement involves an ‘if, and only, if’ there are, as usual,two statementsto be proved. Suppose first that a ≤ b. By axiom (O5), we may add −a toboth sides to get a+(−a) ≤ b+(−a). But a+(−a) = 0 and b+(−a) = b−a,by definition. It follows that 0 ≤ b−a and so b−a is positive. Now we provethe converse. Suppose that b − a is positive. Then by definition 0 ≤ b − a.Also by definition, b− a = b+ (−a). Thus 0 ≤ b+ (−a). By axiom (O5), wemay add a to both sides to get 0 + a ≤ (b + (−a)) + a. But 0 + a = a and(b + (−a)) + a quickly simplifies to b. We have therefore proved that a ≤ b,as required.

Exercises 3.3

1. Prove that between any two distinct rational numbers there is anotherrational number.

2. Prove the following using the axioms.

(a) If a ≤ b then −b ≤ −a.

(b) a2 is positive for all a 6= 0.

(c) If 0 < a < b then 0 < b−1 < a−1.

3.4. THE REAL NUMBERS 77

3.4 The real numbers

The axioms I have introduced so far apply equally well to both the rationalnumbers Q and the real numbers R. But we have seen that although Q ⊆ Rthe two sets are not equal because we have proved that

√2 /∈ Q. In fact, we

shall see later that there are many more irrational numbers than there arerational numbers. In this section, I shall explain the fundamental differencebetween rationals and reals. This material will not be needed in the rest ofthis book instead its role is to connect with the foundations of calculus, thatis, with analysis.

It is convenient to write K to mean either Q or R in what follows because Iwant to make the same definitions for both sets. A non-empty subset A ⊆ Kis said to be bounded above if there is some number b ∈ K so that for alla ∈ A we have that a ≤ b. For example, the set A = {2n : n ≥ 0} is notbounded above since its elements getter bigger and bigger without limit. Onthe other hand, the set B = {

(12

)n: n ≥ 0} is bounded above, for example

by 1. A non-empty subset A as above is said to have a least upper bound ifyou can find a number a ∈ K with the following two properties: first of all,a but be an upper bound for A and second of all if b is any upper bound forA then a ≤ b. We shall now apply these definitions to a result we obtainedearlier.

Let

A = {a : a ∈ Q and a2 ≤ 2}

and let

B = {a : a ∈ R and a2 ≤ 2}.

Then A ⊆ Q and B ⊆ R. Both sets are bounded above: the number 112, for

example, works in both case. However, I shall prove that the subset A doesnot have a least upper bound, whereas the subset B does.

Let’s consider subset A first. Suppose that r were a least upper bound.I claim that r2 would have to equal 2 which is impossible because we haveproved that

√2 is irrational.

Suppose first that r2 < 2. Then I claim there is a rational number r1 suchthat r < r1 and r2

1 < 2. Choose any rational number h such that 0 < h < 1and

h <2− r2

2r + 1.


Put r1 = r + h. By construction r1 > r. We calculate r21 as follows

r21 = r2 + 2rh+ h2 = r2 + (2r + h)h < r2 + (2r + 1)h = r2 + 2− r2 = 2.

Thus r21 < 2 as claimed. But this contradicts the fact that r is an upper

bound of the set A.Suppose now that 2 < r2. Then I claim that I can find a rational number

r1 such that r1 < r and 2 < r21. Put h = r2−2

2rand define r1 = r− h. Clearly,

0 < r1 < r. We calculate r22 as follows

r21 = r2 − 2rh+ h2 = r2 − (r2 − 2) + h2 > r2 − (r2 − 2) = 2.

But this contradicts the fact that r is supposed to be a least upper bound.We have therefore proved that if r is a least upper bound of A then

r =√

2. But this is impossible because we have proved that√

2 is irrational.Thus the set A does not have a least upper bound in the rationals. However,by essentially the same reasoning the set B does have a least upper boundin the reals: the number

√2. This motivates the following definition. It is

this axiom that is needed to develop calculus properly.

The completeness axiom for R

Every non-empty subset of the reals that is bounded above has a leastupper bound.

The Peano Axioms

Set theory is supposed to be a framework in which all of mathematicscan take place. Let me briefly sketch out how we can construct thereal numbers using set theory. The starting point are the Peano axiomsstudied by G. Peano (1858–1932). These deal with a set P and anoperation on this set called the successor function which for each n ∈ Pproduces a unique element n+. The following four axioms should hold:

(P1) There is a distinguished element of P that we denote by 0.

(P2) There is no element n ∈ P such that n+ = 0.

(P3) If m,n ∈ P and m+ = n+ then m = n,

3.4. THE REAL NUMBERS 79

(P4) If X ⊆ P is such that 0 ∈ X and if n ∈ X then n+ ∈ X thenX = P .

By using ideas from set theory, one shows that P is essentially the setof natural numbers together with its operations of addition and multi-plication.

The natural numbers are deficient in that it is not always possibleto solve equations of the form a+ x = b because of the lack of negativenumbers. However, we can use set theory to construct Z from N byusing ordered pairs. The idea is to regard (a, b) as meaning a − b.However, there are many names for the same negative number so weshould have (0, 1) and (2, 3) and (3, 4) all signifying the same number:namely, −1. To make this work, one uses another idea from set theory,that of equivalence relations which we shall meet later. This gives riseto the set Z. Again using ideas from set theory, the usual operationscan be constructed on Z.

But the integers are deficient because we cannot always solve equa-tions of the form ax + b = 0 because of the lack of rational numbers.To construct them we use ordered pairs again. This time (a, b), whereb 6= 0, is interpreted as a

b. But again we have the problem of multiple

names for what should be the same number. Thus (1, 2) should equal(−1,−2) should equal (2, 4) and so forth. Once again this problem issolved by using an equivalence relation, and once again, the set whicharises, which is denoted by Q, is endowed with the usual operations.

As we have seen, the rationals are deficient in not containing numberslike√

2. The intuitive idea behind the construction of the reals from therationals is that we want to construct R as all the numbers that canbe approximated arbitrarily by rational numbers. To do this, we formthe set of all subsets X of Q which have the following characteristics:X 6= ∅, X 6= Q, if x ∈ X and y ≤ x then y ∈ X, and X doesn’t havea biggest element. These subsets are called Dedekind cuts and shouldbe regarded as defining the real number r so that X consists of all therational numbers less than r.

chapter 1 the nature of mathematics - macs.hw.ac.uk

Documents