microchip mathematics number theory
DESCRIPTION
Computational Number TheoryTRANSCRIPT
-
Microchip Mathematics number theory for computer users
Keith Devlin lIathematlcs Department Unlvenlty of Lancaster
-
SHIV A PUBLISHING LIMITED 64 Welsh Row, Nantwich, Cheshire CW5 5ES, England
Keith Devlin, 1984
ISBN 1 850140472
All rights reserved. No part of this publication may be reproduced. stored in a retrieval system, or transmitted in any form or by any means. electronic, mechanical, photocopying. recording and/or otherwise, without the prior written permission of the Publishers.
This book is sold subject to the Standard Conditions of Sale of Net Books and may not be resold in the UK below the net price given by the Publishers in their current price list.
The front cover shows the author with a print-out of the largest known prime number, a number with 39751 digits. The print-out is 9 feet in length. It required over half an hour of main frame computer time to work out the digits in this number. (Photograph taken at The Computer Unit, Warwick University, courtesy of Dr Keith Halstead.)
Printed and bound in Great Britain by Billing and Sons Limited
-
Contenls
O.
I.
PREFACE
BACKGROUND: PRIME NUMBERS
1.
2.
3.
4.
5.
Prime Numbers
The Sieve of Eratosthenes
The Distribution of Primes
Largest Known Primes
Conjectures About Primes
Exercises 0
Computer Problems 0
BASIC CONCEPTS
1.
2.
3.
4.
5.
Mathematical Induction
Divisibility. The Euclidean Algorithm
Efficiency of Algorithms. Mu1tiprecision
Arithmetic
The Fibonnaci Sequence and the Efficiency
of the Euclidean Algorithm
Prime Numbers
v
1
2
4
5
8
9
11
12
14
14
23
34
43
48
iii
-
II.
III.
iv
6. Diophantine Equations
Exercises I
Computer Problems I
CONGRUENCES
1.
2.
3.
4.
Congruence
Modular Arithmetic
Fermat's Little Theorem and the
Euler Phi-Function
Random Number Generators and Primitive Roots
Exercises II
Computer Problems II
PRIMALITY TESTING AND FACTORISATION
1.
2.
3.
4.
Perfect Numbers and Mersenne Primes
Public Key Cryptography
Primality Testing
Factorisation Techniques
Exercises III
Computer Problems III
RECOMMENDED FURTHER READING
INDEX OF NUTl\1'fON
INDEX
51
55
59
62
62
77
94
107
128
135
138
139
153
163
178
191
197
202
203
204
-
Preface
In the Autumn of 1983, in the face of the phenomenal growth of
home computer sales in the U.K., the national British newspaper
The Guardian decided to produce, each week, a 'Computer Page'.
Noone was quite sure exactly what should go into the page on a
regular basis, but it was thought that a fortnightly column on
computer mathematics might be a good idea, and when the computer
page first appeared on 20th October of that year, it included a
small item on binary arithmetic by me.
From the mail I received after my column had been running
for a few months, it was clear that the microcomputer age had brought
with it a huge increase in the number of (potential)
'recreational mathematicians'. Though in many cases without any
formal training in mathematics, my correspondents displayed tremendous
mathematical ability, and I was frequently asked if I could recommend
any suitable books. What they seemed to want was a genuine
mathematics text book, but one which did not require a great deal
of prior knowledge. This is intended to be just such a book.
Number Theory is one of the few areas of modern mathematics
which is accessible to the non-expert. (At least, the kind of
Number Theory considered here: there is a lot of other material
v
-
which also goes under the title 'Number Theory', most of which
is pretty well inaccessible to the majority of trained
mathematicians~) It is also an area in which there is a genuine
two-way flow between man and the computer. Indeed, it was this
fascinating interplay of brain power and computer power that
awakened my own interest in the subject to a level where I began
to give a course on the subject at Lancaster University and,
coincidentally, write about it in The Guardian. (Previously my
mathematical research work had been in Set Theory, a subject dealing
almost exclusively with the mysterious world of the infinite.)
This is a book about (the computational aspects of) Number
Theory. Though written for university undergraduates in
mathematics, I have tried to present the material in such a way
that it can be followed by the keen but largely untrained 'amateur'
sitting at home with (or possibly even without) a cheap home
computer. I do not pretend to give a complete coverage of the
computational aspects of Number Theory. (For instance, no mention
is made of Quadratic Reciprocity, a tremendously important part
of the subject.) Rather my aim is to cover the (very) basic parts
of Number Theory and at the same time give some indication of the
way in which Number Theory both feeds off and leads to advances
in Computation Theory. Consequently, if the book were used as
a text to accompany a university lecture course, the lecturer would
presumably deal with additional topics not covered in this book.
In writing this book, I made extensive reference to, in
particular, two excellent books, to which this text could be regarded
as a precursor. David Burton's book Elementary Number Theory
gives a wonderfully readable coverage of (essentially the non-
vi
-
computational aspects of) Number Theory, and covers many more topics
than I have space for here, whilst Donald Knuth's 'The Art of
Computer Programming, Volume ~' is the 'bible' for serious
computational number theorists.
The book is structured in a way that assumes a more or less
direct passage from start to finish, though an index is provided
to enable the book to be used as a reference text if necessary.
Each chapter (including an informal preparatory chapter) ends with
a selection of (mathematical) Exercises, grouped according to the
section they refer to, and some Computer Problems. The latter
are, for the most part, just initial 'pointers' as to what can be
tried out on a computer, and I would hope that these are enough
to spur the reader on to carrying out further computer investigations
of his or her own devising.
To assist readers who wish to skip proofs and concentrate on
the development of the main results, the symbol 0 is used to
indicate the end of a proof. (Whenever this symbol occurs
immediately following the statement of a result, this indicates
that the proof is so obvious as to require no further comment.)
For easy reference, all results obtained are numbered consecutively,
the reference numbers consisting of the Chapter number, section
number, and result number.
Keith Devlin
Lancaster, August 1984
vii
-
Pierre De Fermat: 'The Father of Number Theory'. Born in 1601
near Toulouse in France, Fermat was a jurist by profession, and
only took up mathematics as a hobby in his thirties. Through
correspondence with many of the leading scholars of the day, Fermat
developed most of the pivotal ideas of present day Number Theory.
Many of his ideas to simplify mental calculation are nowadays
employed to speed up computer algorithms. This painting is from
the collection of the Academie des Sciences, Inscriptions et
Belles Lettres de Toulouse; it is reproduced here with the kind
permission of Robert Gillis.
-
o Background: Prime Numbers
Numbers constitute the one mathematical system familiar to all
mankind, at least if by 'number' you mean 'positive whole number'
as did the Ancient Greeks. Today the professional mathematician
uses the phrase 'natural number' to denote the positive whole
numbers 1,2,3, .. This is a book about these 'natural' numbers,
and we shall rarely have occasion to speak of other numbers such
as proper fractions like t,t, or t. The study of the natural
numbers is known as 'Number Theory', and in keeping with the
traditions of that subject we shall use the word 'number' to mean
'natural number' unless otherwise indicated. (This convention
is used in the very name 'Number Theory' of course.)
The natural numbers are so fundamental to the rest of
mathematics that the famous 19th Century mathematician Leopold
Kronecker once remarked that 'God created the natural numbers,
and all the rest is the work of man.' What he meant by this
was that, starting from the natural numbers it is possible to
construct, in a rigorous fashion, the entire edifice of modern
mathematics, which is true, and that the natural numbers themselves
cannot be constructed (in a mathematical sense) from any simpler
entities, which was true when Kronecker made his remark but is
1
-
no longer valid, Cantor's Set Theory having provided a way of
constructing the natural numbers using simple sets. But this
last point notwithstanding, Kronecker's remark is still pretty
indicative of the status of the natural numbers in mathematics.
As the natural numbers are fundamental to the rest of
mathematics, so are the prime numbers fundamental to the natural
numbers. Strictly speaking, we shall not be in a position to
make a proper study of the prime numbers until we have developed
our Number Theory sufficiently, but so basic are the prime numbers
that it will be helpful to present a few basic facts before we
do anything else. All of these facts will be proved rigorously
in due course. (This is not to say that anything we say is at
all likely to strike you as unlikely, quite the contrary. But
in mathematics it is prudent to leave nothing to chance, as history
has taught us time and time again.)
1. PRIME NUMBERS
A number (natural number) p is said to be prime if it is greater
than 1 and is divisible (without remainder) by only 1 and p.
A number greater than 1 which is not prime is said to be composite.
For example, 2,3,5,7,11,13,17,19 are all prime and 4,6,8,9,10,
12,14,15,16,18,20 are all composite. Obviously, with the
exception of 2, all primes are odd numbers, a fact which leads
to the old joke that 2, being even, is a very 'odd' prime.
In Book IX of his 'Elements', Euclid proved that there are
infinitely many prime numbers. (It is obvious that there are
infinitely many composite numbers, for instance every even number
greater than 2 is composite.)
2
-
The reason why the prime numbers play such a fundamental part
in Number Theory lies to a great extent in the following simple
fact, which we shall prove when we come to develop the theory of
prime numbers in a proper fashion: if p is a prime and p divides
a product ab of two numbers a and b, then p must divide (at least)
one of a and b on its own.
for instance, 6 divides 36
4 or 9.)
(This is not true for non-primes p:
4.9 but 6 does not divide either of
Using the above fact, it can be proved that every number greater
than 1 can be expressed as a product of prime numbers, and that
moreover such an expression is unique apart from the order in which
the prime factors appear.
Theorem of Arithmetic.
This result is known as the Fundamental
For example, 1200 (Actually,
it is perhaps prudent to make a remark here about the use of the
word 'product' in mathematics. Ordinarily, by a 'product' of
numbers one means two numbers multiplied together. In mathematics,
the word 'product' is used to mean the result of any number of
numbers multiplied together. Included in this is the degenerate
case of a single number, where in reality there is no multiplication
involved at all. Thus, for example, the prime number 3 is a
'product' of primes, as is any prime number. Though on the face
of it it may seem a little strange to refer to individual primes
as 'products' of primes, this is done in order to simplify the
statement of mathematical results. For instance, without this
convention it would be necessary to exclude the prime numbers from
the statement of The Fundamental Theorem of Arithmetic.)
3
-
2. THE SIEVE OF ERATOSTHENES
Given a particular number, how can we determine whether it is prime
or not? The most obvious method is to go through all smaller
numbers greater than 1 and see if any of them divide into it.
If a divisor is found, the number cannot be primel if no divisor
is found, it must be prime. Though simple to describe, this method
is unwieldy in practice: for example, to check if 83 is prime would
require 81 trial divisions.
The above can be speeded up considerably by the observation
that if a number a has a factor (other than 1 and a) it necessarily
has a factor less than la. (This is easily proved.) So in order
to check if a number is prime it is only necessary to look for
possible factors less than its square root. For numbers such
as 83, this makes the method feasible, of course, since then only
the numbers 2, ,9 need be checked, but for larger numbers the
method is still unwieldy.
A simple technique for determining all the primes less than
a given number without using any arithmetic at all was invented
by the Greek mathematician Eratosthenes of Cyrene (276-194 B.C.).
To find all the primes less than N, you begin by writing all the
numbers 2,3,4,5, ,N in a list. Starting from 2, every second
number on the list will be even, of course, and hence, excepting
2 itself, will be composite. So you go through the list deleting
every second number (but leaving 2 untouched). Now turn your
attention to the next number on the list which has not been crossed
out, namely 3. Starting from 3, every third number will be a
multiple of 3, hence, excepting 3 itself, composite. So leave
4
-
3 untouched and then proceed to cross out every third number
thereafter. (In counting every third number, you include the
crossed-out numbers.) The next number remaining on the list (i.e.
not crossed out) is 5. Starting at 5, cross out every fifth number
(but leave 5). And so on. By the time you reach the largest
number less than the square root of N by this procedure, you will
have deleted all composite numbers from the list, and what is
left will constitute a list of all the primes less than N.
The process of successively eliminating the multiples of
2,3,5,7, etc in the above method is known as 'sieving' (for obvious
reasons): hence the name 'The Sieve of Eratosthenes'. Later in
the book we shall have occasion to study other 'sieving' procedures.
Though it eliminates the need for arithmetic, it is clear
that if N is much greater than, say 100, sieving is also not a
very practical way to find prime numbers. In fact, utilising
various mathematical results we shall obtain in this book, it is
possible to develop much more efficient methods for primality
testing.
3. THE DISTRIBUTION OF THE PRIMES
If you use the Sieve of Eratosthenes to list all the primes less
than, say, 100, you will be able to see that, though the primes
are common amongst the smaller numbers (less than 20, say), they
become less frequent the higher up you go. In fact, the sieving
method makes it quite clear why this is the case. The higher
up you are, the more numbers will be sieved out by the time you
get there.
If we denote by ~(n) the number of primes less than n, then
5
-
the following table shows how n(n) varies with n for a few values
of n.
n n(n)
1,000 168
10,000 1,229
100,000 9,592
1,000,000 78,498
10,000,000 664,579
100,000,000 5,761,455
, In 1896, Hadamard and de la Valee Poussin independently
succeeded in proving that as n tends to infinity, n(n) approaches
the value n/log(n), i.e.
lim n(n) 1 n~ n/log(n)
(This followed considerable work on the problem by Tchebychef,
Riemann, and others.) This result is known as The Prime Number
Theorem. It had been conjectured over a hundred years earlier
by Legendre and Gauss, based upon the numerical evidence supplied
by tables such as the above.
An even better formula approximating n(n) for 'large' n was
suggested by Gauss and subsequently proved by de la Va1~e Poussin,
namely the function
Li(n) fn dx 2 log(x)
The accuracy of these approximating functions can be judged from
the following table, which extends the one above.
6
-
n n(n) n/log(n) Li(n)
1,000 168 145 178
10,000 1,229 1,086 1,246
100,000 9,592 8,686 9,630
1,000,000 78,498 72,382 78,628
10,000,000 664,579 620,420 664,918
100,000,000 5,761,455 5,428,681 5,762,209
One thing that is immediately apparent from the above table
is that, whilst Li(n) approximates n(n) with considerable accuracy
for quite modest values of n, it always does so on the large side:
Li(n) - n(n) is always positive. Is this in fact always the case,
or are there values of n for which Li(n) - n(n) is negative?
This is one of those salutary occasions when the mathematical
fact is at variance with all the available numerical evidence.
No number n has ever been found for which Li(n) - n(n) is
negative, despite considerable computer searches. Nevertheless,
the mathematician J.E.Littlewood proved that such an n must exist.
In fact, the sign of Li(n) - n(n) changes infinitely often as n
runs up through all the numbers.
somewhere before the number
It must certainly change
a number of incomprehensible magnitude, and almost the largest
number ever to playa genuine part in mathematics. It seems
likely that, no matter how much computers develop in the future,
noone will ever know of a specific example of a number n for
7
-
which TI(n) exceeds Li(n).
4. LARGEST KNOWN PRIMES
Knowing that there are infinitely many primes, mankind's curiosity
has naturally resulted in computer searches being made for 'record'
primes. Such searches involve some interesting mathematics, and
require very efficient computer programs. For mathematical reasons
which will be explained later in the book, record primes are
nowadays always of the form 2n -1 for certain numbers n.
Prior to 1971, the largest known prime was 211 ,213 1, a
number which would require some 3,376 digits to write out in the
normal way. This was discovered by Donald B. Gillies in 1963
using the ILLIAC-II computer. In 1971, Bryant Tuckerman used
an 1MB 360-91 computer to show that the 6,002 digit number
219,937 l' . - l.S prl.me. In 1978, two 18 year old American high
school students, Laura Nickel and Curt Noll, discovered the prime
221,701 _ 1, using a CDC-CYBER-174. This feat so caught the
imagination of the American public that Nickel and Noll's discovery
was announced on nationwide television and made every major American
newspaper. The Nickel-Noll prime has 6,533 digits. One year
later, Noll used the same computer to better the record with the
6,987 digit number 223,209 - 1.
It took the CDC-CYBER-174 well over eight hours to run the
check on primality for Noll's number. Two weeks later, David
Slowinski used the immensely powerful CRAY-1 computer to check
the primality of the same number: it took a mere seven minutes.
Aided by Harry Nelson, Slowinski used the CRAY-1 to discover, on
April 8, 1979, that the 13,395 digit number 244 ,497 - 1 is prime.
8
-
For the period 1976 until 1984, the CRAY-1 was probably the most
powerful computer in the world, so it is not too surprising to
learn that Slowinski and his CRAY-1 kept the record for the world's
largest known prime. On September 25, 1982 the 25,962 digit
prime 286 ,243 - 1 was discovered. Then, on September 19, 1983
(at 1 36 33 ) the 39 751 d o 2132 ,049 1 f d 1: : a.m. , 19lt glant - was oun,
this time using a CRAY-XMP computer, essentially two CRAY-1
computers joined together. At the time of writing, this is the
largest known prime number in the world.
5. CONJECTURES ABOUT PRIMES
There are many easily formulated conjectures about primes, based
upon numerical evidence, which have resisted numerous attempts
at solution. For instance, to make the business of primality
testing feasible, record primes are nowadays always sought amongst
the numbers of the type 2n - 1. It is conjectured that there
are infinitely many prime numbers of this kind, but this has never
been proved. In fact the numerical evidence is rather flimsy.
Including the examples listed in the previous section, only 29
examples of such primes are known. A similar unsolved problem
is whether there are infinitely many primes of the form 2n + 1.
2 Are there infinitely many prlmes of the form n + 1? The
conjecture is that there are. Again, Fermat, the great 17th
Century number theorist, conjectured that all numbers of the form
are prime.
F n
This is certainly true for FO = 3, F1 = 5, F2 = 17,
65,537. But unfortunately, there it stops.
9
-
In 1732, Euler found that F5 = 4,294,967,297 is divisible by 641.
Despite considerable computerised searches, no prime numbers of
the form Fn for n>4 have ever been found, and the present day
conjecture is that Fn is composite for all n>4.
Two primes which are only 2 apart are said to be twin primes.
For example, 5 and 7 constitute a pair of twin primes, as do 17
and 19. Thousands of examples of such pairs have been discovered,
but the conjecture that there are infinitely many pairs of twin
primes remains unresolved.
In a letter to his colleague Euler written in 1742, Christian
Goldbach conjectured that every even number is the sum of two
numbers that are either prime or 1. For example, 4 = 2+2, 6 = 3+3,
8 = 3+5. Computer searches have demonstrated that this is true
up to 1,000,000,000, but the general problem remains unsolved,
and is known today as the Goldbach Conjecture. A similar open
question is whether every even number can be expressed as the
difference of two consecutive primes in infinitely many ways.
And in 1775 Lagrange conjectured that every odd number greater
than 5 can be written in the form p + 2q where p and q are both
primes, again still open.
Is it possible to find arbitrarily long finite arithmetic
progressions of prime numbers? At present the longest known is
of length 18, starting with the prime 107,928,278,317 and increasing
in steps of 9,922,782,870 until the number 276,615,587,107 is
reached. Even more demanding, are there arbitrarily long finite
arithmetic progressions of consecutive primes? The longest known
has length 6, starting with 121,174,811 and going up in steps of
30.
10
-
Occasionally a conjecture about primes does get solved.
For instance, in 1850, Tchebychef established Bertrands's Conjecture
that for every number n>l there is a prime number strictly between
nand 2n. And in 1950 it was shown that every number greater
than 9 can be written as a sum of distinct odd primes. But by
and large, most of the present day open conjectures about primes
seem to be extremely hard to answer.
EXERCISES 0
1. Use the Sieve of Eratosthenes to determine all the primes
less than 100.
2. Prove that if n is composite, it has a prime factor less
than In.
3. A number is said to be square-free if it is not divisible
by any perfect square. Prove that a number n>l is square-
free if, and only if, it is a product of distinct primes.
4. Prove that the only prime of the form n 3 - 1 is 7.
5. Prove that if an - 1 is prime, then a=2.
6. Show that any prim~ greater than 3 is either one less or one
more than a multiple of 6. (This requires the Division
Algorithm considered in Chapter I.)
7. 2 2 Show that if p is a prime greater than 5, either p -lor p +1
is divisible by 10. (This requires the Division Algorithm,
considered in Chapter I.)
11
-
8. Use Bertrand's Conjecture to show that if Pn is the n-th prime
n (so Pl=2, P2=3, P3=5, etc.), then Pn
-
4. A natural question which arises from the proof if the
infinitude of the primes (see question 9 in the above
Exercises: see also Chapter I) is whether the numbers
are prime, where Pn denotes the n-th prime number.
Investigate this question.
5. Investigate the primality of the values of the quadratic
polynomial
f(n) n 2 + n + 41
for n 0,1,2, ,100.
6. There is an arithmetic progression of seven primes which starts
with a number less than 10 and increases in steps of 150.
Find it.
7. The de Polignac Conjecture states that every odd number is
the sum of a prime and a power of 2. Find a counterexample.
8. Verify the Goldbach Conjecture for all even numbers up to
1,000.
9. Find five primes of the form 2n - 1.
10. Find 10 pairs of twin primes. Find 100. Investigate the
behaviour of the function t(n) which gives the number of twin
pairs less than n.
13
-
I Basic Concepts
This preparatory chapter collects together an assortment of concepts
and ideas that will be required throughout the remainder of the
book. Most of the material covered will probably be familiar to
you, though the 'casual' reader may be a little surprised at the
degree of rigour with which the development proceeds. Exercising
great care and leaving nothing to 'chance' may well seem unnecessary
during the early stages, but when things become more complicated
and there is little or no intuition to be the guide, mathematical
rigour is the only guarantee of correctness, so it is as well to
start out as you mean to go on.
1. MATHEMATICAL INDUCTION
Many results in Number Theory are of the form 'all numbers have
such and such a property'.
any number n
For example, the statement that for
1 + 2 + 3 + + (n-1) + n tn(n+1)
How can one set about proving that such a statement is true?
Certainly one might begin by trying the formula with a few values
of n, say n = 1,2,3, ,10. In the case of the example given,
14
-
this will show that the formula is valid for each of these values
of n. Whilst this may well give cause to suspect the general
validity of the formula for all values of n, it does not at all
prove its universal validity. Indeed, the unreliability of
numerical evidence in situations such as this was highlighted by
the result of Littlewood mentioned in Chapter 0.3. (Another
instance was provided by Computer Problem 5 in that chapter.)
In order to demonstrate rigourously that a statement is valid
for all values of the number n, it is necessary to prove that it
is impossible for there ever to be an n for which the statement
is false, no matter how large and computably inaccessible it may
be. In order to do this, it is common to argue like this.
Suppose, for the sake of argument, that there were a value of n
for which the statement in question is false. Then there would
have to be a first (i.e. least) such value of n. For this particular
value of n we would then have the situation where the statement
is true for all values 1,2,3, ,n-l but fails at n. The method
of proof by Mathematical Induction works by preventing this situation,
or rather by demonstrating that it can never arise.
The method is more easily understood by considering a specific
example, such as the one given above.
that for any number n
That is, we want to pr,ove
1+2+3+ +n !n(n+l)
We begin by observing that the formula is valid for the special
case n=l. Now suppose that it were false for some number n.
Let K be the least number for which the formula is false. We
thus have the situation
15
-
(i) 1 + 2 +
(ii) 1 + 2 +
+ (K-l) = ;(K-l)(K)
+ K f. ;K(K+l)
To complete the proof we demonstrate that this situation is
contradictory, i.e. that equations (i) and (ii) cannot both be
valid. This is the part of any induction proof where some ingenuity
is required.
In the case of this example suppose we take equation (i) and
add K to both sides. This gives
1 + 2 + + (K-l) + K ;(K-l)(K) + K
which, when we simplify the right hand side gives
1 + 2 + + (K-l) + K ;K(K+l)
This contradicts equation (ii), of course. (strictly speaking,
(ii) is not an 'equation' but an 'inequation' but mathematicians
never bother about this kind of detail: there are more important
details to worry about.) It follows that the original assumption
that there were an n for which the formula is false cannot be a
valid assumption.
all numbers n.
In other words, the formula must be true for
The reason the above method of induction works can be explained
like this. The difficulty in trying to prove that something is
true for all numbers n is that there are infinitely many numbers
and it is impossible to consider them all individually. By making
the assumption that the statement is false for some number, and
then concentrating on the least number for which it is false, you
are reduced to looking at just one number, namely that least value.
16
-
Of course, you do not know just which number that is. (Indeed,
if the result you are trying to prove turns out to be true, then
there will in fact not be such a number, so there is no way you
can know its value. But at the time of trying to prove the result
you do not know this, at least not 'officially'.) But as the
above example shows, in order to arrive at a contradiction it is
not at all necessary to know anything about the critical value
K other than that it is (by assumption) a critical value where
the statement in question first becomes false.
Notice that in the above example, in order to reach our final
contradiction we began with equation (i). But there will only
be an equation (i) provided that K>l. This was why the proof
began with the observation that the result was valid for n=l.
This means that K, for which the result is assumed false, must
indeed be greater than 1. This important point is often over-
looked by the beginner, so we shall emphasise it by trying to prove,
by induction, the false statement
1 + 3 + 5 + + (2n-1) 2
n + 3
Suppose that the above equation is false for some value of
n. Let K be the least value of n for which it fails. Then we
have the situation
(i) 1 + 3 + 5 + + (2K - 3) (K-1)2 + 3
(ii) 1 + 3 + 5 + + (2K - 1) ~ K2 + 3
Add 2K-1 to both sides of equation (i) to obtain
1 + 3 + 5 + + (2K-3) + (2K-1) 2
(K-1) + 2K + 2
17
-
Rearranging the right hand side of this equation gives
1 + 3 + 5 + + (2K-1) 2
K + 3 ,
which contradicts (ii). So far this looks very similar to the
proof used in our first example. The difference is that in this
example the result is not true for n=l. This 'small' fact means
that the argument just given does not lead to the conclusion that
the formula concerned is valid for all values of n. In fact it
is false for all values of n.
Proofs by mathematical induction are often written in a slightly
different fashion. In order to prove that some statement A(n)
involving the number n is valid for all numbers n, it is possible
to proceed as follows.
1. Establish (usually by simple observation) that A(l) is
valid.
2. Give an algebraic proof that the truth of A(n) implies
that of A(n+1) (for an unspecified n).
This procedure is in fact logically equivalent to the first one.
Step 1 is, of course, common to both approaches. Step 2 above
will clearly preclude the existence of a K for which A(K) is false,
since any such K will have to be greater than 1 (by Step 1), and
so the least K will be for the form K = n+1 where A(n) is true
(K being the least for which A(K) is false), and Step 2 then implies
A(n+l) is true, i.e. A(K) is true, a contradiction.
We shall use this method of writing the proof to establish
the correct version of the formula for the sum of the first 2n-1
odd numbers, considered above. The formula is
18
-
(i) 1 + 3 + 5 + + (2n-1) 2
n
To prove this by the method of induction, we begin by observing
that the formula is valid for n=l. Now we make the assumption
that it is valid for an unspecified n, i.e. we assume that equation
(i) is indeed valid for some (unspecified) n, and we try to prove
that it is valid for n+1, i.e. that
(ii) 1 + 3 + 5 + + (2n+1) (n+1)2
How do we prove that (ii) follows from (i)? This is easy.
Sim~ly add 2n+1 to both sides of equation (i) and simplify the
right hand side.
Really the only difference in the two approaches is that in
the former we perform the algebra on the special (but unknown)
value n=K where the statement is false for the first time, and
in the second we perform the same algebra on some fixed, but likewise
unknown, n.
Notice in particular that in the second formulation of
induction, we do not make the assumption that A(n) is true for
all n, indeed, it is precisely in order to prove this that induction
is being used in the first place. Rather we assume that A(n)
is true for a single but totally unspecified value of n, which,
being unspecified, has to be referred to as 'n' throughout. (Some
authors introduce a second symbol, 'k', at this point and speak
about 'letting n=k', and you can do this if you prefer, preserving
the distinction between 'n', the variable, and 'k', a fixed but
arbitrary number. But the algebra remains the same, except that
n becomes k everywhere.)
19
-
The two examples of induction considered so far both involved
the verification of an equation. This is not always the case.
As an illustration, let us use induction to prove that for all
numbers n, 6 divides into 7n_1. For n=1 this is obviously true.
Now assume the result is valid for some arbitrary but fixed number
n. We shall try to use this assumption in order to prove that
6 divides into 7n+1 - 1. Notice that
7.7n - 7 + 6 7.(7n - 1) + 6
By our 'induction hypothesis', 6 divides into 7n_1, so certainly
6 divides into 7.(7n-1). It follows at once that 6 divides into
7.(7n-1) + 6, of course, so we have succeeded in proving that 6
divides into 7n+1 - 1. It follows ('by induction') that 6 divides
into 7n-1 for all n.
You may well ask, why write 7n+1 - 1 in the form we did?
The only answer is that this led to the result we wanted. Different
situations will require different 'tricks', and induction proofs
often require considerable ingenuity at the 'n to n+1 step'.
We end this section with a particularly important application
of the method of induction: The Binomial Theorem. This allows
us to express powers of the form (a + b)n as a sum of products
of powers of a and b. For example, the following are well known
and easily proved by direct evaluation:
2 2 a + 2ab + b ,
In order to obtain a general result of this kind we need the
factorial function. For any number n, factorial n (also called
20
-
'n factorial') is that number, denoted by n!, obtained by
multiplying together all the numbers 1,2,3, ,n. Thus
n! n(n-1) 3.2.1.
For example,
I! = 1, 2! = 2.1 = 2, 3! = 3.2.1 6, 4! 4.3.2.1 24,
5! 5.4.3.2.1 = 120, 6! = 6.5.4.3.2.1 720,
7! 7.6.5.4.3.2.1 = 5040, 8! = 40320
From the above examples it should be clear that the values of n!
increase very rapidly as n increases. It should also be clear
that there is a simple recursive procedure for calculating values
of n!, namely, for any n,
(n+1) ! (n+1) (n!)
For convenience, we define o! = 1.
For any numbers n,r such that 0 ~ r ~ n, the binomial
coefficient Cn is defined by: r
n! r!(n-r)!
For example,
2 Co 1,
2 C1 2,
2 C2 1;
4 4 4 Co 1, C1 4, C2 6,
Note that for any n, Cn n
n(n-1) (n-r+1) r!
3 1, c3 Co 1
4 4.
4 C3 C4
3, 3
C2 3,
1.
1, and that for any n,r, Cn r
c3 3
1;
Cn n-r
21
-
Theorem 1.1.1 (The Binomial Theorem) For any n il: 2 ,
Proof: By induction on n. The cases n = 2, 3 follow from
previous observations. So assume the result holds for n (i.e.
as stated above) and prove it for n+1. By this induction
hypothesis, we have:
a.(a + b)n n n+l + cnlanb + cn2an-1b2 + n n-r+1 r cOa + Cra b +
b.(a + b)n n n + cn1an-1b2 n n-1 3 coa b + C2a b + n n-r r+l
+ Cab + r
n n + Cnbn+1 + Cn _1ab n
Adding these two expressions, we obtain
n+1 (a + b)
(Cn n n-r+1 r + r + Cr_l)a b +
+ (Cn + Cn l)abn + Cnbn +l n n- n
. n+1 1
n and
n+l 1
n shall have completed Sl.nce Co = Co Cn +1 Cn' we
proof if we can show that Cn n Cn+l our + Cr - 1 for all r, since r r
the above expression will then be the theorem for n+l in place
of n. So we must prove that
n! +
n! (n+1)! r! (n-r)! (r-l)! (n-r+1)! r! (n+1-r)!
But this is easy. Simply combine the two fractions on the left
into a single fraction, and upon simplification the expression
22
-
on the right is obtained. This completes the proof of the
binomial theorem. 0
2. DIVISIBILITY. THE EUCLIDEAN ALGORITHM
The notion of divisibility of one number by another is fundamental
to practically all aspects of Number Theory. Given any two
numbers one can add them or multiply them and obtain a new (natural)
number. If you allow for negative numbers (and zero), by
considering the integers rather than just the positive integers,
you can subtract as well. But division cannot, in general, be
performed, which is to say the result of dividing one number*
or integer by another is not necessarily another number* or
integer. For instance, you cannot divide 2 by 3 and obtain a
natural number as the result. Division is an operation for which
you need, at the very least, the rational number system. But
rational numbers are not what we study in Number Theory (at least,
not for most of the time).
When you are restricting yourself to whole numbers, either
the natural numbers or the integers, the process of division
results in a 'quotient' and a 'remainder'. For example, when
you try to divide 9 by 4 you get a quotient of 2 and a remainder
of 1:
9 = 4.2 + 1
This fundamental fact is embodied in a result called The Division
Algorithm. This is a bit of a misnomer, since the result itself
* Remember ~hat we have agreed that the word 'number' shall mean 'natural number' except where indicated otherwise.
23
-
is not an 'algorithm' at all. On the contrary, it merely asserts
the existence of a quotient and a remainder, and does not tell
you how to calculate them. (Though it can presumably be safely
assumed that you are, in fact, able to perform this task should
it prove necessary to do so.)
Theorem 1.2.1 (The Division Algorithm) Let a,b be integers, b>O.
Then there exist unique integers q,r such that
a q.b + r and O~r
-
where O~r,r'
-
The following result is more general than Theorem 1.2.1.
Theorem 1.2.2 Let a,b be integers, b ~ O. Then there are
unique integers q,r such that
a = qb + rand 0::; r < Ib I.
Proof: We only need to consider the case bO
having been dealt with in Theorem 1.2.1 above. Since Ibl > 0,
an application of Theorem 1.2.1 gives unique integers q',r' such
that
a = q'.lb 1+ r' and 0::; r' < Ibl.
-b, if we set q -q' and r r', we get
a = qb + rand 0::; r < Ibl,
which proves the theorem. o
Simple though it is, the Division Algorithm enables us to
prove various results that can be of help in simplifying
computational work. For instance, suppose that we were looking
for numbers which are the square of a prime. It may be useful
to know that the square of any prime greater than 2, in fact the
square of any odd number, is one more than a multiple of 8. (For
example, 32 = 9 = 8 + 1, 52 = 25 = 3.8 + I.> To prove this fact,
note that by the Division Algorithm, any number can be expressed
in one of the forms 4q, 4q+l, 4q+2, 4q+3, so any odd number is
of one of the forms ~q+l, 4q+3. Squaring each of these gives
26
-
(4q+1)2 = 16q2 + Sq + 1 = S( 2q2 + q) + 1 222
(4q+3) = 16q + 24q + 9 = S(2q + 3q + 1) + 1.
In both cases the result is one more than a multiple of S.
Further examples of this kind are given in the Exercises at
the end of this chapter. (See also Exercises 6 and 7 in
Chapter 0.)
An integer b is said to be divisible by a non-zero integer a,
written symbolically as alb, if and only if there is an integer c
such that b = ac.
The following result lists the basic properties of divisibility.
Lemma 1.2.3 Let a,b,c be integers, a ~ o. Then:-
(i) alO, 11a, ala;
(ii) all if and only if a = l
( iii) if alb and cld then aclbd (for c ~ 0)
(iv) if alb and blc then alc (for b ~ 0)
(v) [alb and blal if and only if a = b
(vi) if alb and b ~ 0, then la I :> Ibl
(vii) if alb and alc then al(bx+cy) for any integers x,y
Proof: In each case the proof simply involves going back to the
definition of alb. For example, to prove (iv), the assumptions
mean that there are integers d,e such that b da and c = eb, from
which it follows that c = (de)a, and so alc. To take another
case, consider (vi). Since alb there is an integer c such that
b = ac. I a 1.1 cl. Since b ~ 0, we must have c ~ 0, so I c I ;: 1. The remaining cases are left as an exercise. 0
27
-
Let a,b be given integers, at least one of which is not zero.
Then there are only a finite number of numbers d such that dla
and dlb. The largest of these numbers d is called the greatest
common divisor of a and b, denoted by gcd(a,b) or, more simply
if the meaning is clear from the context, by (a,b).
For example, take the integers 18 and -24. The positive
divisors of 18 are 1,2,3,6,9,18, whilst those of -24 are
1,2,3,4,6,8,12,24. The common divisors of 18 and -24 (amongst
the positive numbers) are thus 1,2,3,6. Thus 6 = gcd(18,-24).
(In the shorter notation, 6 = (18,-24).) Notice that, as this
example indicates, the gcd of two integers is always positive,
regardless of the sign of the two given integers.
A very useful result concerning the gcd of two integers is
that gcd(a,b) can always be expressed as a linear combination
(with integer coefficients) of a and b. For instance, in the
case of the example 6 = (18,-24) considered above, we have
6 (-1).(-24) + (-1).(18).
Theorem I.2.4 Let a,b be integers, not both zero. Then there
are integers x,y such that
(a,b) xa + yb
Proof: Assume, for the sake of argument, that a ~ O. Let S
be the set
S {ua + vb I u,v are integers and ua + vb > O}
Let u = 1 if a > 0 and let u = -1 if a < O. Then ua + Ob S.
This shows that S is not empty.
28
Let d be the smallest member
-
of S. Pick x,y so that d = xa + yb, as guaranteed by the
definition of S. We complete the proof by demonstrating that
d = (a,b).
By the Division Algorithm there are integers q,r such that
a = qd + r and o ~ r < d.
Then,
r a - qd a - q(xa + yb) (1 - qx)a + (-qy)b.
If r > 0, this would mean that r S, contrary to the minimality
of d in S. Thus we must have r = O. But then a = qd, so dla.
Similarly, dlb.
Having proved that d divides into both a and b, we now prove
that d is the largest such number. Suppose that c is a number
which divides both a and b. Then (by Lemma 1.2.3, part (vii
cl (xa + yb), i.e. cld. It follows that c ~ d, and hence that d
is the greatest common divisor of a,b. 0
Corollary 1.2.5 Let a,b be integers, not both zero, and let
T {xa + by I x,y are integers}
Then T is the set of all multiples of (a,b).
Proof:
of d.
Let d = (a,b). Clearly, every member of T is a multiple
Conversely, since we may write d = xOa + YOb for some
integers xo,yo' we have
for any integer n, so every multiple of d is a member of T. 0
29
-
We say that integers a,b are coprime (or relatively prime)
if (a,b) = 1.
In the case of coprime integers, Theorem I.2.4 has a converse,
namely:
Theorem I.2.6 Let a,b be integers, not both zero. Then a,b
are coprime if and only if there are integers x,y such that
xa + yb 1.
Proof: If (a,b) = 1 then the existence of such x,y follows from
Theorem I.2.4. Conversely, suppose such x,y exist. Let d = (a,b).
Since dla and dlb we must have dl(xa + yb), i.e. d11. But d > 0
(since it is a gcd). Thus d = 1. o We have already observed that division is not a permissible
operation when we are restricting ourselves to whole numbers.
But in the case where integers a,b are such that alb, by definition
there is a (necessarily unique) integer c such that b = ac, and
we shall write bla to denote that unique integer c. We make use
of this natural convent~on in the next result, a corollary to the
above theorem.
Lemma I.2.7 If (a,b) d then (aid, bid) 1.
Proof: Write d xa + yb. Since dla and dlb we can rewrite
this equation as
1 x(a/d) + y(b/d).
So by Theorem I.2.6, (aid, bid) 1. o
Notice that alc and blc do not necessarily imply that ablc.
30
-
For example, 6124 and 8124 but 48124. However, we do have:
Lemma 1.2.8 Suppose that (a,b) 1. If alc and blc then ablc.
Proof: Pick r,s so that c = ra, c = sb. Pick x,y so that
xa + yb = 1. Then xac + ybc = c, so
c = xasb + ybra ab(xs + yr).
Thus ablc. o The following result, sometimes known as Euclid's Lemma, turns
out to be of fundamental importance in Number Theory.
Theorem 1.2.9 If albc and (a,b) 1 then alc.
Proof: Write 1 xa + yb, bc na. Then
c = xac + ybc xac + yna a(xc + yn),
so alc. o How do you go about calculating the gcd of two given integers?
The 'obvious' method is to factor each number into a product of
primes and see which primes (with multiplicities) are common to
both. For example, to calculate (90,2268), notice that
90 and 2268
2 so that (90,2268) = 2.3 = 18. (It is easy to see that this method
always works.) The problem with this method is that factoring
a number into primes is an extremely time consuming business.
(See later.) A much more efficient method of calculating a gcd
is to use The Euclidean Algorithm. This depends upon the following
lemma.
31
-
Lemma I. 2 .10 Ifa qb + r then (a,b) (b,r)
Proof: Let d (a,b) Then dla and dlb, so dlr. Hence dl (b,r).
Suppose that c > 0 also divides (b,r). Then clb and clr so
c I a (=qb + r). Thus cl (a,b), i.e. cld. It follows that c ~ d. So, by definition, d = (b,r). 0
We are now able to describe the Euclidean Algorithm to determine
the gcd of two given integers a,b. We may assume that neither
of a,b is zero. (Otherwise the problem is trivial.) Since
(a,b), we may further assume that a ~ b > O.
By the Division Algorithm applied to the pair a,b we can find
integers q1,r1 such that
If r 1 = 0 then bla so (a, b) = b and we are done. Otherwise
r 1 > 0 and by the above Lemma 1.2.10 we have (a,b) = (b,r1 ).
We now apply the Division Algorithm to b,r1 to obtain integers
Q2,r2 such that
b
If r 2 = 0 then r11b so (a, b) = (b,r1 ) = r 1 and we are done.
Otherwise r 2 > 0 and by Lemma 1.2.10 again we have
(a, b) = (b,r1 ) (r1,r2 ). Now apply the Division Algorithm to
r 1 ,r2 to obtain Q3,r3 such that
Keep on in this fashion. Since b > r 1 > r 2 > r3 > ~ 0, there
must come a stage n for which r n+1 O.
32
Then r n
(a,b), and we
-
are done.
As an example, we shall find the gcd of the numbers 12345
and 678. Applying the Euclidean Algorithm as just outlined, we
obtain the following steps:
12345
678
141
114
27
6
18.678 + 141
4.141 + 114
1.114 + 27
4.27 + 6
4.6 + 3
2.3 + 0
Thus the gcd of 12345 and 678 is 3, the last non-zero remainder
obtained.
It is obvious that the above computation is easily carried
out using at most a pocket calculator.
by factoring into primes takes longer.
To obtain the same result
The relevant factorisations
are
12345 3.5.823 and 678 2.3.113.
From these factorisations it is immediate that the gcd is 3, the
only prime factor the two numbers have in common. In fact, for
numbers of this size it is not so apparent that the factorisation
technique is not always feasible. The necessity of checking that
the numbers 823 and 113 are prime in the above example, though
involving more work than in the Euclidean Algorithm, is nonetheless
not too onerous. But, as we shall indicate in Chapter III, for
larger numbers, factorisation is virtually impossible, and must
therefore be avoided wherever possible.
33
-
Theorem 1.2.4 tells us that the gcd of two numbers can be
expressed as a linear combination (with integer coefficients) of
those two numbers. By tracing backwards through the Euclidean
Algorithm it is possible to find such an expression. This method
is best explained by means of an example. Consider the computation
above to determine gcd(12345,678). How can we express 3, the
answer, as a linear combination of 12345 and 678?
way back through the calculation we find:
3 27 - 4.6
27 - 4.(114 - 4.27)
27 - 4.114 + 16.27
17.27 - 4.114
17.(141 - 1.114) - 4.114
17.141 - 21.114
17.141 - 21.(678 - 4.141)
101.141 - 21.678
101.(12345 - 18.678) - 21.678
101.12345 - 1839.678
Working our
We shall examine the Euclidean Algorithm more closely in the
next section.
3. EFFICIENCY OF ALGORITHMS. MULTI-PRECISION ARITHMETIC
There are two distinct senses in which a mathematical problem can
be said to be 'solved'. First there is the pure 'existence' proof,
which demonstrates that, say, a number exists having certain
properties, but gives no indication as to just what that number
is. An example of such a solution is Littlewood's Theorem,
34
-
mentioned in Chapter 0.3, that there is a number n for which
Li(n) - n(n) is negative. No-one has any real idea of how to
actually find such a number. (Of course, in a sense there is
a method: examine each number in turn until one is found with the
desired property, but for reasons indicated in Chapter 0.3 this
is not at all a feasible method.) The second type of solution
is the computational solution, whereby a method (or 'algorithm')
is given which enables one to calculate numbers with the property
concerned.
a solution.
The Euclidean Algorithm is a good example of such
As soon as you start talking about algorithms for the solution
of problems, the questions arise: 'How efficient is the algorithm?'
'Is it feasible in practical, computing terms?' 'And if so, for
what 'inputs' is it feasible?' The whole subject of algorithm
efficiency is a big one in its own right, and for the most part
lies outside our present scope, but insofar as it concerns our
subject matter we need to know a little bit about it.
First of all, just what do we mean by an 'algorithm'? It
is possible to give a fairly precise definition, but at this stage
it is sufficient to say that an algorithm is a sequence of
instructions which describe, in 'reasonable' detail, the steps
that must be performed in order to compute something: usually
the algorithm will have one or more numerical 'inputs' and produce
one or more numerical 'outputs'. The Euclidean Algorithm described
in the last section is a good example of such a procedure. (The
name 'algorithm' derives from al-Khow~rizm!, an 8th Century Arabic
mathematician who wrote an influential textbook explaining the
35
-
Hindu system of decimal arithmetic.)
The first arithmetical algorithm that we ever meet is the
classical method for adding two numbers in decimal notation.
In order to develop the ideas we shall need to discuss algorithm
efficiency, let us have a quick look at this algorithm.
The classical addition algorithm depends upon the prior
knowledge of the sums of all pairs of 1-digit numbers (1 + 3 4,
5 + 7 = 12, etc.) Then, to add two n-digit numbers
x xnxn_1 X2X1 and Y = YnYn-1 Y2Y1 (where the xi'Y j are single
digits), we perform a sequence of n additions of the form
where c 2 , ,cn are the possible 'carries', defined by (setting
c1 = 0 for convenience)
{ 0 , if xi +Yi +ci :i! 9
1 , if xi+yi+c i > 9
NOw, our discussion of algorithms will really only make sense
when applied to computers, which perform the steps of the
algorithm in sequence at a fixed rate. So let us imagine that
we are to use the above addition algorithm in such a fashion,
taking no short cuts and performing each step in succession.
(The basic operation of adding two 1-digit numbers will correspond
to the basic addition operation provided in the computer hardware.)
Let to be the (assumed constant) time it takes to perform one basic,
single digit addition, and let T(n) denote the time taken to add
two n-digit numbers using the above algorithm. At first glance
it would seem that
36
-
This is not quite accurate, however, since we have ignored the
various 'book-keeping' tasks involved to keep track of where we
are in the algorithm. (Computer programmers refer to the time
taken for such operations as the 'overheads' involved in the
computation.) A few moments reflection should indicate that
these additional steps might themselves require a total time of
the order of 2n.tO
form
for some constant c.
At any rate, we will have a bound of the
We would say that the addition algorithm
'runs in linear time' to describe this situation: that is, the
time taken to perform the computation using the algorithm depends
linearly upon the size of the inputs (expressed in terms of the
number of digits in the two inputs). (If we wanted to express
the efficiency of the algorithm in terms of the magnitude of the
inputs rather than the number of digits involved, we would say
that the algorithm runs in 'log linear time'. This is because
the number of digits in a number N is approximately equal to
loglON, which means that the computation bound would be of the
form
Time taken to add two numbers of the order of N ~ c.loglON.tO.)
So much for addition (and, by a trivial modification to the
algorithm, subtraction). What about the other fundamental number
theoretic operation:multiplication? We start by examining the
37
-
conventional multiplication algorithm we learn at school. This
depends upon knowing in advance the product of any two I-digit
numbers (4.5 = 20, 6.9 = 54, etc.). Normally, when we make use
of this algorithm we layout the calculation more or less like
this:
35 24 x 20
120 100 600 840
(4X5=20) (4X3=12) (2 X5=10) (2 X3=6) (adding)
Thus we reduce the problem of multiplying two 2-digit numbers to
that of performing 4 multiplications of I-digit pairs, using
position to take care of the multiples of 10 involved (with a
units column, a tens column, etc.). In fact it will be more
convenient for us to write out such a calculation in the form
24.35 100.2.3 + 10.2.5 + 10.4.3 + 4.5
In general, if X and Yare two 2-digit numbers, say X
Extending the above algorithm to the general case of two
n-digit numbers we have: if
X and Y
are n-digit numbers then
38
-
Xy
In the course of this calculation, x,y, is calculated for each l. J
value of i,j = 1, ... ,n. What else is involved? There are some
additions, of course, n(n+1) of them, ignoring the final collection
of the various powers of 10. Each of these is essentially a 2-
digit addition, so runs in time 2tO' where to is the time for single
digit addition, giving a total addition time of 2n(n+1)tO (This
will turn out to be a good enough approximation for our needs.)
There are also the multiplications by the various powers of 10,
but since multiplication by 10k simply involves a 'shift' along
(accompanied by the addition of zeros) of k places, this operation
can be assumed to require a time k.tO The final additions will
require a time of at most cO.n for some constant cO' and there
are 2n of them, so this part of the calculation requires a time
2 bounded by c 1n for a suitable constant c 1 We may assume that
t 1 , the time taken to perform a basic single digit multiplication,
is not less than to. Thus the algorithm has a running time
T(n)
where c 2 is chosen large enough to take care of any overheads
involved in the basic multiplications, c 3 to allow for the 2-digit
addition overheads, c 4 to allow for the shifting overheads, and
Cs covers overheads in running the whole show. In other words,
for a suitable constant c.
39
-
You may well think that a multiplication algorithm for
multiplying two n-digit numbers in a time proportional to n2 is
the best possible. As we show next, this is not the case at all.
There is room for considerable improvement.
Let
x Y
be two 2n-digit numbers. We wish to calculate the 4n-digit
product XY. To this end, split each of the numbers X,Y into a
most significant (left) half and a least significant (right) half,
as follows:
x
Clearly,
XY
and
X r
Y r
Y
Now observe that this can be rearranged to give
XY
Then:
Apart from various shifts and additions (including the formation
of XI-Xr and Yr-Y l ), only three multiplications are required here,
namely
40
-
each of which is a multiplication of two n-digit numbers. Thus,
for this algorithm (which simply reduces a single 2n-digit multi-
plication to three n-digit multiplications, and does not
completely 'solve' the problem) we have, for a suitable constant c
T(2n) ~ 3.T(n) + cn (*)
How do we take care of the three n-digit multiplications? We
use the same trick again (replace n by n+l if n is odd). And
so on, until you get down to basic, I-digit products. If we do
this we obtain a 'recursive' algorithm which keeps referring back
to itself for smaller and smaller arguments. What is the running
time for this algorithm? If we choose the constant c large
enough so that c ~ T(2), then by an easy induction argument using
inequality (*) we see that for all k ~ 1,
Let 'x' denote, for any real number x, the least integer
greater than or equal to x. Then, for any number n we have,
from the above inequality,
Since 10g23 ~ 1.59, this means that for some constant K,
T(n) 159
~ K.n
For 'large' values of n this will be significantly faster than
the classroom algorithm, of course. (In practice, 'large' may
mean 'greater than 4' here.)
41
-
So far all of our discussion has had a somewhat artificial
air to it, since in practice all sorts of short cuts are available
in hand calculation, and in any case no-one would ever commence
a calculation that looked too complicated to carry out in a
reasonable time. But for the computer programmer, prior knowledge
of how long it will take a program to run is very important, as
are any tricks that might be employed to speed up a calculation.
The programmer only needs to examine algorithms for addition and
multiplication when it is necessary to deal with numbers which
are too large to fit into one half a computer word (when overflow
would result when a multiplication of two such numbers were
attempted).
Multi-Precision Arithmetic is the name used to describe the
procedures for performing arithmetic on numbers larger than one-
half the computer word size. The numbers themselves have to be
stored in arrays over two or more words, and to manipulate them
one needs to use algorithms very like the ones described above.
In fact only minor changes need to be made to adapt the algorithms
we have studied to make them suitable for computer implementation.
First of all the basic 'units' involved in the computations
are not single digits but the single-word parts of the multi-word
numbers. Secondly, the basic operations in terms of which the
computations must be performed are, as you might expect, the standard
single-word arithmetical operations provided by the computer
hardware. Thirdly, since modern computers perform all of their
arithmetic in binary form rather than decimal form, it is necessary
to replace '10' by '2' throughout. Subject to these changes,
all of our discussion about algorithm running times now holds for
42
-
multi-precision arithmetic routines on a computer.
4. THE FIBONACCI SEQUENCE AND THE EFFICIENCY OF THE EUCLIDEAN
ALGORITHM
In order to investigate the efficiency of the Euclidean Algorithm
it will be helpful to introduce a famous, classical number sequence:
the Fibonacci Sequence.
The Fibonacci sequence gets its name from the great 13th
Century Italian mathematician Leonardo of Pisa, who wrote under
the name of 'Fibonacci' (from 'filius Bonacci' - son of Bonacci).
His influential work Liber Abaci, written in 1202, introduced
the Hindu-Arabic decimal number system to Western Europe. In
this books appears the following problem:
A man puts one pair of rabbits in a certain place
surrounded by a wall. How many pairs of rabbits can
be produced from that pair in a year, if the nature of
these rabbits is such that every month each pair bears
a new pair which from the second month on becomes
productive?
It does not take long to figure out that the number of pairs
of rabbits present each month is given by the sequence
1,2,3,5,8,13,21,34,55,89,
The general rule for generating this sequence is
un +1 + un (for all n ~ 1),
43
-
where un is the n-th term in the sequence.
32+ 1
5 3 + 2
85+ 3
etc.
Thus:
This sequence is now known as the Fibonacci sequence.
From the recursive definition of the Fibonacci sequence given
above, it is easy to prove the following result:
Lemma I.4.1 1.
Proof: Suppose that the lemma were false, and let d > 1 divide
both u and u n+1 Then d divides u = u - u . Hence d n n-1 n+1 n divides u = u - u n-1 Continuing in this fashion we arrive n-2 n eventually at the conclusion that d divides u1 ' i.e. d11, which
is absurd. Thus the lemma must in fact be true. o
Using the Fibonacci sequence we can easily show that there
is no upper bound on the number of steps (divisions) necessary
to calculate a gcd using the Euclidean Algorithm. Specifically:
Lemma I.4.2 Let n > 1. The number of divisions necessary to
calculate (un' un+1 ) is exactly n.
Proof: Applying the Euclidean Algorithm to un ' un+l clearly
leads to the following system of equations:
44
-
u3 l.u2 + ul
u2 2.u l + 0
Thus (Un' un+l ) 1, and exactly n divisions have been
required. D
Closely related to the above lemma is the following result,
which shows that the Fibonacci numbers are rather special with
regards to the Euclidean Algorithm.
Lemma 1.4.3 For any n > 1, un is the least number such that
there is a number b > un for which n divisions are required in
order to calculate (unib) using the Euclidean Algorithm.
Proof: Let a be the least number such that there is a b > a
for which n divisions are required in order to calculate (a,b)
using the Euclidean Algorithm. By virtue of Lemma 1.4.2 we know
that un ~ a, so it suffices to prove that a ~ u n
Let the Euclidean Algorithm applied to the pair (a,b) be:
b q .a + r (0 < r < a) n n-l n-l
a = qn-l,rn- l + r (0 < r < r n- 1 ) n-2 n-2 r n-l qn-2 r n-2 + r n-3
(0 < r n-3 < r n- 2 )
. . . . r 4 q3 r 3 + r 2 (0 < r 2 < r 3 )
r3 q2 r 2 + r l (0 < r l < r 2 )
r 2 qlri
45
-
Now, we know that r 2 > r l > O. Also, each qi is a natural
number. Hence, working our way back through the above equations
we see that:
r 2 > r l
-
10.u n
(by equation (1. o
Corollary 1.4.5 For any n ~ 1, u sn +1 has at least n+l digits.
Proof: Since u 6 = 13, which has 2 digits, the result is valid
for n=l. The result follows by induction using the lemma.
(The easy details are left as an exercise.) o
At last we are able to prove our result concerning the
efficiency of the Euclidean Algorithm.
Theorem 1.4.6 Let b > a > 1. In order to calculate (a, b) using
the Euclidean Algorithm, at most s.k divisions are required, where
k is the number of digits in a.
Proof: Let n be the number of divisions required to calculate
(a,b) using the Euclidean Algorithm. We must show that n ~ s.k.
By Lemma 1.4.3, a ~ un. Let d be the number of digits in
un. Since d ~ k, it suffices to prove that n ~ S.d.
For some number t we have
t.s < n ~ (t+l).s
Since n > st, Corollary 1.4.5 implies that un has at least t+l
digits, i.e. d ~ t+l. Thus
n ~ s.(t+l) ~ s.d ,
as required. 0
Further discussion of the Euclidean Algorithm and its
efficiency is provided in the Exercises to this chapter.
47
-
5. PRIME NUMBERS
Though discussed briefly in Chapter 0, the treatment given there
was far from rigorous, so we shall here develop the theory of
prime numbers from the very beginning.
A prime number is a number p > 1 whose only divisors amongst
the integers are 1 and p (alternatively, whose only divisors
amongst the natural numbers are 1 and pl. A number greater than
1 which is not prime is said to be composite.
One of the most basic properties of prime numbers is provided
by our first lemma on the subject:
Lemma I.5.1 If P is a prime and plab, then pia or plb.
Proof: Assume that p)a. Thus (p,a) = 1. (By the definition
of p being a prime.)
and we are done.
So by Theorem I.2.9 (Euclid's Lemma), plb,
o
Corollary I.5.2
some i (1 ~ i ~ n).
Proof: This follows from the lemma by an easy induction
argument which we leave to the reader to supply. 0
Using the above corollary we can already establish one of
the most fundamental theorems of Number Theory.
Theorem I.5.3 (The Fundamental Theorem of Arithmetic.) Every
number n > 1 can be expressed as a product of prime numbers,
furthermore, this expression is unique up to the order of the
prime factors.
48
-
Proof: The theorem is certainly valid for n=2, since 2 is already
prime and hence, by convention, is a 'product' of prime numbers.
So, if we assume that the theorem is false and let n be the least
number which is not a product of primes, we have n > 2. If n
were prime it would be (by convention) a 'product' of primes.
Consequently, n cannot be prime. Thus there must be numbers
a and b such that n = ab, where both a and b are less than n.
Being less than n, a and b must be products of primes. But then
n = ab is a product of products of primes, and is thus a product
of primes, a contradiction.
a product of primes.
This proves that every number is
We turn now to the uniqueness of the prime factorisation of
any number. Suppose that there were a number n which had two
prime factorisations
(possibly with m ~ n).
Then PI lqlq2 qn' so by Corollary I.5.2, PI divides one of
ql,q2, ,qn. By rearranging QI,q2, ,qn if necessary we may
assume that PI divides ql. This means that PI = ql' of course.
So we can divide PI from the above equation to obtain
Repeating the same argument we see that, with a possible
rearrangement of q2,q3, ,qn' P2 = Q2' and hence that
Continuing this process we see that it must lead to the
49
-
conclusion that m=n and (after various rearrangements)
This completes the proof of the theorem.O
Corollary I.5.4
the form
Every number n > 1 can be written uniquely in
k n
where each Pi is prime, Pl < P2 < < Pn' and each ki is a
positive integer.
Proof: Immediate. 0
The following proof that there are infinitely many primes is
due essentially to Euclid.
Theorem I.5.5 There are infinitely many primes.
Proof: Assume, on the contrary, that there were only a finite
number of primes.
Now form the number
Since P > Pn' P must be composite. Hence P is divisible by some
prime less than P. Thus for some k, P is divisible by Pk.
But the division of P by Pk clearly leaves a remainder of 1, so
this is impossible. This contradiction proves the theorem. o
The above proof raises the question whether infinitely many
of the numbers
50
P P P p + 1 , n 1 2 n
-
where P1,P2,P3, ,Pn"" enumerates the primes in order, are
themselves prime. This is not known.
infinitely many of them are composite.
to both questions is 'Yes'.)
6. DIOPHANTINE EQUATIONS
Nor is it known if
(Presumably the answer
In honour of the Ancient Greek mathematician Diophantus, we use
the name Diophantine Equation to refer to an equation with integer
coefficients for which a solution is sought in the integers.
The simplest non-trivial form of Diophantine equation is
the linear equation in two variables:
ax + by c,
where a,b,c are integers and integer solutions for x,y are sought.
There may be no solutions, as is the case with the equation
6x + By 13.
Or there may be many solutions. For instance, the equation
6x + By 14
has the solutions x=l,y=l, and x=5,y=-2, and x=9,y=-5 (and
infinitely many more).
In a moment we shall see how the Euclidean Algorithm may
be used to find the solutions to Diophantine equations directly,
but first we prove a theorem which tells us exactly when a solution
will exist, and what form the solutions will then have.
51
-
Theorem I.6.l The Diophantine equation
ax + by c
has a solution if and only if (a,b)lc. If (xo'Yo) is one solution,
then all other solutions are given by
x = Xo + (b/d)t y Yo - (a/d)t ,
where t is any integer, and where d (a,b).
Proof: Suppose first that a solution exists. Then by
Corollary I.2.5 we know that die. So that's half the theorem
already.
Conversely, suppose that die, say c
xO,yO so that
dt. Pick integers
(By Theorem I.2.4 we know that such integers exist.) Then
c = dt
so x = xot and y = Yot solve the equation.
Now suppose that xo,yo is any solution to the equation.
Thus, if xl'Yl is any other solution, we will have
axo + byO = c
so
By Lemma I.2.7 there are relatively prime integers r,s such that
52
-
a dr, b ds. So
i.e.
NOw, rls(Y I - YO) and (r,s) = I so by Euclid's Lemma (Theorem
1.2.9), rl(Y I - YO). So for some integer t, YI - YO = rt.
Thus r(xO - xl) = srt, which gives Xo - xl = st. Thus
Xo - (b/d)t and YI YO + rt YO + (a/d)t
Moreover, for any value of t, if xl and YI are as above, then
xI'Y I are solutions to the given equation, as is easily seen,
so our proof is complete. o The existence part of the above proof indicates how the
calculation of the greatest common divisor of a,b and its
expression as a linear combination of a and b plays a role in
the solution of such an equation. We illustrate this by means
of an example.
We shall solve the Diophantine equation
210x + 1001y 21.
First we use the Euclidean Algorithm to find (210,1001).
1001 = 4.210 + 161
210 1.161 + 49
161 3.49 + 14
49 3.14 + 7
14 2.7.
53
-
Thus (210,1001) = 7. Since 7121, the equation does have a
solution. To find a solution we work back through the above
calculation to find 7 as a linear combination of 210 and 1001.
7 49 - 3.14
49 - 3. (161 - 3.49)
10.49 - 3.161
10.(210 - 1.161 ) - 3.161
10.210 - 13.161
10.210 - 13. (1001 - 4.210)
62.210 13.1001.
Thus
7 62.210 - 13.1001 .
Multiplying through by 3 to make the left hand side equal to 21,
the constant term in the original equation, we get
21 210.(186) + 1001.(-39).
Thus x = 186, Y = -39 is a solution to the original Diophantine
equation. All other solutions are given by
x = 186 + (1001/7)t = 186 + 143t
Y -39 - (210/7)t = -39 - 30t,
as t ranges over all integers. For instance, putting t = -1 we
obtain the solution consisting of the smallest numbers in absolute
value, namely x = 43, Y = -9.
Sometimes we are only interested in solutions within a certain
range. For instance, suppose that in the above example we want
54
-
to find all positive solutions. Thus we need to find all those
values of t for which
-39 - 30t> 0 and 186 + 143t > O.
The first of these inequalities implies that t ~ -2 whilst the
second implies that t ~ -1. Thus in this case we see that there
are in fact no positive solutions.
EXERCISES I
SECTION 1
1. Prove each of the following statements by induction. Try
to use both methods of writing out your proof.
(i) 1 + 4 + 9 + + n 2 = n(n+l)(2n+l)/6.
(ii) 1 + 8 + 27 + + n 3 = (!n(n+12
(1 + 2 + 3 + + n)2.
(iii) 1 + 1 + + 1 1.2 ~ n.(n+1)
2. Prove that for any n,
n n+l
1.(1:) + 2.(2:) + + n.(n:) (n+l)! - 1.
3. Do Exercise 0.8.
SECTION 2
4. Let a,b,c be integers. Prove the following
(i) if alb then albc
(ii) if alb and alc then a 2 1bc
(iii) if c '" 0, then alb if and only if aclbc .
55
-
5. Prove that every odd number is of one of the forms 4n+1 or
4n+3. (In advanced work, this classification of all odd numbers
into two classes turns out to be a fundamental one. See
also Exercise 16 below.)
6. Let a,b be integers, not both zero, and let d be any number.
Prove that d = (a,b) if and only if
(i) dla and dlb, and
(ii) whenever cia and clb then cld
7. Prove the following:
(i) if (a,b)
(ii) if (a,b)
(iii) if (a,b)
1 and (a,c) = 1 then (a,bc)
1 and cia then (b,c) = 1
1 then (ac,b) = (c,b)
1
8. The least common multiple of two non-zero integers a and b,
written lcm(a,b), is defined to be the smallest positive
integer m such that aim and blm. Prove that this is always
defined and that for any positive integers a,b,
(a,b).lcm(a,b) ab.
Deduce that for any numbers a and b, lcm(a,b)
if (a,b) = 1.
1 if and only
9. Use the Euclidean Algorithm to find the greatest common
divisor of each of the following pairs of numbers, and in
each case express the gcd as a linear combination of the two
given numbers :
56,72 24,138 119,272 1769,2378
56
-
10. Prove that the product of four consecutive integers is one
less than a perfect square.
11. Prove the following version of the Division Algorithm. Given
integers a and b with b # 0, there exist unique integers q
and r such that
a = qb + r
(Hint. Write a = q'b
o :;; r' :;; ; ibi, let r =
let r = r' - ibi and q
b < o. )
12. Define numbers un by Uo
Show that
u n
and
+ r', where 0 :;; r' < ibi. If
r' and q = q'. If , ibi < r' < ibi,
= q'+l if b > 0 or q = q' - 1 if
SECTION 4
Show further that the smallest numbers a > b > 0 for which
the algorithm of question 11 requires n division steps are
a = un + un- 1 and b = un
SECTION 5
13. Do Exercises 2 through 7 and 10 of Chapter O.
14. A classical theorem of Dirichlet says that if a and bare
relatively prime numbers, then the arithmetic progression
a, a+b, a+2b, a+3b, , a+kb,
57
-
contains infinitely many primes. Prove that no arithmetic
progression can consist entirely of primes.
15. Prove that the sequence
(n+l)! + 2, (n+l)! + 3 , , (n+l)! + (n+l)
provides a sequence of n consecutive composite numbers.
16. Prove that there are infinitely many primes of the form 4n+3.
(There are also infinitely many primes of the form 4n+l, but
the proof of this is rather difficult.)
SECTION 6
17. Find all solutions to the following Diophantine equations:
(i) 56x + 72y = 40 ,
(ii) 22lx + 9ly = 117
18. Find all positive solutions to the following Diophantine
equations:
(i) 30x + l7y 300
(ii) 54x + 2ly 906
19. Professor Euclid cashes a cheque at the bank, but the cashier
mixes up the number of pounds and the number of pence, so
58
instead of receiving Ea.b he receives Eb.a. Professor Euclid
fails to notice this, but after spending 68p he is surprised
to see that he still has twice the amount he wrote his cheque
for. What is the smallest value for which the cheque could
have been made out?
-
COMPUTER PROBLEMS I
1. Write a computer program which calculates n! for any given n.
(Hint. It may be a better approach to consider the following
'recursive' definition of n! :
1 ! 1 (n+1)! (n+l).(n!).
In any event, the rapid growth of n! as n increases will mean
that your program will only run for a few values of n.)
Arrange for the computer to print out the values 1!,2!,3!,etc.
as far as it will go.
2. Write a routine for carrying out multi-precision multiplication
for numbers containing twice the number of digits as your
computer allows in integers, and use this routine to extend
your program to calculate n! from Problem 1 above.
3. Write routines for the addition and the multiplication of
integers (positive or negative) of arbitrary (as far as possible)
size. Use the multiplication routine to obtain decimal print-
outs of the record prime numbers described in Chapter 0.4.
4. Write a multiplication routine for numbers occupying 2n computer
words using the 'fast' method described in section 3. Compare
its running time with that of the classical method. (This
will require your accessing the internal clock of your
computer. )
5. Multi-precision routines written commercially are usually
written in the assembly language of the computer concerned,
59
-
to enable efficient manipulation of the individual bits of
the numbers in store. If you are able to program in assembly
language, write a routine for the multiplication of two 2n
bit binary numbers using the 'fast' method described in section
3. Compare the speed of this routine with that of the
classical algorithm programmed in a high level language.
(This will require your being able to access the internal clock
of your computer.)
6. Write a program to calculate the greatest common divisor of
two given numbers using the Euclidean Algorithm. Include
in your program a count of the number of division steps required
in each calculation.
7. Fix a value of a and run your Euclidean Algorithm program to
find (a,b) for a series of different values of b > a. (Do
this by means of a loop so as to obtain a large number of runs.)
Theoretical considerations indicate that the average number
of division steps required by the Euclidean Algorithm for
varying values of b greater than a fixed value of a is
approximately 1.94 10910a. (This is, of course, much less
than the bound provided by Theorem I.4.6) See how closely
your computed results agree with this theoretical estimate.
Repeat the computation for different values of the number a.
8. If multiprecision arithmetic is required, the Euclidean
Algorithm becomes a rather inefficient method for calculating
greatest common divisors, since multi-precision division
routines tend to be relatively slow. There is a simple
60
-
algorithm for calculating greatest common divisors which uses
only the operations of subtraction, testing whether a number
is even or odd (which for binary numbers involves simply looking
at the last bit), and halving even numbers (which for binary
numbers involves nothing more than a shift of the entire number
one place to the right). This algorithm depends upon the
following facts about positive numbers a and b:
(1) If a and b are both even then (a,b) = 2(a/2,b/2).
(2) If a is even and b is odd then (a,b) = (a/2,b).
(3) If a > b, then (a,b) = (a-b,b).
(4) If a and b are both odd, then a-b is even and
la-bl < max(a,b).
Prove these facts and then use them to develop an algorithm
to calculate greatest common divisors of binary numbers.
If you can program in assembly language, write a program
which implements this algorithm, both for single precision
arithmetic and multiple precision work.
9. Write a program that finds a solution to a given Diophantine
equation of the form
ax + by c,
using the Euclidean algorithm, as described in section 6.
10. Modify the program from Problem 9 to look for a positive
solution to the equation.
61
-
II Congruences
Frequently in mathematics, a real breakthrough is made simply by
regarding a familiar notion from a different viewpoint. Such
is the case with the study of the notion of congruence, which is
but a study of divisibility carried out in a special way. It
is the brain-child of the great 19th Century German mathematician
Karl Friedrich Gauss. Large parts of modern day number theory
can be traced back to their origins in Gauss' Disquisitiones
Arithmeticae, a monumental work carried out whilst Gauss was in
his early twenties.
this volume.
Congruences appear in the first chapter of
1. CONGRUENCE
Let n be a fixed number. TWo integers a and b are said to be
congruent modulo n, written
a = b (mod n) ,
if and only if nl(a - b).
For example, 3 = 24(mod 7), -31 = II(mod 7), -15 = -64(mod 7). Given any integer a, by the Division Algorithm there are
integers q,r such that
62
-
a = qn + r (0 :;; r < n).
By definition of congruence,
a - r (mod n).
Clearly, no two numbers less than n can be congruent modulo n
(unless they are equal), so we see that every integer a is
congruent modulo n to a unique r such that 0 :;; r < n. The unique
number r is called the residue of a modulo n, or more precisely,
the least positive residue modulo n. This last remark is to allow
f