alan baker - a comprehensive course in number theory - cambridge university press, 2012 - 269p

A Comprehensive Course in Number Theory

BAker

A Com

prehensive Course in Number Theory

Developed from the authors popular text, A Concise Introduction to the Theory

of Numbers, this book provides a comprehensive initiation to all the major

branches of number theory. Beginning with the rudiments of the subject, the

author proceeds to more advanced topics, including elements of cryptography

and primality testing; an account of number fields in the classical vein including

properties of their units, ideals and ideal classes; aspects of analytic number

theory including studies of the Riemann zeta-function, the prime-number

theorem and primes in arithmetical progressions; a description of the Hardy

Littlewood and sieve methods from, respectively, additive and multiplicative

number theory; and an exposition of the arithmetic of elliptic curves.

The book includes many worked examples, exercises and, as with the

earlier volume, there is a guide to further reading at the end of each chapter.

Its wide coverage and versatility make this book suitable for courses extending

from the elementary to the graduate level.

Alan Baker, FRS, is Emeritus Professor of Pure Mathematics in the University

of Cambridge and Fellow of Trinity College, Cambridge. His many distinctions

include the Fields Medal (1970) and the Adams Prize (1972).

Cover designed by Hart McLeod Ltd

A Comprehensive Course in

Number Theory

AlAn BAker

BAke

r: A

Com

preh

ensive

Cou

rse in Number Theory CoVer C m

y bLK

A Comprehensive Course in Number Theory

Developed from the authors popular text, A Concise Introduction to the Theory ofNumbers, this book provides a comprehensive initiation to all the major branches ofnumber theory. Beginning with the rudiments of the subject, the author proceeds tomore advanced topics, including elements of cryptography and primality testing; anaccount of number fields in the classical vein including properties of their units, idealsand ideal classes; aspects of analytic number theory including studies of the Riemannzeta-function, the prime-number theorem and primes in arithmetical progressions; adescription of the HardyLittlewood and sieve methods from, respectively, additiveand multiplicative number theory; and an exposition of the arithmetic of ellipticcurves.

The book includes many worked examples, exercises and, as with the earliervolume, there is a guide to further reading at the end of each chapter. Its wide coverageand versatility make this book suitable for courses extending from the elementary tothe graduate level.

Alan Baker, FRS, is Emeritus Professor of Pure Mathematics in the University ofCambridge and Fellow of Trinity College, Cambridge. His many distinctions includethe Fields Medal (1970) and the Adams Prize (1972).

A COMPREHENSIVE COURSEIN NUMBER THEORY

ALAN BAKERUniversity of Cambridge

cambr idge un ivers i ty press

Cambridge, New York, Melbourne, Madrid, Cape Town,Singapore, So Paulo, Delhi, Mexico City

Cambridge University PressThe Edinburgh Building, Cambridge CB2 8RU, UK

Published in the United States of America by Cambridge University Press, New York

www.cambridge.orgInformation on this title: www.cambridge.org/9781107019010

c Cambridge University Press 2012

This publication is in copyright. Subject to statutory exceptionand to the provisions of relevant collective licensing agreements,no reproduction of any part may take place without the written

permission of Cambridge University Press.

First published 2012

Printed andiboundiin the United Kingdom byithe MPGiBooksiGroup

A catalogue record for this publication is available from the British Library

Library of Congress Cataloguing in Publication dataBaker, Alan, 1939

A comprehensive course in number theory / Alan Baker.p. cm.

Includes bibliographical references and index.ISBN 978-1-107-01901-0 (hardback)

1. Number theory Textbooks. I. Title.QA241.B237 2012

512.7dc232012013414

ISBN 978-1-107-01901-0 HardbackISBN 978-1-107-60379-0 Paperback

Cambridge University Press has no responsibility for the persistence oraccuracy of URLs for external or third-party internet websites referred to

in this publication, and does not guarantee that any content on suchwebsites is, or will remain, accurate or appropriate.

Contents

Preface page xiIntroduction xiii

1 Divisibility 11.1 Foundations 11.2 Division algorithm 11.3 Greatest common divisor 21.4 Euclids algorithm 21.5 Fundamental theorem 41.6 Properties of the primes 41.7 Further reading 61.8 Exercises 7

2 Arithmetical functions 82.1 The function [x] 82.2 Multiplicative functions 92.3 Eulers (totient) function (n) 92.4 The Mbius function (n) 102.5 The functions (n) and (n) 122.6 Average orders 132.7 Perfect numbers 142.8 The Riemann zeta-function 152.9 Further reading 172.10 Exercises 17

3 Congruences 193.1 Definitions 193.2 Chinese remainder theorem 193.3 The theorems of Fermat and Euler 213.4 Wilsons theorem 21

v

vi Contents

3.5 Lagranges theorem 223.6 Primitive roots 233.7 Indices 263.8 Further reading 263.9 Exercises 26

4 Quadratic residues 284.1 Legendres symbol 284.2 Eulers criterion 284.3 Gauss lemma 294.4 Law of quadratic reciprocity 304.5 Jacobis symbol 324.6 Further reading 334.7 Exercises 34

5 Quadratic forms 365.1 Equivalence 365.2 Reduction 375.3 Representations by binary forms 385.4 Sums of two squares 395.5 Sums of four squares 405.6 Further reading 415.7 Exercises 42

6 Diophantine approximation 436.1 Dirichlets theorem 436.2 Continued fractions 446.3 Rational approximations 466.4 Quadratic irrationals 486.5 Liouvilles theorem 516.6 Transcendental numbers 536.7 Minkowskis theorem 556.8 Further reading 586.9 Exercises 59

7 Quadratic fields 617.1 Algebraic number fields 617.2 The quadratic field 627.3 Units 637.4 Primes and factorization 65

Contents vii

7.5 Euclidean fields 667.6 The Gaussian field 687.7 Further reading 697.8 Exercises 70

8 Diophantine equations 718.1 The Pell equation 718.2 The Thue equation 748.3 The Mordell equation 768.4 The Fermat equation 808.5 The Catalan equation 838.6 The abc-conjecture 858.7 Further reading 878.8 Exercises 88

9 Factorization and primality testing 909.1 Fermat pseudoprimes 909.2 Euler pseudoprimes 919.3 Fermat factorization 939.4 Fermat bases 939.5 The continued-fraction method 949.6 Pollards method 969.7 Cryptography 979.8 Further reading 979.9 Exercises 98

10 Number fields 9910.1 Introduction 9910.2 Algebraic numbers 10010.3 Algebraic number fields 10010.4 Dimension theorem 10110.5 Norm and trace 10210.6 Algebraic integers 10310.7 Basis and discriminant 10410.8 Calculation of bases 10610.9 Further reading 10910.10 Exercises 109

11 Ideals 11111.1 Origins 111

viii Contents

11.2 Definitions 11111.3 Principal ideals 11211.4 Prime ideals 11311.5 Norm of an ideal 11411.6 Formula for the norm 11511.7 The dierent 11711.8 Further reading 12011.9 Exercises 120

12 Units and ideal classes 12212.1 Units 12212.2 Dirichlets unit theorem 12312.3 Ideal classes 12612.4 Minkowskis constant 12812.5 Dedekinds theorem 12912.6 The cyclotomic field 13112.7 Calculation of class numbers 13612.8 Local fields 13912.9 Further reading 14412.10 Exercises 145

13 Analytic number theory 14713.1 Introduction 14713.2 Dirichlet series 14813.3 Tchebychevs estimates 15113.4 Partial summation formula 15313.5 Mertens results 15413.6 The Tchebychev functions 15613.7 The irrationality of (3) 15713.8 Further reading 15913.9 Exercises 160

14 On the zeros of the zeta-function 16214.1 Introduction 16214.2 The functional equation 16314.3 The Euler product 16614.4 On the logarithmic derivative of (s) 16714.5 The Riemann hypothesis 17014.6 Explicit formula for (s)/(s) 17114.7 On certain sums 173

Contents ix

14.8 The Riemannvon Mangoldt formula 17414.9 Further reading 17714.10 Exercises 177

15 On the distribution of the primes 17915.1 The prime-number theorem 17915.2 Refinements and developments 18215.3 Dirichlet characters 18415.4 Dirichlet L-functions 18615.5 Primes in arithmetical progressions 18715.6 The class number formulae 18915.7 Siegels theorem 19115.8 Further reading 19415.9 Exercises 194

16 The sieve and circle methods 19716.1 The Eratosthenes sieve 19716.2 The Selberg upper-bound sieve 19816.3 Applications of the Selberg sieve 20216.4 The large sieve 20416.5 The circle method 20716.6 Additive prime number theory 21016.7 Further reading 21316.8 Exercises 214

17 Elliptic curves 21517.1 Introduction 21517.2 The Weierstrass -function 21617.3 The MordellWeil group 22017.4 Heights on elliptic curves 22217.5 The MordellWeil theorem 22517.6 Computing the torsion subgroup 22817.7 Conjectures on the rank 23017.8 Isogenies and endomorphisms 23217.9 Further reading 23717.10 Exercises 238

Bibliography 240Index 246

Preface

This is a sequel to my earlier book, A Concise Introduction to the Theory ofNumbers. The latter was based on a short preparatory course of the kind tradi-tionally taught in Cambridge at around the time of publication about 25 yearsago. Clearly it was in need of updating, and it was originally intended that asecond edition be produced. However, on looking through, it became apparentthat the work would blend well with more advanced material arising from mylecture courses in Cambridge at a higher level, and it was decided accordinglythat it would be more appropriate to produce a substantially new book. Thenow much expanded text covers elements of cryptography and primality test-ing. It also provides an account of number fields in the classical vein includingproperties of their units, ideals and ideal classes. In addition it covers vari-ous aspects of analytic number theory including studies of the Riemann zeta-function, the prime-number theorem, primes in arithmetical progressions anda brief exposition of the HardyLittlewood and sieve methods. Many workedexamples are given and, as with the earlier volume, there are guides to furtherreading at the ends of the chapters.

The following remarks, taken from the Concise Introduction, apply evenmore appropriately here:

The theory of numbers has a long and distinguished history, and indeed the concepts andproblems relating to the field have been instrumental in the foundation of a large partof mathematics. It is very much to be hoped that our exposition will serve to stimulatethe reader to delve into the rich literature associated with the subject and thereby todiscover some of the deep and beautiful theories that have been created as a result ofnumerous researches over the centuries. By way of introduction, there is a short accountof the Disquisitiones Arithmeticae of Gauss, and, to begin with, the reader can scarcelydo better than to consult this famous work.

To complete the text there is a chapter on elliptic curves; here my mainsource has been lecture notes by Dr Tom Fisher of a course that he has given

xi

xii Preface

regularly in Cambridge in recent times. I am indebted to him for generouslyproviding me with a copy of the notes and for further expert advice. I am grate-ful also to Mrs Michle Bailey for her invaluable secretarial assistance withmy lectures over many years and to Dr David Tranah of Cambridge UniversityPress for his constant encouragement in the production of this book.

Cambridge 2012 A.B.

Introduction

Gauss and Number Theory

Without doubt the theory of numbers was Gauss favourite subject. Indeed,in a much quoted dictum, he asserted that Mathematics is the Queen of theSciences and the Theory of Numbers is the Queen of Mathematics. Moreover,in the introduction to Eisensteins Mathematische Abhandlungen, Gauss wrote:

The Higher Arithmetic presents us with an inexhaustible storehouse of interestingtruths of truths, too, which are not isolated but stand in the closest relation to oneanother, and between which, with each successive advance of the science, we contin-ually discover new and sometimes wholly unexpected points of contact. A great partof the theories of Arithmetic derive an additional charm from the peculiarity that weeasily arrive by induction at important propositions which have the stamp of simplicityupon them but the demonstration of which lies so deep as not to be discovered untilafter many fruitless eorts; and even then it is obtained by some tedious and artificialprocess while the simpler methods of proof long remain hidden from us.

All this is well illustrated by what is perhaps Gauss most profound pub-lication, namely his Disquisitiones Arithmeticae. It has been described, quitejustifiably I believe, as the Magna Carta of Number Theory, and the depth andoriginality of thought manifest in this work are particularly remarkable con-sidering that it was written when Gauss was only about 18 years of age. Ofcourse, as Gauss said himself, not all of the subject matter was new at thetime of writing, and Gauss acknowledged the considerable debt that he owedto earlier scholars, in particular Fermat, Euler, Lagrange and Legendre. Butthe Disquisitiones Arithmeticae was the first systematic treatise on the HigherArithmetic and it provided the foundations and stimulus for a great volume

This article was originally prepared for a meeting of the British Society for the History ofMathematics held in Cambridge in 1977 to celebrate the bicentenary of Gauss birth.

xiii

xiv Introduction

of subsequent research which is in fact continuing to this day. The impor-tance of the work was recognized as soon as it was published in 1801 and thefirst edition quickly became unobtainable; indeed many scholars of the timehad to resort to taking handwritten copies. But it was generally regarded asa rather impenetrable work and it was probably not widely understood; per-haps the formal Latin style contributed in this respect. Now, however, afternumerous reformulations, most of the material is very well known, and theearlier sections at least are included in every basic course on numbertheory.

The text begins with the definition of a congruence, namely two numbersare said to be congruent modulo n if their dierence is divisible by n. Thisis plainly an equivalence relation in the now familiar terminology. Gauss pro-ceeds to the discussion of linear congruences and shows that they can in fact betreated somewhat analogously to linear equations. He then turns his attentionto power residues and introduces, amongst other things, the concepts of primi-tive roots and indices; and he notes, in particular, the resemblance between thelatter and the ordinary logarithms. There follows an exposition of the theoryof quadratic congruences, and it is here that we meet, more especially, the fa-mous law of quadratic reciprocity; this asserts that if p, q are primes, not bothcongruent to 3 (mod 4), then p is a residue or non-residue of q according asq is a residue or non-residue of p, while in the remaining case the oppositeoccurs. As is well known, Gauss spent a great deal of time on this result andgave several demonstrations; and it has subsequently stimulated much excel-lent research. In particular, following works of Jacobi, Eisenstein and Kummer,Hilbert raised as the ninth of his famous list of problems presented at the ParisCongress of 1900 the question of obtaining higher reciprocity laws, and thisled to the celebrated studies of Furtwngler, Artin and others in the context ofclass field theory.

By far the largest section of the Disquisitiones Arithmeticae is concernedwith the theory of binary quadratic forms. Here Gauss describes how quadraticforms with a given discriminant can be divided into classes so that two formsbelong to the same class if and only if there exists an integral unimodular sub-stitution relating them, and how the classes can be divided into genera, so thattwo forms are in the same genus if and only if they are rationally equivalent.He proceeds to apply these concepts so as, for instance, to throw light on thedicult question of the representation of integers by binary forms. It is a re-markable and beautiful theory with many important ramifications. Indeed, afterre-interpretation in terms of quadratic fields, it became apparent that it couldbe applied much more widely, and in fact it can be regarded as having providedthe foundations for the whole of algebraic number theory. The term Gaussian

Introduction xv

field, meaning the field generated over the rationals by i , is a reminder ofGauss pioneering work in this area.

The remainder of the Disquisitiones Arithmeticae contains results of a moremiscellaneous character, relating, for instance, to the construction of 17-sidedpolygons, which was clearly of particular appeal to Gauss, and to what is nowtermed the cyclotomic field, that is, the field generated by a primitive root ofunity. And especially noteworthy here is the discussion of certain sums involv-ing roots of unity, now referred to as Gaussian sums, which play a fundamentalrole in the analytic theory of numbers.

I conclude this introduction with some words of Mordell. In an essay pub-lished in 1917 he wrote The theory of numbers is unrivalled for the numberand variety of its results and for the beauty and wealth of its demonstrations.The Higher Arithmetic seems to include most of the romance of mathemat-ics. As Gauss wrote to Sophie Germain, the enchanting beauties of this sub-lime study are revealed in their full charm only to those who have the courageto pursue it. And Mordell added We are reminded of the folk-tales, currentamongst all peoples, of the Prince Charming who can assume his proper formas a handsome prince only because of the devotedness of the faithful heroine.

1Divisibility

1.1 Foundations

The set 1,2,3, . . . of all natural numbers will be denoted byN. There is no needto enter here into philosophical questions concerning the existence of N. It willsuce to assume that it is a given set for which the Peano axioms are satisfied.They imply that addition and multiplication can be defined on N such that thecommutative, associative and distributive laws are valid. Further, an orderingonN can be introduced so that either m < n or n < m for any distinct elementsm, n in N. Furthermore, it is evident from the axioms that the principle ofmathematical induction holds and that every non-empty subset of N has a leastmember. We shall frequently appeal to these properties.

As customary, we shall denote by Z the set of integers 0,1,2, . . . , and byQ the set of rationals, that is, the numbers p/q with p in Z and q inN. The con-struction, commencing with N, of Z, Q and then, through Cauchy sequencesand ordered pairs, the real and complex numbers R and C forms the basis ofmathematical analysis and it is assumed known.

1.2 Division algorithm

Suppose that a, b are elements of N. One says that b divides a (written b|a) ifthere exists an element c of N such that a = bc. In this case b is referred to asa divisor of a, and a is called a multiple of b. The relation b|a is reflexive andtransitive but not symmetric; in fact if b|a and a|b then a = b. Clearly also ifb|a then b a and so a natural number has only finitely many divisors. Theconcept of divisibility is readily extended to Z; if a, b are elements of Z, withb 0, then b is said to divide a if there exists c in Z such that a = bc.

We shall frequently appeal to the division algorithm. This asserts that for anya, b in Z, with b>0, there exist q, r in Z such that a =bq +r and 0r

2 Divisibility

proof is simple; indeed if bq is the largest multiple of b that does not exceed athen the integer r = a bq is certainly non-negative and, since b(q + 1) > a,we have r < b. The result plainly remains valid for any integer b 0 providedthat the bound r < b is replaced by r < |b|.

1.3 Greatest common divisorBy the greatest common divisor of natural numbers a, b we mean an elementd of N such that d|a,d|b and every common divisor of a and b also divides d.We proceed to prove that a number d with these properties exists; plainly itwill be unique, for any other such number d would divide a, b and so also d,and since similarly d|d we have d = d .

Accordingly consider the set of all natural numbers of the form ax + bywith x , y in Z. The set is not empty since, for instance, it contains a and b;hence there is a least member d, say. Now d = ax + by for some integers x , y,whence every common divisor of a and b certainly divides d. Further, by thedivision algorithm, we have a = dq + r for some q, r in Z with 0 r < d; thisgives r = ax + by, where x = 1 qx and y =qy. Thus, from the minimalproperty of d, it follows that r = 0, whence d|a. Similarly we have d|b, asrequired.

It is customary to signify the greatest common divisor of a, b by (a,b).Clearly, for any n in N, the equation ax + by = n is soluble in integers x ,y if and only if (a, b) divides n. In the case (a,b) = 1 we say that a and bare relatively prime or coprime (or that a is prime to b). Then the equationax + by = n is always soluble.

Obviously one can extend these concepts to more than two numbers. Infact one can show that any elements a1, . . . ,am of N have a greatest commondivisor d = (a1, . . . ,am) such that d = a1x1 + + am xm for some integersx1, . . . , xm . Further, if d = 1, we say that a1, . . . ,am are relatively prime andthen the equation a1x1 + + am xm = n is always soluble.

1.4 Euclids algorithm

A method for finding the greatest common divisor d of a, b was described byEuclid. It proceeds as follows.

By the division algorithm there exist integers q1, r1 such thata = bq1 + r1 and 0 r1 < b. If r1 0 then there exist integers q2, r2 such thatb = r1q2 + r2 and 0 r2 < r1. If r2 0 then there exist integers q3, r3 such

1.4 Euclids algorithm 3

that r1 = r2q3 + r3 and 0 r3 < r2. Continuing thus, one obtains a decreasingsequence r1, r2, . . . satisfying r j2 = r j1q j + r j . The sequence terminateswhen rk+1 = 0 for some k, that is, when rk1 = rkqk+1. It is then readily veri-fied that d = rk . Indeed it is evident from the equations

a = bq1 + r1, 0 < r1 < b;b = r1q2 + r2, 0 < r2 < r1;r1 = r2q3 + r3, 0 < r3 < r2;. . .

rk2 = rk1qk + rk, 0 < rk < rk1;rk1 = rkqk+1

that every common divisor of a and b divides r1, r2, . . . , rk ; and, moreover,viewing the equations in the reverse order, it is clear that rk divides each r j andso also b and a.

Euclids algorithm furnishes another proof of the existence of integers x , ysatisfying d = ax + by, and furthermore it enables these x , y to be explicitlycalculated. For we have d = rk and r j = r j2 r j1q j , whence the requiredvalues can be obtained by successive substitution. Let us take, for example,a = 187 and b = 35. Then, following Euclid, we have

187 = 35.5 + 12, 35 = 12.2 + 11, 12 = 11.1 + 1.

Thus we see that (187,35)= 1 and moreover

1 = 12 11.1 = 12 (35 12.2)= 3(187 35.5) 35.

Hence a solution of the equation 187x + 35y = 1 in integers x , y is given byx = 3, y =16.

For another example let us take a = 1000 and b = 45; then we get

1000 = 45.22 + 10, 45 = 10.4 + 5, 10 = 5.2

and so d = 5. The solutions to ax + by = d can then be calculated from

5 = 45 10.4 = 45 (1000 45.22)4 = 45.89 1000.4

which gives x = 4, y = 89. Note that the process is very ecient: if a > bthen a solution x, y can be found in O((log a)3) bit operations.

There is a close connection between Euclids algorithm and the theory ofcontinued fractions; this will be discussed in Chapter 6.

4 Divisibility

1.5 Fundamental theorem

A natural number, other than 1, is called a prime if it is divisible only by itselfand 1. The smallest primes are therefore given by 2, 3, 5, 7, 11, . . . .

Let n be any natural number other than 1. The least divisor of n that exceeds1 is plainly a prime, say p1. If n p1 then, similarly, there is a prime p2 divid-ing n/p1. If n p1 p2 then there is a prime p3 dividing n/p1 p2; and so on. Aftera finite number of steps we obtain n = p1 pm ; and by grouping together weget the standard factorization (or canonical decomposition) n = p1 j1 pk jk ,where p1, . . . , pk denote distinct primes and j1, . . . , jk are elements of N.

The fundamental theorem of arithmetic asserts that the above factorization isunique except for the order of the factors. To prove the result, note first that if aprime p divides a product mn of natural numbers then either p divides m or pdivides n. Indeed if p does not divide m then (p,m) = 1, whence there existintegers x , y such that px + my = 1; thus we have pnx + mny = n and hencep divides n. More generally we conclude that if p divides n1n2 nk thenp divides nl for some l. Now suppose that, apart from the factorization n =p1 j1 pk jk derived above, there is another decomposition and that p isone of the primes occurring therein. From the preceding conclusion we ob-tain p = pl for some l. Hence we deduce that, if the standard factorization forn/p is unique, then so also is that for n. The fundamental theorem follows byinduction.

It is simple to express the greatest common divisor (a, b) of elements a,b of N in terms of the primes occurring in their decompositions. In fact wecan write a = p11 pkk and b= p11 pkk , where p1, . . . , pk are distinctprimes and the s and s are non-negative integers; then (a,b)= p11 pkk ,where l = min(l , l). With the same notation, the lowest common multipleof a, b is defined by {a,b}= p11 pkk , where 1 =max(l , l). The identity(a,b){a,b}= ab is readily verified.

1.6 Properties of the primesThere exist infinitely many primes, for if p1, . . . , pn is any finite set of primesthen p1 pn + 1 is divisible by a prime dierent from p1, . . . , pn ; the argu-ment is due to Euclid. It follows that, if pn is the nth prime in ascending orderof magnitude, then pm divides p1 pn + 1 for some m n + 1; from this wededuce by induction that pn > 22

n. In fact a much stronger result is known;

indeed pn n log n as n . The result is equivalent to the assertion that

The notation f g means that f/g 1; and one says that f is asymptotic to g.

1.6 Properties of the primes 5

the number (x) of primes p x satisfies (x) x/ log x as x . Thisis called the prime-number theorem and it was proved by Hadamard and dela Valle Poussin independently in 1896. Their proofs were based on proper-ties of the Riemann zeta-function about which we shall speak in Chapter 2. In1737 Euler proved that the series

1/pn diverges and he noted that this gives

another demonstration of the existence of infinitely many primes. In fact it canbe shown by elementary arguments that, for some number c,

px1/p = log log x + c + O(1/ log x).

Fermat conjectured that the numbers 22n + 1 (n = 1,2, . . .) are allprimes; this is true for n = 1,2,3 and 4 but false for n = 5, as was proved byEuler. In fact 641 divides 232 + 1. Numbers of the above form that are primesare called Fermat primes. They are closely connected with the existence of aconstruction of a regular plane polygon with ruler and compasses only. In factthe regular plane polygon with p sides, where p is a prime, is capable of con-struction if and only if p is a Fermat prime. It is not known at present whetherthe number of Fermat primes is finite or infinite.

Numbers of the form 2n 1 that are primes are called Mersenne primes.In this case n is a prime, for plainly 2m 1 divides 2n 1 if m divides n.Mersenne primes are of particular interest in providing examples of large primenumbers; for instance it is known that 244 497 1 is the 27th Mersenne prime,a number with 13 395 digits.

It is easily seen that no polynomial f (n) with integer coecients can beprime for all n in N, or even for all suciently large n, unless f is constant.Indeed by Taylors theorem, f (m f (n)+ n) is divisible by f (n) for all m in N.On the other hand, the remarkable polynomial n2 n + 41 is prime for n =1,2, . . . ,40. Furthermore one can write down a polynomial f (n1, . . . ,nk) withthe property that, as the n j run through the elements of N, the set of positivevalues assumed by f is precisely the sequence of primes. The latter resultarises from studies in logic relating to Hilberts tenth problem (see Chapter 8).

The primes are well distributed in the sense that, for every n > 1, there isalways a prime between n and 2n. This result, which is commonly referredto as Bertrands postulate, can be regarded as the forerunner of extensive re-searches on the dierence pn+1 pn of consecutive primes. In fact estimatesof the form pn+1 pn = O(pn) are known with values of just a littlegreater than 12 ; but, on the other hand, the dierence is certainly not bounded,since the consecutive integers n!+m with m =2,3, . . . ,n are all composite. Afamous theorem of Dirichlet asserts that any arithmetical progression a, a +q,a + 2q, . . . , where (a,q) = 1, contains infinitely many primes. Some specialcases, for instance the existence of infinitely many primes of the form 4n + 3,

6 Divisibility

can be deduced simply by modifying Euclids argument given at the begin-ning, but the general result lies quite deep. Indeed Dirichlets proof involved,amongst other things, the concepts of characters and L-functions, and of classnumbers of quadratic forms, and it has been of far-reaching significance in thehistory of mathematics.

Two notorious unsolved problems in prime-number theory are the Gold-bach conjecture, mentioned in a letter to Euler of 1742, to the eect that everyeven integer (> 2) is the sum of two primes, and the twin-prime conjecture,to the eect that there exist infinitely many pairs of primes, such as 3, 5 and17, 19, that dier by 2. By ingenious work on sieve methods, Chen showedin 1974 that these conjectures are valid if one of the primes is replaced by anumber with at most two prime factors (assuming, in the Goldbach case, thatthe even integer is suciently large). The oldest known sieve, incidentally, isdue to Eratosthenes. He observed that if one deletes from the set of integers2,3, . . . ,n, first all multiples of 2, then all multiples of 3, and so on up to thelargest integer not exceeding

n, then only primes remain. Studies on Gold-

bachs conjecture gave rise to the HardyLittlewood circle method of analysisand, in particular, to the celebrated theorem of Vinogradov to the eect thatevery suciently large odd integer is the sum of three primes.

1.7 Further reading

For a good account of the Peano axioms see the book by E. Landau, Founda-tions of Analysis (Chelsea Publishing, 1951).

The division algorithm, Euclids algorithm and the fundamental theorem ofarithmetic are discussed in every elementary text on number theory. The tractsare too numerous to list here but for many years the book by G. H. Hardy andE. M. Wright, An Introduction to the Theory of Numbers (Oxford UniversityPress, 2008) has been regarded as a standard work in the field. The books ofsimilar title by T. Nagell (Wiley, 1951) and H. M. Stark (MIT Press, 1978)are also to be recommended, as well as the volume by E. Landau, ElementaryNumber Theory (Chelsea Publishing, 1958).

For properties of the primes, see the book by Hardy and Wright mentionedabove and, for more advanced reading, see, for instance, H. Davenport, Multi-plicative Number Theory (Springer, 2000) and H. Halberstam and H. E. Richert,Sieve Methods (Academic Press, 1974). The latter contains, in particular, aproof of Chens theorem. The result referred to on a polynomial in several

For full publication details please refer to the Bibliography on page 240.

1.8 Exercises 7

variables representing primes arose from work of Davis, Robinson, Putnamand Matiyasevich on Hilberts tenth problem; see, for instance, the article byJ. P. Jones et al. in American Math. Monthly 83 (1976), 449464, where it isshown that 12 variables suce. The best result to date, due to Matiyasevich, is10 variables; a proof is given in the article by J. P. Jones in J. Symbolic Logic47 (1982), 549571.

1.8 Exercises(i) Find integers x , y such that 22x + 37y =1.

(ii) Find integers x , y such that 95x + 432y =1.(iii) Find integers x , y, z such that 6x + 15y + 10z =1.(iv) Find integers x , y, z such that 35x + 55y + 77z =1.(v) Prove that 1 + 12 + + 1/n is not an integer for n > 1.

(vi) Prove that({a,b}, {b, c}, {c,a})={(a,b), (b, c), (c,a)}.

(vii) Prove that if g1, g2, . . . are integers >1 then every natural number can beexpressed uniquely in the form a0 + a1g1 + a2g1g2 + + ak g1 gk ,where the a j are integers satisfying 0 a j < g j+1.

(viii) Show that there exist infinitely many primes of the form 4n + 3.(ix) Show that, if 2n + 1 is a prime, then it is in fact a Fermat prime.(x) Show that, if m > n, then 22n + 1 divides 22m 1 and so

(22m + 1,22n + 1)=1.(xi) Deduce that pn+1 22n + 1, whence (x) log log x for x 2.

2Arithmetical functions

2.1 The function [x]For any real x , one signifies by [x] the largest integer x , that is, the uniqueinteger such that x 1< [x] x . The function is called the integral part of x.It is readily verified that [x + y] [x] + [y] and that, for any positive integern, [x + n]= [x] + n and [x/n]= [[x]/n]. The dierence x [x] is called thefractional part of x; it is written {x} and satisfies 0 {x}< 1.

Let now p be a prime. The largest integer l such that pl divides n! can beneatly expressed in terms of the above function. In fact, on noting that [n/p]of the numbers 1,2, . . . ,n are divisible by p, that [n/p2] are divisible by p2,and so on, we obtain

l =n

m =1

j =1p j |m

1=

j =1

nm =1p j |m

1=

j =1[n/p j ].

It follows easily that l [n/(p 1)]; for the latter sum is at most n(1/p +1/p2 + ). The result also shows at once that the binomial coecient(

m

n

)= m!

n!(m n)!is an integer; for we have

[m/p j ] [n/p j ] + [(m n)/p j ].

Indeed, more generally, if n1, . . . ,nk are positive integers such that n1 + +nk =m then the expression m!/(n1! nk!) is an integer.

8

2.3 Eulers (totient) function (n) 9

2.2 Multiplicative functions

A real function f defined on the positive integers is said to be multiplica-tive if f (m) f (n)= f (mn) for all m, n with (m,n)=1. We shall meet manyexamples. Plainly if f is multiplicative and does not vanish identically thenf (1)=1. Further, if n = p1 j1 pk jk in standard form then

f (n)= f (p1 j1) f (pk jk ).

Thus to evaluate f it suces to calculate its values on the prime powers; weshall appeal to this property frequently.

We shall also use the fact that if f is multiplicative and if

g(n)=d|n

f (d),

where the sum is over all divisors d of n, then g is a multiplicative function.Indeed, if (m,n)=1, we have

g(mn)=d|m

d |n

f (dd )=d|m

f (d)d |n

f (d )

= g(m)g(n).

2.3 Eulers (totient) function (n)By (n) we mean the number of numbers 1,2, . . . ,n that are relatively primeto n. Thus, in particular, (1)=(2)=1 and (3)=(4)=2.

We shall show, in the next chapter, from properties of congruences, that is multiplicative. Now, as is easily verified, (p j )= p j p j1 for all primepowers p j . It follows at once that

(n)=np|n

(1 1/p).

We proceed to establish this formula directly without assuming that is mul-tiplicative. In fact the formula furnishes another proof of this property.

Let p1, . . . , pk be the distinct prime factors of n. Then it suces to showthat (n) is given by

n

r

(n/pr )+r>s

n/(pr ps)

r>s>t

n/(pr ps pt )+ .

10 Arithmetical functions

But n/pr is the number of numbers 1,2, . . . ,n that are divisible by pr ;n/(pr ps) is the number that are divisible by pr ps ; and so on. Hence the aboveexpression is

nm =1

1 r

pr |m

1 +r>s

pr ps |m

1

= nm =1

(1

(l1

)+(

l2

)

),

where l = l(m) is the number of primes p1, . . . , pk that divide m. Now thesummand on the right is (1 1)l =0 if l > 0, and it is 1 if l =0. The requiredresult follows. The demonstration is a particular example of an argument dueto Sylvester. Note that the result can be obtained alternatively as an immediateapplication of the inclusionexclusion principle. For the respective sums in therequired expression for (n) give the number of elements in the set 1,2, . . . ,nthat possess precisely 1,2,3, . . . of the properties of divisibility by p j for 1 j k and the principle (or rather the complement of it) gives the analogousexpression for the number of elements in an arbitrary set of n objects thatpossess none of k possible properties.

It is a simple consequence of the multiplicative property of thatd|n

(d)=n.

In fact the expression on the left is multiplicative and, when n = p j , it becomes

(1)+(p)+ +(p j )= 1 + (p 1)+ + (p j p j1)= p j .

2.4 The Mbius function (n)

This is defined, for any positive integer n, as 0 if n contains a squared factor,and as (1)k if n = p1 pk as a product of k distinct primes. Further, byconvention, (1)=1.

It is clear that is multiplicative. Thus the function

(n)=d|n

(d)

is also multiplicative. Now for all prime powers p j with j >0 we have (p j )=(1) + (p)=0. Hence we obtain the basic property, namely (n)=0 forn > 1 and (1)=1. We proceed to use this property to establish the Mbiusinversion formulae.

2.4 The Mbius function (n) 11

Let f be any arithmetical function, that is, a function defined on the positiveintegers, and let

g(n)=d|n

f (d).

Then we havef (n)=

d|n

(d)g(n/d).

In fact the right-hand side isd|n

d |n/d

(d) f (d )=d |n

f (d )(n/d ),

and the result follows since (n/d )=0 unless d =n. The converse also holds,for we can write the second equation in the form

f (n)=d |n

(n/d )g(d )

and then d|n

f (d)=d|n

f (n/d)=d|n

d |n/d

(n/dd )g(d )

=d |n

g(d )(n/d ).

Again we have (n/d )=0 unless d =n, whence the expression on the rightis g(n).

The Euler and Mbius functions are related by the equation

(n)=nd|n

(d)/d.

This can be seen directly from the formula for established in Section 2.3, andit also follows at once by Mbius inversion from the property of recordedat the end of Section 2.3. Indeed the relation is clear from the multiplicativeproperties of and .

There is an analogue of Mbius inversion for functions defined over thereals, namely if

g(x)=nx

f (x/n)

thenf (x)=

nx

(n)g(x/n).


In fact the last sum isnx

mx/n

(n) f (x/mn)=lx

f (x/ l)(l)

and the result follows since (l)=0 for l>1. We shall give several applicationsof Mbius inversion in the examples at the end of the chapter.

2.5 The functions (n) and (n)

For any positive integer n, we denote by (n) the number of divisors of n (insome books, in particular in that of Hardy and Wright, the function is writtend(n)). By (n) we denote the sum of the divisors of n. Thus

(n)=d|n

1, (n)=d|n

d.

It is plain that both (n) and (n) are multiplicative. Further, for any primepower p j we have (p j )= j + 1 and

(p j )=1 + p + + p j = (p j+1 1)/(p 1).Thus if p j is the highest power of p that divides n then

(n)=p|n

( j + 1), (n)=p|n

(p j+1 1)/(p 1).

It is easy to give rough estimates for the sizes of (n) and (n). Indeed wehave (n)0, where c is a number depending only on ; for thefunction f (n)= (n)/n is multiplicative and satisfies f (p j )= ( j +1)/p j 14 n/ log n for n > 1. In fact the functionf (n)=(n)(n)/n2 is multiplicative and, for any prime power p j , we have

f (p j )=1 p j1 1 1/p2;hence, since

p|n(1 1/p2)

m =2

(1 1/m2)= 12,

2.6 Average orders 13

it follows that (n)(n) 12 n2, and this together with (n)2gives the estimate for .

2.6 Average ordersIt is often of interest to determine the magnitude on average of arithmeticalfunctions f, that is, to find estimates for sums of the form f (n) with n x ,where x is a large real number. We shall obtain such estimates when f is , and .

First we observe thatnx

(n)=nx

d|n

1=dx

mx/d

1=dx

[x/d].

Now we have dx

1/d = log x + O(1),

and hence nx

(n)= x log x + O(x).

This implies that (1/x)

(n) log x as x . The argument can be refinedto give

nx(n)= x log x + (2 1)x + O(x),

where is Eulers constant. Note that although one can say that the averageorder of (n) is log n (since log n x log x), it is not true that almost allnumbers have about log n divisors; here almost all numbers are said to have acertain property if the proportion x not possessing the property is o(x). Infact almost all numbers have about (log n)log 2 divisors, that is, for any > 0and for almost all n, the function (n)/(log n)log 2 lies between (log n) and(log n).

To determine the average order of (n) we observe thatnx

(n)=nx

d|n

(n/d)=dx

mx/d

m.

The last sum is12

[x/d]([x/d] + 1)= 12(x/d)2 + O(x/d).


Now dx

1/d2 =

d =11/d2 + O(1/x),

and thus we obtain nx

(n)= 112

2x2 + O(x log x).

This implies that the average order of (n) is 162n (since n 12 x2).

Finally we derive an average estimate for . We havenx

(n)=nx

d|n

(d)(n/d)=dx

(d)

mx/dm.

The last sum is12(x/d)2 + O(x/d).

Now dx

(d)/d2 =

d =1(d)/d2 + O(1/x),

and the infinite series here has sum 6/2, as will be clear from Section 2.8.Hence we obtain

nx(n)= (3/2)x2 + O(x log x).

This implies that the average order of (n) is 6n/2. Moreover the resultshows that the probability that two integers are relatively prime is 6/2. Forthere are 12 n(n + 1) pairs of integers p, q with 1 p q n, and precisely(1)+ +(n) of the corresponding fractions p/q are in their lowest terms.

2.7 Perfect numbers

A natural number n is said to be perfect if (n)=2n, that is, if n is equal tothe sum of its divisors other than itself. Thus, for instance, 6 and 28 are perfectnumbers.

Whether there exist any odd perfect numbers is a notorious unresolved prob-lem. By contrast, however, the even perfect numbers can be specified precisely.Indeed an even number is perfect if and only if it has the form 2p1(2p 1),

2.8 The Riemann zeta-function 15

where both p and 2p 1 are primes. It suces to prove the necessity, forit is readily verified that numbers of this form are certainly perfect. Supposetherefore that (n)=2n and that n =2km, where k and m are positive inte-gers with m odd. We have (2k+1 1) (m)=2k+1m and hence (m)=2k+1land m = (2k+1 1)l for some positive integer l. If now l were greater than1 then m would have distinct divisors l, m and 1, whence we would have(m) l + m + 1. But l + m =2k+1l =(m), and this gives a contradiction.Thus l =1 and (m)=m + 1, which implies that m is a prime. In fact m is aMersenne prime and hence k +1 is a prime p, say (cf. Section 1.6). This showsthat n has the required form.

2.8 The Riemann zeta-functionIn a classic memoir of 1860 Riemann showed that questions concerning thedistribution of the primes are intimately related to properties of the zeta-function

(s)=

n =11/ns,

where s denotes a complex variable. It is clear that the series converges abso-lutely for > 1, where s = + i t with , t real, and indeed that it convergesuniformly for > 1 + for any > 0. Riemann showed that (s) can be con-tinued analytically throughout the complex plane and that it is regular thereexcept for a simple pole at s =1 with residue 1. He showed moreover that itsatisfies the functional equation (s)=(1 s), where

(s)= 12 s( 12 s)(s).

The fundamental connection between the zeta-function and the primes isgiven by the Euler product

(s)=

p(1 1/ps)1,

valid for > 1. The relation is readily verified; in fact it is clear that, for anypositive integer N ,

pN(1 1/ps)1 =

pN

(1 + ps + p2s + )=

m

ms,


where m runs through all the positive integers that are divisible only by primes N , and

m

ms nN

ns

n>N

n 0 as N .

The Euler product shows that (s) has no zeros for > 1. In view of thefunctional equation it follows that (s) has no zeros for < 0 except at thepoints s =2,4,6, . . .; these are termed the trivial zeros. All other zerosof (s) must lie in the critical strip given by 0 1, and Riemann con-jectured that they in fact lie on the line = 12 . This is the famous Riemannhypothesis and it remains unproved to this day. There is much evidence infavour of the hypothesis; in particular Hardy proved in 1915 that infinitelymany zeros of (s) lie on the critical line, and extensive computations haveverified that at least the first trillion, that is, 1012, zeros above the real axis doso. It has been shown that, if the hypothesis is true, then, for instance, there isa refinement of the prime-number theorem to the eect that

(x)= x

2

dtlog t

+ O(x log x),

and that the dierence between consecutive primes satisfies pn+1 pn =O(pn

12 +). In fact it has been shown that there is a narrow zero-free region

for (s) to the left of the line =1, and this implies that results as above areindeed valid but with weaker error terms. It is also known that the Riemannhypothesis is equivalent to the assertion that, for any > 0,

nx(n)= O(x 12 +).

The basic relation between the Mbius function and the Riemann zeta-function is given by

1/(s)=

n =1(n)/ns .

This is clearly valid for > 1 since the product of the series on the right with1/ns is

(n)/ns . In fact if the Riemann hypothesis holds then the equa-

tion remains true for > 12 . There is a similar equation for the Euler function,valid for > 2, namely

(s 1)/(s)=

n =1(n)/ns .

2.10 Exercises 17

This is readily verified from the result at the end of Section 2.3. Likewise thereare equations for (n) and (n), valid respectively for >1 and >2, namely

((s))2 =

n =1(n)/ns, (s)(s 1)=

n =1

(n)/ns .

2.9 Further readingThe elementary arithmetical functions are discussed in every introductory texton number theory; again Hardy and Wrights An Introduction to the The-ory of Numbers (Oxford University Press, 2008) is a good reference. Otherbooks to be recommended are those of T. M. Apostol (Springer, 1976) andK. Chandrasekharan (Springer, 1968), both with the title Introduction to Ana-lytic Number Theory; see also Chandrasekharans Arithmetical Functions(Springer, 1970).

As regards the last section, the classic text on the subject is that of E. C.Titchmarsh, The Theory of the Riemann Zeta-Function (Oxford UniversityPress, 1986). There are substantial books covering more recent ground byA. Ivic (Wiley, 1985) and by A. A. Karatsuba and S. M. Voronin (de Gruyter,1992), both with the title The Riemann Zeta-Function. The volumes of similartitle by H. M. Edwards (Academic Press, 1974) and S. J. Patterson (CambridgeUniversity Press, 1988) provide accessible introductions to the topic.

2.10 Exercises(i) Evaluate d|n (d) (d) in terms of the distinct prime factors of n.

(ii) Let (n)= log p if n is a power of a prime p and let (n)=0 otherwise( is called von Mangoldts function). Evaluate d|n (d). Express

(n)/ns in terms of (s).(iii) Let a run through all the integers with 1 a n and (a,n)=1. Show

that f (n)= (1/n)a satisfies d|n f (d)= 12 (n + 1). Hence prove thatf (n)= 12(n) for n > 1.

(iv) Let a run through the integers as in Exercise (iii). Prove that(1/n3)

a3 = 14(n)(1 + (1)k p1 pk/n2),

where p1, . . . , pk are the distinct prime factors of n(>1).(v) Show that the product of all the integers a in Exercise (iii) is given by

n(n)

d|n(d!/dd)(n/d).


(vi) Show that nx (n)[x/n]=1. Hence prove that |nx (n)/n | 1.(vii) Let m, n be positive integers and let d run through all divisors of (m,n).

Prove that

d(n/d)=(n/(m,n))(n)/(n/(m,n)). (The sum hereis called Ramanujans sum.)

(viii) Prove that if n has k distinct prime factors then d|n |(d)|=2k .(ix) Prove that

d|n((d))2/(d)=n/(n),

d|2n

(d)(d)=0.

(x) Find all positive integers n such that(a) (n)|n, (b) (n)= 12 n, (c) (n)=(2n), (d) (n)=12.

(xi) Prove thatn =1 (n)xn/(1 xn)= x/(1 x)2. (Series of this kind arecalled Lambert series.)

(xii) Prove that nx (n)/n = (6/2)x + O(log x).

3Congruences

3.1 DefinitionsSuppose that a,b are integers and that n is a natural number. By a b (mod n)one means n divides b a; and one says that a is congruent to b modulo n. If0 b < n then one refers to b as the residue of a (mod n). It is readily verifiedthat the congruence relation is an equivalence relation; the equivalence classesare called residue classes or congruence classes. By a complete set of residues(mod n) one means a set of n integers, one from each residue class (mod n).

It is clear that if a a (mod n) and b b (mod n) then a + b a + b anda b a b (mod n). Further, we have ab ab (mod n), since n divides(a a)b + a(b b). Furthermore, if f (x) is any polynomial with integercoecients, then f (a) f (a) (mod n).

Note also that if ka ka (mod n) for some natural number k with (k,n)=1then a a (mod n); thus if a1, . . . ,an is a complete set of residues (mod n)then so is ka1, . . . , kan . More generally, if k is any natural number such thatka ka (mod n) then a a (mod n/(k,n)), since obviously k/(k,n) andn/(k,n) are relatively prime.

3.2 Chinese remainder theoremLet a,n be natural numbers and let b be any integer. We prove first that thelinear congruence ax b (mod n) is soluble for some integer x if and only if(a,n) divides b. The condition is certainly necessary, for (a,n) divides both aand n. To prove the suciency, suppose that d = (a,n) divides b. Put a =a/d,b =b/d and n =n/d. Then it suces to solve ax b (mod n). But this hasprecisely one solution (mod n), since (a,n)=1 and so ax runs through acomplete set of residues (mod n) as x runs through such a set. It is clear that

19

20 Congruences

if x is any solution of ax b (mod n) then the complete set of solutions(mod n) of ax b (mod n) is given by x = x + mn, where m =1,2, . . . ,d.Hence, when d divides b, the congruence ax b (mod n) has precisely dsolutions (mod n).

It follows from the last result that if p is a prime and if a is not divisible by pthen the congruence ax b (mod p) is always soluble; in fact there is a uniquesolution (mod p). This implies that the residues 0,1, . . . , p 1 form a fieldunder addition and multiplication (mod p); for indeed every non-zero elementhas a unique inverse in the multiplicative group. We shall denote the field ofresidues mod p by Fp. Plainly the field has characteristic p. Since any otherfinite field with characteristic p is a vector space over Fp, it must have q = peelements for some e; an essentially unique field with q elements actually existsbut we shall not be concerned with the theory relating to it here.

We turn now to simultaneous linear congruences and prove the Chinese re-mainder theorem; the result was apparently known to the Chinese at least 1500years ago. Let n1, . . . ,nk be natural numbers and suppose that they are coprimein pairs, that is, (ni ,n j )=1 for i j . The theorem asserts that, for any inte-gers c1, . . . , ck , the congruences x c j (mod n j ), with 1 j k, are solublesimultaneously for some integer x ; in fact there is a unique solution modulon = n1 nk . For the proof, let m j =n/n j (1 j k). Then (m j ,n j )=1 andthus there is an integer x j such that m j x j c j (mod n j ). Now it is readily seenthat x =m1x1 + +mk xk satisfies x c j (mod n j ), as required. The unique-ness is clear, for if x, y are two solutions then x y (mod n j ) for 1 j k,whence, since the n j are coprime in pairs, we have x y (mod n). Plainlythe Chinese remainder theorem together with the first result of this sectionimplies that if n1, . . . ,nk are coprime in pairs then the congruences a j x b j(mod n j ), with 1 j k, are soluble simultaneously if and only if (a j ,n j )divides b j for all j .

As an example, consider the congruences x 2 (mod 5), x 3 (mod 7),x 4 (mod 11). In this case a solution is given by x =77x1+55x2+35x3, wherex1, x2, x3 satisfy 2x1 2 (mod 5), 6x2 3 (mod 7), 2x3 4 (mod 11). Thus wecan take x1 =1, x2 =4, x3 =2, and these give x =367. The complete solutionis x 18 (mod 385). As another example, consider the congruences x 1(mod 3), x 2 (mod 10), x 3 (mod 11). A solution is given by x =110x1+33x2 +30x3, where x1, x2, x3 satisfy 2x1 1 (mod 3), 3x2 2 (mod 10),8x3 3 (mod 11). Again solving by inspection, we get x1 =2, x2 =4, x3 =10,which gives x =652. The complete solution is x 8 (mod 330). This is currently the most common of several standard notations; they include Z/pZ, Z/p and

GF(p) (the Galois field with p elements). The notation Zp , which was used in the ConciseIntroduction, also commonly occurs but it is open to objection since it clashes with notationcustomarily adopted in the context of p-adic numbers.

3.4 Wilsons theorem 21

Note that, when (a,n) divides b, an explicit solution to the congruenceax b (mod n) can always be obtained from Euclids algorithm although, asin the examples above, a simple observation often suces.

3.3 The theorems of Fermat and EulerFirst we introduce the concept of a reduced set of residues (mod n). By thiswe mean a set of (n) numbers, one from each of the (n) residue classesthat consist of numbers relatively prime to n. In particular, the numbers a with1 a n and (a,n)=1 form a reduced set of residues (mod n).

We proceed now to establish the multiplicative property of , referred to inSection 2.3, using the above concept. Accordingly let n,n be natural num-bers with (n,n)=1. Further, let a and a run through reduced sets of residues(mod n) and (mod n) respectively. Then it suces to prove that an +an runsthrough a reduced set of residues (mod nn); for this implies that (n)(n)=(nn), as required. Now clearly, since (a,n)=1 and (a,n)=1, the numberan + an is relatively prime to n and to n and so to nn. Furthermore anytwo distinct numbers of the form are incongruent (mod nn). Thus we haveonly to prove that if (b,nn)=1 then b an +an (mod nn) for some a,a asabove. But since (n,n)=1 there exist integers m,m satisfying mn +mn =1.Plainly (bm,n)=1 and so a bm (mod n) for some a; similary a bm(mod n) for some a, and now it is easily seen that a,a have the requiredproperty.

Fermats theorem states that if a is any natural number and if p is any primethen a p a (mod p). In particular, if (a, p)=1 then a p1 1 (mod p). Thetheorem was announced by Fermat in 1640 but without proof. Euler gave thefirst demonstration about a century later and, in 1760, he established a moregeneral result to the eect that, if a,n are natural numbers with (a,n)=1 thena(n) 1 (mod n). For the proof of Eulers theorem, we observe simply that asx runs through a reduced set of residues (mod n) so also ax runs through sucha set. Hence

(ax)(x) (mod n), where the products are taken over all x in

the reduced set, and the theorem follows on cancelling

(x) from both sides.

3.4 Wilsons theoremThis asserts that (p 1)! 1 (mod p) for any prime p. Though the result isattributed to Wilson, the statement was apparently first published by Waring in

22 Congruences

his Meditationes Algebraicae of 1770 and a proof was furnished a little laterby Lagrange.

For the demonstration, it suces to assume that p is odd. Now to everyinteger a with 0 < a < p there is a unique integer a with 0 < a < p suchthat aa 1 (mod p). Further, if a =a then a2 1 (mod p), whence a =1 ora = p 1. Thus the set 2,3, . . . , p 2 can be divided into 12 (p 3) pairs a,awith aa 1 (mod p). Hence we have 2 3 (p 2) 1 (mod p), and so(p 1)! p 1 1 (mod p), as required.

Wilsons theorem admits a converse and so yields a criterion for primes.Indeed an integer n > 1 is a prime if and only if (n 1)! 1 (mod n). Toverify the suciency note that any divisor of n, other than itself, must divide(n 1)!.

As an immediate deduction from Wilsons theorem we see that if p is aprime with p 1 (mod 4) then the congruence x2 1 (mod p) has solutionsx = (r !), where r = 12 (p 1). This follows on replacing a + r in (p 1)!by the congruent integer a r 1 for each a with 1 a r . Note that thecongruence has no solutions when p 3 (mod 4), for otherwise we wouldhave x p1 = x2r (1)r = 1 (mod p), contrary to Fermats theorem.

3.5 Lagranges theoremLet f (x) be a polynomial with integer coecients and with degree n. Supposethat p is a prime and that the leading coecient of f , that is, the coecientof xn , is not divisible by p. Lagranges theorem states that the congruencef (x) 0 (mod p) has at most n solutions (mod p).

The theorem certainly holds for n =1 by the first result in Section 2.2. Weassume that it is valid for polynomials with degree n 1 and proceed induc-tively to prove the theorem for polynomials with degree n. Now, for any integera we have f (x) f (a)= (x a)g(x), where g is a polynomial with degreen 1, with integer coecients and with the same leading coecient as f .Thus if f (x) 0 (mod p) has a solution x =a then all solutions of the con-gruence satisfy (x a)g(x) 0 (mod p). But, by the inductive hypothesis, thecongruence g(x) 0 (mod p) has at most n 1 solutions (mod p). The theo-rem follows. It is customary to write f (x) g(x) (mod p) to signify that thecoecients of like powers of x in the polynomials f, g are congruent (mod p);and it is clear that if the congruence f (x) 0 (mod p) has its full complementa1, . . . ,an of solutions (mod p) then

f (x) c(x a1) (x an) (mod p),

3.6 Primitive roots 23

where c is the leading coecient of f . In particular, by Fermats theorem,we have

x p1 1 (x 1) (x p + 1) (mod p),and, on comparing constant coecients, we obtain another proof of Wilsonstheorem.

Plainly, instead of speaking of congruences, we can express the above suc-cinctly in terms of polynomials defined over Fp. Thus Lagranges theoremasserts that the number of zeros in Fp of a polynomial defined over this fieldcannot exceed its degree. The proof proceeds in this instance by supposing thatf (x) is a polynomial over Fp with degree n and with at least one zero a in Fp;then f (x)= f (x) f (a)= (x a)g(x), where g(x) is a polynomial over Fpwith degree n 1 and as before, by induction on n, the result follows. As acorollary we deduce that the polynomial xd 1 has precisely d zeros in Fpfor each divisor d of p 1. For we have x p1 1= (xd 1)g(x), where g(x)has degree p 1 d. But, by Fermats theorem, x p1 1 has p 1 zeros inFp and, by Lagranges theorem, g(x) has at most p 1 d zeros in Fp. Thusxd 1 has at least (p 1) (p 1d)=d zeros in Fp, whence the assertion.In particular, on taking d =4, we deduce that x2 + 1 has precisely two zeros inFp when p 1 (mod 4), a result related to both Section 3.4 and Section 4.2.

Lagranges theorem does not remain true for composite moduli. In fact itis readily verified from the Chinese remainder theorem that if m1, . . . ,mk arenatural numbers coprime in pairs, if f (x) is a polynomial with integer coe-cients, and if the congruence f (x) 0 (mod m j ) has s j solutions (mod m j ),then the congruence f (x) 0 (mod m), where m =m1 mk , has s = s1 sksolutions (mod m). Lagranges theorem is still false for prime power moduli;for example x2 1 (mod 8) has four solutions. But if the prime p does notdivide the discriminant of f then the theorem holds for all powers p j ; indeedthe number of solutions of f (x) 0 (mod p j ) is, in this case, the same as thenumber of solutions of f (x) 0 (mod p). This can be seen at once when, forinstance, f (x)= x2 a; for if p is any odd prime that does not divide a, thenfrom a solution y of f (y) 0 (mod p j ) we obtain a solution x = y + p j z off (x) 0 (mod p j+1) by solving the congruence 2yz + f (y)/p j 0 (mod p)for z, as is possible since (2y, p)=1.

3.6 Primitive rootsLet a,n be natural numbers with (a,n)=1. The least natural number d suchthat ad 1 (mod n) is called the order of a (mod n), and a is said to belong to

24 Congruences

d (mod n). By Eulers theorem, the order d exists and it divides (n). In fact ddivides every integer k such that ak 1 (mod n), for, by the division algorithm,k =dq + r with 0 r < d, whence ar 1 (mod n) and so r =0.

By a primitive root (mod n) we mean a number that belongs to (n) (modn). Thus, for a prime p, a primitive root (mod p) is an integer g, not divisibleby p, such that p 1 is the smallest exponent with g p1 1 (mod p). Inother words, a primitive root (mod p) can be defined as a generator g of themultiplicative group of the field Fp. It is relatively easy to obtain examples ofprimitive roots (mod p). Thus, if we take p =17, then, by testing sequentially,we find that the smallest primitive root is g =3; in fact the respective powersof 3 (mod 17) are 3, 9, 10, 13, 5, 15, 11, 16, 14, 8, 7, 4, 12, 2, 6, 1.

We proceed to prove that for every odd prime p there exists a primitive root(mod p) and indeed that there are precisely (p 1) primitive roots (mod p).Now each of the numbers 1,2, . . . , p 1 belongs (mod p) to some divisor dof p 1; let (d) be the number that belongs to d (mod p) so that

d|(p1)(d)= p 1.

It will suce to prove that if (d) 0 then (d)=(d). For, by Section 2.3,we have

d|(p1)(d)= p 1,

whence (d) 0 for all d and so (p 1)=(p 1) as required.To verify the assertion concerning , suppose that (d) 0 and let a be a

number that belongs to d (mod p). Then a,a2, . . . ,ad are mutually incongru-ent solutions of xd 1 (mod p) and thus, by Lagranges theorem, they repre-sent all the solutions (in fact we showed in Section 2.5 that the congruence hasprecisely d solutions (mod p)). It is now easily seen that the numbers am with1 m d and (m,d)=1 represent all the numbers that belong to d (mod p);indeed each has order d, for if amd 1 then d|d , and if b is any number thatbelongs to d (mod p) then b am for some m with 1 m d, and we have(m,d)=1 since bd/(m,d) (ad)m/(m,d) 1 (mod p). This gives (d)=(d),as asserted.

As noted before, arguments of this kind can be expressed alternatively byreferring to the field Fp. In this context, by a primitive root (mod p) we meana generator g of the multiplicative group of Fp and by the order of a non-zeroelement a of Fp we mean the least positive integer d such that ad = 1. Let(d) be the number of elements in Fp with order d. Supposing that (d) 0

3.6 Primitive roots 25

and a is any element of Fp with order d, we show that the (d) elements amwith 1 m d and (m,d) = 1 are precisely those with order d; this gives(d)=(d) as required. Now certainly the am with 1 m d are distinctzeros of the polynomial xd 1 and thus, by Lagranges theorem, they are allthe zeros. Hence any element with order d is given by am for some m and,since (am)d/(m,d) = (ad)m/(m,d) = 1, we must have (m,d) = 1. Further, eachof the am with (m,d) = 1 has order d since amd = 1 and md is the smallestmultiple of m divisible by d. The result follows.

Let g be a primitive root (mod p). We prove now that there exists an integerx such that g = g + px is a primitive root (mod p j ) for all prime powers p j .We have g p1 =1 + py for some integer y and so, by the binomial theorem,gp1 =1 + pz, where

z y + (p 1)g p2x (mod p).

The coecient of x is not divisible by p and so we can choose x such that(z, p)=1. Then g has the required property. For suppose that g belongs to d(mod p j ). Then d divides (p j )= p j1(p 1). But g is a primitive root (modp) and thus p1 divides d. Hence d = pk(p1) for some k < j. Further, sincep is odd, we have

(1 + pz)pk =1 + pk+1zk,

where (zk, p)=1. Now since gd 1 (mod p j ) it follows that j = k + 1 andthis gives d =(p j ), as required.

Finally we deduce that, for any natural number n, there exists a primitiveroot (mod n) if and only if n has the form 2, 4, p j or 2p j , where p is an oddprime. Clearly 1 and 3 are primitive roots (mod 2) and (mod 4). Further, ifg is a primitive root (mod p j ) then the odd element of the pair g, g + p j isa primitive root (mod 2p j ), since (2p j )=(p j ). Hence it remains only toprove the necessity of the assertion. Now if n =n1n2, where (n1,n2)=1 andn1 > 2,n2 > 2, then there is no primitive root (mod n). For (n1) and (n2)are even and thus for any natural number a we have

a12 (n) = (a(n1)) 12 (n2) 1 (mod n1);

similarly a12 (n) 1 (mod n2), whence a 12 (n) 1 (mod n). Further, there are

no primitive roots (mod 2 j ) for j > 2, since, by induction, we have a2 j2 1(mod 2 j ) for all odd numbers a. This proves the theorem.

26 Congruences

3.7 IndicesLet g be a primitive root (mod n). The numbers gl with l =0,1, . . . ,(n) 1 form a reduced set of residues (mod n). Hence, for every integera with (a,n)=1 there is a unique l such that gl a (mod n). The exponentl is called the index of a with respect to g and it is denoted by ind a. Plainlywe have

ind a + ind b ind (ab) (mod (n)),and ind 1=0, ind g =1. Further, for every natural number m, we have ind(am)m ind a (mod (n)). These properties of the index are clearly analogousto the properties of logarithms. We also have ind (1)= 12(n) for n >2 sinceg2 ind(1) 1 (mod n) and 2 ind (1)< 2(n).

As an example of the use of indices, consider the congruence xn a (modp), where p is a prime. We have n ind x ind a (mod (p 1)) and thus if(n, p 1)=1 then there is just one solution. Consider, in particular, x5 2(mod 7). It is readily verified that 3 is a primitive root (mod 7) and we have32 2 (mod 7). Thus 5 ind x 2 (mod 6), which gives ind x =4 and x 34 4(mod 7).

Note that although there is no primitive root (mod 2 j ) for j >2, the number 5belongs to 2 j2 (mod 2 j ) and every odd integer a is congruent (mod 2 j ) to justone integer of the form (1)l5m , where l =0, 1 and m =0,1, . . . ,2 j2 1.The pair l,m has similar properties to the index defined above.

3.8 Further readingA good account of the elementary theory of congruences is given by T. Nagell,Introduction to Number Theory (Wiley, 1951); this contains, in particular, atable of primitive roots. There is another and in fact more extensive table in I.M. Vinogradovs An Introduction to the Theory of Numbers (Pergamon Press,1961). Again Hardy and Wrights book of the same title (Oxford UniversityPress, 2008) covers the subject well.

3.9 Exercises(i) Find an integer x such that 2x 1 (mod 3), 3x 1 (mod 5), 5x

1 (mod 7).(ii) Find an integer x such that 3x 1 (mod 5), 5x 1 (mod 17), 7x 1

(mod 23).

3.9 Exercises 27

(iii) Find integers a,b, c,d, e such that the congruences x a (mod 2), x b (mod 3), x c (mod 4), x d (mod 6), x e (mod 12) overlap, thatis, such that at least one is soluble for every x .

(iv) Show that akpk+1 a (mod p) for all primes p, integers a and positiveintegers k. Deduce that 798 divides a19 a for all integers a.

(v) Suppose that a1, . . . ,ap and b1, . . . ,bp are each complete sets ofresidues (mod p) for a prime p. Is it possible that a1b1, . . . ,apbp isalso a complete set of residues (mod p)?

(vi) Show that, for an odd prime p, the congruence x2 (1) 12 (p+1) (mod p)has the solution x = ( 12 (p 1))!.

(vii) Show that, for composite n, the congruence (n 1)! 0 (mod n) holdswith one exception. Show further that (n 1)! + 1 is not a power of n.

(viii) Prove that, for any positive integers a,n with (a,n)=1, {ax/n}=12(n), where the summation is over all x in a reduced set of residues(mod n).

(ix) The integers a and n > 1 satisfy an1 1 (mod n) but am 1 (mod n)for each divisor m of n 1, other than itself. Prove that n is a prime.

(x) Show that the congruence x p1 1 0 (mod p j ) has just p 1 solu-tions (mod p j ) for every prime power p j .

(xii) Prove that, for every natural number n, either there is no primitive root(mod n) or there are ((n)) primitive roots (mod n).

(xiii) Prove that, for any prime p, the sum of all the distinct primitive roots(mod p) is congruent to (p 1) (mod p).

(xiv) Prove that, for a prime p > 3, the product of all the distinct primitiveroots (mod p) is congruent to 1 (mod p).

(xv) Prove that if p is a prime and k is a positive integer then pn=1 nk iscongruent (mod p) to 1 if p 1 divides k and to 0 otherwise.

(xvi) Determine all the solutions of the congruence y2 5x3 (mod 7) in inte-gers x, y.

(xvii) Prove that, for any prime p>3, the numerator of 1+ 12 + +1/(p1)is divisible by p2 (Wolstenholmes theorem).

4Quadratic residues

4.1 Legendres symbol

In the last chapter we discussed the linear congruence ax b (mod n). Here weshall study the quadratic congruence x2 a (mod n); in fact this amounts tothe study of the general quadratic congruence ax2 + bx + c 0 (mod n), sinceon writing d =b2 4ac and y =2ax + b, the latter gives y2 d (mod 4an).

Let a be any integer, let n be a natural number and suppose that (a,n)=1.Then a is called a quadratic residue (mod n) if the congruence x2 a (mod n)is soluble; otherwise it is called a quadratic non-residue (mod n). The Legendresymbol

(ap), where p is a prime and (a, p)=1, is defined as 1 if a is a quadratic

residue (mod p) and as 1 if a is a quadratic non-residue (mod p). The symbolis customarily extended to the case when p divides a by defining it as 0 in thisinstance. Clearly, if a a (mod p), we have(

a

p

)=(

a

p

).

4.2 Eulers criterion

This states that if p is an odd prime then(a

p

) a 12 (p1) (mod p).

For the proof we write, for brevity, r = 12 (p 1) and we note first that if ais a quadratic residue (mod p) then for some x in N we have x2 a (mod p),whence, by Fermats theorem, ar x p1 1 (mod p). Thus it suces to showthat if a is a quadratic non-residue (mod p) then ar 1 (mod p). Now in anyreduced set of residues (mod p) there are r quadratic residues (mod p) and rquadratic non-residues (mod p); for the numbers 12,22, . . . , r2 are mutually

28

4.3 Gauss lemma 29

incongruent (mod p) and since, for any integer k, (p k)2 k2 (mod p), thenumbers represent all the quadratic residues (mod p). Each of the numberssatisfies xr 1 (mod p), and, by Lagranges theorem, the congruence has atmost r solutions (mod p). Hence if a is a quadratic non-residue (mod p) then ais not a solution of the congruence. But, by Fermats theorem, a p1 1 (modp), whence ar 1 (mod p). The required result follows.

It will be seen that the proof given above can be expressed briefly in terms ofthe field Fp. In fact it is enough to observe that, from Fermats theorem, everyelement of Fp other than 0 is a zero of one of the polynomials x

12 (p1) 1

and, from Lagranges theorem, x 12 (p1) 1 has precisely the zeros 12,22, . . . ,( 12 (p 1))2, which is a complete set of quadratic residues. Note also that onecan argue alternatively in terms of a primitive root (mod p), say g; indeed it isclear that the quadratic residues (mod p) are given by 1, g2, . . . , g2(r1).

As an immediate corollary to Eulers criterion we have the multiplicativeproperty of the Legendre symbol, namely(

a

p

)(bp

)=(

abp

)for all integers a, b not divisible by p; here equality holds since both sides are1. Similarly we have (1

p

)= (1) 12 (p1);

in other words, 1 is a quadratic residue of all primes 1 (mod 4) and aquadratic non-residue of all primes 3 (mod 4). It will be recalled fromSection 3.4 that when p 1 (mod 4) the solutions of x2 1 (mod p) aregiven by x = (r !).

4.3 Gauss lemmaFor any integer a and any natural number n we define the numerically least res-idue of a (mod n) as that integer a for which a a (mod n) and 12 n

30 Quadratic residues

For the proof we observe that the numbers |a j | with 1 j r , where r =12 (p 1), are simply the numbers 1,2, . . . , r in some order. For certainly wehave 1 |a j | r , and the |a j | are distinct since a j = ak , with k r , wouldgive a( j +k)0 (mod p) with 0< j +k < p, which is impossible, and a j =akgives a j ak (mod p), whence j = k. Hence we have a1 ar = (1)lr !. Buta j aj (mod p) and so a1 ar arr ! (mod p). Thus ar (1)l (mod p),and the result now follows from Eulers criterion.

As a corollary we obtain (2p

)= (1) 18 (p21),

that is, 2 is a quadratic residue of all primes 1 (mod 8) and a quadraticnon-residue of all primes 3 (mod 8). To verify this result, note that, whena =2, we have a j =2 j for 1 j [ 14 p] and a j =2 j p for [ 14 p]< j 12 (p 1). Hence in this case l = 12 (p 1) [ 14 p], and it is readily checked thatl 18 (p2 1) (mod 2).

4.4 Law of quadratic reciprocity

We come now to the famous theorem stated by Euler in 1783 and first provedby Gauss in 1796. Apparently Euler, Legendre and Gauss each discovered thetheorem independently and Gauss worked on it intensively for a year before es-tablishing the result; he subsequently gave no fewer than eight demonstrations.

The law of quadratic reciprocity asserts that if p,q are distinct odd primesthen (

pq

)(qp

)= (1) 14 (p1)(q1).

Thus if p,q are not both congruent to 3 (mod 4) then(pq

)=(

qp

),

and in the exceptional case (pq

)=

(qp

).

For the proof we observe that, by Gauss lemma,( p

q)= (1)l , where l is the

number of lattice points (x, y) (that is, pairs of integers) satisfying 0< x < 12 qand 12 q < px qy

4.4 Law of quadratic reciprocity 31

p

px qy q

qy px p

q <

px

qy < 0

p <

qy

px < 0

q x

y

0

Fig. 4.1 The rectangle R in the proof of the law of quadratic reciprocity.

points in the rectangle R defined by 0< x < 12 q, 0< y 3, then their sum S satisfies S 0 (mod p). Deduce analo-gous results for the product and sum of all the quadratic non-residues(mod p).

(iv) Prove that if p is a prime 1 (mod 4) then r = 14 p(p 1), wherethe summation is over all quadratic residues r with 1 r p 1.

(v) Use Eulers criterion to show that the primitive roots (mod p) for aprime p = 2n + 1 are precisely the quadratic non-residues (mod p).Deduce that(a) if n > 1 then 3 is a primitive root (mod p),(b) if n = 2k with k > 1 then 5 is a primitive root (mod p).

(vi) Show that the prime factors of n2 +4, where n is a positive odd integer,are congruent to 1 or 5 (mod 8). Deduce that there are infinitely manyprimes congruent to 5 (mod 8). By considering n2 +2 and n2 2, showfurther that there are infinitely many primes congruent to 3 (mod 8) andto 7 (mod 8).

(vii) Find the least integer n > 1 such that an a (mod 12 121) for all inte-gers a.

(viii) Let p be an odd prime and let a be an integer not divisible by p. Provethat, if a is a quadratic residue (mod p), then it is a quadratic residue(mod pk) for all positive integers k.

4.7 Exercises 35

(ix) Show that, for p >3, the latter holds also for cubic residues; by a cubicresidue (mod n), one means an integer a with (a,n)= 1 such that x3 a (mod n) is soluble.

(x) Evaluate the Jacobi symbol ( 123917).(xi) Evaluate the Jacobi symbols ( 1032773) and ( 1173553). Are 103 and 117

quadratic residues mod 2773 and mod 3553 respectively?(xii) Let f (x)=ax2 + bx + c, where a,b, c are integers, and let p be an

odd prime that does not divide a. Prove that the number of solutions ofthe congruence f (x) 0 (mod p) is 1 + ( dp ), where d = b2 4ac and( d

p)= 0 if p divides d.

(xiii) Find the number of solutions (mod 997) of(a) x2 + x + 1 0, (b) x2 + x 2 0, (c) x2 + 25x 93 0.

(xiv) With the notation of Exercise (xii), show that, if p does not divide d,then

px =1

( f (x)p

)=

(a

p

).

Evaluate the sum when p divides d.(xv) Prove that if p is a prime 1 (mod 4) and if p =2p +1 is a prime then

2 is a primitive root (mod p). For which primes p with p =2p + 1prime is 5 a primitive root (mod p)?

(xvi) Show that if p is a prime and a,b, c are integers not divisible by p thenthere are integers x, y such that ax2 + by2 c (mod p).

(xvii) Let f = f (x1, . . . , xn) be a polynomial with integer coecients thatvanishes at the origin and let p be a prime. Prove that if the congruencef 0 (mod p) has only the trivial solution then the polynomial

1 f p1 (1 x p11 ) (1 x p1n )is divisible by p for all integers x1, . . . , xn . Deduce that if f has totaldegree less than n then the congruence f 0 (mod p) has a non-trivialsolution (Chevalleys theorem).

(xviii) Prove that if f = f (x1, . . . , xn) is a quadratic form with integer coe-cients, if n 3 and if p is a prime then the congruence f 0 (mod p)has a non-trivial solution.

5Quadratic forms

5.1 Equivalence

We shall consider binary quadratic forms

f (x, y)= ax2 + bxy + cy2,where a, b, c are integers. By the discriminant of f we mean the numberd = b2 4ac. Plainly d 0 (mod 4) if b is even and d 1 (mod 4) if b is odd.The forms x2 14 dy2 for d 0 (mod 4) and x2 + xy + 14 (1 d)y2 for d 1(mod 4) are called the principal forms with discriminant d. We have

4a f (x, y)= (2ax + by)2 dy2,whence if d < 0 the values taken by f are all of the same sign (or zero); f iscalled positive or negative definite accordingly. If d > 0 then f takes values ofboth signs and it is called indefinite.

We say that two quadratic forms are equivalent if one can be transformedinto the other by an integral unimodular substitution, that is, a substitution ofthe form

x = px + qy, y = r x + sy,

where p,q, r, s are integers with ps qr = 1. It is readily verified that thisrelation is reflexive, symmetric and transitive. Further, it is clear that the setof values assumed by equivalent forms as x, y run through the integers arethe same, and indeed they assume the same set of values as the pair x, y runsthrough all relatively prime integers; for (x, y) = 1 if and only if (x , y)= 1.Furthermore equivalent forms have the same discriminant. For the substitutiontakes f into

f (x , y)= ax 2 + bx y + cy2,

36

5.2 Reduction 37

where

a = f (p, r), b = 2apq + b(ps + qr)+ 2crs, c = f (q, s),and it is readily checked that b2 4ac =d(ps qr)2. Alternatively, in matrixnotation, we can write f as X T F X and the substitution as X =U X , where

X =(

x

y

), X =

(x

y

), F =

(a 12 b12 b c

), U =

(p qr s

);

then f is transformed into X T F X , where F =U T FU , and, since the deter-minant of U is 1, it follows that the determinants of F and F are equal.

5.2 Reduction

There is an elegant theory of reduction relating to positive definite quadraticforms which we shall now describe. Accordingly we shall assume henceforththat d < 0 and that a > 0; then we have also c > 0.

We begin by observing that by a finite sequence of unimodular substitutionsof the form x = y, y =x and x = x y, y = y, f can be transformed intoanother binary form for which |b| a c. For the first of these substitutionsinterchanges a and c, whence it allows one to replace a > c by a < c; and thesecond has the eect of changing b to b 2a, leaving a unchanged, whence,by finitely many applications it allows one to replace |b| > a by |b| a. Theprocess must terminate since whenever the first substitution is applied it resultsin a smaller value of a. In fact we can transform f into a binary form for whicheither

a < b a < c or 0 b a = c.For if b = a then the second of the above substitutions allows one to takeb = a, leaving c unchanged, and if a = c then the first substitution allows oneto take 0 b. A binary form for which one or other of the above conditions ona, b, c holds is said to be reduced.

There are only finitely many reduced forms with a given discriminant d; forif f is reduced then d = 4ac b2 3ac, whence a, c and |b| cannot exceed13 |d|. The number of reduced forms with discriminant d is called the classnumber and is denoted by h(d). To calculate the class number when d =4,for example, we note that the inequality 3ac 4 gives a = c = 1, whence b = 0and h(4)=1. The number h(d) is actually the number of inequivalent classesof binary quadratic forms with discriminant d since, as we shall now prove, anytwo reduced forms are not equivalent.

38 Quadratic forms

Let f (x, y) be a reduced form. Then if x, y are non-zero integers and |x | |y| we have

f (x, y) |x |(a|x | |by|)+ c|y|2 |x |2(a |b|)+ c|y|2 a |b| + c.

Similarly if |y| |x | we have f (x, y) a |b| + c. Hence the smallest valuesassumed by f for relatively prime integers x, y are a, c and a |b| + c inthat order; these values are taken at (1, 0), (0, 1) and either (1, 1) or (1, 1).Now the sequences of values assumed by equivalent forms for relatively primex, y are the same, except for a rearrangement, and thus if f is a form, asin Section 5.1, equivalent to f , and if also f is reduced, then a = a, c = cand b = b. It remains therefore to prove that if b = b then in fact b = 0.We can assume here that a < b < a < c, for, since f is reduced, we havea c > a for all non-zero integers x, y. But, with thenotation of Section 5.1 for the substitution taking f to f , we have a = f (p, r).Thus p =1, r =0, and from ps qr =1 we obtain s =1. Further, we havec = f (q, s), whence q = 0. Hence the only substitutions taking f to f arex = x , y = y and x =x , y =y. These give b = 0, as required.

5.3 Representations by binary formsA number n is said to be properly represented by a binary form f if n = f (x, y)for some integers x, y with (x, y)= 1. There is a useful criterion in connectionwith such representations, namely n is properly represented by some binaryform with discriminant d if and only if the congruence x2 d (mod 4n) issoluble.

For the proof, suppose first that the congruence is soluble and let x = bbe a solution. Define c by b2 4nc = d and put a = n. Then the form f , as inSection 5.1, has discriminant d and it properly represents n; in fact f (1,0)=n.Conversely suppose that f has discriminant d and that n = f (p, r) for someintegers p, r with (p, r) = 1. Then there exist integers q, s with ps qr = 1and f is equivalent to a form f as in Section 5.1 with a = n. But f andf have the same discriminants and so b2 4nc = d. Hence the congruencex2 d (mod 4n) has a solution x = b.

The ideas here can be developed to furnish, in the case (n,d)= 1, the num-ber of proper representations of n by all reduced forms with a given discrimi-nant d. Indeed the quantity in question is given by ws, where s is the number

5.4 Sums of two squares 39

of solutions of the congruence x2 d (mod 4n) with 0 x < 2n and w is thenumber of automorphs of a reduced form; by an automorph of f we mean anintegral unimodular substitution that takes f into itself. The number w is re-lated to the solutions of the Pell equation (see Section 7.3); it is given by 2 ford

40 Quadratic forms

5.5 Sums of four squaresWe prove now the famous theorem stated by Bachet in 1621 and first demon-strated by Lagrange in 1770 to the eect that every natural number can beexpressed as the sum of four integer squares. Our proof will be based on theidentity

(x2 + y2 + z2 +w2)(x 2 + y2 + z2 +w2)= (xx + yy + zz +ww)2 + (xy yx +wz zw)2

+ (xz zx + yw wy)2 + (xw wx + zy yz)2,

which is related to the theory of quaternions.In view of the identity and the trivial representation 2=12 + 12 + 02 + 02, it

will suce to prove the theorem for odd primes p. Now the numbers x2 with0 x 12 (p 1) are mutually incongruent (mod p), and the same holds forthe numbers 1 y2 with 0 y 12 (p 1). Thus we have x2 1 y2 (modp) for some x , y satisfying x2 + y2 + 1 < 1 + 2( 12 p)2 < p2. Hence we obtainmp = x2 + y2 + 1 for some integer m with 0 < m < p.

Let l be the least positive integer such that lp = x2 + y2 + z2 +w2 for someintegers x , y, z, w. Then l m < p. Further, l is odd, for if l were even thenan even number of x , y, z, w would be odd and we could assume that x + y,x y, z +w, z w are even; but

12 lp = ( 12 (x + y))2 + ( 12 (x y))2 + ( 12 (z +w))2 + ( 12 (z w))2

and this is inconsistent with the minimal choice of l. To prove the theoremwe have to show that l = 1; accordingly we suppose that l > 1 and obtain acontradiction. Let x , y, z, w be the numerically least residues of x , y, z, w(mod l) and put

n = x 2 + y2 + z2 +w2.

Then n 0 (mod l) and we have n >0, for otherwise l would divide p. Further,since l is odd, we have n < 4( 12 l)

2 = l2. Thus n = kl for some integer k with0 < k < l. Now by the identity we see that (kl)(lp) is expressible as a sumof four integer squares, and moreover it is clear that each of these squares isdivisible by l2. Thus kp is expressible as a sum of four integer squares. Butthis contradicts the definition of l and the theorem follows. The argument hereis an illustration of Fermats method of infinite descent.

There is a result dating back to Legendre and Gauss to the eect that anatural number is the sum of three squares if and only if it is not of the form

5.6 Further reading 41

4 j (8k + 7) with j , k non-negative integers. Here the necessity is obvious sincea square is congruent to 0, 1 or 4 (mod 8) but the suciency depends on thetheory of ternary quadratic forms.

Waring conjectured in 1770 that every natural number can be represented asthe sum of 4 squares, 9 cubes, 19 biquadrates and so on. One interprets thelatter to mean that, for every integer k 2 there exists an integer s = s(k) suchthat every natural number n can be expressed in the form x1k + + xs k withx1, . . . , xs non-negative integers; and it is customary to denote the least suchs by g(k). Thus we have g(2)= 4. Warings conjecture was proved by Hilbertin 1909. Another, quite dierent proof was given by Hardy and Littlewood in1920 and it was here that they described for the first time their famous circlemethod. The work depends on the identity

n=0

r(n)zn = ( f (z))s,

where r(n) denotes the number of representations of n in the required formand f (z)= 1 + z1k + z2k + . Thus we have

r(n)= 12 i

C

( f (z))szn+1

dz

for a suitable contour C . The argument now involves a delicate division of thecontour into major and minor arcs, and the analysis leads to an asymptoticexpression for r(n) and to precise estimates for g(k).

5.6 Further readingA careful account of the theory of binary quadratic forms is given in Landau,Elementary Number Theory (Chelsea Publishing, 1958); see also Davenport,The Higher Arithmetic (Cambridge University Press, 2008). As there, we haveused the classical definition of equivalence in terms of substitutions with de-terminant 1; however, there is an analogous theory involving substitutions withdeterminant 1 and this is described in Niven, Zuckerman and Montgomery,An Introduction to the Theory of Numbers (Wiley, 1991).

For a comprehensive account of the general theory of quadratic forms seeCassels, Rational Quadratic Forms (Academic Press, 1978). For an account ofthe analysis appertaining to Warings problem see R. C. Vaughan, The HardyLittlewood Method (Cambridge University Press, 1997).

42 Quadratic forms

5.7 Exerc

alan baker - a comprehensive course in number theory - cambridge university press, 2012 - 269p

Documents

number theorydeveloped

number theorybaker

number theory cover

primenumber theorem

number theoryalan bakerbaker

comprehensive initiation

theory of numbers

theory ofnumbers