the riemann zeta function, lent 20141. first definition of the riemann zeta function the riemann...

LECTURE NOTES 1 FOR CAMBRIDGE PART III COURSE ON“THE RIEMANN ZETA FUNCTION”, LENT 2014

ADAM J HARPER

Abstract. These are rough notes covering the first block of lectures in “The Riemann

Zeta Function” course. In these first lectures we will introduce the zeta function ζ(s),

obtain some basic estimates for it, and use it to prove the Prime Number Theorem.

We will also develop a general procedure for widening the zero-free region for the zeta

function (given suitable analytic information), and thus improving the error term in

the Prime Number Theorem.

(No originality is claimed for any of the contents of these notes. In particular, they

borrow from the classic books of Ivic [1] and Titchmarsh [2].)

1. First definition of the Riemann zeta function

The Riemann zeta function ζ(s) is a meromorphic function on the entire complex

plane, but its definition is not straightforward to explain for all s ∈ C. We will begin by

defining the zeta function when <(s) > 1. Later we will extend the definition to cover

the range <(s) > 0, (which is by far the most important for applications), and finally

to cover all s ∈ C. (Note that, since the half plane {<(s) > 1} is a set containing a

limit point, the Identity Theorem from complex analysis implies there is at most one

analytic continuation of ζ(s) to a meromorphic function on C.)

Definition 1.1. For each s ∈ C such that <(s) > 1, the Riemann zeta function ζ(s) is

defined by

ζ(s) :=∞∑n=1

1

ns.

Note that the series is absolutely convergent.

Any series of the form∑∞

n=1anns

, where the an are fixed complex numbers and s is

variable, is called a Dirichlet series. If only finitely many of the an are non-zero, the

resulting sum∑

n≤Nanns

is sometimes called a Dirichlet polynomial. The zeta function

is the prototypical Dirichlet series, but we will meet some others later in the course.

To get a feel for how the zeta function behaves when <(s) > 1, we shall quickly prove

the following approximation and estimate.

Date: 28th February 2014.1

2 ADAM J HARPER

Lemma 1.2. If s = σ + it, with σ > 1 and t ∈ R, then for any natural number x we

have

ζ(s) =∑n≤x

1

ns+x1−s

s− 1+O(|s|x−σ).

Consequently, if σ > 1 and |t| ≥ 2 then ζ(σ + it) = O(log |t|).

Proof of Lemma 1.2. We might initially think that

ζ(s) =∞∑n=1

1

ns≈∫ ∞

1

dw

ws=

[w1−s

1− s

]∞1

=1

s− 1,

if <(s) > 1. This is not entirely correct because the integral doesn’t approximate the

sum very accurately at the beginning, but the lemma asserts that the approximation

becomes good for the later terms.

Note that ∫ ∞n

dw

ws+1=

[w−s

−s

]∞n

=1

s

1

ns,

so in fact we have∑n>x

1

ns= s

∑n>x

∫ ∞n

dw

ws+1= s

∫ ∞x

( ∑x<n≤w

1

)dw

ws+1,

where the interchange of summation and integration is justified because everything is

absolutely convergent (if σ > 1). Thus

ζ(s) =∑n≤x

1

ns+s

∫ ∞x

(bwc − x)dw

ws+1=∑n≤x

1

ns+s

∫ ∞x

(w − x)dw

ws+1+O

(|s|∫ ∞x

dw

wσ+1

),

where bwc here denotes the integer part of w. The integral is equal to x1−s/(s−1), and

the error term is O(|s|x−σ), as claimed.

Finally, applying the estimate we just proved, with x chosen as the integer part of

|t|, yields that

ζ(σ + it) =∑n≤|t|

1

nσ+it+O

(1

|t|

)+O(1) =

∑n≤|t|

1

nσ+it+O(1).

The sum here is O(∑

n≤|t|1n), and since that is O(log |t|) the claimed estimate follows.

�

The proof of Lemma 1.2 has lots of potential for improvement and further develop-

ment, and we shall revisit it shortly when we define the zeta function on the extended

range <(s) > 0.

For each <(s) > 1, the zeta function is built from some information about every

natural number n. Since each n is a (possibly empty) product of primes, in an essentially

unique way, one might hope that the values of the zeta function can be related to the

RIEMANN ZETA FUNCTION LECTURE NOTES 1 3

behaviour of ps for primes p only. The following result, the so-called Euler product

expression for ζ(s), provides such a connection.

Lemma 1.3 (Euler product expression). If <(s) > 1 then

ζ(s) =∏p

(1− 1

ps

)−1

,

where the infinite product is defined to be limP→∞∏

p≤P

(1− 1

ps

)−1

.

Proof of Lemma 1.3. Note that, for any prime p,(1− 1

ps

)−1

=∞∑k=0

1

pks.

Here the geometric series is absolutely convergent if <(s) > 0.

Since we can freely multiply out and rearrange the terms in a finite product of

absolutely convergent series, and since every integer has a unique prime factorisation

up to ordering (the Fundamental Theorem of Arithmetic), we have∏p≤P

(1− 1

ps

)−1

=∏p≤P

∞∑k=0

1

pks=∞∑n=1

cP (n)

ns,

where cP (n) is 1 if all the prime factors of n are ≤ P , and cP (n) is zero otherwise.

Then we see ∣∣∣∣∣ζ(s)−∏p≤P

(1− 1

ps

)−1∣∣∣∣∣ =

∣∣∣∣∣∣∣∣∞∑n=1,

cP (n)=0

1

ns

∣∣∣∣∣∣∣∣ ≤∞∑n=1,

cP (n)=0

1

n<(s).

Since we certainly have cP (n) = 1 if n ≤ P , the right hand side is

≤∞∑

n=P+1

1

n<(s).

If <(s) > 1 then this tends to zero as P →∞, as claimed. �

The foregoing results constitute the most basic theory of the zeta function. When

<(s) > 1, the zeta function is simultaneously a product over primes, which are the

basic object of study in multiplicative number theory, and a sum over integers that can

be approximated and manipulated analytically. The basic theme of this course, and

of most research on the zeta function, is to try to establish similar properties for other

s ∈ C, and to play the properties off against one another to deduce information about

the zeta function and the primes. We will begin to do this in the next section, when we

see how distributional information about the primes could be recovered if we had good

information about the zeta function.

4 ADAM J HARPER

2. Primes and Perron’s formula

When Riemann first proposed that one could use the zeta function, together with

complex analysis, to investigate the distribution of primes, he proposed working directly

with the counting function

π(x) :=∑p≤x

1.

However, as we shall see it turns out to be technically easier to study a weighted counting

function, and that is now the standard approach.

Definition 2.1. We define the von Mangoldt function Λ : N→ R by

Λ(n) :=

{log p if n = pk for some prime p, and some k ≥ 1

0 otherwise.

We then define Chebychev’s Psi counting function

Ψ(x) :=∑n≤x

Λ(n).

Note that Ψ(x) =∑

p≤x log p +∑∞

k=2

∑p≤x1/k log p =

∑p≤x log p + O(

√x log2 x) if

x ≥ 2, say, since each term in the sum over k is O(√x log x), and only the first O(log x)

terms are non-zero.

The basic analytic reason for introducing the von Mangoldt function is because the

corresponding Dirichlet series,∑∞

n=1Λ(n)ns

, can be related to the zeta function more

satisfactorily than the simpler Dirichlet series∑

p1ps

. Indeed, if <(s) > 1 we note that

∞∑n=1

Λ(n)

ns= −

∑ 1

k

d

ds

1

pks= − d

ds

∑ 1

k

1

pks,

where the sums are over all prime powers pk, and the interchange of summation and

differentiation is justified because we have the uniform bound∣∣∣ 1kdds

1pks

∣∣∣ ≤ log ppk<(s)

, and∑ log ppk<(s)

is convergent. Then we see

∑ 1

k

1

pks=∑p

log

(1− 1

ps

)−1

= log ζ(s),

in view of the Euler product expression for ζ(s), and so we have

∞∑n=1

Λ(n)

ns= − d

dslog ζ(s) = −ζ

′(s)

ζ(s), <(s) > 1.

Note that ζ(s) 6= 0 if <(s) > 1, (since the zeta function is an absolutely convergent

product in which no term vanishes), and so the logarithmic derivative ζ ′(s)/ζ(s) is well

defined if <(s) > 1. (In fact one sees this directly in the absolute convergence of all

series in the above calculations.)


Remark 2.2. In the foregoing calculations, we begin to see the relevance of vanishing of

the zeta function to the behaviour of prime numbers.

After the foregoing preparations, we would like to formulate a procedure for recov-

ering information about Ψ(x) from information about∑∞

n=1Λ(n)ns

. One can think of

the Dirichlet series∑∞

n=1Λ(n)nσ+it

as a kind of Fourier transform of the values of the von

Mangoldt function, in which the oscillating terms n−it = e−it logn are analogous to the

exponential phases e2πint in a Fourier series, and the terms n−σ are present to make

everything converge. With this in mind, one could reasonably hope to formulate a

procedure like Fourier inversion, allowing one to recover information about Ψ(x) by

summing/integrating the series∑∞

n=1Λ(n)nσ+it

over a suitable range of the “frequency”

variable t. This is the procedure that we shall now develop, in a quite general context.

Lemma 2.3. Let y, c, T > 0, and define

δ(y) :=

0 if 0 < y < 1

1/2 if y = 1

1 if y > 1.

Then ∣∣∣∣δ(y)− 1

2πi

∫ c+iT

c−iTysds

s

∣∣∣∣ <{yc min{1, 1

T | log y|} if y 6= 1

min{1, cT} if y = 1.

Proof of Lemma 2.3. The obvious approach is to use Cauchy’s Residue Theorem, and

evaluate the integral by deforming the line of integration in a suitable way.

For example, if 0 < y < 1 then the integrand ys

stends to zero as <(s) → ∞ (in a

uniform way, independently of =(s)), and is holomorphic on the line of integration and

to the right of it (the only pole being to the left, at s = 0). Thus Cauchy’s Residue

Theorem implies that

1

2πi

∫ c+iT

c−iTysds

s= − 1

2πi

∫ ∞+iT

c+iT

ysds

s+

1

2πi

∫ ∞−iTc−iT

ysds

s,

and we certainly have∣∣∣∫∞+iT

c+iTys ds

s

∣∣∣ ≤ 1T

∫∞cyσdσ = yc

T | log y| . On the other hand, Cauchy’s

Residue Theorem also implies that

1

2πi

∫ c+iT

c−iTysds

s= − 1

2πi

∫Γ(c,T )

ysds

s,

where Γ(c, T ) is the arc of the circle centred at the origin, with radius |c + iT | =√c2 + T 2, that runs from c + iT to c − iT on the right. And we have

∣∣∣∫Γ(c,T )ys ds

s

∣∣∣ ≤yc√c2+T 2

∫Γ(c,T )

|ds| ≤ πyc.

Similarly, if y > 1 then the integrand ys

stends to zero as <(s) → −∞, so one can

apply Cauchy’s Residue Theorem with the contour shifted to the left instead of the

6 ADAM J HARPER

right. This time the contour encloses the pole at s = 0, which contributes its residue of

1 to the value of the integral.

Finally, if y = 1 then the integral is quite easy to estimate directly (by real variable

methods). �

Note in particular that we have δ(y) = 12πi

∫ c+i∞c−i∞ ys ds

s, (where the infinite integral is

interpreted as limT→∞∫ c+iTc−iT ys ds

s), and also that for any x > 0 and any n ∈ N we have∣∣∣∣δ(x/n)− 1

2πi

∫ c+iT

c−iT

xs

nsds

s

∣∣∣∣ <{

xc

ncmin{1, 1

T | log(x/n)|} if n 6= x

min{1, cT} if n = x.

Lemma 2.4 (Truncated Perron formula). Let x, c, T > 0, and suppose that∑∞

n=1|an|nc

is convergent. Then

′∑n≤x

an =1

2πi

∫ c+iT

c−iT

(∞∑n=1

anns

)xsds

s+O

(xc

∞∑n=1

|an|nc

min{1, 1

T | log(x/n)|}

),

where∑′

n≤x denotes that if x is an integer, then the final summand ax is replaced by

(1/2)ax.

Proof of Lemma 2.4. We have

′∑n≤x

an =∞∑n=1

anδ(x/n) =∞∑n=1

an1

2πi

∫ c+iT

c−iT

xs

nsds

s+O

(xc

∞∑n=1

|an|nc

min{1, 1

T | log(x/n)|}

),

in view of our previous observation and the triangle inequality.

Finally, our assumption that∑∞

n=1|an|nc

converges implies that both∑∞

n=1

∫ c+iTc−iT

∣∣an xsns 1s

∣∣ |ds|and

∫ c+iTc−iT

∑∞n=1

∣∣an xsns 1s

∣∣ |ds| are convergent, and therefore we may swap the summation

and integration and deduce that

∞∑n=1

an1

2πi

∫ c+iT

c−iT

xs

nsds

s=

1

2πi

∫ c+iT

c−iT

(∞∑n=1

anns

)xsds

s.

�

Remark 2.5. In fact, as we have stated Lemma 2.4 it remains valid with∑′

n≤x an

replaced by∑

n≤x an, since if x is an integer then the “big Oh” term in the lemma is at

least as large as |ax|.

Lemma 2.4 provides our desired relationship between a Dirichlet series and the count-

ing function of its coefficients. Note that if T is chosen larger, meaning that we input

information about the Dirichlet series at a wider range of “frequencies” c+ it, then the

“big Oh” error term becomes smaller.

Remark 2.6. The integral appearing in the truncated Perron formula is a little unsatis-

factory, since the factor 1/s in the integrand doesn’t decay very fast when |=(s)| → ∞.


This is precisely analogous to the way in which the usual Fourier transform of (the

characteristic function of) an interval only decays like 1/|t|. One can obtain faster de-

cay on the Fourier side by introducing smooth weight functions (like the Fejer kernel)

on the “physical space” side, and the same is true here if one replaces∑′

n≤x an by∑n anΦ(n/x), where Φ is a smooth function that approximates the indicator function

1[0,1]. This is sometimes technically very useful.

We conclude from all our work in this section that for any x > 1, and any 1 < c ≤ 2

and 1 < T ≤ x (say), we have

Ψ(x) =1

2πi

∫ c+iT

c−iT

(∞∑n=1

Λ(n)

ns

)xsds

s+O

(xc

∞∑n=1

Λ(n)

ncmin{1, 1

T | log(x/n)|}

)

=1

2πi

∫ c+iT

c−iT

(−ζ′(s)

ζ(s)

)xsds

s+O

xcT

∞∑n=1

Λ(n)

nc+ 2c

∑x/2<n<2x

Λ(n) min{1, 1

T | log(x/n)|}

=

1

2πi

∫ c+iT

c−iT

(−ζ′(s)

ζ(s)

)xsds

s+O

xcT

∞∑n=1

log n

nc+ 2c log(2x)

∑x/2<n<2x

min{1, 1

T | log(x/n)|}

=

1

2πi

∫ c+iT

c−iT

(−ζ′(s)

ζ(s)

)xsds

s+O

(xc

T (c− 1)2+x log2(2x)

T

).

It is usual to choose c = 1 + 1/ log x, so that both “big Oh” terms are O(x log2(2x)/T ),

and the term xc+it in the integrand is of order x (which is the expected size of Ψ(x)).

At this point, a possible approach to investigating Ψ(x) would be to obtain very

precise information about ζ(1 + 1/ log x+ it) (and ζ ′(1 + 1/ log x+ it)), on a good range

of t relative to x, and try directly to evaluate the contour integral above, up to a small

error. Some nice work in multiplicative number theory does proceed a bit like that, but

it is not the classical approach and it is not what we shall do at the moment. Instead

we shall extend the definition of the zeta function to hold when <(s) ≤ 1, and then use

Cauchy’s Residue Theorem to estimate the integral by deforming the line of integration.

We shall also start to investigate the zeros of the zeta function, (i.e. the values s at

which ζ(s) = 0), since these may produce poles of the integrand(− ζ′(s)

ζ(s)

)xs

s, which

produce residues.

3. Second definition of the Riemann zeta function

As promised, we shall now give a definition of the zeta function that makes sense

whenever <(s) > 0, except at s = 1 (where the function has a simple pole), and agrees

with our original definition 1.1 when <(s) > 1.

8 ADAM J HARPER

Definition 3.1. For each s ∈ C such that <(s) > 0, except for s = 1, and for any

x > 0, the Riemann zeta function is defined by

ζ(s) :=∑n≤x

1

ns+x1−s

s− 1+{x}xs− s

∫ ∞x

{w} dwws+1

,

Here {w} := w − bwc denotes the fractional part of w.

The value of the right hand side is independent of the choice of x.

Proof of well definedness. It is obvious that, for any fixed x, the right hand side in

Definition 3.1 defines a holomorphic function on the half plane <(s) > 0, except for a

simple pole at s = 1 with residue 1 (coming from the term x1−s/(s− 1)). Moreover, in

the proof of Lemma 1.2 we have already seen that for any x ∈ N, the right hand side in

Definition 3.1 coincides with ζ(s) for <(s) > 1.

It only remains to check that the right hand side takes the same value for any choice

of x, and (by analytic continuation) it will suffice to do that when <(s) > 1. If x > 0

is not an integer, and N = bxc+ 1 is the smallest integer exceeding x, then we have

s

∫ N

x

{w} dwws+1

= s

∫ N

x

(w − (N − 1))dw

ws+1= s

([w1−s

1− s

]Nx

− (N − 1)

[w−s

−s

]Nx

)

=sN1−s

1− s− sx1−s

1− s+ (N − 1)N−s − N − 1

xs

= −N1−s

s− 1+ x1−s +

x1−s

s− 1− 1

N s− N − 1

xs.

Then we see(∑n≤x

1

ns+x1−s

s− 1+{x}xs− s

∫ ∞x

{w} dwws+1

)−

(∑n≤N

1

ns+N1−s

s− 1− s

∫ ∞N

{w} dwws+1

)

= − 1

N s+x1−s

s− 1− N1−s

s− 1+{x}xs− s

∫ N

x

{w} dwws+1

,

and since {x} = x− (N − 1) this all vanishes, so indeed∑n≤x

1

ns+x1−s

s− 1+{x}xs− s

∫ ∞x

{w} dwws+1

= ζ(s) ∀<(s) > 1, ∀x > 0.

�

It follows immediately from Definition 3.1 that the final estimate in Lemma 1.2

continues to hold on, and slightly to the left of, the line <(s) = 1.

Lemma 3.2. For any t such that |t| is sufficiently large, and any σ > 1 − 100log |t| (say),

we have

ζ(σ + it) = O(log |t|).


On the same range we have

ζ ′(σ + it) = O(log2 |t|).

Proof of Lemma 3.2. If |t| is sufficiently large then certainly 1 − 100log |t| > 0, so we can

apply Definition 3.1 with the choice x = |t|, finding that

ζ(σ + it) =∑n≤|t|

1

nσ+it+O(

|t|100/ log |t|

|t|) +O(|σ + it||t|−σ) =

∑n≤|t|

1

nσ+it+O(1).

Then we note that∑n≤|t|

1

nσ+it= O(

∑n≤|t|

1

n1−100/ log |t| ) = O(∑n≤|t|

1

n) = O(log |t|).

To prove the estimate for ζ ′(σ + it) we simply differentiate Definition 3.1, finding

that

ζ ′(s) = −∑n≤x

log n

ns−x

1−s log x

s− 1− x1−s

(s− 1)2−{x} log x

xs−∫ ∞x

{w} dwws+1

+s

∫ ∞x

{w} logwdw

ws+1.

Choosing x = |t|, and estimating as before, yields the result. �

Thus far, none of the information that we have about the zeta function is very strong

unless s is close to, or to the right of, the line <(s) = 1 (usually just called “the 1-line”).

In general it is a very difficult problem to obtain strong information to the left of the

1-line (especially about the zeros of zeta), and most of the course will be taken up with

doing so. We will finish this section by giving a first taste of some of the techniques

involved, proving an approximation to the zeta function with a sharper error term than

Lemma 1.2, and which is therefore useful to the left of the 1-line.

Theorem 3.3 (Hardy and Littlewood, 1921). If s = σ + it for any σ > 0 and any

t ∈ R; and if x ≥ |t|/π is arbitrary; we have

ζ(s) =∑n≤x

1

ns+x1−s

s− 1+O(x−σ).

The main new ingredient in proving this will be the following slightly tricky lemma,

which asserts that certain sums are well approximated by the corresponding integrals,

essentially provided the summands don’t oscillate too fast. (Hopefully this seems quite

intuitively plausible.)

Lemma 3.4 (Special case of Van der Corput, 1921). Let f(x) be a real-valued function

on the interval [a, b] ⊆ R. Suppose that f ′(x) is continuous and monotonic on [a, b],

and that |f ′(x)| ≤ δ for some δ < 1. As usual, write e(θ) := e2πiθ. Then∑a<n≤b

e(f(n)) =

∫ b

a

e(f(x))dx+O

(1

1− δ

).

10 ADAM J HARPER

Proof of Lemma 3.4. The idea is roughly to express e(f(n)) as a Fourier series (in fact

we will derive something like the Poisson summation formula). It will turn out that

the zero mode in the Fourier expansion produces the main term∫ bae(f(x))dx, whilst all

the other modes contribute to the error term O(1/(1− δ)) (under the hypotheses of the

lemma, which imply that the non-trivial Fourier phases e(−kx) oscillate more rapidly

than e(f(x)), and so produce cancellation).

More precisely, we have

e(f(n)) + e(f(n+ 1))

2= lim

K→∞

K∑k=−K

un(k), where un(k) =

∫ 1

0

e(f(n+ x)− kx)dx.

We note that un(0) =∫ n+1

ne(f(x))dx, and

un(k) =

[e(f(n+ x))e(−kx)

−2πik

]1

0

+1

k

∫ 1

0

f ′(n+ x)e(f(n+ x)− kx)dx

=e(f(n))− e(f(n+ 1))

2πik+

1

k

∫ n+1

n

f ′(x)e(f(x)− kx)dx

if k 6= 0, using integration by parts and the fact that e(−kx) = e(−k(n+ x)). Here the

term e(f(n))−e(f(n+1))2πik

always cancels the corresponding term at −k, so can be ignored.

Therefore∑a<n≤b

e(f(n)) =∑

bac+1≤n≤bbc−1

(un(0) +∑k 6=0

un(k)) +O(1)

=

∫ b

a

e(f(x))dx+O(1) +∑k 6=0

1

k

∫ bbcbac+1

f ′(x)e(f(x)− kx)dx

=

∫ b

a

e(f(x))dx+O(1) +∑k 6=0

1

2πik

∫ bbcbac+1

f ′(x)

f ′(x)− kd

dxe(f(x)− kx)dx.

Finally, since f ′(x) is monotonic on [a, b], and |f ′(x)| < 1, we observe that f ′(x)/(f ′(x)−k) is also monotonic (for each fixed k 6= 0). Now one can easily check the following sum-

mation lemma of Abel: if c1 ≥ c2 ≥ ... ≥ cN , and if d1, ..., dN ∈ R are arbitrary, then∣∣∣∣∣N∑n=1

cndn

∣∣∣∣∣ ≤ |cN |∣∣∣∣∣N∑n=1

dn

∣∣∣∣∣+(cN−1−cN)

∣∣∣∣∣N−1∑n=1

dn

∣∣∣∣∣+...+(c1−c2)|d1| ≤ (c1−cN+|cN |) max1≤N ′≤N

∣∣∣∣∣N ′∑n=1

dn

∣∣∣∣∣ .So if f ′(x)/(f ′(x) − k) is monotone decreasing then, by approximating the real and

imaginary parts of the integral by Riemann sums, we see∣∣∣∣∣∫ bbcbac+1

f ′(x)

f ′(x)− k< d

dxe(f(x)− kx)dx

∣∣∣∣∣� max

bac+1≤x≤bbc

∣∣∣∣ f ′(x)

f ′(x)− k

∣∣∣∣ · maxbac+1≤X≤bbc

∣∣∣∣∫ X

bac+1

< d

dxe(f(x)− kx)dx

∣∣∣∣� 1

|k| − δ,


similarly for the imaginary part. If instead f ′(x)/(f ′(x) − k) is monotone increasing

then a similar argument yields the bound O(1/(|k| − δ)), so finally summing over k

completes the proof of the lemma. �

Proof of Theorem 3.3. Let N ≥ x be a large parameter, and recall that we write σ =

<(s) and t = =(s). Then directly from Definition 3.1, we have

ζ(s) =∑n≤N

1

ns+N1−s

s− 1+O(N−σ) +O(|s|

∫ ∞N

dw

wσ+1) =

∑n≤N

1

ns+N1−s

s− 1+O(|s|N−σ/σ).

In particular, if we choose N large enough in terms of x, σ, |t| then the error term will

be O(x−σ).

Now to prove Theorem 3.3, it will suffice to show that(∑n≤N

1

ns+N1−s

s− 1

)−

(∑n≤x

1

ns+x1−s

s− 1

)= O(x−σ)

if x ≥ |t|/π. But the left hand side is∑x<n≤N

1

ns−∫ N

x

1

wsdw

=∑

x<n≤N

1

nσe−it logn −

∫ N

x

1

wσe−it logwdw

=∑

x<n≤N

(∫ n

x

−σvσ+1

dv +1

xσ

)e−it logn −

∫ N

x

(∫ w

x

−σvσ+1

dv +1

xσ

)e−it logwdw

=1

xσ

( ∑x<n≤N

e−it logn −∫ N

x

e−it logwdw

)+

∫ N

x

−σvσ+1

( ∑v<n≤N

e−it logn −∫ N

v

e−it logwdw

)dv.

(These manipulations are a version of summation/integration by parts.) And we have

e−it logn = e2πif(n), where f(w) = −(t/2π) logw is a real valued function whose derivative

f ′(w) = −t/(2πw) is continuous, monotonic, and of absolute value ≤ |t|/(2πx) ≤ 1/2

on the interval [x,N ]. So we can apply Lemma 3.4 to bound each of the bracketed

terms, finding ∑x<n≤N

1

ns−∫ N

x

1

wsdw � 1

xσ+ σ

∫ N

x

dv

vσ+1� 1

xσ,

as claimed. �

Note that Theorem 3.3 provides a good approximation of the zeta function by a

Dirichlet polynomial with around |t| terms (usually described as a Dirichlet polynomial

of length |t|), together with the easy to understand term x1−s/(s− 1). We usually think

of the length of a Dirichlet polynomial as a measure of its complexity. It turns out

that Dirichlet polynomials of length |t|, evaluated where =(s) � |t|, can be understood

12 ADAM J HARPER

moderately well (we will see this later), but if one could approximate by a shorter

Dirichlet polynomial that would be even better.

4. Proof of the Prime Number Theorem

We are now ready to prove the following fundamental result about primes.

Theorem 4.1 (Prime Number Theorem, Hadamard, de la Vallee Poussin, 1896). As

x→∞ we have

Ψ(x) ∼ x.

More precisely, we have

Ψ(x) = x+O(xe−c log1/10 x).

(Actually Hadamard and de la Vallee Poussin did not originally obtain the quanti-

tative estimate Ψ(x) = x + O(xe−c log1/10 x), but from a modern perspective it follows

easily from variants of their methods. A little later, de la Vallee Poussin obtained the

stronger estimate Ψ(x) = x + O(xe−c√

log x), which is usually called the Prime Number

Theorem with classical error term. We will discuss that result a little later.)

Recall that Ψ(x) :=∑

n≤x Λ(n) is a counting function of weighted prime powers.

Given Theorem 4.1, it is easy (using summation by parts) to deduce an asymptotic for

the unweighted counting function π(x) as well.

Corollary 4.2. As x→∞ we have

π(x) ∼∫ x

2

dt

log t.

The function Li(x) :=∫ x

2dt

log tis called the Logarithmic Integral. Integrating by parts

shows that Li(x) = xlog x

+O( xlog2 x

), and integrating by parts multiple times yields more

precise expansions for the logarithmic integral (e.g. Li(x) = xlog x

+ xlog2 x

+ O( xlog3 x

)).

Although primes had been studied since ancient times, it was only by using the zeta

function that it became possible to prove the asymptotic for π(x) (which had been

conjectured by Gauss, and in a weaker form by Legendre). Much later alternative

proofs were found that do not use ζ(s), but the zeta approach remains fundamental to

research and to proving stronger results.

In addition to some of the facts we established previously, to prove Theorem 4.1 we

need to prove a key result about the zeros of the zeta function.

Theorem 4.3 (Weak zero-free region, Hadamard, de la Vallee Poussin, 1896). There

exists a small absolute constant c > 0 such that the following is true. For any t ∈ Rand any σ ≥ 1− c/ log9(|t|+ 2),∣∣∣∣ 1

ζ(σ + it)

∣∣∣∣ = O(log7(|t|+ 2)).


In particular, the zeta function has no zeros s = σ + it in the region {s : σ ≥ 1 −c/ log9(|t|+ 2)}.

The zero-free region {s : σ ≥ 1 − c/ log9(|t| + 2)} supplied by Theorem 4.3 is only

a small extension of the half plane {σ > 1}, where the zeta function trivially has no

zeros (because it is an absolutely convergent product). However, it does include the line

<(s) = 1, and is sufficient to imply quite spectacular results like Theorem 4.1.

Lots of our recent work on the zeta function has made use of its series expansion.

In contrast, the proof of Theorem 4.3 is mainly based on its Euler product expression,

showing that this continues to have an influence a little to the left of the half plane

{σ > 1} where it is valid.

Proof of Theorem 4.3. We may assume throughout that σ − 1 is less than a small con-

stant, since if σ−1 is large then the result is a trivial consequence of the Euler product.

We may also assume that |t| is larger than a small constant, since otherwise ζ(σ+ it) is

in the neighbourhood of the pole of the zeta function at s = 1, in which case |1/ζ(σ+it)|is certainly small.

Let σ′ > 1 be a number, to be chosen later in terms of σ and t. For any t′ we see

|ζ(σ′+it′)| = exp{< log ζ(σ′+it′)} = exp{−<∑p

log

(1− 1

pσ′+it′

)} = exp{

∑pk

cos(kt′ log p)

kpkσ′},

since we have the Euler product expression (Lemma 1.3) for σ′ > 1. Here the sum∑

pk

is over all prime powers.

The key idea is to consider the product ζ(σ′)3|ζ(σ′ + it)|4|ζ(σ′ + 2it)|. We see

ζ(σ′)3|ζ(σ′ + it)|4|ζ(σ′ + 2it)| = exp{∑pk

3 + 4 cos(kt log p) + cos(2kt log p)

kpkσ′}

= exp{∑pk

2(1 + cos(kt log p))2

kpkσ′} ≥ 1,

in view of an elementary trigonometric identity. This means that the only way that

|ζ(σ′ + it)| can be very small is if ζ(σ′), |ζ(σ′ + 2it)| are very large. In fact we have

|ζ(σ′ + it)| ≥ 1

ζ(σ′)3/4|ζ(σ′ + 2it)|1/4� (σ′ − 1)3/4

log1/4(|t|+ 2),

since ζ(s) has a simple pole at s = 1, and since ζ(σ′ + 2it) = O(log(|t| + 2)) by the

second part of Lemma 1.2 (which was stated for |t| ≥ 2 but is valid, with the bound

O(log(|t|+ 2)), provided that s is away from 1).

Finally we observe

|ζ(σ + it)| ≥ |ζ(σ′ + it)| −∫ σ′

σ

|ζ ′(r + it)|dr = |ζ(σ′ + it)|+O((σ′ − σ) log2(|t|+ 2)),

14 ADAM J HARPER

by the second part of Lemma 3.2 (which, again, was stated for large |t| but is valid

for small |t|, with the bound O(log2(|t| + 2)), provided s is away from 1). So if σ ≥1 + c/ log9(|t|+ 2) we can choose σ′ = σ, and see

|ζ(σ + it)| � c3/4

log7(|t|+ 2);

whereas if 1−c/ log9(|t|+2) ≤ σ ≤ 1+c/ log9(|t|+2) we can choose σ′ = 1+c/ log9(|t|+2),

and see

(σ′ − 1)3/4

log1/4(|t|+ 2)=

c3/4

log7(|t|+ 2), and (σ′ − σ) log2(|t|+ 2) = O(

c

log7(|t|+ 2)).

In particular, provided c is a sufficiently small (but fixed) constant then |ζ(σ + it)| �c3/4

log7(|t|+2)in this case as well. �

Proof of Theorem 4.1. In view of the truncated Perron formula (Lemma 2.4) and the

subsequent discussion, for any large x and 1 < T ≤ x we have

Ψ(x) =1

2πi

∫ 1+1/ log x+iT

1+1/ log x−iT

(−ζ′(s)

ζ(s)

)xsds

s+O

(x log2 x

T

).

We shall try to evaluate the integral using Cauchy’s Residue Theorem and our preceding

estimates, and will succeed up to an error term involving T . Then we will choose T to

balance our two “big Oh” terms.

Indeed, by the residue theorem and Theorem 4.3 we see

1

2πi

∫ 1+1/ log x+iT

1+1/ log x−iT

(−ζ′(s)

ζ(s)

)xsds

s

= Ress=1

(−ζ′(s)

ζ(s)

xs

s

)+

1

2πi

∫ 1−c/ log9(T+2)+iT

1−c/ log9(T+2)−iT

(−ζ′(s)

ζ(s)

)xsds

s

+1

2πi

∫ 1+1/ log x+iT

1−c/ log9(T+2)+iT

(−ζ′(s)

ζ(s)

)xsds

s− 1

2πi

∫ 1+1/ log x−iT


(−ζ′(s)

ζ(s)

)xsds

s

= x+1

2πi

∫ 1−c/ log9(T+2)+iT


(−ζ′(s)

ζ(s)

)xsds

s

+1

2πi

∫ 1+1/ log x+iT

1−c/ log9(T+2)+iT

(−ζ′(s)

ζ(s)

)xsds

s− 1

2πi

∫ 1+1/ log x−iT


(−ζ′(s)

ζ(s)

)xsds

s,

since the only pole of the integrand we encounter when moving to the left comes from

the simple pole of the zeta function at s = 1 (there being no zeros of zeta enclosed by

the relevant contour), and that gives rise to a residue x.


Next, on the short horizontal line [1− c/ log9(T + 2) + iT, 1 + 1/ log x+ iT ] we have∣∣∣∣−ζ ′(s)ζ(s)

xs

s

∣∣∣∣ ≤ x1+1/ log x

Tmax

s∈[1−c/ log9(T+2)+iT,1+1/ log x+iT ]|ζ ′(s)| max

s∈[1−c/ log9(T+2)+iT,1+1/ log x+iT ]

∣∣∣∣ 1

ζ(s)

∣∣∣∣� x

Tlog9(T + 2)

� x log9 x

T,

in view of Lemma 3.2 and Theorem 4.3 (and the assumption that T ≤ x). The same is

obviously true on the other short line [1− c/ log9(T + 2)− iT, 1 + 1/ log x− iT ], and so

both of those integrals contribute an error term O((x/T ) log9 x).

On the vertical line [1− c/ log9(T + 2)− iT, 1− c/ log9(T + 2) + iT ] we have instead

that ∣∣∣∣−ζ ′(s)ζ(s)

xs

s

∣∣∣∣ ≤ x1−c/ log9(T+2)

|s|max

s∈[1−c/ log9(T+2)−iT,1−c/ log9(T+2)+iT ]|ζ ′(s)|

· maxs∈[1−c/ log9(T+2)−iT,1−c/ log9(T+2)+iT ]

∣∣∣∣ 1

ζ(s)

∣∣∣∣� x1−c/ log9(T+2)

|s|log7 x max

s∈[1−c/ log9(T+2)−iT,1−c/ log9(T+2)+iT ]|ζ ′(s)| ,

in view of Theorem 4.3 again. We need to be slightly careful when bounding max |ζ ′(s)|,since some of the relevant values of s are close to the pole at s = 1. However, since we

always have |s − 1| ≥ c/ log9(T + 2), a quick check of the proof of Lemma 3.2 reveals

that

maxs∈[1−c/ log9(T+2)−iT,1−c/ log9(T+2)+iT ]

|ζ ′(s)| � log18(T + 2)� log18 x.

Therefore we have that∣∣∣∣∣ 1

2πi

∫ 1−c/ log9(T+2)+iT


(−ζ′(s)

ζ(s)

)xsds

s

∣∣∣∣∣� x1−c/ log9(T+2) log25 x

∫ T

−T

1

1 + |t|dt� x1−c/ log9(T+2) log26 x.

In summary, we have shown that

Ψ(x) = x+O(x1−c/ log9(T+2) log26 x)+O

(x log9 x

T

)= x+O

(x(e−c(log x)/ log9(T+2) + e− log T

)log26 x

)for all large x and 1 < T ≤ x. (Note that the error term O((x/T ) log2 x) from our initial

application of Perron’s formula can be subsumed into the error term O((x/T ) log9 x)

from estimating the short integrals.) Choosing T = exp{log1/10 x}, we conclude that

Ψ(x) = x+O(xe−c log1/10 x log26 x),

and since log26 x = e26 log log x is of much smaller order than e(c/2) log1/10 x (say) the asser-

tion of Theorem 4.1 follows, with c replaced by c/2. �

16 ADAM J HARPER

Remark 4.4. From this point on we will usually feel free to “absorb logarithmic terms”,

as was done in the last sentence of the proof of Theorem 4.1, without much discussion.

This leads to neater looking bounds, which are therefore easier to think about. Thus

the reader should make sure that he or she understands the justification for removing

the factor log26 x (and, in compensation, replacing c by c/2 in the exponent). Note this

is also an example of recycling letters to mean different things: the constant c in the

statement of Theorem 4.1 turned out to be half as big (in our writing of the argument)

as the constant c from Theorem 4.3.

5. Widening the zero-free region

Theorem 4.1 is a fundamental result about the distribution of primes, and is very

useful throughout number theory, but the error term O(xe−c log1/10 x) that we obtained

there is a bit awkward and unsatisfactory. For example, if we wanted to investigate the

values of the von Mangoldt function Λ(n) in the fairly long interval (x, x+ xe− log1/4 x],

the obvious approach is to note that∑x<n≤x+xe− log1/4 x

Λ(n) = Ψ(x+ xe− log1/4 x)−Ψ(x)

= (x+ xe− log1/4 x +O(xe−c log1/10 x))− (x+O(xe−c log1/10 x))

= xe− log1/4 x +O(xe−c log1/10 x).

But this is useless, since the “big Oh” error term is much larger than the supposed

main term xe− log1/4 x. Thus they might, for all we know at this point, cancel each other

completely, so we cannot even guarantee in this way that there will be a single prime

in the interval (x, x+ xe− log1/4 x] (for all large x).

Much later in the course we will develop a more sophisticated approach to studying

primes in intervals. But the obvious goal, for this and other applications, is simply to

improve the error term in the Prime Number Theorem. In this section we will develop a

general procedure for doing this, and use it to quickly prove the Prime Number Theorem

with classical error term O(xe−c√

log x). In the next chapter of the course we will use the

same procedure, but inputting much more powerful analytic data about ζ(s), to obtain

the best known error term.

We saw in the proof of Theorem 4.1 that the quality of our error term depended on

how far we could shift the line of integration to the left, which in turn depended on

how large a zero-free region we have for the zeta function. To obtain a wider zero-free

region we shall prove an important technical theorem.

Theorem 5.1 (Landau, 1924). Let φ(t) ≥ 1 and w(t) ≥ 1 be non-decreasing functions

such that φ(t)→∞ as t→∞. Also let t0 ≥ 0 be any fixed constant.


Suppose that w(t) = O(eφ(t)/2), and that

ζ(σ + it) = O(eφ(t)) ∀ 1− 1

w(t)≤ σ ≤ 2, ∀t ≥ t0.

Then there exists a constant c > 0, depending only on the implicit constants for which

the preceding conditions are satisfied, such that

ζ(σ + it) 6= 0 ∀ 1− c

φ(2t+ 1)w(2t+ 1)≤ σ, ∀t ≥ t0.

Theorem 5.1 asserts that if one knows that ζ(s) isn’t too large in a region of width

1/w(t) (to the left of the 1-line), then in a slightly narrower region it cannot be zero.

This is a rather remarkable fact, whose proof we shall turn to shortly. As an immediate

corollary we improve our zero-free region from Theorem 4.3.

Corollary 5.2 (Classical zero-free region). There exists a small absolute constant c > 0

such that the zeta function has no zeros s = σ+it in the region {s : σ ≥ 1−c/ log(|t|+2)}.

Proof of Corollary 5.2. We apply Theorem 5.1 with the choices w(t) ≡ 2, φ(t) = log t

and t0 = 3, for example. We need to check that ζ(σ+ it) = O(t) whenever σ ≥ 1/2 and

t ≥ 3, but this follows immediately from Theorem 3.3 with the choice x = t, which in

fact implies that

ζ(σ + it) =∑n≤t

1

nσ+it+

t1−σ−it

σ + it− 1+O(t−σ) =

∑n≤t

1

nσ+it+O(1) = O(

√t).

So Theorem 5.1 implies that the zeta function is non-zero in the region

1− c

log(2t+ 1)≤ σ, t ≥ 3.

Using Theorem 4.3 to handle those 0 ≤ t < 3, and adjusting the value of c suitably, we

in fact conclude that zeta is non-zero in the region

1− c

log(|t|+ 2)≤ σ, t ≥ 0.

Finally, it is easy to check (e.g. in Definition 3.1) that we always have ζ(σ − it) =

ζ(σ + it), and so we have the result for negative t by symmetry. �

Remark 5.3. It isn’t really necessary to use Theorem 3.3 to check the conditions in

the proof of Corollary 5.2: it would suffice to use Definition 3.1. One can also avoid

appealing to Theorem 4.3 to handle 0 ≤ t < 3, by choosing t0 smaller and noting that

ζ(s) is certainly non-zero in the neighbourhood of the pole at s = 1.

Next we shall prove Theorem 5.1, which requires two ingredients. The first of these,

which is crucial in essentially all known zero-free region arguments, is the fact that

ζ(σ′)3|ζ(σ′ + it)|4|ζ(σ′ + 2it)| ≥ 1 when σ′ > 1, as we saw in the proof of Theorem 4.3.

The new ingredient is the following lemma.

18 ADAM J HARPER

Lemma 5.4. Let r,M > 0 and z0 ∈ C. Suppose that f(z) is a holomorphic function

on the disc |z − z0| ≤ r, that f(z0) 6= 0, and that |f(z)/f(z0)| ≤M for all |z − z0| ≤ r.

Then if f(z) 6= 0 in the right half of the disc (where <z ≥ <z0), we have

<f′(z0)

f(z0)≥ −8 logM

r+ <

∑ρ: f(ρ)=0,|ρ−z0|≤r/2,<ρ<<z0

1

z0 − ρ,

so in particular we have

<f′(z0)

f(z0)≥ −8 logM

r.

Proof of Lemma 5.4. The proof is based on some facts from complex analysis.

Let Z denote the multi-set of all zeros of f(z) in the small disc |z−z0| ≤ r/2, counted

with multiplicity. This must be a finite multi-set, since otherwise the zeros would have

a limit point and so f(z) would be identically zero, which is false by hypothesis. Then

we define a function g(z) on the large disc |z − z0| ≤ r, by setting

g(z) :=

{f(z)

∏ρ∈Z

1z−ρ if z /∈ Z

limz′→z g(z′) if z ∈ Z.

Note that g(z) is holomorphic on the large disc |z−z0| ≤ r, so by the maximum modulus

principle we have

max|z−z0|≤r

∣∣∣∣ g(z)

g(z0)

∣∣∣∣ = max|z−z0|=r

∣∣∣∣ g(z)

g(z0)

∣∣∣∣ ≤ max|z−z0|=r

∣∣∣∣ f(z)

f(z0)

∣∣∣∣ · max|z−z0|=r

∏ρ∈Z

∣∣∣∣z0 − ρz − ρ

∣∣∣∣ ≤M.

Next, by construction the function g(z) is non-zero on the small disc |z − z0| ≤ r/2,

so we can define h(z) := log(g(z)/g(z0)) by taking the principal branch of the logarithm.

Then h(z) is holomorphic on the small disc |z − z0| ≤ r/2, and there we have

h(z0) = 0, <h(z) = log |g(z)/g(z0)| ≤ logM.

But one can bound the modulus of a holomorphic function at a point given a bound

for its real part on a surrounding disc: since h(0) = 0, the Borel–Caratheodory theorem

implies that, for any r′ < r/2,

max|z−z0|≤r′

|h(z)| ≤ 2r′

r/2− r′logM.

Then Cauchy’s Integral Formula implies that |h′(z0)| = | 12πi

∫|z−z0|=r/4

h(z)(z−z0)2

dz| ≤8(logM)/r.

Finally, note that

f ′(z0)

f(z0)=

d

dzlog f(z)|z=z0 =

d

dzlog g(z)|z=z0 +

∑ρ∈Z

1

z0 − ρ= h′(z0) +

∑ρ∈Z

1

z0 − ρ,


from which the first lower bound claimed in the lemma follows. Also if <ρ < <z0 then

< 1

z0 − ρ= < z0 − ρ|z0 − ρ|2

> 0,

so the second lower bound <f′(z0)f(z0)

≥ −8 logMr

is weaker than the first. �

Now we are in a position to prove Theorem 5.1. The proof is, in overview, rather

similar to the proof of the weak zero-free region in Theorem 4.3, with Lemma 5.4

(applied with f = ζ) providing a size estimate for ζ ′(s)/ζ(s) that takes the place of the

size estimates for ζ(s) and ζ ′(s) in that proof. Because of the need to set the quantities

r,M in Lemma 5.4, there are however some additional fiddly details.

Proof of Theorem 5.1. Let t ≥ t0 and σ > 0. We wish to prove that if ζ(σ + it) = 0

then we must have σ < 1− cφ(2t+1)w(2t+1)

, where c > 0 is a small constant that depends

(possibly) on the implicit constants in the conditions of the theorem. In view of Theorem

4.3, we may certainly assume that σ < 1, and we may also assume that t ≥ 10, say.

Let σ′ > 1 be a number, to be chosen later in terms of σ and t. We will choose σ′− 1

sufficiently small that ζ ′(σ′)/ζ(σ′) is under the influence of the simple pole at 1, and

specifically so that ζ ′(σ′)/ζ(σ′) ≥ −(5/4)/(σ′ − 1), say.

As calculated at the beginning of section 2, (and similarly as in the proof of Theorem

4.3), since σ′ > 1 the Euler product expression implies that

−3ζ ′(σ′)

ζ(σ′)− 4<ζ

′(σ′ + it)

ζ(σ′ + it)−<ζ

′(σ′ + 2it)

ζ(σ′ + 2it)= 3

∞∑n=1

Λ(n)

nσ′+ 4<

∞∑n=1

Λ(n)

nσ′+it+ <

∞∑n=1

Λ(n)

nσ′+2it

=∞∑n=1

Λ(n)(3 + 4 cos(t log n) + cos(2t log n))

nσ′

=∞∑n=1

Λ(n)2(1 + cos(t log n))2

nσ′≥ 0.

Now let 0 < r ≤ 1 be another parameter (to be chosen later), and let M =

M(r, σ′, t) > 1 be such that |ζ(s)/ζ(σ′ + it)| ≤ M for all |s − (σ′ + it)| ≤ r, and

|ζ(s)/ζ(σ′+2it)| ≤M for all |s−(σ′+2it)| ≤ r. Then our assumption that ζ ′(σ′)/ζ(σ′) ≥−(5/4)/(σ′ − 1), together with Lemma 5.4 applied to f = ζ, imply that

−<ζ′(σ′ + it)

ζ(σ′ + it)≥ 3

4

ζ ′(σ′)

ζ(σ′)+

1

4<ζ′(σ′ + 2it)

ζ(σ′ + 2it)≥ −(15/16)

σ′ − 1− 2 logM

r.

On the other hand, Lemma 5.4 also implies that−< ζ′(σ′+it)ζ(σ′+it)

≤ 8 logMr−<

∑ρ: ζ(ρ)=0,

|ρ−(σ′+it)|≤r/2,<ρ<σ′

1σ′+it−ρ .

In particular, if ζ(σ + it) = 0 then either σ < σ′ − r/2, or else (taking only the term

20 ADAM J HARPER

ρ = σ + it in the sum, remembering that all the terms are non-negative) we must have

1

σ′ − σ≤ 15/16

σ′ − 1+

10 logM

r.

If we choose σ′ = 1 + cmin{1, r/ logM}, where c > 0 is a sufficiently small absolute

constant, then the right hand side will be ≤ 31/32cmin{1,r/ logM} , so we conclude that

either σ < 1 + cmin{1, r/ logM} − r/2, or σ < 1− 0.01cmin{1, r/ logM}.

To finish the proof, it remains to choose 0 < r ≤ 1 and to calculate a permissible

value of M . By the hypotheses of Theorem 5.1, if we choose r = 1/w(2t+ 1) ≤ 1 then

we have ζ(s) = O(eφ(2t+1)) in each of the discs |s − (σ′ + it)| ≤ r, |s − (σ′ + 2it)| ≤ r.

In view of the Euler product expression, we also have

1

|ζ(σ′ + it)|=∏p

∣∣∣∣1− 1

pσ′+it

∣∣∣∣ ≤∏p

(1 +

1

pσ′

)≤

∞∑n=1

1

nσ′� 1

σ′ − 1=

max{1, (logM)/r}c

=max{1, (logM)w(2t+ 1)}

c,

and clearly we have the same bound for 1|ζ(σ′+2it)| . So if we take M = eCφ(2t+1), where

C > 0 is a suitable large constant depending on the implicit constants in the assumptions

w(t) = O(eφ(t)/2), ζ(σ + it) = O(eφ(t)), then we indeed have∣∣∣∣ ζ(s)

ζ(σ′ + it)

∣∣∣∣ ≤ eCφ(2t+1) ∀|s− (σ′ + it)| ≤ 1

w(2t+ 1),

and the same in the other disc |s− (σ′ + 2it)| ≤ 1w(2t+1)

. Inserting these values of r and

M completes the proof of the theorem. �

Having established the classical zero-free region in Corollary 5.2, we are almost ready

to prove the Prime Number Theorem with classical error term O(xe−c√

log x). To do this

we just need to obtain a bound for ζ ′(s)/ζ(s) inside the classical zero-free region. (Recall

that when we proved our weak zero-free region, in Theorem 4.3, we proved a bound for

1/ζ(s) in that region at the same time.)

Lemma 5.5. There exists a small absolute constant c > 0 such that the following is

true. For any |t| ≥ 1 and any σ ≥ 1− c/ log(|t|+ 2),∣∣∣∣ζ ′(σ + it)

ζ(σ + it)

∣∣∣∣ = O(log(|t|+ 2)).

Sketch proof of Lemma 5.5. [[This proof is non-examinable, although it follows by slightly

adapting the proof of Lemma 5.4, and applying Corollary 5.2.]] We will only sketch the

argument.


One can check that, under the hypotheses of Lemma 5.4, one actually has

<f′(z)

f(z)≥ O

(logM

r

)+ <

∑ρ: f(ρ)=0,|ρ−z0|≤r/2,<ρ<<z0

1

z − ρ

whenever |z − z0| ≤ r/4 (say), and not just at z = z0. Note that the sum is still over

those |ρ− z0| ≤ r/2, regardless of the value of z. In particular, by applying Lemma 5.4

with the choices f = ζ, z0 = 9/8 + it, r = 1/2 and M = O(|t|), we obtain

<ζ′(z)

ζ(z)≥ O(log(|t|+ 2)) + <

∑ρ: ζ(ρ)=0,

|ρ−(9/8+it)|≤1/4

1

z − ρ.

By Corollary 5.2, if z = σ + it for any σ ≥ 1− c/(2 log(|t|+ 2)) then we have <ρ < <zfor every ρ counted in the sum (in other words there are no zeros to the right of z).

This means that < 1z−ρ = < z−ρ

|z−ρ|2 > 0 for every ρ in the sum, and so actually

<ζ′(z)

ζ(z)≥ O(log(|t|+ 2))

for all such z.

Finally, the Borel–Caratheodory theorem (applied on a circle around the point 1 +

c/(2 log(|t|+ 2)) + it) implies that∣∣∣∣ζ ′(σ + it)

ζ(σ + it)

∣∣∣∣ = O(log(|t|+ 2)) ∀|t| ≥ 1, ∀σ ≥ 1− c/(4 log(|t|+ 2)).

This is the claimed result, when c is replaced by c/4. �

Theorem 5.6 (Prime Number Theorem with classical error term). For all x ≥ 2 we

have

Ψ(x) = x+O(xe−c√

log x).

Proof of Theorem 5.6. The proof is exactly similar to the proof of the Prime Number

Theorem with weak error term (Theorem 4.1), but moving the line of integration further

to the left to exploit the classical zero-free region rather than the weak zero-free region.

We outline the main details.

Using the truncated Perron formula, Cauchy’s Residue Theorem, and the classical

zero-free region from Corollary 5.2, we obtain that

Ψ(x) = x+1

2πi

∫ 1−c/ log(T+2)+iT

1−c/ log(T+2)−iT

(−ζ′(s)

ζ(s)

)xsds

s+

1

2πi

∫ 1+1/ log x+iT

1−c/ log(T+2)+iT

(−ζ′(s)

ζ(s)

)xsds

s

− 1

2πi

∫ 1+1/ log x−iT

1−c/ log(T+2)−iT

(−ζ′(s)

ζ(s)

)xsds

s+O

(x log2 x

T

),

for any large x and any 1 < T ≤ x.

22 ADAM J HARPER

Next, on the short horizontal line [1− c/ log(T + 2) + iT, 1 + 1/ log x+ iT ] we have∣∣∣∣−ζ ′(s)ζ(s)

xs

s

∣∣∣∣ ≤ x1+1/ log x

Tmax

s∈[1−c/ log(T+2)+iT,1+1/ log x+iT ]

∣∣∣∣ζ ′(s)ζ(s)

∣∣∣∣� x log(T + 2)

T� x log x

T,

by Lemma 5.5. We have the same bound on the other horizontal line. Meanwhile, on

the vertical line [1− c/ log(T + 2)− iT, 1− c/ log(T + 2) + iT ] we have∣∣∣∣−ζ ′(s)ζ(s)

xs

s

∣∣∣∣ ≤ x1−c/ log(T+2)

|s|max

s∈[1−c/ log(T+2)−iT,1−c/ log(T+2)+iT ]

∣∣∣∣ζ ′(s)ζ(s)

∣∣∣∣� x1−c/ log(T+2) log x

|s|,

where the bound |ζ ′(s)/ζ(s)| � log(T + 2) � log x follows from Lemma 5.5 when

|=s| ≥ 1 (and in fact whenever s is a fixed distance away from the pole at s = 1), and

because ζ ′(s)/ζ(s) ∼ −1/(s− 1) as s→ 1.

In summary, we have shown that

Ψ(x) = x+O

(x1−c/ log(T+2) log x

∫ T

−T

dt

1 + |t|

)+O

(x log2 x

T

)= x+O

(x log2 x

(e−c(log x)/ log(T+2) + e− log T

)).

Choosing T = exp{√

log x} to balance the two terms, the theorem follows. �

6. A quick break: Other facts about the zeta function

We end this chapter by stating a couple more facts about the zeta function, which I

would be embarrassed not to include somewhere in the course. Since we will not need to

use these results (except perhaps at the very end of the course), the proofs are omitted.

As yet we have only defined ζ(s) when <(s) > 0. However, Riemann himself showed

that one can analytically continue ζ(s) to a meromorphic function on the whole of C,

and in the process he obtained some very important symmetry information about the

zeta function.

Theorem 6.1 (Functional equation, Riemann, 1859). For all s ∈ C we have

π−s/2Γ(s/2)ζ(s) = π−(1−s)/2Γ((1− s)/2)ζ(1− s),

where Γ(z) :=∫∞

0e−xxz−1dx if <(z) > 0, and Γ is defined elsewhere on C by analytic

continuation (e.g. using the formula zΓ(z) = Γ(z + 1), or the formula Γ(z)Γ(1 − z) =

π/ sin(πz)).

Remark 6.2. When 0 < <(s) < 1, all the terms in the functional equation have straight-

forward definitions, and it asserts a certain symmetry of behaviour on either side of the

line <(s) = 1/2 (called the critical line). When <(s) ≤ 0, the functional equation allows

us to define ζ(s) in terms of ζ(1− s) (which is already defined, since <(1− s) ≥ 1) and

some other functions.


Γ(z) is meromorphic and non-zero on C, with simple poles at z = 0,−1,−2, .... Thus

ζ(s) must vanish when s = −2,−4,−6, ..., to cancel the poles of Γ(s/2) on the left hand

side of the functional equation. These zeros of zeta are called the trivial zeros, since

they don’t seem to encode information about the primes (unlike any zeros in the strip

0 < <(s) < 1).

Remark 6.3. Chapter 2 of Titchmarsh’s book [2] contains seven different proofs of the

functional equation (a particularly famous one, due to Riemann, using a functional

equation for∑∞

n=−∞ e−n2πx)!

Remark 6.4. It is speculated that the reason the Riemann Hypothesis should hold is

because the Euler product should force all the non-trivial zeros of the zeta function to lie

close to the critical line <(s) = 1/2, and then the functional equation should somehow

force them to lie exactly on the line. In this course we will not get close to the critical

line and will not need the functional equation.

As an alternative to repeatedly using the truncated Perron formula, with the line of

integration moved to different positions, to study Ψ(x), one can use an explicit formula

that directly links Ψ(x) and the zeros of the zeta function. We state a version of this

below.

Theorem 6.5 (Explicit formula, von Mangoldt, 1895). For any 2 ≤ T ≤ x, we have

Ψ(x) = x−∑

ρ: ζ(ρ)=0,|=(ρ)|≤T

xρ

ρ+O

(x log2 x

T

).

The explicit formula is proved by starting with the truncated Perron formula, moving

the line of integration far to the left (picking up residues at the zeros of the zeta function,

which appear in the sum), and then estimating the contribution on the shifted line of

integration using the functional equation.

References

[1] A. Ivic. The Riemann zeta-function: theory and applications. Dover edition, published by Dover

Publications, Inc.. 2003

[2] E. C. Titchmarsh. The Theory of the Riemann Zeta-function. Second edition, revised by D. R.

Heath-Brown, published by Oxford University Press. 1986

Jesus College, Cambridge, CB5 8BL

E-mail address: [email protected]

the riemann zeta function, lent 20141. first definition of the riemann zeta function the riemann...

Documents