class field theory and the theory of n-fermat primes … · 2017-08-07 · class field theory and...

CLASS FIELD THEORY AND THE THEORY OF N -FERMAT PRIMES

BY

ANDREW KOBIN

A Thesis Submitted to the Graduate Faculty of

WAKE FOREST UNIVERSITY GRADUATE SCHOOL OF ARTS AND SCIENCES

in Partial Fulfillment of the Requirements

for the Degree of

MASTER OF ARTS

Mathematics

May, 2015

Winston-Salem, North Carolina

Approved By:

Frank Moore, Ph.D., Advisor

Hugh Howards, Ph.D., Chair

Jeremy Rouse, Ph.D.

Acknowledgments

This was probably the hardest page of this thesis to write, as no number of wordsare sufficient to praise those who have helped and supported me along the way to myMaster’s degree.

First, I would like to thank the members of my committee. Thank you to Dr.Jeremy Rouse for his seemingly infinite wisdom and even greater generosity in sharinghis knowledge with me. We had numerous discussions on the finer details of algebraicnumber theory that helped shape the direction of my research. He is also a fastrunner. Thank you to Dr. Hugh Howards for his mentorship and advice going allthe way back to 2010 when I was mere freshman at Wake Forest. My identity as amathematician is in large part due to the dedication of Dr. Howards as an educator.Lastly, thank you to my adviser, Dr. Frank Moore, for his selfless devotion to thisproject over nearly two years’ time. I would not be where I am today without hismathematical knowledge, worldly advice and sincere friendship.

The Department of Mathematics at Wake Forest has been like a second family tome for some years now. Thank you to everyone here that has helped me to surviveand thrive at Wake Forest.

To my office mates, Mackenzie, Elliott, Elena and Amelie: thank you for yoursupport and for putting up with my loud music!

Finally, I would like to thank my family for their love and for providing me withopportunities in life that have allowed me to succeed. They are my biggest supportersand I love them dearly.

ii

Table of Contents

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Chapter 1 Algebraic Number Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Rings of Algebraic Integers . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Dedekind Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Ramification of Primes . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.4 The Decomposition and Inertia Groups . . . . . . . . . . . . . . . . . 18

1.5 Norms of Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.6 Discriminant and Different . . . . . . . . . . . . . . . . . . . . . . . . 26

1.7 The Class Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

1.8 The Hilbert Class Field . . . . . . . . . . . . . . . . . . . . . . . . . . 45

1.9 Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

1.10 Units in a Number Field . . . . . . . . . . . . . . . . . . . . . . . . . 70

Chapter 2 Class Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

2.1 Valuations and Completions . . . . . . . . . . . . . . . . . . . . . . . 78

2.2 Frobenius Automorphisms and the Artin Map . . . . . . . . . . . . . 90

2.3 Ray Class Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

2.4 L-series and Dirichlet Density . . . . . . . . . . . . . . . . . . . . . . 105

2.5 The Frobenius Density Theorem . . . . . . . . . . . . . . . . . . . . . 118

2.6 The Second Fundamental Inequality . . . . . . . . . . . . . . . . . . . 126

2.7 The Artin Reciprocity Theorem . . . . . . . . . . . . . . . . . . . . . 134

2.8 The Conductor Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 143

2.9 The Existence and Classification Theorems . . . . . . . . . . . . . . . 145

2.10 The Cebotarev Density Theorem . . . . . . . . . . . . . . . . . . . . 150

2.11 Ring Class Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

Chapter 3 Quadratic Forms and n-Fermat Primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

3.1 The Theory of Binary Quadratic Forms . . . . . . . . . . . . . . . . . 168

3.2 The Form Class Group . . . . . . . . . . . . . . . . . . . . . . . . . . 175

3.3 n-Fermat Primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

iii

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

Appendix A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

A.1 The Four Squares Theorem . . . . . . . . . . . . . . . . . . . . . . . 193

A.2 The Snake Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

A.3 Cyclic Group Cohomology . . . . . . . . . . . . . . . . . . . . . . . . 198

A.4 Helpful Magma Functions . . . . . . . . . . . . . . . . . . . . . . . . 201

Curriculum Vitae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

iv

Abstract

Andrew J. Kobin

Most problems in number theory are exceedingly simple to state, yet many con-tinue to elude mathematicians even centuries after they were originally posed. Sucha question, “Given a positive integer n, when can a prime number be written in theform x2 + ny2?”, was solved by Cox [7], and although the statement is elementary,the solution requires the depth and power of class field theory to understand. In ourapproach to this question, we will explore a variety of topics, including: algebraicnumber fields; types of class groups and class fields; two density theorems; the maintheorems in class field theory; and the theory of quadratic forms. Our discussion willculminate in Theorem 2.11.3, a full characterization of primes of the form x2 + ny2.

However, the intrigue doesn’t end there. In Chapter 3, we pose the related ques-tion: “If p is a prime of the form x2 +ny2, when is y2 +nx2 also prime?” This questionturns out to be much harder to approach, but we will investigate the symmetric n-Fermat prime question thoroughly.

In certain sections (1.8, 2.10 and 3.3) we use the Magma Computational Alge-bra System to handle large or complicated computations. Many of the basic com-mands can be found in the Magma handbook, available at http://magma.maths.

usyd.edu.au/magma/handbook/ through the University of Sydney’s ComputationalAlgebra Group.

v

http://magma.maths.usyd.edu.au/magma/handbook/

http://magma.maths.usyd.edu.au/magma/handbook/

Chapter 1: Algebraic Number Fields

In the first chapter we provide a detailed description of the main topics in algebraic

number theory: algebraic number fields, rings of integers, the behavior of prime ideals

in extensions, norms of ideals, the discriminant and different, the class group, the

Hilbert class field, orders and Dirichlet’s unit theorem.

1.1 Rings of Algebraic Integers

Let Q be an algebraic closure of Q. Then Q is an infinite dimensional Q-vector space

and every polynomial f ∈ Q[x] splits in Q[x]. An example of such an algebraic closure

is Q = {u ∈ C | f(u) = 0 for some f ∈ Q[x]}. Then Q ⊂ Q ⊂ C. Note that any two

choices of Q are isomorphic.

One of the most important elements of a number field we will be working with is:

Definition. An element α ∈ Q is an algebraic integer if it is a root of some monic

polynomial with coefficients in Z.

Example 1.1.1.√

2 is an algebraic integer since it is a root of x2− 2. However, 12, π

and e are not algebraic integers. We will see in a moment why 12

is not algebraic, but

the proof for π and e is famously difficult.

Note that the set of algebraic integers in Q is precisely the integers Z. In a moment

we will generalize this set to fields other than Q.

Definition. The minimal polynomial of α ∈ Q is the monic polynomial f ∈ Q[x]

of minimal degree such that f(α) = 0.

The minimal polynomial of α is unique, as the following lemma shows.

1

Lemma 1.1.2. Suppose α ∈ Q. Then the minimal polynomial f of α divides any

other polynomial h such that h(α) = 0.

Proof. Suppose h(α) = 0. Then by the division algorithm, h = fq + r with deg r <

deg f . Note that r(α) = h(α) − f(α)q(α) = 0 so α is a root of r. But since deg f is

minimal among all polynomials of which α is a root, r must be 0. This shows that f

divides h.

Lemma 1.1.3. If α ∈ Q is an algebraic integer then the minimal polynomial has

coefficients in Z.

Proof. Let f ∈ Q[x] be the minimal polynomial of α. Since α is an algebraic integer,

there is some g ∈ Z[x] such that g(α) = 0. By Lemma 1.1.2, g = fh for some monic

h ∈ Q[x]. Suppose f 6∈ Z[x]. Then there is some prime p dividing the denominator

of at least one of the coefficients of f ; let pi be the largest power of p that divides a

denominator. Likewise let pj be the largest power of p that divides the denominator of

a coefficient of h. Then pi+jg = (pif)(pjh) and reducing mod p gives 0 on the left, but

two nonzero polynomials in Fp[x] on the right, a contradiction. Hence f ∈ Z[x].

An important characterization of algebraic integers is provided in the following

proposition.

Proposition 1.1.4. α ∈ Q is an algebraic integer if and only if

Z[α] =

{n∑i=0

ciαi : ci ∈ Z, n ≥ 0

}

is a finitely generated Z-module.

Proof. ( =⇒ ) Suppose α is integral with minimal polynomial f ∈ Z[x], where deg f =

k. Then Z[α] is generated by 1, α, . . . , αk−1.

2

( =⇒) Suppose α ∈ Q and Z[α] is generated by f1(α), . . . , fn(α). Let d ≥ M

where M = max{deg fi | 1 ≤ i ≤ n}. Then

αd =n∑i=1

aifi(α)

for some choice of ai ∈ Z. Hence α is a root of xd −n∑i=1

aifi(x) so it is integral.

Example 1.1.5. α = 12

is not an algebraic integer since Z[

12

]is not finitely generated

as a Z-module.

Definition. For a given algebraic closure Q of Q, we will denote the set of all algebraic

integers in Q by Z.

This set inherits the binary operations + and · from Q, and an important property

is that Z is closed under these operations:

Proposition 1.1.6. The set Z of all algebraic integers is a ring.

Proof. Note that 0 is a root of the zero polynomial, so 0 ∈ Z. Then it suffices to

prove closure under addition and multiplication.

Suppose α, β ∈ Z and let m and n be the degrees of their respective minimal

polynomial. Then 1, α, . . . , αm−1 span Z[α] and 1, β, . . . , βn−1 likewise span Z[β]. So

the elements αiβj for 1 ≤ i ≤ m, 1 ≤ j ≤ n span Z[α, β], so this Z-module is finitely

generated. This implies that the submodules Z[α + β] and Z[αβ] of Z[α, β] are also

finitely generated, so it follows by Proposition 1.1.4 that α+ β and αβ are algebraic

integers.

The two most important objects of study in algebraic number theory are number

fields and their associated rings of integers, which are defined below.

3

Definition. A number field is a subfield K ⊂ Q such that K is a finite dimensional

vector space over Q. The dimension of K/Q is called the degree of the field extension,

denoted [K : Q].

Definition. The ring of integers of a number field K is

OK = K ∩ Z = {α ∈ K | α is an algebraic integer}.

Example 1.1.7. Q is the unique number field of degree 1, and its ring of integers is

the rational integers Z.

Example 1.1.8. Q(i) is a number field of degree 2. Its ring of integers is Z[i], the

Gaussian integers.

Example 1.1.9. K = Q(√

5) has ring of integers OK = Z[(1 +√

5)/2]. The reader

may recognize this number (1 +√

5)/2 as the golden ratio.

An object we will study in Section 1.9 is:

Definition. An order in OK is any subring O ⊂ OK such that the quotient OK/O

of abelian groups is finite.

Example 1.1.10. For OQ(i) = Z[i], the subring Z+niZ is an order for every nonzero

n ∈ Z. However, Z ⊂ Z[i] is not an order since Z does not have finite index in Z[i].

Example 1.1.11. For K = Q(α) where α is an algebraic integer, Z[α] is an order in

OK but in general Z[α] 6= OK . We study orders in further detail in Section 1.9 and

see some important examples where Z[α] 6= OK .

Lemma 1.1.12. For any number field K, OK ∩Q = Z and QOK = K.

Proof. Suppose α ∈ OK ∩Q such that α = ab

in lowest terms. We may assume b > 0.

Since α is integral, Z[ab

]is finitely generated as a Z-module so b = 1.

4

On the other hand, suppose α ∈ K with minimal polynomial f(x) ∈ Q[x], where

deg f = n. For any positive integer d, the minimal polynomial of dα is dnf(xd

). In

particular, let d be the least common multiple of the denominators of the coefficients

of f . Then dnf(xd

)has integer coeffients, so dα ∈ OK . Hence QOK = K.

Definition. A lattice in a number field K is a subset L such that QL = K and L is

an abelian group of rank [K : Q].

Proposition 1.1.13. For any number field K, the ring of integers is a lattice in K.

Proof. QOK = K was proven in Lemma 1.1.12, and the second statement can be

shown by choosing a basis for K consisting of elements in OK .

Corollary 1.1.14. OK is a noetherian ring.

Proof. By the proposition, OK is finitely generated as a Z-module, so it is clearly

finitely generated as a ring. It is well known (cf. 2.2.7 in [25]) that this implies OK

is noetherian.

1.2 Dedekind Domains

A nice property of the integers Z is unique factorization: every integer n can be

written as a unique product of powers of prime numbers. This unique factorization

property fails in general for rings of algebraic integers. However, OK has the special

property that every nonzero ideal factors uniquely as a product of prime ideals.

Definition. An integral domain R is integrally closed in its field of fractions K if

every α ∈ K that is a root of a monic polynomial f ∈ R[x] is itself in R.

Proposition 1.2.1. Z is integrally closed and for any number field K, its ring of

integers OK is integrally closed.

5

Proof. First suppose α ∈ Q is integral over Z. Then there is some monic polynomial

f(x) in Z[x] such that f(α) = 0. If f(x) = a0 + a1x + . . . + xn then the ai all lie

in OK , where K = Q(a0, . . . , an−1). Since OK is finitely generated as a Z-module,

so is Z[a0, . . . , an−1]. Now f(α) = 0 means that we can write αn as a combination

of αi for i < n, with weights ci ∈ Z[a0, . . . , an−1]. Thus Z[a0, . . . , an−1, α] is also a

finitely generated Z-module. But notice that Z[α] is a submodule of Z[a0, . . . , an−1, α],

so it too is finitely generated. Hence α is integral over Z, meaning α ∈ Z. This

proves the first statement, and the second statement now follows easily. Suppose

α ∈ K is integral over OK . Then since Z is integrally closed, α ∈ Z, implying

α ∈ K ∩ Z = OK .

This property of OK is important in establishing it as a special type of domain,

called a Dedekind domain.

Definition. An integral domain R is a Dedekind domain if

(1) R is noetherian.

(2) R is integrally closed in its field of fractions.

(3) Every nonzero prime ideal p ⊂ R is maximal.

Example 1.2.2. Z[√

5] is not integrally closed since for example (1+√

5)/2 ∈ Q(√

5)

is integral over Z[√

5] but is not itself an element of Z[√

5]. Therefore Z[√

5] is not

a Dedekind domain, but as we shall see in Section 1.9 it is an order of the ring of

integers Z[(1 +√

5)/2].

Example 1.2.3. Any field is (trivially) a Dedekind domain.

Example 1.2.4. Z is integrally closed and every nonzero prime ideal is maximal, but

Z is not noetherian and hence not a Dedekind domain.

6

Proposition 1.2.5. OK is a Dedekind domain.

Proof. We have shown (Corollary 1.1.14 and Proposition 1.2.1) that OK is integrally

closed and noetherian, so it suffices to show that prime ideals are maximal.

Suppose p is a nonzero prime ideal of OK . Let α ∈ p and let f(x) = xn +

an−1xn−1 + . . .+ a0 be its minimal polynomial. Then f(α) = 0 so

a0 = −αn − an−1αn−1 − . . .− a1α ∈ p.

Then a0 ∈ Z∩ p so every element of the quotient OK/p is killed by a0, which implies

OK/p is finite. Since p is prime, OK/p is an integral domain, and every finite integral

domain is a field, which proves that p is maximal. Hence OK is a Dedekind domain.

The crucial property of Dedekind domains is that their nonzero ideals factor

uniquely into prime ideals. In fact, unique factorization holds for a more general

class of objects in a Dedekind domain called fractional ideals.

Definition. Let R be a Dedekind domain andK be its field of fractions. A fractional

ideal of R is a nonzero R-submodule of K that is finitely generated as an R-module.

Note that since fractional ideals are finitely generated, we can clear denominators

of a generating set to realize every fractional ideal in the form

aI = {ab | b ∈ I}

where a ∈ K and I is an integral ideal of the ring R.

Example 1.2.6. 12Z is a fractional ideal of Z.

Lemma 1.2.7. Let R be a Dedekind domain. For every nonzero ideal I ⊂ R, there

exist prime ideals p1, . . . , pn such that p1 · · · pn ⊂ I.

7

Proof. Let S be the set of nonzero ideals in R that do not satisfy the conclusion of the

lemma. The idea here is to use the fact that R is noetherian to show that S must be

empty. Supposing to the contrary that S is not empty, the noetherian property allows

us to choose a maximal element I ∈ S. If I were prime, it would trivially contain a

product of primes so we know this is not the case. Then there exist a, b ∈ Rr I such

that ab ∈ I. Let J1 = I + (a) and J2 = I + (b). Then neither J1 nor J2 is in S since

I is maximal, so each contains the product of primes, say

p1 · · · pr ⊂ J1 and q1 · · · qs ⊂ J2.

Then p1 · · · prq1 · · · qs ⊂ J1J2 = I2 + I(b) + (a)I + (ab) ⊂ I. We have shown I to

contain a product of primes, producing the necessary contradiction to show that S is

empty. Hence every nonzero ideal of R contains a product of primes.

The critical property of fractional ideals is proven next.

Theorem 1.2.8. The set of fractional ideals of a Dedekind domain R forms an abelian

group under ideal multiplication, with identity R.

Proof. The product of two fractional ideals is again finitely generated, hence a frac-

tional ideal. Also, for any nonzero ideal I, IR = R so it suffices to show the existence

of inverses.

First we prove that if p ⊂ K is prime, it has an inverse. Let I = {a ∈ K | ap ⊂ R};

we will show this is an inverse of p. Fix a nonzero b ∈ p. Since I is an R-module, bI

is an ideal in R. And since R ⊂ I we have p ⊂ Ip ⊂ R, but p is maximal (R is a

Dedekind domain) so either p = Ip or Ip = R.

If Ip = R then I is an inverse of p and we’re done. Instead suppose Ip = p. By

Lemma 1.2.7 we can choose a minimal product of prime ideals p1p2 · · · pm ⊂ (b) ⊂ p.

If no pi is contained in p then for each i there is some ai ∈ pi with ai 6∈ p, but∏ai ∈ p

8

which contradicts that p is a prime ideal. Thus there is some pi ⊂ p. However, every

prime is maximal so pi = p. Since m was minimal, p2p3 · · · pm 6⊂ (b) and so there is

some c 6∈ (b) that lies in p2p3 · · · pm. Then p(c) ⊂ (b) so we have d := cb∈ I. However

d 6∈ R since if it were, it would lie in (b). But note that d preserves p as an R-module

– that is, dp ⊂ p since d = cb

– so d must be in R, a contradiction. Hence Ip = R, so

every prime ideal has an inverse in R.

Now we turn to fractional ideals. Every fractional ideal is of the form aI for some

a ∈ K and I an ideal of R. Since the prime ideals are maximal in R, I ⊂ p for some

prime p. Multiplying both sides of this containment by p−1, we have

I ⊂ p−1I ⊂ p−1p = R.

By the same argument as above, p−1I = R so every fractional ideal has an inverse.

In the next two theorems we show that unique factorization of ideals holds in any

Dedekind domain.

Theorem 1.2.9. Every nonzero ideal I in a Dedekind domain R can be written as a

unique (up to order) product of prime ideals.

Proof. Suppose I is maximal among those ideals that cannot be factored into primes.

Every ideal is contained in a maximal ideal so I ⊂ p for some maximal p which is

also prime. If Ip−1 = I then p−1 = R by group properties, but this is impossible.

However, R ⊂ p−1 which implies I ( Ip−1. By maximality of I, Ip−1 = p1 · · · pn

for prime ideals pi. Then I = p1 · · · pnp, which shows I can in fact be written as

a product of primes, contradicting our initial assumption. Hence every ideal has a

prime factorization.

To prove uniqueness, suppose p1 · · · pn = q1 · · · qm. If no qi is contained in p1 then

for each i there is some ai ∈ qi r p1. But then a1 · · · am ∈ q1 · · · qm = p1 · · · pn ⊂ p1

9

which contradicts primality of p1. Thus p1 = qi for some i, and this argument can be

repeated for each pj to show that pj = qi for some i. Thus the factorization is unique

up to order.

Theorem 1.2.10. If I is a fractional ideal of R then there exist prime ideals p1, . . . , pn

and q1, . . . , qm so that

I = (p1 · · · pn)(q1 · · · qm)−1

and this factorization is unique up to order.

Proof. We can clear denominators to write aI = J for some a ∈ R and J an integral

ideal of R. Apply unique factorization to J and (a) and the result follows from

Theorem 1.2.8.

Example 1.2.11. Let K = Q(√−6) with ring of integers OK = Z[

√−6]. If ab =

√−6 with neither a unit, then Norm(a)Norm(b) = 6 (see Section 1.5). Without loss

of generality let Norm(a) = 2 and Norm(b) = 3. If a = x + y√−6 then Norm(a) =

x2 + 6y2 = 2 which has no solutions in Z. This shows that√−6 is irreducible, and

even if a or b were a unit, the other would equal√−6 so

√−6 would be irreducible

anyways. So 6 cannot be written as a product of irreducibles in OK . However, (6)

factors into prime ideals as Theorem 1.2.9 suggests:

(6) = (2, 2 +√−6)2(3, 3 +

√−6)2.

This is not trivial to calculate, but we will develop the techniques required to deter-

mine such a factorization in subsequent sections.

A special case of a Dedekind domain is:

Definition. An integral domain R is a discrete valuation ring if it is noetherian,

integrally closed and contains exactly one nonzero prime ideal.

10

In Section 2.1, we will see where the name ‘discrete valuation ring’ comes from,

as well as study some of the properties of a DVR as they relate to absolute values on

a field.

We proved in Theorem 1.2.8 that the set of fractional ideals of a Dedekind domain

forms a group under ideal multiplication, but there is an even stronger characteriza-

tion.

Theorem 1.2.12. Let R be an integral domain. Then the following are equivalent:

(1) R is a Dedekind domain.

(2) For every prime ideal p ⊂ R, the local ring Rp is a discrete valuation ring.

(3) The fractional ideals of R form a group.

(4) For every fractional ideal I ⊂ R there is an ideal J ⊂ R such that IJ = R.

Proof. See VIII.6.10 in [14].

In the case of OK , there are some important groups that arise from fractional

ideals, the most important being the class group.

Definition. Let IK denote the group of fractional ideals of OK and let PK denote

the subgroup of all principal fractional ideals of OK :

PK = {αOK | α ∈ K∗}.

Then the quotient IK/PK is called the ideal class group of K, denoted C(OK).

In Section 1.7 we explore this group fully. A major result we will prove is

Theorem. C(OK) is a finite group.

11

1.3 Ramification of Primes

Let K be a number field and suppose L/K is any finite extension. If p is a prime

ideal of OK then pOL is an ideal of OL and hence has prime factorization

pOL = Pe11 · · ·Peg

g

where Pi are the distinct prime ideals of OL containing p. We will sometimes say a

prime Pi lies over p, Pi contains p or Pi divides pOL.

Definition. For each Pi, the integer ei is called the ramification index of p in Pi.

If any of these are greater than 1, p is said to ramify in L.

Definition. Each ideal Pi lying over p gives a residue field extension OL/Pi ⊃ OK/p.

The degree of this extension, denoted fi, is called the inertial degree of p in Pi.

Definition. A prime p is said to split completely in L if ei = fi = 1 for all Pi in

the prime factorization of pOL. If in addition pOL is itself a prime ideal, i.e. g = 1,

we say p is inert.

The set of all prime ideals of a ring R together with (0) is called the spectrum

of R, denoted Spec(R). We will also occasionally use Spec(p) to denote the set of

primes P ⊂ OL lying over p ⊂ OK .

Example 1.3.1. In Z[i], (2) = (1 + i)2 so (2) ramifies with e1 = 2. By contrast, (3)

is inert in Q(i) with residue field Z[i]/(3) ∼= F9, and (5) = (2 + i)(2− i) is unramified.

The next lemma characterizes the primes of OL which divide pOL.

Lemma 1.3.2. A prime ideal P ⊂ OL divides pOL if and only if p = P ∩K.

Proof. ( =⇒ ) Clearly p ⊂ P∩K 6= OK . Since p is maximal, this implies p = P∩K.

( =⇒) If p ⊂ P then we have seen that pOL ⊂ P and this implies that P occurs

in the prime factorization of pOL.

12

There is an important relation between the ramification indices, inertial degrees

and number of primes in Spec(p) that is described in the next theorem, known as the

efg theorem.

Theorem 1.3.3. Let m = [L : K] and let P1, . . . ,Pg be the prime OL-ideals con-

taining p ⊂ OK. Then

g∑i=1

eifi = m.

Furthermore, if L/K is Galois, then all the ramification indices are equal to e = e1

and all the inertial degrees are equal to f = f1, so efg = m.

Proof. The first statement is proven by showing both sides are equal to [OL/pOL :

OK/p]. By the Chinese remainder theorem,

OLpOL

=OL∏Peii

∼=∏ OL

Peii

.

For each i = 1, . . . , g, fi is the degree of the extension OL/Pi ⊃ OK/p, and for each

ri, Prii /P

ri+1i is an OL/Pi-module. Since there is no ideal between Pri

i and Pri+1i –

(OL)Piis a DVR – this module has dimension 1 as an OL/Pi-vector space, and hence

dimension fi as an OK/p-vector space. Therefore each quotient in the chain

OL ⊃ Pi ⊃ P2i ⊃ · · · ⊃ Pei

i

has dimension fi over OK/p. Thus [OL/Peii : OK/p] = eifi. This shows that the left

side equals [OL/pOL : OK/p].

For the other equality, we first prove it when OL is a free OK-module (e.g. when

OK is a PID). On one hand, OnK∼=−→ OL induces an isomorphism Kn → L which

shows that n = m. On the other hand, OnK∼=−→ OL also induces an isomorphism

(OK/p)n → OL/pOL which shows that m = n = [OL/pOL : OK/p]. In the general

13

case, localize OK at p to obtain a DVR O′K = (OK)p. Since a DVR is always a PID,

O′L = (OL)p satisfies

pO′L =∏

(PiO′L)ei

so [O′L/pO′L : O′K/pO′K ] = m. This completes the first part of the proof.

Now assume L is Galois over K. Take σ ∈ G = Gal(L/K). Then if P ⊂ OL is

a prime ideal, so is σ(P). Moreover, if P contains p then by Lemma 1.3.2 so must

σ(P). Clearly e(σ(P) | p) = e(P | p) and f(σ(P) | p) = f(P | p).

To complete the proof, we will show that G acts transitively on Spec(p), the set

of prime ideals of OL lying over p. Suppose P and Q both contain p but are not

Galois conjugates. By the Chinese remainder theorem we can find an element β ∈ Q

that does not lie in σ(P) for any σ ∈ G. Define b = N(β), where N denotes the

norm (see Section 1.5). Then b ∈ OK and since β ∈ Q, b ∈ Q as well. Thus

b ∈ OK ∩ Q = p. On the other hand, β 6∈ σ−1(P) for any σ ∈ G so σ(β) 6∈ P.

However, N(σ(β)) = N(β) = b ∈ p so we have p ⊂ P which contradicts primality of

p. Hence Gal(L/K) acts transitively on the primes containing p and the result follows

by the preceding paragraph since e and f are invariant under Gal(L/K).

As we saw in Example 1.2.11, it is hardly easy to determine the factorization of

ideals in a number field. The next theorem will be of immense importance going

forward, as it allows us to describe the splitting behavior of a prime p ⊂ OK as we

pass to an extension L/K.

14

Theorem 1.3.4. Let L/K be Galois, where L = K(α) for some α ∈ OL. Let

f(x) ∈ OK [x] be the minimal polynomial of α over K. Suppose p is a prime ideal of

OK and f(x) is separable mod p. Then

(1) p is unramified in L.

(2) If f(x) ≡ f1(x) · · · fg(x) mod p for distinct fi(x) which are irreducible mod p,

then Pi = pOL + fi(α)OL is a prime ideal of OL, and the prime factorization

of pOL is

pOL = P1 · · ·Pg.

Furthermore, deg fi = f(Pi | p) for all i, and since L/K is Galois, these are

all the same.

(3) p splits completely in L ⇐⇒ f(x) ≡ 0 mod p has a solution in OK.

Proof. (1) and (3) will follow during the course of proving (2). To prove (2), observe

that since f(x) is separable mod p, f(x) ≡ f1(x) · · · fg(x) mod p for distinct, irre-

ducible (mod p) polynomials fi(x). If P ⊂ OL is a prime lying over p, then fi(α) ∈ P

for some i; we may relabel the fi so that f1(α) ∈ P. Then by Galois theory,

[OL/P : OK/p] ≥ [L : K] = deg f.

Now for any σ ∈ Gal(L/K) such that σ(P) = P, f1(σ(α)) ∈ P and f1(x) is separable

by hypothesis, so deg f1 ≥ ef , where e and f are the ramification index and inertial

degree, respectively, of p in P. This shows that e = 1 and f = deg f1, so (1) is proved.

Now let pOL = P1 · · ·Pg be the prime factorization of pOL into prime ideals of

OL. Theorem 1.3.3 implies that deg fi = f for all i, so it remains to prove that each

Pi is generated by p and fi(α). On one hand, pOL +fi(α)OL is contained in Pi since

fi(α) ∈ Pi (reindexing if necessary). On the other hand,∏

(pOL + fi(α)OL) ⊂ pOL.

15

Each ideal on the left is contained in a prime ideal in the factorization of pOL, and

this must be Pi for each i. This completes the proof of (2).

We will develop further techniques for deciding when a prime ramifies/splits/stays

inert in Section 1.6. For the moment, we do not even know if there are an infinite

number of primes splitting in an extension L/K; this question will finally be given

an answer in Section 2.10.

Example 1.3.5. In this example we provide a full characterization of the splitting

behavior of primes in quadratic extensions. Suppose K = Q(√n) where n is a square-

free integer. Then K/Q is Galois, so for each prime p ∈ Z we have 2 = efg by

Theorem 1.3.3. There are exactly three possibilities for e, f and g:

• e = 2 and f, g = 1. In this case p ramifies in OK so pOK = P2 for some prime

ideal P. It turns out that there are only finitely many such primes since by (3)

of the previous theorem, p ramifies in K if and only if x2 +n ≡ 0 (mod p) has a

multiple root. This ties in with the idea that the discriminant of a polynomial

determines its number of roots – in Section 1.6, we will see that the connection

between ramification and discriminants runs even deeper.

• f = 2 and e, g = 1. In this case p is inert, so pOK is prime. It turns out that this

happens half the time (minus the finitely many cases when a prime ramifies).

• g = 2 and e, f = 1. Here p splits completely in OK , so pOK = P1P2 for prime

ideals P1 6= P2. This happens the other half of the time.

Definition. For a quadratic field K = Q(√n), the discriminant of K is

dK =

{n if n ≡ 1 (mod 4)

4n otherwise.

16

For any integer q we also define the Kronecker symbol by

(q2

)=

0 if q ≡ 0 (mod 4)

1 if q ≡ 1 (mod 8)

−1 if q ≡ 5 (mod 8).

As a consequence of the above characterization of primes in OK , where K = Q(√n),

we have the following characterization of the splitting of primes in a quadratic exten-

sion.

Proposition 1.3.6. A prime p ramifies in K = Q(√n) if and only if p | dK, and p

splits completely in K if and only if(dKp

)= 1.

The first statement follows from the general case in Sections 1.6 and the second

is a consequence of (3) of Theorem 1.3.4, since(−4np

)=(−np

)= 1 if and only if

x2 + n ≡ 0 (mod p) for some integer x. For now, let’s take a look at a familiar

example.

Example 1.3.7. Let K = Q(i) and recall that the Gaussian integers Z[i] are the ring

of integers for K. In this example we will describe the splitting behavior of primes in

Z[i]. From the last few results, we claim that for an odd prime integer p (excluding

p = 2) the following are equivalent:

(i) p ≡ 1 (mod 4).

(ii) (p) splits completely in Z[i].

(iii) p = x2 + y2 for some integers x, y.

Proof. To prove our claim, note that Z[i] is the ring of integers for K = Q(i) so we

may take the α in Theorem 1.3.4 to be i, which has minimal polynomial x2 + 1 over

Q. Thus (p) splits completely in Z[i] if and only if x2 + 1 splits mod p. This in turn

17

happens if and only if Fp contains a fourth root of unity, i.e. F×p contains an element

of order 4. Since F×p has order p−1, this means 4 | p−1 and so (i)⇐⇒ (ii) is proven.

Next suppose (p) splits in Z[i]; let (p) = p1p2 for prime ideals p1, p2 ∈ Z[i]. In

Example 1.7.2, we will prove that the ring of Gaussian integers Z[i] is a PID. Using

this fact, we know p1 = (x + yi) for integers x and y, but then p2 must be (x − yi).

Therefore p = x2 + y2 up to multiplication by a unit in Z[i]. However the only

units are ±1,±i so clearly p must just be x2 + y2. Conversely, if p = x2 + y2 then

p = (x+ yi)(x− yi) in Z[i].

Note that this solves Fermat’s theorem characterizing primes of the form x2 + y2.

It will be a continuing theme in these notes to fully characterize primes of the form

x2 + ny2 for all integers n.

1.4 The Decomposition and Inertia Groups

In this section we describe two important subgroups of Gal(L/K) for a Galois exten-

sion L/K of number fields.

Definition. For a Galois extension L/K and a prime ideal P ⊂ OL lying over

p ⊂ OK , the decomposition group of P is

DP = {σ ∈ Gal(L/K) | σ(P) = P}

and the inertia group of P is

IP = {σ ∈ Gal(L/K) | σ(α) ≡ α mod P for all α ∈ OL}.

Let k = OK/p and ` = OL/P denote the respective residue fields of p and P. We

will prove that there is an exact sequence

1→ IP → DP → Gal(`/k)→ 1.

18

Recall from the proof of Theorem 1.3.3 that G = Gal(L/K) acts transitively on

Spec(p). Then we can interpret DP as the stabilizer of P under this action. The

Orbit-Stabilizer Theorem tells us that [G : DP] = g, where g is the number of distinct

primes in the factorization of pOL. Hence |Dp| = ef .

Lemma 1.4.1. For a fixed prime ideal p ⊂ OK, the decomposition groups DP of the

prime ideals lying over p are conjugate subgroups of Gal(L/K).

Proof. This is a more general fact about the stabilizers of a transitive group action.

Note that for σ, τ ∈ Gal(L/K),

τ−1στ ∈ DP ⇐⇒ τ−1στP = P ⇐⇒ στP = τP ⇐⇒ σ ∈ DτP

which implies that σ ∈ DP ⇐⇒ τστ−1 ∈ DP. Hence τDPτ−1 = DτP.

The decomposition group is useful because we can view an extension L/K as a

tower of extensions so that we understand the splitting of primes better in each step

of the tower.

Proposition 1.4.2. Let L/K be a Galois extension and fix a prime p ⊂ OK. Let

D = DP be the decomposition group for a particular prime P lying over p. Then the

fixed field

LD = {α ∈ L | σ(α) = α for all σ ∈ D}

is the smallest subfield E of L such that g = 1 for P ∩ OE.

Proof. First suppose E = LD. By Galois theory, Gal(L/E) ∼= D and as in the last

section, D acts transitively on the set of primes of OL lying over PE := P ∩ OE.

One of these primes is P itself, and D fixes P by definition, so this must be the only

prime lying over PE, i.e. g = 1.

On the other hand, if g = 1 for PE then Gal(L/E) fixes P: it’s the only prime

over PE. So Gal(L/E) ≤ D and by Galois correspondence, LD ⊂ E.

19

This shows that p does not split when moving from LD to L: it either ramifies or

stays inert. Let E = LD and denote P ∩ OE by PE, e = e(P | p), f = f(P | p) and

g = g(P | p). To piece together more of the puzzle, we have the following.

Proposition 1.4.3. Given K,L,E, p,P and PE as above, e(PE | p) = f(PE | p) =

1, g = [E : K], e = e(P | PE) and f = f(P | PE).

Proof. As mentioned in the remarks preceding Lemma 1.4.1, the Orbit-Stabilizer

Theorem implies that g = [Gal(L/K) : D]. Then by Galois theory, [Gal(L/K) : D] =

[E : K], so this equals g.

The previous proposition gives us g(P | PE) = 1, and by Theorem 1.3.3 we have

e(P | PE)f(P | PE) = [L : E] =[L : K]

[E : K]

=efg

[E : K]= ef.

Now e(P | PE) ≤ e and f(P | PE) ≤ f , so we must have that e(P | PE) = e and

f(P | PE) = f . It follows easily that e(PE | p) and f(P | p) are 1.

Fix a prime P ⊂ OL lying over p ⊂ OK . Observe that each σ ∈ DP acts on the

finite field ` = OL/P and fixes k = OK/p so we obtain a group homomorphism

ϕ : DP −→ Gal(`/k).

The next two results establish that ϕ is surjective, which we will use to prove exactness

of the sequence described at the start of this section.

Lemma 1.4.4. The residue field extension `/k is Galois.

Proof. First we show that `/k is normal. To do this, take any α ∈ ` and let f(x) be

20

its minimal polynomial over k. Let α ∈ OL be a lift of α. Then

f(x) =∏σ∈DP

(x− σ(α)) ∈ OK [x]

splits completely over OK and has α as a root, when taken mod p. Thus `/k is

normal. Furthermore, `/k will be Galois whenever it is separable, but since OK/p is

a finite field, it is perfect and therefore any finite extension is separable [10].

Proposition 1.4.5. ϕ is surjective.

Proof. By the lemma, `/k is Galois. We will show that ϕ(DP) acts transitively on the

conjugates of α over k. By the Chinese remainder theorem, one may choose α ∈ OL

such that

α ≡

{α mod P

0 mod P′ for any other P′ lying over p.

Then for any σ ∈ G rDP we have α ≡ 0 mod σ−1P and hence σ(α) ≡ 0 mod P.

This implies that

f(x) =∏σ∈DP

(x− σ(α)

) ∏σ 6∈DP

x

=∏σ∈DP

(x− ϕ(σ)(α))∏σ 6∈DP

x

which lies in k[x]. Notice that the first product lies in k[x], so it is divisible by the

minimal polynomial of α over k. So given any conjugate α′ of α, (x − α′) divides

the first product above and thus α′ must equal ϕ(σ)(α) for some σ ∈ DP. Hence the

action of ϕ(DP) on the conjugates of α is transitive and it follows that ϕ is surjective

since the image has at least [` : k] = |Gal(`/k)| elements.

Next we relate the inertia group IP to the map ϕ, and use it to prove that the

original sequence we defined is exact.

21

Proposition 1.4.6. The inertia group IP is the kernel of ϕ : DP → Gal(`, k).

Proof. By definition

kerϕ = {σ ∈ DP | σ(α) ≡ α mod P for all α ∈ OL}

so it suffices to show that if σ 6∈ DP then there exists an α ∈ OL such that σ(α) 6≡ α

mod P. If σ 6∈ DP then of course σ−1 6∈ DP so σ−1(P) 6= P. Since both σ−1(P)

and P are maximal ideals, there exists some α ∈ P with α 6∈ σ−1(P), which implies

σ(α) 6∈ P. Thus σ(α) 6≡ α mod P and it follows that IP = kerϕ.

We now summarize our findings.

Corollary 1.4.7. If L/K is a Galois extension, the sequence

1→ IP → DP → Gal(`/k)→ 1

is exact. Moreover, |IP| = e and |DP| = ef , where e and f are the ramification index

and inertial degree, respectively, for L/K.

Notice that the inertia group is a very useful measure of how a prime p ramifies

in a Galois extension L. This is a common theme in algebraic number theory: the

behavior of primes in an extension is often encoded in the automorphisms of the field

itself.

1.5 Norms of Ideals

In this section we define the norm of an ideal. As in previous sections, all of these

definitions and results generalize to any Dedekind domain A with integral closure B

– see [19] for the general cases. In our context, we will replace A with OK and B with

OL, which have fields of fractions K and L, respectively.

22

Let IK and IL denote the groups of fractional ideals of OK and OL, respectively.

We want to define a group homomorphism N : IL → IK . Since IL is the free abelian

group on the set of prime ideals in OL, we only have to define N for p prime.

Let p be a prime ideal of OL and factor

pOL =∏

Peii

for Pi prime. Suppose p = (π) is principal. Then we should have

N (pOL) = N (πOL) = N (π)OK = (π)m = pm

where m = [L : K]. We also want N to be a homomorphism, so we must have

N (pOL) = N(∏

Peii

)=∏N (Pi)

ei .

Recall that m =∑

eifi, so the correct definition for N is

Definition. For a prime P ⊂ OL lying over p ⊂ OK , the norm of P is defined to be

N (P) = pf

where f = [OL/P : OK/p].

To distinguish this norm from a similar norm to be defined shortly, we will some-

times refer toN as the ideal norm. If the norm is taken with respective to an extension

L/K, we write NL/K but when the context is clear we will often drop the decoration.

Remark. By the properties of inertial degree f , it is easy to see that for a tower

M ⊃ L ⊃ K,

NL/K(NM/L(a)) = NM/K(a).

Next we check that the properties discussed above hold for the norm we have

defined.

23

Proposition 1.5.1. Let L/K, OK and OL be as above.

(a) For any nonzero ideal a ⊂ OK, N (aOL) = am where m = [L : K].

(b) If L/K is Galois and P ⊂ OL is any nonzero prime ideal with p = P ∩ OK

and pOL = (P1 · · ·Pg)e, then

N (P) = (P1 · · ·Pg)ef =

∏σ∈Gal(L/K)

σ(P).

(c) For any nonzero element β ∈ OL, N(β)OK = N (βOL), where N denotes the

regular field norm.

Proof. (a) It suffices to prove this for prime ideals, for which we have

N (pOL) = N(∏

Peii

)= p

∑eifi = pm

using Theorem 1.3.3.

(b) Since N (Pi) = pf for any prime Pi in the prime factorization of pOL, the left

equality is clear. Recall that G = Gal(L/K) acts transitively on the set Spec(p) =

{P1, . . . ,Pg}. Then by the Orbit-Stabilizer Theorem, each Pi occurs

|Gal(L/K)||Spec(p)|

=m

g= ef

times in the collection {σ(P) | σ ∈ G}, which implies the right equality.

(c) First suppose L/K is Galois. Denote βOL by b. The map IK → IL given by

a 7→ aOL is injective since IK and IL are free on nonzero prime ideals, so it suffices

to show that N(β)OL = N (b). But by (b),

N (b) =∏σ∈G

σ(b) =∏σ∈G

(σ(β)OL) =

(∏σ∈G

σ(β)

)OL = N(β)OL.

24

In the general case, let E be a finite Galois extension of K containing L, with

d = [E : L] and OE the integral closure of OL in E. Then we have

NL/K(βOL)d = NE/K(βOE) by the remark

= NE/K(β)OK by the Galois case

= NL/K(β)dOK .

Lastly since IK is torsion-free, the above implies that NL/K(βOL) = NL/K(β)OK for

all nonzero β ∈ OL.

For a Galois extension K/Q, we define a different norm taking ideals of OK to

integers. We will see that the definition below coincides with the ideal norm.

Definition. Let a ⊂ OK be a nonzero ideal. The numerical norm of a is its index

in the lattice of integers: N(a) = [OK : a].

In order to justify this definition, we need to check that [OK : a] is always finite.

Proposition 1.5.2. Every nonzero ideal a in OK has finite index in the lattice OK.

Proof. Let a be a nonzero OK-ideal. The proof of Proposition 1.2.5 shows that a

contains a nonzero integer m (a0 from that proof). So consider ϕ : OK/mOK →

OK/a, which is clearly surjective. By Proposition 1.1.13, OK is a free Z-module of

rank n = [K : Q]. This means thatOKmOK

∼=Zn

mZnis a finite quotient of order mn.

Since ϕ is surjective, it follows that |OK/a| ≤ mn <∞.

Notice that the ideal norm is defined for any extension L/K and outputs an ideal

of OK . On the other hand, the numerical norm is defined on K/Q and outputs

an integer in Z. The connection between the two norms is described in the next

proposition.

25

Proposition 1.5.3. Let K be any number field.

(a) For any ideal a ⊂ OK, NK/Q(a) = (N(a)) and therefore N(ab) = N(a)N(b).

(b) For any fractional ideals b ⊂ a of OK, [a : b] = N(a−1b).

Proof. (a) Write a =∏

peii and let fi = f(pi | pi) where (pi) = Z ∩ pi. Then

N (pi) = (pi)fi . By the Chinese remainder theorem, OK/a ∼=

∏OK/peii and thus

[OK : a] =∏

[OK : peii ].

We previously proved that [OK : peii ] = peifii , thus [OK : a] =∏

(peifii ) = NK/Q(a).

When we identify the set of nonzero ideals of Z with the set of positive integer

generators, N and N are seen to coincide, and multiplicativity of N follows from the

same property of the ideal norm.

(b) We can multiply by some integer d to make a and b integral ideals. Then part

(a) gives us

[a : b] = [da : db] =[OK : db]

[OK : da]=

N(db)

N(da)= N(a−1b).

1.6 Discriminant and Different

One may recall the definition of discriminant from field theory. Here we present it in

the context of extensions of number fields.

Definition. For a number field extension L/K with rings of integers OL ⊃ OK ,

suppose OL has a basis {β1, . . . , βm} over OK . Then the discriminant of OL is

D(OL) = D(β1, . . . , βm) = det(TrL/K(βiβj))

26

where Tr denotes the trace.

In this section we use the discriminant to characterize which primes ramify in

Galois extensions of a number field.

Definition. Let D = D(OL) as above. The discriminant ideal of OL, denoted

∆(L/K) or simply ∆, is the ideal of OK generated by D.

We will prove

Theorem. The primes which ramify in OL are those that divide ∆.

First, we establish some properties of the discriminant. The details can be found

in [19].

Definition. Let L = K(β) for some β ∈ L and let f be the minimal polynomial of

β over K, setting deg f = m. Then the discriminant of f is defined to be

D(f) := D(1, β, β2, . . . , βm−1) = (−1)m(m−1)/2NL/K(f ′(β)).

Proposition 1.6.1. D(f) = 0 if and only if f has a repeated root, i.e. is not sepa-

rable.

Lemma 1.6.2. Let OL have basis {β1, . . . , βm} over OK. Then for any OK-ideal a,

{β1, . . . , βm} is a basis for OL/aOL over OK/a and the discriminant satisfies

D(β1, . . . , βm) ≡ D(β1, . . . , βm) mod a,

where the discriminant on the left is taken with respect to OL/aOL (as a module over

OK/a) and on the right with respect to OL over OK.

Proof. See 3.36 in [19].

We are now ready to prove the main result.

27

Theorem 1.6.3. A prime p ⊂ OK ramifies in OL if and only if p | ∆(L/K).

Proof. By definition ∆ = ∆(L/K) is the ideal generated by D = D(OL). Thus p | ∆

if and only if D ∈ p, which in turn happens if and only if

D(β1, . . . , βm) = det(Tr(βiβj)) = 0

in OK/p by Lemma 1.6.2. Let p have factorization pOL = Pe11 · · ·P

egg . By the Chinese

remainder theorem,

OL/pOL ∼= OL/Pe11 ⊕ · · · ⊕ OL/Peg

g .

First suppose p is not ramified in L. Then each ei = 1 and OL/Pi is a separable

extension of OK/p. Let ti denote the trace map OL/Pi → OK/p. Select a basis {ui}

for OL/pOL such that {u1, . . . , uk} is a basis for OL/P1, {uk+1, . . . , uk+l} is a basis

for OL/P2, etc. Then for each y ∈ OL/pOL, y = y1 + . . .+ yg with yi ∈ OL/Pi. Each

multiplication map ri : x 7→ xyi takes OL/Pi to itself, and if ri has standard matrix

Ai then the matrix for r : x 7→ xy decomposes into the block matrix

A =

A1

A20

0. . .

Ag

Then Tr(y) = t1(y1) + . . . + tg(yg). More importantly, the discriminant matrix has

block form

B =

∆1

∆2

. . .

∆g

where ∆i is the discriminant of the chosen basis of OL/Pi over OK/p. But OL/Pi

28

is separable over OK/p if and only if det ∆i 6= 0. Hence if we view the B above as a

change of basis matrix, we have

D(β1, . . . , βg) = (detB)2(det ∆1)(det ∆2) · · · (det ∆g) 6= 0.

By the initial comments, this shows that when p is unramified, p - ∆.

On the other hand, if some ei > 1 then OL/Pi is not separable over OK/p.

We may reindex the primes lying over p so that e1 > 1. Choose a basis {vi} for

OL/Pe11 such that v1 ∈ P1/P

e11 . In the quotient, ve11 = 0 so the multiplication map

rv1 : x 7→ xv1 from above (now defined for the vi) is trivial. Moreover, (v1vj)e1 = 0

so the characteristic polynomial for the map rv1vj only has roots for its eigenvalues.

Thus ti(v1vj) = Tr(rv1vj) = 0 so the discriminant matrix for OL/P1 over OK/p has a

row of zeros. Hence det ∆1 = 0 which implies D(β1, . . . , βm) = 0. By the preliminary

comments, this shows that p divides ∆.

This tells us when a prime in OK ramifies in L, but the discriminant misses some

critical information:

• Which primes in OL lying over p ramify? That is, which primes P in the

factorization of pOL have ramification index greater than 1?

• How do we determine the multiplicity of a prime dividing the discriminant?

The rest of this section follows K. Conrad’s paper “The Different Ideal”, which

outlines the techniques required to answer these questions. To motivate the problem,

consider the following example.

Example 1.6.4. Let K = Q(α) where α is a root of f(x) = x3 − x − 1. This

polynomial has discriminant −23 so 23 is the only integer prime which ramifies in

29

OK . Since [K : Q] = 3 and

x3 − x− 1 ≡ (x− 3)(x− 10)2 mod 23,

the factorization we obtain from Theorem 1.3.4 is 23OK = pq2 where p 6= q and both

are prime. In general, how do we know that q ramifies but p doesn’t?

Definition. For a lattice L in a number field K, its dual lattice is

L∨ = {α ∈ K | TrK/Q(αL) ⊂ Z}.

Proposition 1.6.5. For a lattice L with Z-basis {e1, . . . , en}, the dual lattice may be

written

L∨ =n⊕i=1

Ze∨i

where {e∨i } is the dual basis of {ei} relative to the trace product on K/Q. In particular,

L∨ is a lattice.


We will see in the next section just how useful lattices can be in algebraic number

theory, but for now we will focus on the lattice OK and its fractional ideals. Consider

the dual lattice O∨K . First, we should recognize that O∨K is not just the elements of

K with trace in Z – actually it’s smaller. But since algebraic integers have integral

trace, we see that OK ⊂ O∨K .

Proposition 1.6.6. For any fractional ideal a in K, a∨ is a fractional ideal and

a∨ = a−1O∨K. Moreover, O∨K is the largest fractional ideal of K whose elements all

have integral trace.

Proof. By definition, a∨ = {α ∈ K | TrK/Q(αa) ⊂ Z}. First, since any dual lattice

is a lattice by the previous proposition, we know a∨ is a finitely generated Z-module.

30

Take α ∈ a∨ and x ∈ OK . Then for any β ∈ a,

TrK/Q((xα)β) = TrK/Q(α(xβ)) ∈ Z

since xβ ∈ a and α ∈ a∨. Thus xα ∈ a∨ so a∨ is a fractional OK-ideal.

Next we check the formula for a∨. Take α ∈ a∨ again. Then for any β ∈ a,

Tr(αβOK) ⊂ Z since βOK ⊂ a. Thus αa ⊂ O∨K which implies α ∈ a−1O∨K . This

shows a∨ ⊂ a−1O∨K and the reverse containment is similarly shown.

For the last statement, note that any fractional OK-ideal satisfies a = aOK .

Therefore

Tr(a) ⊂ Z ⇐⇒ Tr(aOK) ⊂ Z ⇐⇒ a ⊂ O∨K .

Since O∨K is a fractional ideal containing OK , its inverse is an integral ideal con-

tained in OK , from which we define:

Definition. The different of K is the ideal

DK = (O∨K)−1 = {x ∈ K | xO∨K ⊂ OK}.

Example 1.6.7. For K = Q(i) with OK = Z[i], Tr(a + bi) ∈ Z precisely when

2a ∈ Z, so we see that Z[i]∨ = 12Z[i]. Thus the different of K is 2Z[i]. This can be

verified with the next proposition.

Proposition 1.6.8. If OK = Z[α] then DK = (f ′(α)) where f(x) is the minimal

polynomial of α over Q.


31

Example 1.6.9. For a quadratic field K = Q(√n) where n is squarefree,

DK =

{(2√n) if n 6≡ 1 (mod 4)

(√n) if n ≡ 1 (mod 4).

The different is related to the field discriminant dK by the following.

Theorem 1.6.10. For a number field K of discriminant dK, NK/Q(DK) = |dK |.

Proof. Let {β1, . . . , βn} be a Z-basis for OK so that we have

OK =n⊕i=1

Zβi.

Then D−1K = O∨K =

n⊕i=1

Zβ∨i by Proposition 1.6.5. Using the definition of norm, we

have

N (DK) = [OK : DK ] = [D−1K : OK ] = [O∨K : OK ].

We can calculate [O∨K : OK ] by finding | detA| where A is the matrix expressing the

basis {β1, . . . , βn} in terms of the dual basis {β∨1 , . . . , β∨n}. Since {β∨i } is a dual basis

of {βi} it follows that

A = (TrK/Q(βiβj))

and by definition detA = D(OK) = dK . The result follows.

Lemma 1.6.11. For any nonzero ideal a ⊂ OK, a | DK if and only if Tr(a−1) ⊂ Z.

Proof. This may be stated as a ⊃ DK = (O∨K)−1 which in turn is equivalent to

a−1 ⊂ O∨K . By Proposition 1.6.6, this is equivalent to Tr(a−1) ⊂ Z.

Dedekind proved the following characterization of ramified primes in terms of the

different ideal. The proof can be found in [6].

32

Theorem 1.6.12 (Dedekind). The prime factors of DK are exactly the primes in K

that ramify over Q. In particular, for any prime ideal p ⊂ OK lying over a prime

p ∈ Z with ramification index e, the multiplicity of p in DK is e − 1 when e 6≡ 0

(mod p), and at least e when p | e.

Corollary 1.6.13. The primes in Z that ramify in K are precisely the prime divisors

of dK.

Proof. Use the fact that |dK | = N (DK) and Theorem 1.6.12.

Note that this also proves the rest of Proposition 1.3.6 which characterized ramified

primes in quadratic extensions.

The theorem we just proved showed the true power of the different: while the

discriminant also tells us if a prime ramifies in an extension, it does not tell us anything

about the ramification indices of the primes in the larger field. This information is

conveyed by the different, but if we know the full factorization of pOK , we can relate

this multiplicity to dK :

Corollary 1.6.14. Suppose pOK = pe11 · · · pegg with inertial degrees denoted by fi.

Then the multiplicity of p in dK is at least

(e1 − 1)f1 + . . .+ (eg − 1)fg = n− (f1 + . . .+ fg).

Furthermore, if p - ei for all i then this is the exact multiplicity of p in dK.

It turns out that the multiplicity of a prime p ∈ DK is bounded by

e− 1 ≤ ordp(DK) ≤ e− 1 + e ordp(e).

The left is Theorem 1.6.12 and the right was proven by Hensel (see [6]).

33

We can extend the ideas of O∨K and DK to an arbitrary extension of number fields

L/K in the following way. Define the fractional ideal

O∨L = {x ∈ L | TrL/K(xy) ∈ OK for all y ∈ OK}.

Then we have

Definition. For an extension of number fields L/K, the relative different is

δL/K = (O∨L)−1 = {x ∈ L | xO∨L ⊂ OL}

which is an integral ideal of OL.

As in the case withK/Q we have several important results for the relative different.

See section 15 of [13] for details.

Theorem 1.6.15. For any extension of number fields L/K, the discriminant and

relative different are related by DL/K = NL/K(δL/K).

Theorem 1.6.16. Let dL and dK be the field discriminants of L and K, respectively.

Then dL = ±dKNL/K(DL/K).

Theorem 1.6.17. For all number field extensions L/K, DL = DKδL/K.

Corollary 1.6.18. The primes of OL that ramify over K are precisely those that

divide the relative different δL/K.

1.7 The Class Group

Recall that the class group of a number field K is C(OK) = IK/PK , where IK is the

group of fractional ideals of OK and PK is the subgroup of principal fractional ideals.

There is an exact sequence

0→ O∗K → K∗ → IK → C(OK)→ 0.

34

In this section we explore the structure of the class group and prove that its order,

called the class number hK of K, is finite.

In the previous section we defined the discriminant of a number field K to be

dK = D(OK). We will prove

Theorem. Let K be a number field with [K : Q] = n and discriminant dK. Let

2s be the number of nonreal embeddings of K into C. Then there exists a set of

representatives for the ideal class group C(OK) consisting of ideals a ⊂ OK with

N(a) ≤ n!

nn

(4

π

)s√|dk|.

The value on the right is called the Minkowski bound and is often denoted BK .

According to [24], BK is currently the best known bound for a generating set of

C(OK) that does not depend on unproven conjectures.

In the statement of the main theorem, 2s counted the number of nonreal em-

beddings K ↪→ C. Alternatively, by the primitive element theorem we may write

K = Q(α) for some α ∈ K with minimal polynomial f(x) ∈ Q[x]. Then 2s is the

number of nonreal roots of f . We will also denote by r the number of real roots of f ,

so that we have

K ⊗Q R ∼= Rr × Cs ∼= Rr+2s (as Q-vector spaces).

Before proving the main theorem, we present some applications and examples

using the Minkowski bound. The first result is an important property of the class

group.

Theorem 1.7.1. The class number hK := |C(OK)| is finite for any number field K.

Proof. It suffices to show there are only finitely many ideals a ⊂ OK whose norms fall

under the bound. Let a =∏

prii so that N(a) =∏

prifii where (pi) = Z ∩ pi. Since

35

N(a) is bounded by BK , there are only finitely many possibilities for the pi – and

hence for the pi – and only finitely many possibilities for the ri. Hence the number

of such a is finite, and it follows that the class group is finite.

Note that hK = 1 if and only if OK is a principal ideal domain. Thus the

class group is a direct measure of how far the ring of integers is from being a PID.

Since every PID is also a UFD, the class number is related to how badly unique

factorization fails in OK . An open question in class field theory asks if there are

infinitely many number fields with hK = 1. However, it is known [7] that there

are only nine imaginary quadratic fields with class number 1; these are Q(√n) for

n = −1,−2,−3,−7,−11,−19,−43,−67,−163.

Example 1.7.2. Let K = Q(i). Then n = 2, s = 1 and |dK | = 4 so the Minkowski

bound is

2!

22

(4

π

)1√4 =

4

π< 2.

Thus every fractional ideal is equivalent to an ideal of norm 1. Since the only ideal

of norm 1 is (1), every ideal is principal. Hence hK = 1, which reflects the fact that

Z[i] is a PID.

Example 1.7.3. Let K = Q(√−5). Then a set of representatives for C(OK) may

be chosen with N(a) ≤ BK ≈ 0.63√

20 < 3. Thus every ideal that satisfies this must

divide 2OK . In fact we can use Theorem 1.3.4 to compute the factorization of 2:

2OK = (2, 1 +√−5)2.

Since N(2OK) = 22 = 4, it must be that N((2, 1 +√−5)) = 2. This shows that OK

and (2, 1+√−5) form a set of representatives for C(OK). Further, (2, 1+

√−5) cannot

be principal because there is no element α = a + b√−5 with N(α) = a2 + 5b2 = 2.

Hence |C(OK)| = 2.

36

Another useful application of the Minkowski bound is to prove

Theorem 1.7.4. Every extension of Q ramifies at some prime.

Proof. We will prove this for K/Q a finite extension. A set of representatives of

C(OK) has at least one element and the element has numerical norm ≥ 1. Define a

sequence an by an = nn

n!

(π4

)n/2and note that by the Minkowski bound,

an ≤nn

n!

(π4

)s≤√|dK |.

We can also see that a2 > 1 and for all n ≥ 2,

an+1

an=(π

4

)1/2(

1 +1

n

)n> 1.

So the sequence (an) is monotone increasing. This implies |dK | > 1 and Corol-

lary 1.6.13 tells us that some prime ramifies.

This shows that the only unramified extension of Q is Q itself. However, there

may exist unramified extensions of a number field other than Q. In the next section

we will describe the Hilbert class field of K, which is the maximal unramified abelian

extension L of K. This field has the special property that Gal(L/K) is isomorphic to

the class group C(OK).

Constructing further abelian extensions of the Hilbert class field is called the class

field tower problem. Let K1 be a number field with class number h1 > 1. Let K2 be

the Hilbert class field of K1, let K3 be the Hilbert class field of K2, and so on. It is

an open question to decide when the tower

· · · ⊃ K3 ⊃ K2 ⊃ K1 ⊃ Q

is infinite, or terminates with a field of class number 1 in a finite number of steps.

37

Golod and Shafarevich [11] proved that there are fields K1 with infinite class field

towers.

For the rest of the section, we develop the mechanics required to prove the

Minkowski bound. First we redefine the notion of a lattice in a vector space – a slight

generalization of the definition given in Section 1.1, where lattices were assumed to

have full rank.

Definition. Let V be an n-dimensional vector space over R. A lattice in V is a

subgroup of the form

Λ = Ze1 + . . .+ Zer

where e1, . . . , er are linearly independent vectors in V . When r = n, Λ is said to be

a full lattice in V .

Remark. A lattice is a free abelian subgroup of V generated by elements of V that

are linearly independent over R. A full lattice Λ ⊂ V is a subgroup such that the

map

R⊗Z Λ −→ V∑ri ⊗ xi 7−→

∑rixi

is an isomorphism.

Since V is isomorphic to Rn, this induces a topology on V .

Definition. A subgroup W ⊂ V is a discrete subgroup if every point in W is

open in the topology on V , i.e. if every w ∈ W has a neighborhood U such that

U ∩W = {w}.

Proposition 1.7.5. A subgroup Λ ⊂ V is a lattice if and only if it is a discrete

subgroup.

38

Proof. ( =⇒ ) is clear. See 4.15 in [19] for the other direction.

Definition. Let Λ =∑

Zei be a full lattice in V . Then for any λ0 ∈ Λ, the set

D ={λ0 +

∑aiei : 0 ≤ ai < 1

}is called a fundamental parallelopiped for Λ.

The shape of the parallelopiped depends on the choice of the ei, but for a fixed

basis we may vary the λ ∈ Λ so that the parallelopipeds cover Rn without overlaps.

Furthermore, if D is a fundamental parallelopiped for Λ =∑

Zei, the volume of D

is given by

µ(D) = | det(e1, . . . , en)|.

(Here µ is actually Lebesgue measure, but all our sets will have well-defined volumes.)

Notice that if Λ = Zf1 + . . .+ Zfn then the change-of-basis matrix between {ei} and

{fi} has determinant ±1, so the volume of D does not depend on the choice of basis

for Λ.

Definition. For a set T ⊂ Rn, we say T is convex if every pair of points in T is

connected by a line that lies in T .

Definition. A set T ⊂ Rn is symmetric about the origin if α ∈ T implies −α ∈ T .

Lemma 1.7.6. Let D be a fundamental parallelopiped for a full lattice Λ in V and

suppose S is a measurable subset of V . If µ(S) > µ(D) then S contains distinct points

α and β such that β − α ∈ Λ.


It will be useful to let T be a subset of V such that for any α, β ∈ T , 12(α−β) ∈ T ,

and let S = 12T . Then by Lemma 1.7.6, T contains the difference of any two points

in S and so T will contain a point of Λ r {0} whenever µ(D) < µ(

12T)

= 2−nµ(T ).

39

The main theorem en route to proving the Minkowski bound is a classic theorem

in the geometry of numbers in its own right:

Minkowski’s Theorem. Let T be a subset of V that is compact, convex and sym-

metric about the origin. If Λ is a lattice in V with fundamental parallelopiped D such

that

µ(T ) ≥ 2nµ(D)

then T contains a point of Λ other than the origin.

Proof. Let ε > 0. Then

µ((1 + ε)T ) = (1 + ε)nµ(T ) > 2nµ(D)

so by the preceding comments, (1 + ε)T contains a point of Λr {0}. T only contains

finitely many points in Λr{0} since Λ is discrete and T is compact (and so is (1+ε)T ).

Now since T is closed,

T =⋂ε>0

(1 + ε)T

so if none of the finitely many points in Λ∩ (1 + ε)T other than the origin were in T ,

we could keep making ε > 0 smaller and smaller so that (1 + ε)T contains no point

of Λ other than the origin. This of course contradicts the lemma, hence T contains a

point in Λ r {0}.

For a fascinating application of Minkowski’s theorem to the proof of the Four

Squares Theorem, see Appendix A.1.

Moving forward, let K be a number field with [K : Q] = n. Suppose K has r real

embeddings {σ1, . . . , σr} and 2s complex embeddings {σr+1, σr+1, . . . , σr+s, σr+s}, so

40

that n = r + 2s. Then we have an embedding

σ : K ↪→ Rr × Cs

α 7→ (σ1(α), . . . , σr+s(α)).

Let V = Rr × Cs and identify V with Rn using {1, i} as a basis for C. The relation

between ideals of OK and lattices in V is contained in the following proposition.

Proposition 1.7.7. Let a ⊂ OK be any nonzero ideal. Then σ(a) is a full lattice in

V and the volume of any fundamental parallelopiped for σ(a) is 2−sN(a)√|dK |.

Proof. Let {α1, . . . , αn} be a basis for a as a Z-module. We claim that {σ(α1), . . . , σ(αn)}

is a basis for σ(a). To prove this, we will show that the matrix A whose ith row is

(σ1(αi), . . . , σr(αi),Re(σr+1(αi)), im(σr+1(αi)), . . . ,Re(σr+s(αi)), im(σr+s(αi)))

has nonzero determinant. First consider the matrix B with ith row

(σ1(αi), . . . , σr(αi), σr+1(αi), σr+1(αi), . . . , σr+s(αi)).

By definition of the discriminant, (detB)2 = D(α1, . . . , αn) 6= 0.

Next we relate the determinants of A and B. If we perform the column opera-

tions Cr+2 + Cr+1 → Cr+1 and −12Cr+1 + Cr+2 → Cr+2 to matrix B, we will have

2 Re(σr+1(αi)) in Cr+1 and −i · im(σr+1(αi)) in Cr+2. Repeat this for the remaining

pairs of columns to obtain a matrix A′. These column operations do not change detB

and it’s easy to scale A′ to obtain A, so we have detB = detA′ = (−2i)s detA, or

detA = (−2i)−s detB = ±(−2i)−sD(α1, . . . , αn)1/2 6= 0.

Thus {σ(αi)} is a basis for σ(a), which proves σ(a) is a lattice in V of rank n.

Now we can write σ(a) =∑

Zσ(αi) so the fundamental parallelopiped for σ(a)

41

has volume | detA|. One can prove that

|D(α1, . . . , αn)| = [OK : a]2 |D(OK/Z)| = N(a)2|dK |

(see 4.26 in [19] for details). Hence µ(D) = 2−s|D(α1, . . . , αn)|1/2 = 2−sN(a)√|dK |.

Lemma 1.7.8. Let a ⊂ OK be an integral ideal. Then a contains an element α ∈ K∗

whose norm is bounded by

|N(α)| ≤ n!

nn

(4

π

)sN(a)

√|dK |.

Proof. For a fixed positive t ∈ R, define X(t) = {v ∈ V : ||v|| ≤ t}, where || · || is the

Euclidean norm on V . Using complex analysis (see 4.27 in [19]), one can calculate

µ(X(t)) = 2r(π

2

)s tnn!.

The set X(t) is compact (it is closed and bounded), convex and symmetric about the

origin, and we may choose a large enough t so that

µ(X(t)) ≥ 2nµ(D)

where D is a fundamental parallelopiped for a. Then Minkowski’s theorem says that

X(t) contains some σ(α) 6= 0, for some α ∈ a. Consider

|N(α)| = |σ1(α)| · · · |σr(α)| · |σr+1(α)|2 · · · |σr+s(α)|2

≤ (∑|σi(α)|+

∑2|σi(α)|)n

nn(geometric mean ≤ arithmetic mean)

≤ tn

nn.

42

Now in the case that µ(X(t)) ≥ 2nµ(D), we must have by Proposition 1.7.7 that

2r(π

2

)s tnn!≥ 2n2−sN(a)

√|dk| ⇐⇒ tn ≥ n!

2n−r

πsN(a)

√|dK |.

So choose t ∈ R so that tn = n!2n−r

πs N(a)√|dK |. Then the above work gives the result:

|N(α)| ≤ n!

nn· 2n−r

πsN(a)

√|dK | =

n!

nn

(4

π

)sN(a)

√|dK |.

We are now ready to prove the main theorem:

Theorem 1.7.9. For a number field K, there exists a set of representatives for the

class group C(OK) consisting of integral ideals a whose norms satisfy

N(a) ≤ n!

nn

(4

π

)s√|dK |.

Proof. Let c be a fractional ideal in IK . There is some d ∈ K∗ that clear the de-

nominators of c−1, so b := dc−1 is an integral ideal. By Lemma 1.7.8 there exists a

nonzero element β ∈ b with |N(β)| ≤ BKN(b). Note that βOK ⊂ b which implies

βOK = ab for some integral ideal a, with a ∼ b−1 ∼ c in the class group. Then

N(a)N(b) = |N(β)| ≤ BKN(b). Since N(b) = [OK : b] > 0, we can cancel to obtain

N(a) ≤ BK .

In general we can compute C(OK) with the following approach:

1) Use the results from Section 1.3 to list all prime ideals p ⊂ OK that appear in the

factorization of any prime p ≤ BK .

2) Find the group generated by the ideal classes [p] for the primes found in Step 1.

43

Example 1.7.10. Let K = Q(√−19) with ring of integers OK = Z[(1 +

√−19)/2].

Since n = 2, r = 0, s = 1 and dK = −19, the Minkowski bound for K is

BK =2!

22

(4

π

)1√19 ≈ 2.775.

So every class in C(OK) is represented by a prime ideal with norm either 1 or 2.

The ideal 2OK is unramified in K since 2 - dK . The minimal polynomial of α =

(1 +√−19)/2 is f(x) = x2 − x + 5, so because

(−192

)= −1 and f has no roots

mod 2, Theorem 1.3.4 tells us that 2OK is inert and thus prime in K. Clearly this

is principal, so the class group is trivial. By previous comments h(−19) = 1 implies

that Z[(1 +√−19)/2] is a PID.

Example 1.7.11. Let K = Q(√−2) with OK = Z[

√−2]. Note that n = 2, r = 0,

s = 1 and dK = −8 so the Minkowski bound is calculated to be

BK =2!

22

(4

π

)1√8 ≈ 1.801.

It easily follows that C(OK) is trivial and hence Z[√−2] is a PID. In particular,

Z[√−2] has unique factorization. We will use this fact to deduce a famous theorem

of Fermat whose proof was first discovered by Euler.

Theorem 1.7.12 (Fermat). The only integer solutions to x3 = y2 + 2 are (3,±5).

Proof. First suppose ab = u3 in Z[√−2] where a and b are relatively prime. We will

show that a and b must be cubes in Z[√−2]. Since Z[

√−2] is a UFD, we may write

u = γ∏

peii for primes pi ∈ Z[√−2], integers ei and some unit γ. Then

ab = u3 =(γ∏

peii

)3

= γ3∏

p3eii .

44

Since a and b are relatively prime, each pi appears in exactly one of the factorizations

for a and b. So by the above equality, a and b each factor into products of primes

whose exponents are all 3ei. We have not worried about the unit γ yet, but that is

because the units in K are ±1, each of which is a cube in Z[√−2] anyways. Thus we

conclude that a and b are both cubes in Z[√−2].

Now suppose (x, y) is an integer solution to x3 = y2 + 2 = (y +√−2)(y −

√−2).

If d divides both y +√−2 and y −

√−2, then it divides their difference:

(y +√−2)− (y −

√−2) = 2

√−2.

However√−2 is prime in Z[

√−2] (norm is multiplicative), so d must divide 2. Sup-

pose x were even. Then we would have y2+2 ≡ x3 ≡ 0 (mod 8), or y2 ≡ −2 (mod 8).

Of course −2 is not a square mod 8, so x must be odd. This forces y to be odd as

well, so d | y2 +2 implies that d must be 1. Hence y+√−2 and y−

√−2 are coprime.

By the first part of the proof, y +√−2 and y −

√−2 are both cubes in Z[

√−2].

Write y +√−2 = (a + b

√−2)3 = (a3 − 6ab2) + (3a2b − 2b3)

√−2. We now solve for

a and b to show that (3,±5) are the only valid choices for (x, y). From the above,

we see that 1 = 3a2b − 2b3 = b(3a2 − 2b2). Since a and b are integers, this implies

b = ±1. If b = −1, the other factor is 3a2 + 2 = 1, which can be written 3a2 = −1.

This of course is impossible. So b = 1 and this means 3a2− 2 = 1 which has solutions

a = ±1. Plugging these values in above, we see that y = ±5 and x = 3.

1.8 The Hilbert Class Field

Prime ideals p ⊂ OK are often referred to as finite primes to distinguish them from

infinite primes, which are defined as

Definition. A real infinite prime of a number field K is an embedding σ : K ↪→ R,

while a complex infinite prime is a pair of conjugate embeddings σ, σ : K ↪→ C.

45

We will see why it is useful to include infinite primes in our list of primes of K in

Section 2.1. For now, we use it to define unramified extensions of a number field, but

first we need to define when an infinite prime ramifies.

Definition. Given an extension L/K, an infinite prime σ of K is said to ramify in

L if σ is real and has an extension to L which is complex.

Example 1.8.1. The infinite prime σ : Q ↪→ R is unramified in Q(√

2) but σ is

ramified in Q(√−2).

Definition. We say an extension of number fields L/K is unramified if every prime

in K, finite or infinite, is unramified in L.

A number field may have unramified extensions of arbitrary degree – the work

of Golod and Shafarevich [11] in the 1960s was famous for its rather complicated

examples. However, if we restrict our focus to unramified abelian extensions, the

theory becomes more tractable.

Theorem. For every number field K, there exists a finite Galois extension L ⊃ K

such that L is an unramified abelian extension of K, and L contains every other

unramified abelian extension of K.

Proof. This will follow from a more general result established in Section 2.9.

Definition. The Hilbert class field of a number field K is the maximal unramified

abelian extension of K.

For now we will assume the existence of the Hilbert class field and further develop

the connections between Hilbert class fields and rings of integers. The main tool

in describing this relationship is the Artin symbol, whose existence is proved in the

following lemma.

46

Lemma 1.8.2. Let L/K be a Galois extension, p ⊂ OK an unramified prime and P

a prime of OL lying over p. Then there is a unique element σ ∈ Gal(L/K) such that

for all α ∈ OL,

σ(α) ≡ αN(p) mod P

where N(p) = [OK : p] is the norm of p.

Proof. Let D = DP and I = IP be the decomposition and inertia groups of P ⊃ p.

Let ` = OL/P and k = OK/p, with G = Gal(`/k). Recall from Proposition 1.4.5

that each σ ∈ D maps via ϕ to an element σ ∈ G. Since p is unramified in L,

|I| = e(P | p) = 1 and since kerϕ = I by Proposition 1.4.6, ϕ is an isomorphism.

Let q = N(p) = |OK/p|. It is well known that G is a cyclic group generated by the

Frobenius automorphism x 7→ xq. Thus there is a unique σ ∈ G which maps to the

Frobenius automorphism. Finally, since q = N(p), this σ satisfies the lemma.

Definition. For a given prime P ⊂ OL, the unique element σ ∈ DP described above

is called the Artin symbol, denoted

(L/K

P

). For all α ∈ OL, it satisfies

(L/K

P

)(α) ≡ αN(p) mod P,

where p = P ∩ OK . If p = OK ∩P,

(L/K

P

)is called a Frobenius element for p.

We will describe Frobenius automorphisms in greater detail in Section 2.2 but for

now we will focus on their relation to the Hilbert class field.

Proposition 1.8.3. For a Galois extension L/K, an unramified prime p ⊂ OK and

a prime P ⊃ p, the Artin symbol has the following properties.

(i) For all σ ∈ Gal(L/K),

(L/K

σ(P)

)= σ

(L/K

P

)σ−1.

47

(ii) The order of

(L/K

P

)in DP is the inertial degree f = f(P | p).

(iii) p splits completely in L ⇐⇒(L/K

P

)= 1.

Proof. (i) follows from the uniqueness of

(L/K

P

)and the fact, proven in Section 1.3,

that all primes lying over p are conjugates under the action of Gal(L/K).

(ii) From Lemma 1.8.2, DP∼= G = Gal(`/k) and the order of G is [OL/P :

OK/p] = f . By definition, the Artin symbol maps to a generator of G so the order

of

(L/K

P

)is f .

(iii) Recall that p splits completely if and only if e = f = 1. Then e = 1 since we

are assuming p is unramified in L, and f = 1 ⇐⇒(L/K

P

)= 1 follows from part

(ii).

Since L/K is abelian, the Artin symbol only depends on the underlying prime p: if

P and P′ are both primes ofOL containing p, then P′ = σ(P) for some σ ∈ Gal(L/K)

as we have already shown. Thus (i) of the proposition implies

(L/K

P′

)=

(L/K

σ(P)

)= σ

(L/K

P

)σ−1 = σσ−1

(L/K

P

)=

(L/K

P

).

We will write the Artin symbol as

(L/K

p

)to indicate that it is determined by the

underlying prime p ⊂ OK .

The Artin symbol is the first step in establishing a powerful tool in class field

theory called Artin reciprocity (Section 2.7). The name comes from the fact that it

is a generalization of more elementary reciprocity laws, such as quadratic, cubic and

biquadratic reciprocities established by Euler, Legendre and Gauss.

48

When L/K is an unramified abelian extension, things are especially nice. Let IK

be the group of fractional ideals of OK . For any a ∈ IK with prime factorization

a =∏

prii we can define the Artin symbol on a by

(L/K

a

)=∏(

L/K

pi

)ri.

Definition. The Artin map for an extension L/K is the homomorphism

(L/K

·

): IK −→ Gal(L/K).

Notice that if L/K is ramified at any primes, the Artin map is not defined for all

of IK . Likewise if Gal(L/K) is not abelian, the Artin symbol may not be uniquely

defined for all p ∈ IK . For this reason many of the main theorems in class field theory

are complicated to state, as we will see in Chapter 2. However when L is the Hilbert

class field of K we have the following characterization of the Artin map.

Theorem 1.8.4 (Artin Reciprocity for the Hilbert Class Field). If L is the Hilbert

class field of a number field K, the Artin map

(L/K

·

): IK −→ Gal(L/K) is

surjective and its kernel is PK. Therefore the Artin map induces an isomorphism

C(OK) ∼= Gal(L/K) where C(OK) = IK/PK is the ideal class group.

Proof. This will follow from the full Artin reciprocity theorem in Section 2.7.

Using Galois theory, we have the following classification of unramified abelian

extensions of K.

Corollary 1.8.5. For a number field K, there is a one-to-one correspondence

{unramified abelian extensions

M ⊃ K

}←→

{subgroupsH ≤ C(OK)

}.

49

Furthermore, if the extension M/K corresponds to the subgroup H, then the Artin

map induces an isomorphism C(OK)/H ∼= Gal(M/K).

Proof. This too will be proven in a more general setting in Section 2.9.

This is a good example of the general strategy employed in class field theory:

describe a certain type of extensions of K – in this case unramified abelian extensions

– using information encoded in K itself, e.g. subgroups of the class group.

Corollary 1.8.6. Let L be the Hilbert class field of a number field K and let p ⊂ OK

be a prime ideal. Then p splits completely in L ⇐⇒ p is a principal ideal.

Proof. By (iii) of Proposition 1.8.3, p splits completely if and only if

(L/K

p

)= 1.

Since the Artin map induces C(OK) ∼= Gal(L/K) by the Artin reciprocity theorem

(Theorem 1.8.4),

(L/K

p

)= 1 ⇐⇒ [p] is trivial in the class group, which is

equivalent to p being a principal ideal.

The Hilbert class field has an important application to the study of primes of the

form p = x2 + ny2.

Theorem 1.8.7 ([7]). Let n > 0 be a squarefree integer such that n 6≡ 3 (mod 4).

Then there is a monic irreducible polynomial fn(x) ∈ Z[x] of degree h(−4n) – the

class number of K = Q(√−n) – such that if p is an odd prime that does not divide n

or the discriminant of fn, then

p = x2 + ny2 ⇐⇒(−np

)= 1 and fn(x) ≡ 0 (mod p) has an integer solution.

Furthermore, any choice of fn(x) will be the minimal polynomial of a real algebraic

integer α for which L = K(α) is the Hilbert class field of K.

50

We devote the rest of this section to the proof of Theorem 1.8.7 and its applica-

tions. The first step is to relate p = x2 + ny2 to the splitting behavior of p in the

Hilbert class field.

Theorem 1.8.8. Let L be the Hilbert class field of K = Q(√−n), where n > 0 is

squarefree and n 6≡ 3 (mod 4), so that OK = Z[√−n]. If p is an odd prime not

dividing n, then

p = x2 + ny2 ⇐⇒ p splits completely in L.

Proof. We will prove

dK = −4n ⇐⇒ OK = Z[√−n] ⇐⇒ n is squarefree and n 6≡ 3 (mod 4)

in the next section. For now, assume the conditions on n imply that dK = −4n. Let

p be an odd prime not dividing n, so that p - dK . By Theorem 1.6.3 this means that

p is unramified in K. To prove the theorem, we will prove

(i) p = x2 + ny2 ⇐⇒ pOK = pq where p 6= q and p is principal in OK (ii)

⇐⇒ pOK = pq, p 6= q and p splits completely in L (iii)

⇐⇒ p splits completely in L. (iv)

(i) ⇐⇒ (ii) Suppose p = x2 + ny2 = (x + y√−n)(x − y

√−n). Let p =

(x + y√−n)OK . Then pOK = pq must be the prime factorization of pOK , where

q = p = (x − y√−n)OK . Since p is unramified, p 6= q. This entire argument is

reversible, so we have proved the first equivalence.

(ii) ⇐⇒ (iii) follows from Corollary 1.8.6.

(iii) ⇐⇒ (iv) First we prove that L is Galois over Q. To do this, let τ denote

complex conjugation. It is easy to see that τ(L) is an unramified abelian extension of

τ(K) = K. Then since [τ(L) : K] = [L : K] and L is the maximal unramified abelian

51

extension of K by definition, we must have τ(L) = L. Hence τ ∈ Gal(L/K) and this

implies L/Q is Galois by conventional Galois theory arguments.

To finish the final equivalence, note that condition (iii) says that p splits in K

and some prime lying over p splits in L. Since L/Q is Galois, this is the same as p

splitting in L. Hence p = x2 + ny2 if and only if p splits completely in L.

The next step is to further describe the criteria for when p splits in L.

Theorem 1.8.9. Let K be an imaginary quadratic field and L be a finite extension

of K that is Galois over Q. Then

(1) There exists a real algebraic integer α such that L = K(α).

(2) Let f denote the minimal polynomial of α over Q, with f(x) ∈ Z[x]. If p is an

odd prime not dividing the discriminant of f(x), then

p splits in L ⇐⇒(dKp

)= 1 and f(x) ≡ 0 (mod p) has an integer solution.

Proof. (1) By hypothesis, L/Q is Galois so [L ∩ R : Q] = [L : K] since L ∩ R is

the fixed field of complex conjugation. Then for any α ∈ L ∩ R, L ∩ R = Q(α)

precisely when L = K(α). Hence if α ∈ OL ∩ R such that L ∩ R = Q(α) then α is a

real algebraic integer generating the extension L/K. Such an element exists by the

primitive element theorem.

(2) Now let f be the minimal polynomial of α over Q. By the first part, [L ∩

R : Q] = [L : K] so f is also the minimal polynomial of α over K. Let p be a

prime not dividing the discriminant of f(x). Then f(x) is separable mod p, so by

Proposition 1.3.6,

pOK = pp where p 6= p ⇐⇒(dKp

)= 1.

52

We may assume p splits completely in K, so that Z/pZ ∼= OK/p. Since f(x) is

separable over Z/pZ, it is separable over OK/p. Then Theorem 1.3.4 gives us

p splits completely in L ⇐⇒ f(x) ≡ 0 mod p is solvable in OK

⇐⇒ f(x) ≡ 0 mod p is solvable in Z.

Finally (2) is proven using (iii) ⇐⇒ (iv) from the previous proof.

We are now ready to prove Theorem 1.8.7.

Proof. Since the Hilbert class field L of K = Q(√−n) is Galois over Q, Theorem 1.8.9

says there is a real algebraic integer α which is a primitive element of the extension

L/K. Let fn be its minimal polynomial and let p be a prime that does not divide n

or the discriminant of fn. Then the previous two theorems show that

p = x2 + ny2 ⇐⇒ p splits completely in L

⇐⇒(−np

)= 1 and fn(x) ≡ 0 mod p is solvable in Z.

As discussed in the proof of Theorem 1.8.8, the hypotheses imply that dK = −4n so

(dKp

)=

(−np

).

It remains to show that deg fn = h(−4n), but by Artin reciprocity, [L : K] =

|Gal(L/K)| = |C(OK)|, and h(−4n) = |C(OK)| when K = Q(√−n), so the the-

orem is proved.

The polynomial fn(x) is not unique since L/K has infinitely many primitive ele-

ments. We can at least use this theorem to predict deg fn, and later we will see that

fn(x) completely describes the Hilbert class field – quite an amazing result indeed!

53

The Hilbert class field also allows us to relate the ideal class group C(OK) to the

form class group C(dK) for binary quadratic forms. In Section 3.2 we prove

Theorem. Let K be an imaginary quadratic field of discriminant dK = −4n, n ≥ 1.

(1) If f(x, y) = ax2 + bxy + cy2 is a primitive positive definite quadratic form of

discriminant dK, then

[a, (−b+√dK)/2] = {ma+ n(−b+

√dK)/2 | m,n ∈ Z}

is an ideal of OK.

(2) The map f(x, y) 7→ [a, (−b +√dK)/2] is an isomorphism between C(OK) ∼=

C(dK) and hence |C(OK)| = h(dK) which is the number of reduced forms of

discriminant dK.

Lemma 1.8.10. Let L = K(√β) for some β ∈ OK and let p ⊂ OK be a prime ideal.

Then p is unramified in L if either of the following two conditions are met:

(i) 2β 6∈ p, or

(ii) 2 ∈ p, β 6∈ p and β = b2 − 4c for some b, c ∈ OK.

Proof. (i) Since the discriminant of x2 − β is 4β 6∈ p, x2 − β is separable mod p and

hence p is unramified by Theorem 1.3.4.

(ii) Note that L = K(γ) as well, where γ = −b+√β

2is a root of x2 + bx + c. The

discriminant of x2+bx+c is b2−4c 6∈ p so by Theorem 1.3.4 again, p is unramified.

Example 1.8.11. Let K = Q(√−17). Our goal is to prove a characterization of

primes of the form p = x2 + 17y2 using Theorem 1.8.7. Note that n = 2, r = 0, s = 1

and dK = −68 so the Minkowski bound is computed as

BK =2!

22

(4

π

)1√68 ≈ 5.250.

54

Thus the class group C(OK) is generated by prime ideals with norm ≤ 5. These

correspond to ideals pOK for p = 2, 3 and 5. Corollary 1.6.13 tells us that of these,

only 2 ramifies, so we have the following factorizations:

• 2OK = p22 where p2 is prime.

• Using quadratic reciprocity, we calculate

(−17

3

)=

(−1

3

)(17

3

)=

(−1

3

)(2

3

)= −1 · −1 = 1.

Thus by our characterization of quadratic extensions in Proposition 1.3.6, 3

splits in K and we write 3OK = p3p′3 for prime ideals p3 6= p′3.

• Likewise, for 5 we have

(−17

5

)=

(−1

5

)(17

5

)=

(−1

5

)(2

5

)= 1 · −1 = −1.

So 5 is inert, i.e. 5OK is prime.

This shows that C(OK) may be generated by [p2] and [p3], since p3p′3 is principal.

Suppose p2 is principal, say p2 = αOK for α = a + b√−17. Then 2OK = p2

2 =

α2OK so we must have 4 = N (2OK) = N (α)2, or N (α) = ±2. However the equation

a2 + 17b2 = ±2 has no integer solutions, so p2 must not be principal. Thus its ideal

class is an element of order 2 in the class group. Similar arguments shows that p3 is

not principal, and that p23 = p2. Therefore |C(OK)| = 4.

We claim that the Hilbert class field of K is L = K(α), where α =√

(1 +√

17)/2,

following a suggestion [7]. The work above shows the Hilbert class field is a degree

4 extension of K, so it suffices to show that L = K(α) is an unramified abelian

extension of degree 4 over K, from which it will follow from the uniqueness of the

Hilbert class field.

55

It’s easy to verify, using the minimal polynomial x2− x− 4 for α2 = (1 +√

17)/2,

that the minimal polynomial for α is f(x) = x4−x2−4 which splits in L. This shows

that L/K is Galois, so [L : K] = 4. Of course every group of order 4 is abelian, so

L/K is an abelian extension. It remains to check that L/K is ramified at every prime

of OK .

Of course any infinite prime is unramified since K = Q(√−17) is imaginary

quadratic and thus has no real embeddings. We will use Lemma 1.8.10 to show

that E/K and L/E, where E = K(√

17), are both unramified extensions and it will

follow that L/K is unramified. As a sidenote, observe that α2 = (1 +√

17)/2 implies√

17 ∈ L, so K ⊂ K(√

17) ⊂ L and thus it makes sense to define the extensions E/K

and L/E.

Let p be a prime ideal ofOK . Since (i) of Lemma 1.8.10 tells us that p is unramified

in E whenever 2 6∈ p, let us assume 2 ∈ p. Note that 17 6∈ p and 17 can be written

17 = 12 − 4(−4)

and 1,−4 ∈ Z ⊂ OK so (ii) of the lemma tells us that p is unramified in E. Thus

E/K is an unramified extension.

Now we turn our attention to L/E. Let µ = (1 +√

17)/2 and µ′ = (1−√

17)/2,

so that L = E(√µ) = E(

√µ′). Suppose p ⊂ OE is a prime ideal; we may assume

2 ∈ p by (i), and furthermore 1 6∈ p, else it’s the whole ring of integers. Notice that

µ + µ′ = 1 6∈ p, so that either µ 6∈ p or µ′ 6∈ p. But these each satisfy x = x2 − 4 so

(ii) of the lemma tells us that p is unramified.

We have shown L/K to be an unramified abelian extension of degree 4, so by

uniqueness it is the Hilbert class field. We now use this to prove a theorem for primes

of the form x2 + 17y2.

56

Theorem 1.8.12. Let p 6= 17 be an odd prime. Then

p = x2 +17y2 ⇐⇒(−17

p

)= 1 and x2(x2−1) ≡ 4 mod p has an integer solution.

Proof. Let K = Q(√−17). We proved that the Hilbert class field of K is L = K(α)

where α =√

(1 +√

17)/2. We also know that the minimal polynomial for α is

f17(x) = x4 − x2 − 4 = x2(x2 − 1)− 4. Note that the discriminant of f17 is −216 · 172

which explains why we remove p = 2 and 17 from consideration. The result follows

from Theorem 1.8.7.

It is clear that even when K is only quadratic, the Hilbert class field is nontrivial

to compute. In the next example we show how Magma can be used to facilitate these

computations.

Example 1.8.13. Let K = Q(√−47). According to Magma, the class number of K

is 5:

> R<x> := PolynomialRing(RationalField());

> K<a> := NumberField(x^2 + 47);

> ClassNumber(K);

5

The next command produces the minimal polynomial for a primitive element of

L/K, where L is the Hilbert class field of K.

> L<b> := HilbertClassField(K);

> f<x> := MinimalPolynomial(b,RationalField());

> f;

x^10 + 10*x^8 - 295*x^6 + 17200*x^4 + 726840*x^2 + 6539063

Set f(x) = x10 + 10x8− 295x6 + 17200x4 + 726840x2 + 6539063 and take α to be a

root of f . Let L = Q(α). It is easy to verify that L is indeed the Hilbert class field of

K. Since f splits in L, L/Q is Galois and [L : Q] = 10. This of course means L/K is

57

Galois and [L : K] = 5. Further, the only group of order 5 is Z/5Z which is abelian,

so L/K is an abelian extension. Finally, Magma shows the extension is unramified:

> OL := MaximalOrder(L);

> IsUnramified(OL);

true

The commands Discriminant(OL) and Different(OL) may also be used to verify

ramification. In any case, we have shown L = Q(α) to be the maximal unramified

abelian extension of K. Unfortunately, we cannot prove a theorem for primes of the

form x2 + 47y2 just yet, as 47 ≡ 3 (mod 4) and Theorem 1.8.7 won’t apply.

1.9 Orders

In the previous section we were able to prove a full characterization of when a prime

is of the form p = x2 + ny2 given certain restrictions on n. We have thus described

the main question for infinitely many n, but what about the rest?

In general, if K = Q(√n) we have the following characterization (see [7]) of the

ring of integers:

OK =

{Z[√n] if n 6≡ 1 (mod 4)

Z[

1+√n

2

]if n ≡ 1 (mod 4).

Recall that for a quadratic extension, the field discriminant is given by

dK =

{n if n ≡ 1 (mod 4)

4n otherwise.

Using this allows us to write the ring of integers more succinctly:

OK = Z[dK +

√dK

2

].

The important thing is that when n does not satisfy the criteria in Section 1.8, i.e.

58

when Z[√−n] is not the full ring of integers for Q(

√−n), we still have a characteri-

zation that involves Z[√−n]. We will make some headway on the x2 + ny2 question

towards the end of this section, but a full characterization of primes of the form

x2 + ny2 will not be possible until we have the theorems of class field theory at our

disposal.

The ring Z[√−n] is an example of an order. In Lemma 1.9.2 we will prove that

the following definition is equivalent to the one given in Section 1.1.

Definition. Let K be a number field. Then a subring O ⊂ K is an order if

• 1K ∈ O

• O is finitely generated as a Z-module

• O contains a Q-basis of K.

There is a more general notion of an order in an arbitrary ring R, but the behavior

is quite different even when R is not a field. We will primarily make use of orders in

quadratic fields.

Proposition 1.9.1. Let O be an order in a quadratic number field K.

(1) O is a free Z-module of rank 2.

(2) K is the field of fractions of O.

(3) OK is an order in K containing every other order. In other words OK is the

maximal order in K.

Proof. (1) Clearly O is torsion free, so since it is a Z-module it is free. Also, since

O contains a Q-basis of a quadratic field, O is at least rank 2, so it must be exactly

rank 2.

(2) follows from the fact that O contains a Q-basis for K.

59

(3) Since 1K ∈ OK and OK is a Z-module of rank [K : Q] = 2 by Proposi-

tion 1.1.13, it suffices to show that OK contains a basis for K/Q. But this follows

from the discussion above: OK is generated by 1 and dK+√dK

2.

Now let O be any order in K. Since O is a free Z-module, it is noetherian. Let

α ∈ O and consider the chain of Z-submodules I0 ⊂ I1 ⊂ I2 ⊂ · · · where I0 = Z and

for n ≥ 1,

In = Z + αZ + α2Z + . . .+ αnZ.

By the noetherian condition, there is some n such that for all m ≥ n, Im = In. So

for all such m we have Z + αZ + . . . + αmZ = Z + αZ + . . . + αnZ. This implies

αm = αi for some 1 ≤ i ≤ n and thus the powers of α are finite. This shows that

Z[α] is finitely generated as a Z-module, so Proposition 1.1.4 shows α ∈ OK . Thus

O ⊂ OK .

Recall that Z+ niZ is an order in Q(i) for every nonzero n ∈ Z. The next lemma

shows that this is essentially the form of every order in a quadratic field.

Lemma 1.9.2. Let O be an order in a quadratic field K with discriminant dK and

ring of integers OK. Then f = [OK : O] is finite and O = Z + fOK.

Proof. The finiteness of f is a result of the fact that O and OK are both free Z-

modules of rank 2. On one hand, since f = [OK : O] we have

fOK ⊂ O =⇒ Z + fOK ⊂ O.

On the other hand, our description of OK at the beginning of the section allows us

to write Z + fOK = [1, fwK ], where

wK =dK +

√dK

2.

60

Clearly [1, fwK ] has index f in [1, wK ] = OK , which proves the result.

Definition. The index f = [OK : O] is called the conductor of the order.

This is not to be confused with the conductor of an extension in class field theory,

which will be discussed in Section 2.8. To add to the clutter, each order has an

associated value called the discriminant which is distinct from, although related to,

the field discriminant.

Definition. For an order [α, β], its discriminant is defined to be

D =

(det

[α βα′ β′

])2

where α′ and β′ denote the respective images of α and β under the nontrivial auto-

morphism of K/Q.

The discriminant of an order is independent of the basis chosen, since if A =[α βα′ β′

]then changing basis is done by conjugating A by some invertible matrix

B, but this doesn’t change the determinant calculation above. Therefore we can let

O = [1, fwK ] as in Lemma 1.9.2 and have D = f 2dK . This shows that an order

is determined by its conductor. Moreover, the maximal order OK has conductor 1

which shows that the discriminant of OK is dK .

By our description of dK for quadratic fields, we see that D ≡ 0, 1 (mod 4). Let

K = Q(√−n) for any integer n. Then Z[

√−n] is an order in K with discriminant

−4n. By the comments above, −4n = f 2dK which makes it relatively easy to compute

the conductor of Z[√−n].

In fact, if D ≡ 0 or 1 (mod 4) there will be an in order in a quadratic field whose

discriminant is D. For D ≡ 0 (mod 4), we may write D = 4n and see that the

maximal order OK = [1, wK ] in K = Q(√n) has discriminant dK = 4n = D. On the

61

other hand, if D ≡ 1 (mod 4), Q(√D) has ring of integers OK = Z

[1+√D

2

]which

has discriminant dK = D.

Recall that OK is a Dedekind domain and has unique factorization of ideals.

Unfortunately this is not true in general for an order O ( OK so our description of

the ideals of O requires a bit more care. It turns out that we can still define a class

group C(O) by restricting to certain types of ideals. One should view the subsequent

construction as a precursor to the types of constructions used in class field theory in

Chapter 2.

Proposition 1.9.3. Let a be a nonzero ideal in an order O of K. Then the quotient

O/a is finite.

Proof. By Proposition 1.5.2, every nonzero ideal a of the maximal order OK has finite

index in OK . If b is a nonzero ideal in an order O of K, Proposition 1.9.1 tells us

that O ⊂ OK so that b ⊂ OK . Then [OK : b] = [OK : O][O : b] and the left side is

finite, so [O : b] must also be finite.

This allows us to define

Definition. For an order O, the norm of an O-ideal a is N(a) = [O : a].

For any nonzero ideal a ⊂ O, O ⊆ {β ∈ K : βa ⊂ a}, but equality may not always

hold. The ideals for which equality does hold have a special name.

Definition. An ideal a of an order O is a proper ideal if O = {β ∈ K : βa ⊂ a}.

Notice that principal ideals are always proper. Also, every ideal of the maximal

order OK is proper. From this definition we proceed with our construction of a class

group for O by defining an analog of fractional ideals.

62

Definition. For an order O, a fractional O-ideal is a subset of K which is finitely

generated as an O-module. We say a fractional O-ideal b is proper if O = {β ∈ K :

βb ⊂ b}.

Proposition 1.9.4. Every fractional O-ideal is of the form αa for some nonzero

α ∈ K and ideal a ⊂ O.

Proof. This is identical to the property for fractional ideals of a Dedekind domain;

see Section 1.2.

Lemma 1.9.5. Let K = Q(α) be a quadratic field and suppose ax2 + bx + c is the

minimal polynomial for α – we may assume (a, b, c) = 1. Then [1, α] is a proper

fractional ideal of the order [1, aα] in K.

Proof. First, [1, aα] is an order by Lemma 1.9.2 since [1, aα] = Z + aαOK and aα is

an algebraic integer. Now suppose β ∈ K such that β[1, α] ⊂ [1, α]. Equivalently,

β · 1 ∈ [1, α] and β · α ∈ [1, α].

The first of these gives us β = j + kα for j, k ∈ Z, so we can write the second as

β · α = (j + kα)α = jα + kα2 = jα +k

a(−bα− c) = −ck

a+

(−bka

+ j

)α.

By hypothesis (a, b, c) = 1 so the above shows β · α ∈ [1, α] if and only if a | k. Thus

{β ∈ K : β[1, α] ⊂ [1, α]} = [1, aα],

proving [1, α] is a proper fractional ideal of [1, aα].

For orders in a quadratic field, we have the following characterization of their

fractional ideals.

Proposition 1.9.6. A fractional O-ideal a is proper if and only if a is invertible.

63

Proof. ( =⇒) If a is invertible, there exists some fractional O-ideal b such that ab =

O. Suppose β ∈ K such that βa ⊂ a. Then βO = β(ab) = (βa)b ⊂ ab = O. This

implies β ∈ O so a is a proper fractional O-ideal.

( =⇒ ) Suppose a ⊂ O is a proper fractional ideal. Since K is quadratic, a is a free

Z-module of rank 2, so a = [β, γ] for some β, γ ∈ K. Let α = γβ; then a = β[1, α] and

Lemma 1.9.5 implies that O = [1, aα] where ax2 + bx + c is the minimal polynomial

of α over Q. Let z 7→ z′ be the nontrivial automorphism in Gal(K/Q). Since α′ is

also a root of ax2 + bx + c, Lemma 1.9.5 also shows that a′ = β′[1, α′] is a fractional

O-ideal. We will show that aaa′ = N(β)O. Note that

aaa′ = aββ′[1, α][1, α′] = N(β)[a, aα, aα′, aαα′].

Also observe that α + α′ = − ba

and αα′ = ca, so

aaa′ = N(β)[a, aα,−b, c] = N(β)[1, aα] = N(β)O

since (a, b, c) = 1. This proves the claim, and it follows that a is invertible.

Example 1.9.7. O = Z[√−3] is an order of conductor 2 in K = Q(

√−3). Consider

the ideal [2, 1 +√−3] in O. It’s easy to see that

O ( {β ∈ K : β[2, 1 +√−3] ⊂ [2, 1 +

√−3]} = OK .

Further, 2, 1 +√−3 and 1 −

√−3 are all irreducible in O, but 4 = 2 · 2 = (1 +

√−3)(1−

√−3) showing that unique factorization fails in O.

In the next theorem we construct a class group C(O) for an order in a quadratic

number field. As with the class group in Section 1.7, we take a quotient of a fractional

ideal group by some principal fractional ideals, but in this context we must restrict

our consideration to proper fractional ideals in O.

64

Theorem 1.9.8. Given an order O in a quadratic number field, the set I(O) of

proper fractional O-ideals forms a group under ideal multiplication. Moreover, the

set P (O) of principal O-ideals is a subgroup of I(O) and hence the ideal class group

C(O) = I(O)/P (O) is defined.

Proof. Let a and b be proper fractional ideals of the order O. By Proposition 1.9.6,

it is equivalent to consider invertible ideals. First note that O is clearly the identity

in I(O). Since a is invertible, there is some fractional O-ideal which we will denote

a−1, such that aa−1 = O. This shows that a−1 is also invertible and hence proper, so

I(O) has inverses.

Now consider the product (ab)c, where we set c = b−1a−1. Then

(ab)c = abb−1a−1 = aOa−1 = aa−1 = O

so we see that ab is invertible and hence proper. This proves that I(O) is a group.

Clearly P (O) is a subgroup of I(O) since every principal ideal is proper, and the

product of principal ideals is again principal. C(O) = I(O)/P (O) is a quotient of

abelian groups, so it is a group. This completes the proof of the theorem.

In order to make our work on orders in quadratic fields more compatible with

the rest of class field theory, it will be advantageous to translate O-ideals into the

language of OK-ideals.

Definition. Given an order O of conductor f , we say that a nonzero O-ideal a is

prime to f if a + fO = O.

Lemma 1.9.9. Let O be an order of conductor f .

(1) An O-ideal a is prime to f ⇐⇒ N(a) is relatively prime to f .

(2) Every O-ideal that is prime to f is proper.

65

Proof. (1) Define the map ϕf : O/a→ O/a to be multiplication by f . Note that

a + fO = O ⇐⇒ ϕf is surjective

⇐⇒ ϕf is an isomorphism

⇐⇒ f and |O/a| are relatively prime

where the last equivalence comes from the fundamental theorem of finite abelian

groups. Then by definition of numerical norm, |O/a| = N(a) so (1) is proved.

(2) Suppose a is prime to the conductor. Let β ∈ K and suppose βa ⊂ a. Then

βO = β(a + fO) = βa + βfO ⊂ a + fOK .

But fOK ⊂ O so βO ⊂ O which proves β ∈ O. Hence a is proper.

Note that since norm is multiplicative, (1) can be used to show that the set of

O-ideals prime to the conductor forms a subgroup I(O, f) ≤ I(O). Moreover, the set

P (O, f) = {αO | α ∈ O, (N(α), f) = 1}

is a subgroup of I(O, f). The next proposition describes the class group C(O) in

terms of O-ideals prime to the conductor.

Proposition 1.9.10. I(O, f)/P (O, f) ∼= I(O)/P (O) = C(O).

Proof. A result in Chapter 3 will imply that every ideal class in C(O) contains a

proper O-ideal whose norm is prime to a fixed M ∈ Z. Thus the map I(O, f)→ C(O)

is surjective with kernel I(O, f) ∩ P (O), so it suffices to show P (O, f) = I(O, f) ∩

P (O).

On one hand, P (O, f) ⊂ I(O, f) ∩ P (O) is clear from the definitions of these

subgroups. On the other hand, every element of I(O, f) ∩ P (O) is a fractional ideal

of the form αO = ab−1, where α ∈ K and a, b are O-ideals prime to f . Let m = N(b).

66

Then mO = bb ∈ P (O, f) and mb−1 = b which implies

mαO = mab−1 = a(mb−1) = ab ⊂ O.

So mαO ∈ P (O, f). It follows that αO = (mαO)(mO)−1 ∈ P (O, f) and hence the

kernel is equal to P (O, f).

Given any positive integer m, an OK-ideal a is prime to m provided that a +

mOK = OK . By Lemma 1.9.9, this is equivalent to (N(a),m) = 1. This implies

that for every ring of integers OK , inside the group of fractional OK-ideals we have a

subgroup IK(m) ≤ IK . In Section 2.3 we will generalize this construction using class

field theory, but for now we have

Theorem 1.9.11. Let O be the order of conductor f in a quadratic field K.

(1) If a is an OK-ideal prime to f , then a ∩ O is an O-ideal prime to f and

N(a ∩ O) = N(a), where the first norm is taken with respect to O and the

second with respect to OK.

(2) If b is an O-ideal prime to f , then bOK is an OK-ideal prime to f with the

same norm.

(3) IK(f) ∼= I(O, f).

Proof. (1) Let a be an OK-ideal prime to f . By the natural injection ν : O/(a∩O) ↪→

OK/a, (N(a), f) = 1 implies (N(a∩O), f) = 1 as well. This shows a∩O is prime to

f . As in Lemma 1.9.9, the map ϕf is an automorphism of OK/a, but fOK ⊂ O so

the injection ν is also a surjection. Hence the norms are equal.

(2) and (3) Let b be an O-ideal prime to f . Then

bOK + fOK = (b + fO)OK = OOK = OK

67

which shows that bOK is an OK-ideal prime to f . In a moment we will show the

norms are equal, but first consider

bOK ∩ O = (bOK ∩ O)O

= (bOK ∩ O)(b + fO)

⊂ b + f(bOK ∩ O)

⊂ b + b(fOK).

Since fOK ⊂ O this proves bOK ∩ O ⊂ b. The other containment, b ⊂ bOK ∩ O, is

clear so we have bOK ∩ O = b.

On the other hand, suppose a is an OK-ideal prime to f . Then

a = aO = a(a ∩ O + fO) ⊂ (a ∩ O)OK + fa,

but fa ⊂ fOK ⊂ O so fa ⊂ a ∩ O ⊂ (a ∩ O)OK and it follows that a ⊂ (a ∩ O)OK .

Again the other inclusion is obvious, so we have (a∩O)OK = a. These two identities

for O- and OK-ideals, along with (1), prove the equality of norms in (2). Furthermore

we have established a bijection

IK(f)←→ I(O, f)

a 7−→ a ∩ O

bOK 7−→b.

To show this is an isomorphism, we must simply check that it is multiplicative:

(aa′)OK = (aOK)(a′OK)

and we have proven the theorem.

Corollary 1.9.12. Every O-ideal prime to the conductor has a unique decomposition

as a product of prime O-ideals which are prime to the conductor.

68

Proof. Apply unique factorization of ideals in OK and Theorem 1.9.11.

Finally we describe C(O) in terms of the maximal order.

Theorem 1.9.13. Let O be the conductor of order f in an imaginary quadratic field

K and define PK,Z(f) of IK(f) by

PK,Z(f) = {αOK | α ∈ OK and α ≡ a mod fOK for some a ∈ Z, (a, f) = 1}.

Then C(O) ∼= IK(f)/PK,Z(f).

Proof. We have proven that C(O) ∼= I(O, f)/P (O, f). In the proof of Theorem 1.9.11

we saw that I(O, f) ∼= IK(f), so it suffices to show that the image of P (O, f) under

this isomorphism is PK,Z(f). To do so, we will prove that for α ∈ OK ,

α ≡ a mod fOK , a ∈ Z, (a, f) = 1 ⇐⇒ α ∈ O, (N(α), f) = 1.

( =⇒ ) Assume α ≡ a mod fOK where a ∈ Z is relatively prime to f . By

definition of the numerical norm in a quadratic field, N(α) ≡ a2 (mod f) which

implies (N(α), f) = (a2, f) = 1. Since fOK ⊂ O we see that α ∈ O.

( =⇒) Conversely, suppose α ∈ O = [1, fwK ] with (N(α), f) = 1. We may write

α = a + bfwK for a, b ∈ Z, so α ≡ a mod fOK . Since (N(α), f) = 1, N(α) ≡ a2

(mod f) again implies (a, f) = 1. This proves the stated equivalence.

Now by definition P (O, f) is generated by ideals αO, where α ∈ O and (N(α), f) =

1. Thus we see that the image of P (O, f) under the isomorphism I(O, f)∼=−→ IK(f)

is generated by the corresponding ideals αOK . By the equivalence proven above, this

proves the image is precisely PK,Z(f).

We are by no means finished working with orders. In Section 2.11 we will re-

alize PK,Z(f) as a congruence subgroup for the conductor, and show that there is

69

a corresponding field extension L/K with the special property that Gal(L/K) ∼=

IK(f)/PK,Z(f). This will allow us to provide a full solution to the question of when

a prime is of the form p = x2 + ny2, which we have only answered partially as of

Section 1.8.

1.10 Units in a Number Field

In this section we further describe the structure of OK by characterizing the group of

units UK , which is the group of all elements in OK which have a multiplicative inverse.

In particular we will prove Dirichlet’s unit theorem, which is the main structure

theorem for the group of units in a number field. At the end of the section we will

discuss a nice application of Dirichlet’s unit theorem to Pell’s equation x2 − dy2 = 1.

By the theory of finitely generated abelian groups, O×K ∼= Zt × T where t is the

rank of O×K and T is the torsion subgroup of O×K . Henceforth we will write O×K = UK

and T = µ(K), which is equivalently the set of roots of unity which lie in OK .

Definition. A set of units u1, . . . , ut is called a fundamental system of units if

it forms a basis for UK modulo torsion, i.e. if every unit u ∈ UK can be written

u = ζum11 · · ·umt

t for ζ ∈ µ(K) and mi ∈ Z.

Proposition 1.10.1. The torsion subgroup µ(K) is finite and cyclic.

Proof. Recall that if ζ is a primitive mth root of unity then Q(ζ) is a Galois extension

of Q with Gal(Q(ζ)/Q) ∼= (Z/mZ)×. Moreover, [Q(ζ) : Q] = |Gal(Q(ζ)/Q)| = φ(m)

where φ is Euler’s function, defined to be the number of positive integers less than m

that are relatively prime to m – we will provide a proof of this well-known fact using

the Frobenius density theorem in Section 2.5. There is a product formula for φ(m):

if m = pr11 · · · prss is the prime factorization of m, then φ(m) =∏pri−1i (pi − 1). Now

70

note that for any number field K,

ζ ∈ µ(K) =⇒ ζ ∈ K =⇒ K(ζ) ⊂ K =⇒ φ(m) | [K : Q].

Since [K : Q] is finite, there are only a finite number of choices for ζ, the primitive

root, and each one has finite order. It follows easily that µ(K) is cyclically generated

by ζ.

Using the field norm N = NK/Q, we have the following characterization of the

units of a number field K.

Proposition 1.10.2. An element α ∈ K is a unit if and only if α ∈ OK and

NK/Q(α) = ±1.

Proof. ( =⇒ ) If α is a unit, then there exists a β ∈ OK such that αβ = 1. Then

N(α), N(β) ∈ Z and since norm is multiplicative, 1 = N(αβ) = N(α)N(β). There-

fore N(α) = ±1.

( =⇒) Fix an embedding σ0 : K ↪→ C. By properties of the field norm,

N(α) =∏

σ:K↪→C

σ(α) = α∏σ 6=σ0

σ(α).

Let β =∏σ 6=σ0

σ(α). Then α ∈ OK implies β ∈ OK as well. If N(α) = ±1, then

β = ±α−1 so α has an inverse ±β in OK and is therefore a unit.

For all real fields K, UK = {±1}. This turns out to be the unit group for many

nonreal fields as well.

Example 1.10.3. Let K = Q(√d), a quadratic field. Recall that

OK =

{Z[√d] n 6≡ 1 (mod 4)

Z[

1+√d

2

]n ≡ 1 (mod 4).

71

In each case, the units in OK are the solutions to one of

x2 − dy2 = ±1

(2x+ y)2 − dy2 = ±4.

When d < 0 (i.e. K is imaginary quadratic), these equations only have finitely many

solutions, so UK = µ(K). In fact, ζm ∈ K if and only if φ(m) ≤ 2 which only happens

for m dividing 4 or 6. Thus µ(K) = {±1} except for the following cases:

Q(i) : µ(K) = {±1,±i}

Q(√−3) : µ(K) = {±1,±ρ,±ρ2} where ρ = 1+

√−3

2.

For d > 0, there are infinitely many solutions. The equation x2 − dy2 = ±1 is known

as Pell’s equation; we will describe this case further after proving the unit theorem.

Proposition 1.10.4. For any m,M ∈ Z, the set of all algebraic integers α such that

the degree of α is ≤ m and |α′| ≤M for all conjugates α′ of α is finite.

Proof. If the degree of α is bounded by m then α is equivalently the root of a monic

irreducible polynomial with degree at most m. On the other hand, |α′| ≤ M for all

conjugates α′ of α implies the coefficients of the polynomial are all bounded. Since

the degree and coefficients of such a polynomial are all integers, and in this case

they are all bounded, there are only a finite number of polynomials satisfying the

requirements, each one having only finitely many roots. Hence the set of these α

described above is finite.

Corollary 1.10.5. An algebraic integer α is a root of unity ⇐⇒ |α′| = 1 for all

conjugates α′ of α.

Proof. Apply the proposition to the set {1, α, α2, . . .} to see that it is finite, and hence

αn = 1 for some n ∈ N.

72

As in Section 1.7, consider the map

σ : K −→ Rr × Cs

α 7−→ (σ1(α), . . . , σr(α), σr+1(α), σr+1(α), . . . , σr+s(α))

where {σ1, . . . , σr, σr+1, σr+1, . . . , σr+s} is the set of all embeddings of K into C. This

map takes sums to sums, but to describe a multiplicative group such as UK we want to

map products to sums instead. The solution is to construct a map using logarithms:

L : UK −→ Rr+s

α 7−→(log |σ1(α)|, . . . , log |σr(α)|, log |σr+1(α)|, . . . , log |σr+s(α)|

).

If u is a unit in OK , then by Proposition 1.10.2, N(u) = ±1 and so

|σ1(u)| · · · |σr(u)| |σr+1(u)|2 · · · |σr+s(u)|2 = 1.

Taking the log of both sides shows that the image L(u) lands in the hyperplane

H : x1 + . . .+ xr + 2xr+1 + . . .+ 2xr+s = 0.

This is a linear system with one degree of dependence, so we see that H is isomorphic

to Rr+s−1. This suggests the main result in this section, Dirichlet’s unit theorem.

Dirichlet’s Unit Theorem. Let K be a number field with ring of integers OK.

Then UK ∼= Zr+s−1 × µ(K) where r is the number of real embeddings of K and s is

the number of pairs of complex embeddings.

We will prove this theorem by establishing that L(UK) is a lattice in Rr+s−1 with

full rank. First we have

Lemma 1.10.6. The image of L : UK → H is a lattice in H and the kernel is µ(K).

Proof. Let C be a bounded subset of H, say C = {(xi) ∈ H : |xi| ≤ M for all i}.

If L(u) ∈ C for a unit u ∈ UK then |σj(u)| ≤ eM for each embedding σj. By

73

Proposition 1.10.4, this implies there are only a finite number of such u, and thus

L(UK) ∩ C is finite. This implies L(UK) is a lattice in H. Further, u ∈ kerL if and

only if |σi(u)| = 1 for all i and thus Corollary 1.10.5 says that kerL = µ(K).

We will also need

Lemma 1.10.7. Suppose A = (aij) is an m×m matrix such that

• aij < 0 for all i 6= j

• ai1 + ai2 + . . .+ aim > 0 for each i.

Then A is invertible.

Proof. If A is not invertible, then the system of equations

m∑j=1

aijxj = 0, i = 1, . . . ,m

has a nontrivial solution x = (x1, . . . , xm). Suppose xk is a component such that

|xk| = max{|xj|}. We may scale x so that xk = 1. Then |xj| ≤ 1 for each j 6= k which

implies

0 =m∑j=1

akjxj = akk +∑j 6=k

akjxj ≥ akk +∑j 6=k

akj > 0,

a contradiction. Thus A must be invertible.

We now proceed to the proof of the Unit Theorem.

Proof. We will show that L(UK) is a full lattice in H, which, since kerL is finite, will

imply that UK has rank r+s−1. Again we will make use of the map σ : K ↪→ Rr×Cs.

Set V = Rr × Cs. For each x ∈ V , define

N(x) = x1 · · ·xrxr+1xr+1 · · · xr+s.

74

Then N(σ(α)) = NK/Q(α). Also note that |NK/Q(α)| = |x1 · · · |xr| |xr+1|2 · · · |xr+s|2.

Recall from Proposition 1.7.7 that σ(OK) is a full lattice in V and the volume of its

fundamental parallelopiped is 2−s√|dK |. Equivalently, if {α1, . . . , αn} is a Z-basis for

OK then we showed that the absolute value of the determinant of the matrix whose

ith row is

σ(αi) =(σ1(αi), . . . ,Re(σr+1(αi)), im(σr+1(αi)), . . . , im(σr+s(αi))

)is equal to 2−s

√|dK |. For the remainder of the proof, let x ∈ V with 1

2≤ N(x) ≤ 1.

Define an action of V on σ(OK) by x ·σ(OK) = {x ·σ(α) | α ∈ OK}, where u ·v is the

multiplication operation of V as a ring. Then x · σ(OK) is again a lattice in V and

the volume of its fundamental parallelopiped is the determinant of the matrix with

ith row

(x1 · σ1(αi), . . . ,Re(xr+1 · σr+1(αi)), im(xr+1 · σr+1(αi)), . . .)

which equals 2−s√|dK | |N(x)|. Observe that as x ranges over the set of points with

12≤ N(x) ≤ 1 these volumes remain bounded.

Let T be a compact subset of V such that T is symmetric about the origin and

convex, and large enough in volume so that by Minkowski’s Theorem T contains a

point x · σ(γ) for some nonzero γ ∈ OK . The points in T have bounded coordinates

(by compactness and the Heine-Borel Theorem) and thus bounded norms; say M is

a bound on their norms. Then since x · σ(γ) ∈ T , |N(x · σ(γ))| ≤M and hence

|NK/Q(γ)| ≤ M

N(x)≤ 2M.

Consider the set of ideals γOK where γ runs through the chosen algebraic integers

above for each x, i.e. those γ such that x · σ(γ) ∈ T . The norm of these ideals

is bounded by 2M , so there are only finitely many: {γ1OK , . . . , γtOK}. If γ is any

75

nonzero algebraic integer in OK such that x · σ(γ) ∈ T then γOK = γiOK for some

1 ≤ i ≤ t, and thus for some unit u we have γ = γiu. This shows that x · σ(u) ∈

σ(γ−1i )T . Note that the set T ′ =

⋃ti=1 σ(γ−1

i )T is bounded and does not depend on

x. We have therefore shown that for each x with 12≤ N(x) ≤ 1 there exists a unit u

for which the coordinates of x · σ(u) are bounded uniformly.

To prove L(UK) is a full lattice in H, we may assume r+s−1 ≥ 1, since otherwise

the proof is trivial. For each i between 1 and r + s, we may choose an x ∈ V with

12≤ N(x) ≤ 1 such that all the coordinates xj for j 6= i are large compared to all

y ∈ T ′, and xi is small enough so that |N(x)| = 1. For each i = 1, . . . , r + s− 1, we

have shown that there exists a unit ui such that the coordinates of x · σ(ui) are all

bounded, and so for all j 6= i,

|σj(ui)| < 1 =⇒ log |σj(ui)| < 0.

We claim that L(u1), . . . , L(ur+s−1) are linearly independent vectors in L(UK), which

we prove by showing that the matrix with ith row

(log |σ1(ui)|, . . . , log |σr+s−1(ui)|

)is invertible. The non-diagonal entries of the matrix are all negative, but ui ∈ kerL

and so

log |σ1(ui)|+ . . .+ log |σr+s(ui)| = 0

=⇒ log |σ1(ui)|+ . . .+ log |σr+s−1(ui)| = − log |σr+s(ui)| > 0.

In other words the sum of the entries across each row are positive. Then Lemma 1.10.7

implies that our matrix is invertible. Thus rankL(UK) ≥ r + s− 1, but H has rank

r+s−1. It follows that L(UK) is a full lattice in H, completing the proof of the Unit

Theorem.

76

We now apply the Unit Theorem to Pell’s equation.

Example 1.10.8. The famous problem of Pell is to find all positive integer solutions

(x, y) to x2 − dy2 = 1, where d > 0 is squarefree. Let K = Q(√d). Note that for any

α = x+ y√d ∈ K,

N(α) = (x+ y√d)(x− y

√d) = x2 − dy2.

Thus the solutions to Pell’s equation form a finite-index subgroup of UK , the group

of units. By Dirichlet’s Unit Theorem, the solution set is an infinite abelian group of

rank 1, a fact that is exponentially harder to show with elementary number theory.

Definition. Let t = r+ s− 1 and let u1, . . . , ut be a fundamental system of units for

UK . The regulator of K is the determinant of the matrix whose ith row is

L(ui) =(log |σ1(ui)|, . . . , 2 log |σt(ui)|

).

In other words, the regulator is the signed volume of the fundamental parallelop-

iped for L(UK). We will encounter the regulator again in Section 2.4.

77

Chapter 2: Class Field Theory

In this chapter we develop the main concepts and theorems in class field theory,

including valuations on a field, the Artin map, ray class groups, Dirichlet L-series,

Artin reciprocity, the Conductor and Existence Theorems and two density theorems.

At the end of the section, we prove the main characterization of primes of the form

x2 + ny2 by applying class field theory to the ring class field of Z[√−n].

2.1 Valuations and Completions

Definition. A function | · | : K → R on a field K is called an absolute value, or

valuation, if it satisfies

(1) |x| ≥ 0 for all x ∈ K and |x| = 0 ⇐⇒ x = 0.

(2) |xy| = |x| |y| for all x, y ∈ K.

(3) |x+ y| ≤ |x|+ |y|. This is called the triangle inequality.

There is an additional axiom that defines an important type of valuation.

Definition. An absolute value on K is nonarchimedean if |x + y| ≤ max{|x|, |y|}

for all x, y ∈ K. Otherwise, | · | is said to be archimedean.

By (1) and (2), | · | is a multiplicative homomorphism from K× to the positive

reals. Moreover, since R>0 is torsion-free, | · | maps K× to 1. This implies

(a) | − 1| = 1.

(b) | − x| = |x| for all x ∈ K.

78

Examples.

1 For any number field K, let σ : K ↪→ C be any complex embedding. Then

|x| := |σ(x)| is a valuation, where the second set of absolute values denotes the

usual absolute value on C.

2 Let ord : K× → Z be a discrete (additive) valuation. Then for any real number

c > 1, the following gives us a nonarchimedean absolute value on K:

|x| =

{c− ord(x) if x 6= 0

0 if x = 0.

3 The most important example of the valuation in 2 is the p-adic valuation for

any prime number p. It is a map | · |p : Q→ R defined by |a|p = p− ordp(a) = p−v

where a = pvm such that p does not divide the numerator of m.

4 This can be generalized to extensions of Q in the following way. For any prime

ideal p in the ring of integers of a number field K, we define the normalized

p-adic valuation

| · | : K −→ R

a 7−→ |a|p := N(a)− ordp(a)

where a = aOK , N(a) is the ideal norm from Section 1.5 and ordp(a) is the

largest integer n such that pn divides a.

5 On any field, the trivial absolute value is |x| = 1 for all x 6= 0. Note that the

only absolute value on a finite field F is the trivial absolute value since every

nonzero element of F is a root of 1F .

79

Remark. The condition |x+y| ≤ max{|x|, |y|}, called the nonarchimedean condition,

is equivalent to ∣∣∣∑xi

∣∣∣ ≤ max{|xi|}.

We also have the following characterization.

Proposition 2.1.1. An absolute value | · | is nonarchimedean if and only if it is

bounded on the set {m · 1K | m ∈ Z}.

Proof. ( =⇒ ) If | · | is nonarchimedean, then for any m > 0,

|m · 1| = |1 + 1 + . . .+ 1| ≤ |1| = 1.

By property (a) following the definition of absolute values, | −m · 1| = |m · 1| so the

values are bounded for all m ∈ Z.

( =⇒) Conversely, suppose there is some N such that |m · 1| ≤ N for all m ∈ Z.

Then

|x+ y|n =

∣∣∣∣∣n∑r=0

(n

r

)xryn−r

∣∣∣∣∣ ≤n∑r=0

∣∣∣∣(nr)∣∣∣∣ |x|r|y|n−r

by the triangle inequality for sums. Clearly |x|r|y|n−r ≤ max{|x|n, |y|n} = max{|x|, |y|}n

and(nr

)is an integer, so we see that

|x+ y|n ≤ N(1 + n) max{|x|, |y|}n =⇒ |x+ y| ≤ N1/n(1 + n)1/n max{|x|, |y|}.

As n→∞, N1/n(1+n)1/n tends to 1 by limit laws, so we have |x+y| ≤ max{|x|, |y|}.

Hence | · | is nonarchimedean.

Corollary 2.1.2. If K is a field of characteristic p 6= 0, then every absolute value on

K is nonarchimedean.

Proof. If charK 6= 0 then the set {m · 1K | m ∈ Z} is finite and therefore bounded

under | · |. Apply Proposition 2.1.1.

80

In example 2 , we saw that an additive map ord on K induces an absolute value

|x| = c− ord(x), where c is any positive real number. Taking logs, we have

logc |x| = − ord(x), or ord(x) = − logc |x|

which suggests the following connection between additive and multiplicative valua-

tions.

Definition. An absolute value | · | on K is said to be discrete if |K×|, the image of

the units of K, is a discrete subgroup of R>0.

Proposition 2.1.3. For any nonarchimedean absolute value | · | on K, define v :

K× → R by v(x) = − log |x| (with v(0) defined to be 0). Then

(i) v(xy) = v(x) + v(y),

(ii) v(x+ y) ≥ min{v(x), v(y)}.

Furthermore, if the image v(K×) is discrete in R, then v is a multiple of some discrete

valuation ord : K×→→ Z ⊂ R.

Proof. Obvious. See [19] for the details.

We next define several important objects which arise from a nonarchimedean

valuation on a field.

Definition. For any nonarchimedean valuation | · | on K, define

A = {a ∈ K : |a| ≤ 1}

U = {a ∈ K : |a| = 1}

m = {a ∈ K : |a| < 1}.

81

Proposition 2.1.4. Let | · | be a nonarchimedean valuation on K. Then A is a

local subring of K with U as its group of units and m as its unique maximal ideal.

Furthermore, | · | is discrete if and only if m is principal, in which case A is a DVR.

Proof. It follows easily from the properties of a nonarchimedean valuation that A is

a ring. By property (a) following the definition of valuations, it is easy to see that U

is the group of units of A. By the nonarchimedean condition, m is an ideal of A. To

see that it is the unique maximal ideal, let y ∈ A r m. Then |y| = 1 by definition,

and |y−1| = 1 so y−1 ∈ A as well. Thus every element in A outside of m is a unit,

which implies that m is the unique maximal ideal of A.

Turning to the last statement, if | · | is discrete then v(|x|−1) = −v|x| implies that

one of v(|A|),−v(|A|) contains all the positive integers. We may assume 1 ∈ v(|A|).

Let π ∈ A such that v|π| = 1. For any x ∈ A, v|x| = n ∈ Z+ and so v|xπ−n| = 0.

However, v : K× → Z is an isomorphism so |xπ−n| = 1. Thus u = xπ−n is a unit, and

x = uπn. We have proven that every nonzero x ∈ A can be written as a unit times

a power of π, so the only ideals in A are powers of πA. Hence m must be principal,

which of course implies A is a DVR. The other direction follows from showing that

v : K× → Z is again an isomorphism and using the fact that Z is a discrete subgroup

of R.

Definition. The ring A defined above is called the valuation ring for | · |, and m is

called its valuation ideal.

The last statement in Proposition 2.1.4 says that | · | is discrete exactly when A

is a discrete valuation ring, which explains where this term comes from.

Every absolute value defines a metric on K given by d(x, y) = |x − y| and hence

induces a metric topology on K. For each x ∈ K, the sets

B(x, ε) = {y ∈ K : 0 < |x− y| < ε}

82

form a basis of neighborhoods around x.

Example 2.1.5. The p-adics are a perfect case study of absolute values and their

related groups, rings and topologies. In the p-adic topology induced on Q by | · |p,

two rationals a and b are considered “close” if their difference is divisible by a high

power of p. For example, the sequence xn = pn = (1, p, p2, . . .) converges to 0 quite

rapidly in the p-adic topology. Even the series∑∞

n=1 pn converges in this topology.

Gouvea [12] provides a great introductory-level examination of the p-adics.

Definition. Two absolute values | · |1 and | · |2 on a field K are said to be equivalent

if they induce the same topology on K.

Proposition 2.1.6. For two absolute values | · |1, | · |2 : K → R, the following are

equivalent:

(1) | · |1 and | · |2 are equivalent.

(2) |x|1 < 1 ⇐⇒ |x|2 < 1 for all x ∈ K.

(3) There exists some c > 0 such that |x|1 = |x|c2 for all x.

Proof. This is a standard proof in the study of absolute values. See [19] or any

topology book for details.

Nonarchimedean absolute values cause the metric topology on K to have some

strange properties. For example,

• If y ∈ B(x, ε) then B(y, ε) = B(x, ε).

• The open ball B(x, ε) is also closed.

• Under this topology, K is totally disconnected.

83

The next theorem characterizes all possible (equivalence classes of) absolute values

on Q. We will write | · |∞ for the usual absolute value on R.

Theorem 2.1.7 (Ostrowski). Let | · | be a nontrivial absolute value on Q.

(1) If | · | is archimedean, it is equivalent to | · |∞.

(2) If | · | is nonarchimedean, it is equivalent to | · |p for exactly one prime p.

Proof. Let m,n ∈ Z. Then we can write m = a0 + a1n + . . . + arnr where ai ∈ Z,

0 ≤ ai < n and nr ≤ m. Let N = max{1, |n|}. By the triangle inequality,

|m| ≤r∑i=0

|ai| |n|i ≤r∑i=0

|ai|N r.

Note that r ≤ logmlogn

and |ai| ≤ |1 + . . .+ 1| = ai|1| = ai ≤ n. So we have

|m| ≤ (1 + r)nN r ≤(

1 +logm

log n

)nN

logmlogn .

If we replace m with mt for an integer t, this gives us

|m| ≤(

1 +t logm

log n

)1/t

n1/tNlogmlogn .

Then as t→∞, this becomes |m| ≤ Nlogmlogn .

For the first case, suppose that |n| > 1 for all integers n > 1. Here N = |n|, so

we see that |m|1/ logm ≤ |n|1/ logn. By symmetry, these must be equal so there exists

a number c > 1 such that c = |m|1/ logm for all integers m. Hence

|m| = clogm = elog c logm = mlog c

for all m > 1. This shows that |m| = |m|log c∞ for all m ∈ Z,m > 1. Since | · | and | · |∞

84

are equivalent on a set of generators for Q (the integers), they must be equivalent

absolute values.

Now suppose there exists some integer n > 1 such that |n| ≤ 1. In this case N = 1

and the inequality proved in the first part of the proof implies |m| ≤ 1 for all m ∈ Z.

By Proposition 2.1.1, | · | is nonarchimedean. Let A and m be its valuation ring and

ideal, respectively. By definition, Z ⊂ A and so m∩Z is a prime ideal in Z since it is

nonzero. Thus m∩Z = (p) for some prime p. This implies that |m| = 1 if p - m, so for

all rationals q such that p does not divide its numerator or denominator, |qpr| = |p|r.

Let c be a number such that |p| = p−c. Then |x| = |x|cp for all x ∈ Q and we have

shown that | · | is equivalent to the p-adic valuation | · |p.

Definition. For a number field K, an equivalence class of valuations on K is called

a place or prime of K.

By Theorem 2.1.7, the places on Q are in one-to-one correspondence with the

prime integers – it is for this reason that places are also called primes – with the

exception of ∞. To rectify this, we refer to ∞ as the infinite prime, which thus

corresponds to the equivalence class of the archimedean absolute values on Q. This

motivates our use of the term infinite prime in Section 1.8.

Definition. For each place p of Q, let p be its prime integer representative. There is

a valuation | · |p in p that satisfies |p|p = 1p, called the normalized absolute value

of p. By convention, the normalized absolute value of the infinite place is taken to be

| · |∞.

The next theorem shows an important relation between the values of any x ∈ Q

under the normalized absolute values of all primes on Q.

85

Theorem 2.1.8 (The Product Formula). For each prime p = 2, 3, 5, . . . ,∞, let | · |p

be the corresponding normalized absolute value on Q. Then for all x ∈ Q,

∏p

|x|p = 1.

Proof. Let x = ab

for a, b ∈ Z. Then |x|p = 1 except when p | a or p | b. (Importantly,

this establishes that the product above really is finite.) The map

Q× −→ R×

x 7−→∏p

|x|p

is a homomorphism, so it suffices to show that the image of −1 and all primes q ∈ Z

is 1. But | − 1| = 1 for any absolute value, and since q is prime we have

|q|p =

q p =∞1q

p = q

1 p 6= q.

Therefore the product over all primes of Q is equal to 1.

One of the first objectives in class field theory is to prove an analog of the product

formula for finite extensions K/Q. This will require a description of how to extend

an absolute value to a field extension of Q.

Recall that a field K is said to be complete with respect to an absolute value | · |

if every Cauchy sequence in K converges with respect to | · |. Of course not every

field is complete: for example, many sequences in Q itself do not converge to rational

numbers. For this reason we embed K into a complete field K, called the completion

of K. The most common method is to define K to consist of equivalence classes

of Cauchy sequences in K. This is a common construction in many first courses in

86

analysis; the reader may consult [16] for the general case or [15] for the construction

in the number field context. We highlight one important result.

Theorem 2.1.9. The completion (K, | · |) of (K, | · |) is unique up to valuation-

preserving isomorphism.

Example 2.1.10. For Q, the completion with respect to | · |∞ is isomorphic to R.

For any prime p, (Q, | · |p) is called the p-adic numbers and is denoted Qp. The ring

of integers in this completion is called the p-adic integers, denoted Zp.

We briefly state, without proof, some results about nonarchimedean valuations

in completions. Our objective is to obtain a description of all possible completions

of a number field K, so we will content ourselves with highlights from the lengthier

discussions in [15] and [19].

The objects related to nonarchimedean absolute values extend nicely in a comple-

tion.

Proposition 2.1.11. If | · | is a nonarchimedean absolute value on K whose valuation

ring A is a DVR, then the valuation ring A of the completion (K, | · |) is also a DVR,

and its maximal ideal has the same generator as m ⊂ A.

Proposition 2.1.12. Let K be a completion of a number field K with respect to | · |.

Every element α ∈ K r {0} has a unique representation as a power series

α = πr(s0 + s1π + s2π2 + . . .)

where si ∈ A and π is a generator of m.

Theorem 2.1.13. Let K be complete with respect to a nonarchimedean absolute value

| · | and let L be a finite extension of K, with n = [L : K]. Then | · | extends uniquely

to L by

|x| = |NL/K(x)|1/n.

87

The next result is a famous one in the study of absolute values.

Hensel’s Lemma. Suppose A is a commutative ring that is complete with respect

to an ideal m. Let k denote the residue field for A, and for any f(x) ∈ A[x] denote

its image in k[x] by f(x). Then for every monic polynomial f(x) ∈ A[x] such that

f(x) = g0(x)h0(x) where g0 and h0 are monic and relatively prime in k[x], we can

factor f(x) = g(x)h(x) where g = g0 and h = h0.

Corollary 2.1.14. In Qp, xp−1 − 1 has p− 1 distinct roots.

Let K be an algebraic number field, A a DVR such that K is its field of fractions,

and p the maximal ideal of A. Let L be a finite extension of K and let B be a DVR

which is the integral closure of A in L, with P the maximal ideal of B. Now complete

both fields with respect to the p-adic valuation | · |p, which by Theorem 2.1.13 extends

uniquely to | · |P on L. The next proposition completely describes the behavior of

primes in the extension L/K in the language of Section 1.3.

Proposition 2.1.15. Let A, B, p and P denote the DVRs and maximal ideals, re-

spectively, of the extension L/K. Then

(a) p = pA and P = PB.

(b) pB = Pe where e = e(P | p), the ramification index in L/K.

(c) Moreover, e(P | p) = e(P | p) = e and f(P | p) = f(P | p) = f .

(d) [L : K] = ef .

Proof. It suffices to prove (a) and (b), since the main results in Section 1.3 will then

imply that (c) and (d) are true. Proposition 2.1.11 directly implies (a), since p has

the same generator as p and P has the same generator as P.

To prove (b), let pB have prime factorization pB = Pe11 · · ·P

egg where P = P1

– this is possible since B is a DVR and hence Dedekind. Each Pi, i ≥ 2, contains

88

elements of B outside of the maximal ideal P, and since Proposition 2.1.11 says that

B is a DVR with PB as its maximal ideal, this implies that PiB = B for i ≥ 2.

Hence

pB = (pA)B = (Pe11 · · ·Peg

g )B = (P1B)e1 = Pe1 .

For now, this concludes our description of nonarchimedean absolute values in ex-

tensions. We conclude the section by discussing how to extend archimedean absolute

values.

Theorem 2.1.16. If K is complete with respect to an archimedean absolute value | · |,

then K is isomorphic to either R or C, and | · | is equivalent to the usual absolute

value on these.

Proof. Since | · | is archimedean, the values for |n| where n ∈ Z are unbounded. Thus

K must have characteristic zero and therefore |·| restricts to an archimedean valuation

on Q. By Theorem 2.1.7 (Ostrowski’s theorem), we may replace | · | with the usual

(archimedean) absolute value on Q.

K is complete, so the completion of Q with respect to | · | is contained in K. We

know this completion is isomorphic to R, with | · | equivalent to the usual absolute

value on R, so it remains to show that either K = R already or K = C (up to

isomorphism).

Suppose K contains i, a root of x2 + 1 = 0. Then K ∼= R(i) = C. If not, adjoin i

to obtain a field K(i). Then | · | extends to K(i) by

|a+ bi| =√|a|2 + |b|2

and K(i) is complete under this valuation. It is straightforward to check that this is

equivalent to the usual absolute value on C. In any case we may at this point assume

89

C ⊆ K or replace K with K(i). Janusz [15] shows that C 6= K (or K(i)) produces a

contradiction, so we must have K = C or K(i) = C, proving the theorem.

Corollary 2.1.17. Let K be an algebraic number field and {σ1, . . . , σr, σr+1, . . . , σr+s}

be the set of embeddings of K into C. Then every archimedean absolute value on K

is equivalent to exactly one of the form |x|i = |σi(x)|.

From this description of absolute values on K, both archimedean and nonar-

chimedean, we have a generalized product formula for the places of K.

The Product Formula. Let K be an algebraic number field. For each prime p of

K, finite or infinite, we may select a valuation | · |p in p such that for all nonzero

x ∈ K, ∏p

|x|p = 1.

2.2 Frobenius Automorphisms and the Artin Map

Fix a Galois extension L of a number field K and let G be the Galois group of this

extension. Recall from Section 1.8 that for an unramified prime P ⊂ OL, there

is an automorphism σ ∈ G called the Artin symbol such that σ(α) = αq for all

α ∈ OL/P, where q = |OK/p| if p = P ∩ OK . Cox [7] denotes the Artin symbol by(L/K

P

)since it is used to define the Artin map

(L/K

·

): IK → G in the abelian

case. On the other hand, Janusz [15] and many other authors refer to this element as

the Frobenius automorphism, denoted FrobL/K(P). We will use these names and

notations interchangeably, since each has its uses in particular contexts and neither

is really preferred in the literature. There should be no confusion.

We’ve already proven the existence and uniqueness of the Frobenius automorphism

(Lemma 1.8.2) and in Proposition 1.8.3 we gave some nice properties:

90

(i) For all σ ∈ G,

(L/K

σ(P)

)= σ

(L/K

P

)σ−1.

(ii) FrobL/K(P) has order f = [OL/P : OK/p] in G.

(iii) p splits completely in L ⇐⇒(L/K

P

)= 1 for any prime P lying over p.

Note that (i) means that in general, the set {FrobL/K(P) | P ⊂ OL divides p} is a

conjugacy class in G. If L/K is abelian, this represents a single element of G which

we denote with

(L/K

p

)or FrobL/K(p). One sometimes sees

(L/K

p

)denote the

conjugacy class as well.

It will be useful to know how the Frobenius automorphism behaves in towers.

Suppose L ⊃ E ⊃ K and denote P∩E by pE. If p = P∩K is unramified in L, pE is

clearly also unramified in L so there is a Frobenius automorphism FrobL/E(P) which

relates to FrobL/K(P) by the next few results.

Proposition 2.2.1. Let f0 = f(pE | p). Then

(L/K

P

)f0=

(L/E

P

).

Proof. The residue fields are related in the following way:

OL/P ⊃ OE/pE ⊃ OK/p

and they have orders qf , qf0 and q, respectively. Consider G′ = Gal(`/ε), where

` = OL/P and ε = OE/pE. This group is generated by the automorphism x 7→ xqf0

which is the f0th power of the generator of Gal(`/k). The proposition then follows

from the definitions of the Frobenius automorphisms.

Proposition 2.2.2. Suppose L ⊃ E ⊃ K is a tower of fields so that L/K is abelian

and E/K is normal. Let m be a modulus on K and let mE denote the modulus of E

defined by the primes lying over each p | m. Then the following diagram commutes:

91

ImK

ImEE

Gal(L/K)

Gal(E/K)

σ

σ|E

FrobL/K(·)

FrobE/K(·)

Proof. Let P ∈ OL and set pE = P ∩ E. Since E/K is normal, FrobE/K(pE) is

defined. To show the diagram commutes, it suffices to prove that the restriction of

FrobL/K(P) to E is exactly FrobE/K(pE). For any α ∈ OE, σ(α) ≡ αq mod P if

and only if σ(α) ≡ αq mod pE since pE = P ∩ E is fixed by all of G when E/K is

normal. Therefore

FrobL/K(P)∣∣E

= FrobE/K(pE).

Corollary 2.2.3. Suppose E1 and E2 are normal extensions of K and L = E1E2.

Define p1 = P∩E1 and p2 = P∩E2 so that their Frobenius elements are all defined.

Then the homomorphism

Gal(L/K) −→ Gal(E1/K)×Gal(E2/K)

σ 7−→ (σ |E1 , σ |E2)

is one-to-one and therefore

(L/K

P

)=

(E1/K

p1

)×(E2/K

p2

).

Proof. The previous proposition shows that the map is a well-defined homomorphism.

Then the fact that p splits completely in L ⇐⇒ p splits completely in E1 and E2

proves the map is one-to-one.

Let’s take a look at Frobenius automorphisms in our favourite example.

92

Example 2.2.4. Let K = Q(i) and take any prime integer p. Since K/Q is abelian,(K/Qp

)represents a single element. We claim that

(K/Qp

)=

{complex conjugation if p ≡ 3 (mod 4)

1 if p ≡ 1 (mod 4).

To prove this, first let p ≡ 3 (mod 4). Then p remains prime in Q(i) and the residue

fields are given by

` = Z[i]/pZ[i] = Fp2 and k = Z/pZ = Fp.

The Frobenius element for p in `/k must be x 7→ xp:

(a+ bi)p = ap + bpip ≡ a− bi (mod p).

So the Frobenius element of any prime p ≡ 3 (mod 4) is complex conjugation.

On the other hand, recall that if p ≡ 1 (mod 4), (p) splits completely in Q(i). If

pZ[i] = p1p2, these prime ideals must be complex conjugates. Then we have

Z[i]/p1 = Z[i]/p2 = Fp and Z/pZ = Fp

so the Frobenius automorphism is the identity.

Next we describe Frobenius automorphisms in general cyclotomic extensions.

Example 2.2.5. LetK = Q(ζn) where ζn = e2πi/n for some n ≥ 2. Then Gal(L/K) ∼=

(Z/nZ)× via the automorphism identifying [k] ∈ (Z/nZ)× with the map ζn 7→ ζkn.

For a prime p - n, this implies that

(K/Qp

)= (ζn 7→ ζpn)←→ p (mod n).

In particular, (p) splits completely in Q(ζn) if and only if p ≡ 1 (mod n).

93

For the rest of the section, we focus on setting up the right conditions for a

generalization of the Artin map. The definition is simpler when it is a map on

unramified primes of OK so we need a way to restrict to these primes.

Definition. For a number field K, let IK be the group of fractional OK-ideals and

let S be a finite set of primes in OK . Then ISK is defined to be the subgroup of IK

generated by those prime ideals which are not in S.

In practice we will . For this choice of S, we define

Definition. Suppose L/K is abelian and let S = {primes p ⊂ OK | p ramifies in L}

so that ISK is generated by the unramified primes in OK . Define the Artin map to

be the homomorphism

ϕL/K : ISK −→ G = Gal(L/K)

a 7−→∏pi

(L/K

pi

)ei

where a is a fractional ideal with prime factorization a =∏

peii .

Since L/K is abelian, this map is well-defined. We will later (Section 2.10) generalize

the Artin map to non-abelian extensions.

Suppose E is a finite extension of K. Then EL/E is an abelian extension whose

Galois group, say H, is a subgroup of Gal(L/K) when we restrict elements of H to

L. Let ISE denote the subgroup of IE generated by primes in OE that do not lie over

any prime in S. Note that this is equivalent to saying ISE is generated by the primes

of OE which have norm in ISK .

Proposition 2.2.6. Let G = Gal(L/K) and H = Gal(EL/E). Then restricting H

to L gives us ϕEL/E = ϕL/KNE/K on ISE.

94

Proof. Let P ⊂ OEL be prime and let PE = P ∩ E, PL = P ∩ L and p = P ∩K.

Then q := NK/Q(p) is a prime power and NE/K(PE) = pf . Let σ = FrobEL/E(PE).

Then for each α ∈ OEL we have σ(α) ≡ αqf

mod P. Recall that σ(P) = P and

σ(PL) = PL. Let τ = FrobL/K(p). Then when α ∈ OL we have

τ(α) ≡ αq mod PL =⇒ τ f (α) ≡ αqf

mod PL

Since the Frobenius automorphism is unique, τ f = σ on L. This proves the property

for all primes in ISE and since they generate ISE we’re done.

Corollary 2.2.7. Let ϕ be the Artin map in an extension L/K. Then NL/K(ISL) ⊆

kerϕ.

Proof. Let E = L and apply Proposition 2.2.6 to obtain ϕL/KNL/K = ϕL/L = 1.

From this we obtain a nice description of ϕ for any abelian extension K of Q.

Theorem 2.2.8. Let K/Q and let S be the set of prime ideals containing (m) for

some positive integer m. Then the Artin map ϕ : ISQ → Gal(K/Q) is surjective with

kerϕ ={

fractional ideals(ab

): a ≡ b (mod m)

}.

Proof. See III.3.3 of [15]. Surjectivity of ϕ will follow from the Frobenius Density

Theorem in Section 2.5.

When L/K is not an abelian extension, a description of the Artin map becomes

more difficult. For this reason many theorems in class field theory are complicated

to state. It is our goal in the next few sections to provide a glimpse of some of the

constructions required to prove a more general description of the Artin map.

95

2.3 Ray Class Groups

In this section we generalize the class group from Chapter 1.

Definition. A modulus m is a formal product of places of K:

m =∏p

pn(p).

This product is taken over all places of K, and the n(p) are nonnegative integers

subject to the following conditions:

(1) If p is finite then n(p) ≥ 0 and only finitely many of these are nonzero.

(2) If p is a real infinite prime, n(p) = 0 or 1.

(3) If p is a complex infinite prime, n(p) = 0.

It is common to write a modulus as m = m0m∞ where m0 denotes the product

of all finite primes with positive exponent and m∞ denotes the product of the real

primes in m. In this way m0 may be realized as an integral ideal in OK .

Fix a place p of K and take α ∈ K∗. If p is a real infinite place, we say α ≡ 1

mod p if αp > 0. Otherwise α 6≡ 1 mod p. If p is finite, we say α ≡ 1 mod pn(p) if

α is in the valuation ring corresponding to p and α − 1 ∈ pn(p). We can extend this

notion of congruence for elements of K∗ to any modulus m by α ≡ 1 mod m if and

only if α ≡ 1 mod pn(p) for all primes with n(p) > 0.

Definition. For a modulus m of a field K, define the following subgroups of K∗:

Km ={ab| a, b ∈ OK and aOK , bOK are relatively prime to m0

}Km,1 = {α ∈ Km | α ≡ 1 mod m}.

Let ISK be as in the last section; that is, for any set of primes S, ISK is the subgroup

of IK generated by primes outside S. We define a special case of this for a modulus.

96

Definition. Let S be the set of primes dividing m0 for some modulus m. Then we

denote the subgroup ISK ≤ IK by Im.

There is a natural inclusion i : K∗ → IK given by α 7→ (α); we denote the image

of Km,1 under this map by PK(m, 1) := i(Km,1). This allows us to define

Definition. The ray class group of a modulus m is CK(m) = Im/PK(m, 1). The

cosets of PK(m, 1) in this quotient are referred to as ray classes mod m.

Example 2.3.1. If m = 1 then PK(m, 1) is just the subgroup of principal ideals and

thus CK(m) is the full ideal class group C(OK).

Example 2.3.2. If m =∏ν real

ν then CK(m) = IK/{(a) : |a|ν > 0 for all real ν} is

called the narrow class group of K.

Example 2.3.3. Let m = (2)3(17)2(19)·∞, a modulus of Q. Then m0 = (2)3(17)2(19)

so Qm,1 consists of all x ∈ Q satisfying

x > 0

x ≡ 1 mod 23

x ≡ 1 mod 172

x ≡ 1 mod 19.

For example, if x = ab

for a, b ∈ Z and b 6= 0 then the condition at the place 2 tells us

a and b are odd and ab−1 ≡ 1 mod 8. This looks similar to the Chinese remainder

theorem. The connection is made clear in the weak approximation theorem.

Weak Approximation Theorem. Let | · |1, . . . , | · |n be inequivalent, nontrivial

valuations on K and let β1, . . . , βn ∈ K∗. Then for any ε > 0 there exists an element

α ∈ K such that |α− βi|i < ε for all i = 1, . . . , n.

97

Proof. Following the proof in [15], we first prove the existence of elements y1, . . . , yn

in K that satisfy

|yi|i > 1 and |yi|j < 1 for all i 6= j.

We do this by inducting on n. The base case n = 2 is proven by the negation of the

definition of equivalent valuations. Now suppose there is an element y ∈ K satisfying

|y|1 > 1 and |y|j < 1 for j = 2, . . . , n− 1.

By the base case there exists some t ∈ K such that |t|1 > 1 and |t|n < 1. Now choose

y1 according to

y1 =

y if |y|n < 1

yrt if |y|n = 1yrt

1+yrif |y|n > 1

for a real number r yet to be chosen. If |y|n = 1 then |y1|j = |y|rj |t|j for all 2 ≤ j ≤ n.

Thus for sufficiently large r, |y1|j < 1 for all 2 ≤ j ≤ n. In the case that |y|n > 1,

note that

|yr|j|1 + yr|j

<1

|y−r|j−→ 0 as r →∞.

In all cases we have y1 ∈ K that is “large” at | · |1 and “small” at the other valuations.

We could have picked another valuation to start with, so the same proof produces

y2, . . . , yn with the desired properties.

Now let

α =n∑i=1

yri1 + yri

βi.

We claim that α is the element prescribed by the theorem when r is chosen appro-

98

priately. By the triangle inequality,

|α− βi|i ≤∣∣∣∣ βi1 + yri

∣∣∣∣i

+∑j 6=i

∣∣∣∣ yrj1 + yrj

βj

∣∣∣∣i

.

For any ε > 0 we can choose r large enough so that both terms on the right are less

than ε. This completes the proof.

Remark. When p is an infinite place of K, the statement |α−β|p < ε for small ε > 0

is equivalent to(αβ

)p> 0, i.e. α ≡ β mod p. When p is a finite place, recall that

|α|p = cv(α) for some real number c, 0 < c < 1. Then we see that |α − β|p < ε is

equivalent to ∣∣∣αβ − 1∣∣∣p<

ε

|β|p=: ε′.

In turn when ε′ is small, say ε′ < cn for some n, then v(αβ− 1)> 1 which means α

β−1

is in the valuation ring for p. Recall that this is the same as saying α ≡ β mod pn.

So in general we see that |α−β|p < ε is equivalent to α ≡ β mod pn for a sufficiently

large n. As suggested in Example 2.3.3, the reformulation of the weak approximation

theorem in terms of congruences allows us to view it as a generalization of the Chinese

remainder theorem.

Theorem 2.3.4. For every modulus m of K, there is an exact sequence

0→ UK/Um,1 → Km/Km,1 → CK(m)→ C(OK)→ 0

and isomorphisms

Km/Km,1∼=∏p realp|m

{±1} ×∏p|m0

(OK/pn(p))× ∼=∏p realp|m

{±1} × (OK/m0)×

where Um,1 = UK ∩Km,1.

99

Proof. First, the inclusion Im ↪→ IK induces a homomorphism CK(m) → C(OK).

Consider the sequence

0→ UK → Km → Im → C(OK)→ 0.

We will show that it is exact. In particular, to show Im → C(OK) is surjective,

we must prove that every ideal class is represented by an ideal in Im. Let a be a

fractional ideal; we may write a = bc−1 where b and c are integral ideals. For any

c ∈ c, a · (c) = bc−1(c) is integral so we may assume a is integral in the first place.

Write

a =∏p|m

pn(p)b

where b ∈ Im. For each p | m, choose πp ∈ p r p2 such that πp ≡ 1 mod p. By the

weak approximation theorem, there is some a ∈ OK so that a ≡ πn(p)p mod pn(p)+1

for all p | m. This means we can write

(a) =∏p|m

pn(p)b′ where b′ ∈ Im

but then a−1a ∈ Im and this belongs to the same ideal class as a. Hence Im → C(OK)

is surjective. Next, if a ∈ Im maps to the trivial class in C(OK) then a = (α) for some

α ∈ Km and this α is uniquely determined up to multiplication by a unit u ∈ UK .

This implies exactness of the rest of the sequence.

Now consider the maps Km,1f−→ Km

g−→ Im. By the work above, ker g = UK and

coker g = C(OK). By definition, coker(g ◦ f) = CK(m) and ker(g ◦ f) = Km,1 ∩UK =

Um,1. Finally, f is injective by the definitions of Km and Km,1. Hence by Lemma A.2.1

we have an exact sequence

0→ Um,1 → UK → Km/Km,1 → CK(m)→ C(OK)→ 0.

100

Next we prove the isomorphisms. Let p | m. If p is an infinite prime we map

α ∈ Km to the sign (+ or −) of the image of α under the embedding (·)p : K ↪→ C. If

p is finite, we map α to [a][b]−1 ∈ (OK/pn(p))× where a, b ∈ OK such that a ≡ b ≡ 1

mod m0. Since a and b are in particular relatively prime to p, it makes sense to define

their equivalence classes and take inverses in (OK/pn(p))×. Consider the map we have

defined:

ϕ : Km −→∏p real

{±} ×∏p|m0

(OK/pn(p))×.

By the weak approximation theorem and subsequent remark, ϕ is surjective. More-

over, its kernel is Km,1 by the way this subroup is defined. This shows the first

isomorphism; the second is easily concluded from the Chinese remainder theorem.

Corollary 2.3.5. For any m, the ray class group CK(m) is a finite group of order

hm =hK 2r0 N(m0)

[UK : Um,1]

∏p|m0

(1− 1

N(p)

)

where r0 is the number of real primes dividing m.

Proof. First, OK/pn is a local ring with maximal ideal p/pn; this can be seen by the

correspondence between its ideals and the ideals of OK containing p. Moreover, the

units in OK/pn are precisely those elements not in p/pn. It follows that (OK/pn)×

has order qn−1(q − 1) where q = N(p) = [OK : p]. Then by Theorem 2.3.4,

|CK(m)| = |(Km/Km,1)/(UK/Um,1)| · |C(OK)|

=

∣∣∣∣∣∣∏p real

{±1} ×∏p|m0

(OK/pn(p))×

∣∣∣∣∣∣ [UK : Um,1]−1 · hK

= hK 2r0 [UK : Um,1]−1∏p|m0

N(p)n(p)−1(N(p)− 1).

101

Furthermore, this expression is equal to the desired one when we factor out N(m0)

from the product on the right, using that N is multiplicative.

The most important implication of Corollary 2.3.5 is that every ray class group

CK(m) is finite. Let’s take a look at some examples.

Example 2.3.6. For K = Q, the narrow class group is trivial.

Example 2.3.7. Let K = Q(√n) for n > 0. Here there are two real primes and

UK = {±εm} ∼= Z/2Z×Z for a fundamental unit ε. Let ε be the conjugate of ε. Then

hm =

{2hK if ε, ε have the same sign

hK otherwise.

Also note that N(ε) = −1 if and only if ε and ε have different signs. For the first few

values of n we have

n hK ε N(ε)

2 1 1 +√

2 −1

3 1 2 +√

3 1

5 1 (1 +√

5)/2 −1

6 1 5 + 2√

6 1

so we see that the narrow class numbers for Q(√

3) and Q(√

6) are 2, whereas the

others have narrow class number 1.

Example 2.3.8. Let’s look at the important example of cyclotomic extensions. Let

L = Q(ζm) where ζm = e2πi/m for m > 2. Define the modulus m = (m)∞ on L. We

claim that all ramified primes of L divide m. The minimal polynomial of ζm over Q

is well known: it is the mth cyclotomic polynomial Φm(x). These polynomials are

102

constructed by setting Φ1(x) = x− 1 and recursively defining

Φm(x) =xm − 1∏d|md<m

Φd(x).

The relevant property we will use is that Φm(x) is a factor of xm − 1. For a prime

p, consider xm − 1 over the finite field Fp. Since the formal derivative of xm − 1 is

mxm−1, these polynomials are relatively prime unless m = 0 in Fp, i.e. p | m. In

particular this shows that if p - m, xm − 1 is separable mod p and so are all of its

irreducible factors, namely Φm(x). Hence by Theorem 1.3.4, p is unramified in L.

This allows us to consider the Artin map ϕL/Q : ImQ → Gal(L/Q) ∼= (Z/mZ)×.

We know from Section 2.2 that in any abelian extension L/K, the Artin map takes

a prime p ∈ ImK to the Frobenius automorphism x 7→ xq where q = |OK/p|. In this

example K = Q and L = Q(ζm) so OK = Z and p = (p) for a prime integer p. The

isomorphism Gal(L/Q) ∼= (Z/mZ)× is exhibited by (σ : ζm 7→ ζkm) 7→ [k]. Using this

description, we can extend ϕL/Q to all fractional ideals. If a =∏psp and b =

∏rtr

then we should have

ϕL/Q(abZ)

=

∏p|a

ϕL/Q(pZ)sp

∏r|b

ϕL/Q(rZ)tr

−1

=

∏p|a

|Z/pZ|sp∏

r|b

|Z/rZ|tr−1

=

∏p|a

psp

∏r|b

rtr

−1

= [a][b]−1.

103

It’s easy to see that the kernel of the Artin map is precisely PQ(m, 1):

PQ(m, 1) = {(α) ∈ IQ | α ≡ 1 mod m}

= {(a/b)Z | a, b ∈ Z and a ≡ b mod m}

= {(a/b)Z | [a][b]−1 = 1 ∈ (Z/mZ)×}.

Moreover, the Artin map in this case is clearly surjective (this will be proven in general

in Section 2.5). This implies that the ray class group for m = (m)∞ is isomorphic to

(Z/mZ)×.

We can use Corollary 2.3.5 to get even more information out of this example. For

m = (m)∞, the above shows that |CQ(m)| = φ(m). Plugging this into the ray class

formula, we have

φ(m) =hK 2r0 N(m0)

[UK : Um,1]

∏p|m0

(1− 1

N(p)

).

Notice that the numerical norm on Q just evaluates to the integer itself, so we can

multiply N(m0) = m back into the product on the right to obtain

φ(m) =hK2r0

[UK : Um,1]

∏p|m

pn(p)−1(p− 1)

where n(p) is the exponent of p in the prime factorization of m. This product is now

easily recognized as φ(m), so we can cancel this from both sides and rearrange:

[UK : Um,1] = hK2r0 .

In general it is a very hard problem to compute the class number of a cyclotomic field

so we end the discussion here. The study of the cyclotomic fields is closely related

to 20th Century pursuits of a proof of Fermat’s Last Theorem. For example, unique

factorization can be used to prove FLT when Q(ζm) has class number 1 but this fails

104

for m as small as 23. To worsen matters, the class number of Q(ζm) is not even known

for sure for m > 70, and even assuming the Generalized Riemann Hypothesis only

allows for computations [22] up to m = 163.

2.4 L-series and Dirichlet Density

In these next two sections we delve into one of my favourite topics in number theory:

Dirichlet series. At the end of Section 2.5 we will be able to prove Dirichlet’s theorem

on primes in arithmetic progression, one of the cornerstones of early analytic number

theory.

Definition. For any positive integer m, a Dirichlet character mod m is a homo-

morphism χ : (Z/mZ)× → C×. It is typical to extend a character to the entire ring

of integers by

χ(n) =

{χ([n]) if gcd(n,m) = 1

0 if gcd(n,m) 6= 1.

Note that since (Z/mZ)× is a finite group for all m ∈ Z+, χ([n]) is a root of unity

for all congruence classes [n] ∈ (Z/mZ)×. In other words, a Dirichlet character is a

multiplicative homomorphism from (Z/mZ)× to the circle group S1 ⊂ C.

Example 2.4.1. The trivial character mod m, which takes every [n] ∈ (Z/mZ)× to 1

(and every other integer to 0), is called the principal Dirichlet character, denoted

χ0.

Definition. For a Dirichlet character χ, we define a complex-valued function

L(s, χ) =∞∑n=1

χ(n)

ns

called a Dirichlet L-series.

105

The product form for L-series is

L(s, χ) =∏p-m

1

1− χ(p)p−s

which may be obtained by using unique factorization of n and multiplicativity of χ.

Note that both expressions for L(s, χ) converge when Re(s) > 1. The most important

and probably the most thoroughly studied example of an L-series is the Riemann zeta

function:

Example 2.4.2. The Riemann zeta function is the L-series

ζ(s) =∞∑n=1

1

ns= L(s, χ0),

where χ0 is the principal Dirichlet character for m = 1. Notice that for any m > 1,

L(s, χ0) differs from ζ(s) only by factors 11−p−s for p | m. It is well known in analytic

number theory [23] that ζ(s) extends to a meromorphic function on the half-plane

Re(s) > 0 and satisfies

ζ(s) =1

1− s+ g(s)

for some holomorphic function g(s) defined on Re(s) > 0.

As a result of the relation between L(s, χ) and ζ(s), we have the following analytic

properties of L-series. Because we are focusing on the algebra of number fields, we

leave out many analytic proofs but provide references for where one may find them.

Proposition 2.4.3 ([23]). If χ is a nonprincipal Dirichlet character, then L(s, χ)

converges for all Re(s) > 0 and L(1, χ) 6= 0.

106

Proposition 2.4.4 ([15]). For an L-series L(s, χ), define

s(x) =∑n≤x

χ(n)

and suppose there exist real numbers a, b > 0 such that |s(x)| ≤ axb for all x ≥ 1.

Then

(1) For any ε, δ > 0, L(s, χ) is uniformly convergent on the domain

D ={s ∈ C : Re(s) ≥ b+ δ, |Arg(s− b)| ≤ π

2− ε}.

(2) L(s, χ) is analytic on the half-plane Re(s) > b.

(3) For all s ∈ D0 ={s ∈ C : Re(s) ≥ 1, |Arg(s− 1)| ≤ π

2− ε}

,

lims→1

(s− 1)L(s, χ) = limx→∞

s(x)

x.

We can extend the idea of Riemann’s zeta function to an arbitrary algebraic

number field in the following way.

Definition. Let K be an algebraic number field and for any nonzero ideal a ⊂ OK ,

let N(a) denote its numerical norm. Then the Dedekind zeta function for K is

the complex-valued function

ζK(s) =∑a⊂OK

1

N(a)s.

Notice that when K = Q, the zeta function is simply the Riemann zeta function.

An even further generalization of ζK(s) is obtained by taking a modulus m of K and

107

letting k be a class in the ray class group CK(m), and defining

ζ(s, k) =∑a∈k

1

N(a)s.

In particular when m = 1, ζK(s) =∑

k∈C(OK)

ζ(s, k).

We are interested in computing the limit of (s− 1)ζ(s, k) as s→ 1. If we write

ζ(s, k) =∑a⊂OK

χ(a)

N(a)s

where χ(a) = 1 if a ∈ k and 0 otherwise, then s(x) simply counts the number of ideals

of OK with norm less than or equal to x. By Proposition 2.4.4,

lims→1

(s− 1)ζ(s, k) = limx→∞

s(x)

x.

To evaluate the limit on the right, we require a bit more machinery.

For a lattice L in an n-dimensional vector space V (as in Section 1.7), and any

bounded region D ⊂ V , let T (γ) denote the number of points of γLv in D, where γ > 0

is real and Lv := v + L for some vector v ∈ V . Define the function M(t) = T (t−1).

Then the Euclidean volume (or Lebesgue measure) of D is given by [15]

vol(D) = limt→∞

M(t)

tn.

The plan is to identify s(x)x

with M(t)tn

for suitably chosen L,D and M(t). First we

observe the following.

Lemma 2.4.5. Each ray class k ∈ CK(m) contains an integral ideal.

Proof. Since CK(m) is finite, each prime not dividing m has some power in the trivial

108

class. If a = a1a−12 is an ideal in the class k, where a1 and a2 are integral ideals, then

at2 is trivial for some t > 1. Thus aat2 is an integral ideal in k = kat2.

Now suppose a is an integral ideal in k with N(a) ≤ n for a fixed n ∈ N. Then

for any integral ideal b ∈ k−1, ab = 0 in CK(m) so ab = (α) for some α ∈ b ∩Km,1

with N(α) ≤ nN(b). On the other hand, if we have such an α, then a = (α)b−1 ∈ k

has norm less than or equal to n. We summarize this in the following lemma.

Lemma 2.4.6. For any n, the value s(n) is the number of principal ideals (α) such

that α ∈ b ∩Km,1 and N(α) ≤ nN(b). Furthermore, there is some α0 ∈ K satisfying

α0 ≡ 1 mod m0 and α0 ≡ 0 mod b

such that α ≡ α0 mod m0b for every α counted by s(n).

The existence of such an α0 is guaranteed by the weak approximation theorem

(Section 2.3) and the fact that b ∈ Im implies b - m.

Now let β1, . . . , βn be a basis for the ideal m0b, where n = [K : Q]. Then we may

write any α from Lemma 2.4.6 in the form

α = α0 +n∑i=1

aiβi.

Moreover, α0 =∑hiβi for some hi ∈ Q. To connect ideals with lattices once again,

let L be the lattice in Rn of points with integer coordinates, i.e. L = Zn. Take

v = (hi) and recall the notation Lv = v + L. Then the map

Lv −→ K∗

(xi) 7−→∑

xiβi

gives a one-to-one correspondence between points in Lv and elements α ∈ K∗ which

satisfy Lemma 2.4.6. We also need

109

Lemma 2.4.7. Um,1 = UK ∩Km,1 is the direct product of a finite cyclic group with a

free abelian group of rank r+ s− 1, where r and s are respectively the number of real

and complex pairs of embeddings K ↪→ C.

Proof. Recall from Section 1.10 the map L : UK → Rr+s. We used L to embed UK

as an (r + s− 1)-dimensional lattice in Rr+s. Corollary 2.3.5 says that UK/Um,1 is a

finite group, which implies that L(Um,1) has finite index in L(UK) and so it too is an

(r + s− 1)-dimensional lattice.

From this, we have

Lemma 2.4.8 ([15]). Let wm denote the number of roots of unity in Um,1. Then there

are exactly wm · s(n) points (x1, . . . , xn) ∈ Lv which satisfy

(1) α =n∑i=1

xiβi.

(2) α ≡ 1 mod m∞.

(3) 0 < N(α) ≤ nN(b).

(4) L(α) = c0w0 +n∑i=1

ciwi, where 0 ≤ ci < 1, w0 = (

r︷︸︸︷1, . . . , 1,

s︷︸︸︷2, . . . , 2) and wi =

L(ui), the images of the generators of the unit group Um,1.

Proof sketch. We know there are s(n) principal ideals (α) satisfying (2) and (3) by

Lemma 2.4.6. Each ideal (α) may be generated by any α′ = uα, where u ∈ Um,1.

Out of all these elements, exactly wm satisfy (4). Finally, the map L : UK → Rr+s

restricted to Um,1 provides the connection between these ideals and points in Lv.

110

Now let D be the set of all points (x1, . . . , xn) ∈ Rn satisfying Lemma 2.4.8 such

that each xi ≥ 0. We skip straight to the statement of the volume; see section IV.2

of [15] to see how it is derived.

Proposition 2.4.9. As before, let r0 be the number of real primes dividing a modulus

m. For D defined above,

vol(D) =2r−r0 reg(m)(2π)s

N(m0b)√|dK |

where reg(m) is the regulator for Um,1, as defined in Section 1.10.

Recall that reg(m) is the determinant of the matrix whose ith row is L(ui). Above

we defined r0 to be the number of real primes dividing m∞. We can extend the norm

to any modulus by setting N(m∞) = 2r0 , so that N(m) = 2r0N(m0). This leads to

the main result.

Theorem 2.4.10. Let K be a number field, m a modulus of K and k a class of ideals

in CK(m). Then

lims→1

(s− 1)ζ(s, k) =2r(2π)s reg(m)

N(m)wm

√|dK |

where r is the number of real primes of K, s is the number of pairs of complex primes

of K and wm is the number of roots of unity in Um,1.

Corollary 2.4.11. Let ζK(s) be the Dedekind zeta function for a number field K.

Then

lims→1

(s− 1)ζK(s) =2r(2π)s reg(K)

wK√|dK |

hK

where wK = |µ(K)| and hK is the class number.

Proof. Remember that ζK(s) coincides with the sum of all the ζ(s, k) for m = 1,

111

i.e. k are the distinct ideal classes in C(OK). Taking the sum of the formula in

Theorem 2.4.10 over all k ∈ C(OK) gives the result.

Example 2.4.12. In the case when K = Q, the Riemann zeta function has a simple

pole at s = 1 since by Corollary 2.4.11,

lims→1

(s− 1)ζ(s) = 1.

This is a well-known fact about the Riemann zeta function, however our work on

ζK(s) gives us a simple proof. What’s more, the Dedekind zeta function for any

number field can be analytically continued to the whole complex plane except for a

simple pole at s = 1. For more information on this, see section 5.1 in [3].

Next we extend L-series to arbitrary number fields in a similar fashion to what

we did with zeta functions. Let m be a modulus of K and let χ be any multiplicative

function χ : CK(m)→ C×. We extend χ to a character on all of Im be defining χ(a)

for an ideal a ∈ Im to be the value of χ at the ideal class [a] in CK(m).

Definition. The L-series for χ is

L(s, χ) =∑a

χ(a)

N(a)s

where the sum is taken over all a ∈ Im, i.e. all integral ideals relatively prime to m.

Note that since χ(a) only depends on k = [a], we may express L(s, χ) in terms of

zeta functions as we did with the Dedekind zeta function:

L(s, χ) =∑

k∈CK(m)

χ(k)ζ(s, k).

As with L-series over the rational field, we have

112

Proposition 2.4.13 (Product Formula). Fix a modulus m of a number field K. For

all s ∈ C with Re(s) > 1 and for any character χ : Im → C×, L(s, χ) may be expressed

as the uniform limit of the product

L(s, χ) =∏p-m

(1− χ(p)

N(p)s

)−1

.

Proof. Let p be any prime ideal in OK . Then the series

(1− χ(p)

N(p)s

)−1

= 1 +χ(p)

N(p)s+

χ(p2)

N(p2)s+

χ(p3)

N(p3)s+ . . .

converges absolutely. Suppose p1, . . . , pr are all the primes in Im with norm at most

n – by the Minkowski bound from Section 1.7 there are finitely many of these. Then

r∏i=1

(1− χ(pi)

N(pi)s

)−1

=∑ χ(pa11 · · · parr )

N(pa11 · · · parr )s=∑a∈Im

N(a)≤n

χ(a)

N(a)s.

Rearranging the terms of the L-series, we see that

∣∣∣∣∣∣L(s, χ)−∏

N(p)≤n

(1− χ(p)

N(p)s

)−1

∣∣∣∣∣∣ ≤∣∣∣∣∣∣∑

N(a)>n

χ(a)

N(a)s

∣∣∣∣∣∣ .L(s, χ) converges for all Re(s) > 1 (in fact for all Re(s) > 0 as with L-series over Q;

see section 5.2 of [3]) so the remainder term on the right must tend to 0 as n → ∞.

Hence for all Re(s) > 1,

L(s, χ) =∏p-m

(1− χ(p)

N(p)s

)−1

.

113

The function log z is well-known from complex analysis. One typically restricts

its domain to(−π

2, π

2

)for Re(z) > 0 – called the principal branch of the logarithm –

and writes its series expansion as

− log(1− z) = z +z2

2+z3

3+ . . . =

∞∑n=1

zn

n.

It is also known that every L-series satisfies

logL(s, χ) =∑p∈Im

χ(p)

N(p)s+ gχ(s)

for some function gχ which is bounded on a neighborhood of s = 1. Consult [15] or

[23] for details of these and other analytic properties of L(s, χ).

Example 2.4.14. Suppose there are only a finite number of primes p ∈ Z. Then

ζ(s) = ζQ(s) would have to be bounded near s = 1. Recall that lims→1

(s − 1)ζ(s) = 1

by Example 2.4.12. Then (s− 1)ζ(s) is also bounded near s = 1. This means

log(s− 1) = log((s− 1)ζ(s))− log ζ(s)

is bounded near s = 1, which of course is impossible since log(s−1)→ −∞ as s→ 1.

This is a rather neat proof that there are an infinite number of rational primes using

the Riemann zeta function. Moreover, we showed that

log ζ(s) ∼ − log(s− 1)

where f(z) ∼ g(z) as usual means

limz→1|f(z)− g(z)| <∞.

This generalizes in an important way.

114

Definition. Let K be an algebraic number field and S a set of prime ideals in OK .

If there exists a real number δ such that

∑p∈S

1

N(p)s∼ −δ log(s− 1)

then S is said to have Dirichlet density δ, denoted δ(S) = δ.

Example 2.4.14 shows that the set of rational primes has Dirichlet density δ = 1.

In general, establishing that a set has nonzero density is important for the following

reason.

Proposition 2.4.15. For any set S whose Dirichlet density δ(S) is defined, 0 ≤

δ(S) ≤ 1, and if δ(S) 6= 0 then S is an infinite set.

Proof. The first statement comes from the more general fact that if T ⊆ S then

δ(T ) ≤ δ(S). This in turn is a result of the fact that∑p∈S

1

N(p)scannot be negative

for s ∈ R sufficiently close to s = 1. The prove the second statement, consider the

contrapositive: if S is finite then

∑p∈S

1

N(p)s∼ 0.

This is true by definition of ∼ and the desired statement follows.

Consider the set S of primes p ⊂ OK having inertial degree f = 1. We call S the

set of degree 1 primes of K. In the following lemma we prove that there are infinitely

many of these primes in any number field.

Lemma 2.4.16. The set S of degree 1 primes of a number field K is an infinite set.

115

Proof. Since there are only a finite number of primes that ramify in K, we may

assume S excludes these. Then S consists of precisely those primes p ∈ OK whose

norm N(p) is a prime integer. Then

log ζK(s) ∼∑p⊂OK

1

N(p)s

where the p are all primes in OK . For p 6∈ S (again excluding ramified primes, since

the sum above is bounded at s = 1 for finite sums), N(p) = pf ≥ p2, where p = p∩Z.

At most [K : Q] of these p have their norms equal to a power of the same prime.

Therefore we bound the sum by

∣∣∣∣∣∑p6∈S

1

N(p)s

∣∣∣∣∣ ≤ [K : Q]∑p prime

1

p2s.

The sum on the right is bounded at s = 1, so therefore

log ζK(s) ∼∑p∈S

1

N(p)s.

Lemma 2.4.11 now tells us that log(s−1)ζK(s) is bounded at s = 1, but since log(s−1)

is clearly not bounded at s = 1, we must have

∑p∈S

1

N(p)s∼ log ζK(s) ∼ − log(s− 1).

This shows that S is an infinite set; in fact, we have shown that δ(S) = 1. This will

be important in Section 2.5.

We will need the next theorem in the course of proving Dirichlet’s theorem on

arithmetic progressions in Section 2.5.

116

Theorem 2.4.17. Let m be a modulus of K and take H to be a subgroup PK(m, 1) ≤

H ≤ Im, setting h = [Im : H]. If S is a set of primes in H with density δ(S), then

δ(S) ≤ 1h

.

Proof. First note that Corollary 2.3.5 ensures that the index h will be finite. Let χ

be a character defined on Im/H; we may view χ as a homomorphism Im → C whose

kernel contains H. Then by previous remarks,

logL(s, χ) =∑p-m

χ(p)

N(p)s+ gχ(s)

for gχ(s) convergent on Re(s) > 0 and bounded at s = 1. For any p ∈ Im, the sum∑χ

χ(p) taken over all characters χ of Im/H is either h if p ∈ H or 0 otherwise. Then

we see that

∑p∈H

h

N(p)s=∑χ 6=χ0

(logL(s, χ)− gχ(s)

)+ log(s− 1)L(s, χ0)− log(s− 1)− gχ0(s).

We also have that ∑p∈S

1

N(p)s= −δ(S) log(s− 1) + g(s)

for some g(s) bounded at s = 1. Since S ⊆ H, Proposition 2.4.15 implies that

∑p∈H

1

N(p)s−∑p∈S

1

N(p)s≥ 0

for all real s > 1. Hence for all such s,

−(

1h− δ(S)

)log(s−1)+

∑χ 6=χ0

(logL(s, χ)−gχ(s)

)+log(s−1)L(s, χ0)−gχ0(s)−g(s) > 0.

Each of the logL(s, χ) terms are bounded at s = 1 unless L(1, χ) = 0, in which case

117

the terms become negatively infinite at s = 1. However since we are assuming that s

is real and s > 1, log(s − 1) is negative near s = 1. Hence for the above expression

to be positive, we must have 1h− δ(S) ≥ 0, which impies δ(S) ≤ 1

has claimed.

Our proof implies that if δ(S) = 1h

then L(1, χ) 6= 0 for any nonprincipal character

χ of Im/H. In Section 2.10 we will see that the condition δ(S) = 1[Im:H]

holds when S

is the set of splitting primes and use this to prove a generalization of the Frobenius

density theorem for non-abelian extensions.

2.5 The Frobenius Density Theorem

In this section we prove the first main density theorem used in class field theory. In

some ways the Frobenius density theorem has been rendered obsolete by the more

powerful Cebotarev density theorem (Section 2.10), but we felt it is important to see

Frobenius’ earlier result which was intimately related to Dirichlet’s study of primes

in arithmetic progression. At the end of the section, we present a proof of Dirichlet’s

Theorem using the Frobenius density theorem.

For this section, fix a number field K, a Galois extension L/K and let G =

Gal(L/K).

Definition. Let σ ∈ G be an element of order n. The division of σ is the set of all

elements of G which are conjugate to some σm where m ∈ Z is relatively prime to n.

Equivalently, the division of σ is the union of conjugacy classes of all generators of

the cyclic subgroup 〈σ〉.

Lemma 2.5.1. Let σ ∈ G, H = 〈σ〉 and t the number of elements in the division of

σ. Then t = φ(n)[G : NG(H)] where φ is Euler’s function and NG(H) denotes the

normalizer of H.

118

Proof. For all m relatively prime to n = |σ|, ZG(σm) = ZG(σ), where ZG denotes the

conjugacy class of an element. Thus as m ranges over the integers relatively prime

to n, we count φ(n)[G : ZG(σ)] conjugates. However, some of these need not be

distinct. An element is counted q times if it is conjugate to q distinct powers of q.

Equivalently, q counts the number of conjugates of σm which are also powers of σ,

i.e. q is the number of distinct automorphisms of H induced under the conjugation

action of G. Thus q = [NG(H) : ZG(σ)]. Putting this together,

t =φ(n)[G : ZG(σ)]

[NG(H) : ZG(H)]= φ(n)[G : NG(H)].

We now state and prove the Frobenius density theorem.

Frobenius Density Theorem. Let σ ∈ G = Gal(L/K), let t denote the number of

elements in the division of σ and let S be the set of primes p ⊂ OK such that there

is some prime P ⊂ OL whose Frobenius automorphism FrobL/K(P) is in the division

of σ. Then

δ(S) =t

|G|.

Proof. We induct on n = |〈σ〉|. For the base case, n = 1 means σ is the identity

and S is the set of primes of K which split completely in L. Let S∗ denote the set

of primes of p ⊂ OL dividing some prime in S. For each p ∈ S, there are exactly

|G| = [L : K] primes in S∗ dividing p, each of which has norm equal to p. Then

∑P∈S∗

1

NL/Q(P)s=∑P∈S∗

1

NK/Q(NL/K(P))s= |G|

∑p∈S

1

NK/Q(p)s.

Let T be the set of degree 1 primes of L (those having inertial degree f = 1 over Q).

Recall that in the proof Lemma 2.4.16 we showed that δ(T ) = 1. By properties of

119

Dirichlet density, T ⊆ S∗ implies that δ(S∗) ≥ δ(T ) = 1, so δ(S∗) = 1. This combines

with the above work to give us

∑p∈S

1

N (p)s∼ 1

|G|(− log(s− 1))

and hence δ(S) = 1|G| , proving the base case.

Now assume that n = |〈σ〉| > 1. Let H = 〈σ〉 and E = LH , the subfield of L

fixed by H. The primes p ⊂ OK which have at least one degree 1 prime factor in OE

are exactly those divisible by a prime P ⊂ OL such that FrobL/K(P) is conjugate to

some power of σ. In other words p ∈ Sd for some d | n.

For each d | n, let td denote the size of the division of σd. Let Sd denote the

set of OK-primes containing an OL-prime whose Frobenius automorphism lies in the

division of σd. By induction, we have δ(Sd) = td|G| when d 6= 1.

Let SE denote the primes of E having inertial degree 1 over K. For each p ∈ Sd

let n(p) denote the number of primes in SE dividing p. Then each p ∈ Sd is the norm

of exactly n(p) distinct primes in SE. As in the base case, SE contains all the degree

1 primes of E (over Q), so δ(SE) = 1. Therefore

− log(s− 1) ∼∑P∈SE

1

NK/Q(NE/K(P))s=∑d|n

∑p∈Sd

n(p)

N(p)s.

Note that for any p ∈ Sd, n(p) is exactly the number of distinct cosets Hτi such that

Hτiσd = Hτi (see section IV.4 of [15] for details). This coset equivalence occurs if

and only if τiσdτ−1i ∈ H, but since H is cyclic, this can only happen if τi ∈ NG(〈σd〉).

120

Thus n(p) = [NG(〈σd〉) : H] and using the inductive hypothesis, we write

[NG(H) : H]∑p∈S

1

N(p)s∼

−1 +∑d|nd6=1

[NG(〈σd〉) : H]td|G|

log(s− 1).

By Lemma 2.5.1, the coefficient on the right becomes

−1 +∑d|nd 6=1

φ(nd

)[G : NG(〈σd〉)] [NG(〈σd〉) : H]

|G|= −1 +

∑d|nd6=1

φ(nd

)|H|

= −1 +∑d|nd6=1

1

nφ(nd

)

= −1− φ(n)

n+

1

n

∑d|n

φ(nd

).

A well-known property of Euler’s function states that

∑d|n

φ(nd

)= n

so the whole coefficient is −1− φ(n)n

+ 1n· n = −φ(n)

n. Finally, this implies

∑p∈S

1

N(p)s∼ − φ(n)

[NG(H) : H]nlog(s− 1) = − t

|G|log(s− 1)

using Lemma 2.5.1 again. Hence δ(S) = t|G| .

Now we can prove an important property of the Artin map:

Corollary 2.5.2. Let L/K be an abelian extension of number fields and suppose S

is a finite set of primes of K that contains all the primes that ramify in L. Then the

Artin map ϕL/K : ISK −→ Gal(L/K) is surjective.

121

Proof. Let G = Gal(L/K) and take σ ∈ G. Since G is abelian, the division of σ

is precisely the set of generators of the cyclic group 〈σ〉. By the Frobenius density

theorem, there exist infinitely many primes P ⊂ OL such that FrobL/K(P) generates

〈σ〉 and so one can certainly be found outside the finite set S. Recall that when L/K

is abelian, ϕL/K is well-defined on the ideals of OK . Thus we can find p ⊂ OK such

that ϕL/K(p) = σ′, a generator of 〈σ〉. Since σ ∈ G was arbitrary, ϕL/K is onto.

Corollary 2.5.3 ([15]). Let L1 and L2 be Galois extensions of a number field K

and let S1 and S2 be the sets of primes of K which split completely in L1 and L2,

respectively. Then S1 ⊆ S2 if and only if L2 ⊆ L1.

Another important result we can prove now that we have the Frobenius density

theorem is known as the first fundamental inequality of class field theory. Recall the

map i : K∗ → IK that takes α 7→ (α). In Section 2.3 we denoted the image of Km,1

under this map by PK(m, 1); it is also common in the literature to write i(Km,1) so

we will use them interchangeably.

Theorem 2.5.4 (First Inequality). Let L/K be a Galois extension of number fields,

let m be a modulus of K and let ImL denote the subgroup of IL generated by all primes

P ⊂ OL for which P ∩K lies in ImK. Then

[ImK : NL/K(ImL )i(Km,1)] ≤ [L : K].

Proof. With finitely many exceptions, the primes that split completely in L lie in

NL/K(ImL ). By Frobenius density, the density of the set of these primes is

1

|G|=

1

[L : K]

since it is the set of primes p such that FrobL/K(pOL) = 1 ∈ G. Then by properties

122

of Frobenius density,

1

[L : K]≤ 1

[ImK : NL/K(ImL )i(Km,1)]

which implies the first fundamental inequality.

Under certain conditions the reverse inequality holds. This is called, as one might

expect, the second fundamental inequality of class field theory and will be discussed

in the next section.

We conclude the section with a proof of Dirichlet’s famous theorem on the infini-

tude of primes in arithmetic progression. We first use the Frobenius density theorem

to prove a nice fact that is often hard to come by: the cyclotomic polynomials are

irreducible.

Proposition 2.5.5. Let ζm denote a primitive mth root of unity. Then [Q(ζm) : Q] =

φ(m).

Proof. For m ∈ Z+, let m = (m)∞ which is a modulus of Q. Set H = i(Qm,1) ≤ ImQ .

Then by Example 2.3.8, the set of primes in Q that split completely in K = Q(ζm)

is precisely the primes in H. The Frobenius density theorem says that the density of

this set is 1[K:Q]

. Therefore by properties of Dirichlet density, this is at most

1

[ImQ : H]=

1

φ(m)

which implies [K : Q] ≥ φ(m). On the other hand, the minimal polynomial of ζm

over Q, which is by definition the mth cyclotomic polynomial, has degree ≤ φ(m)

since |G| = |(Z/mZ)×| = φ(m). Hence we conclude that [K : Q] = φ(m).

Corollary 2.5.6. For any nonprincipal character χ of the ray class group CQ(m),

where m = (m)∞ as above, L(1, χ) 6= 0.

123

Proof. Apply Theorem 2.4.17 and Proposition 2.5.5 to see that

∑χ 6=χ0

(logL(s, χ)− gχ(s)) + log(s− 1)L(s, χ0)− gχ0(s)− g(s) > 0

since the log(s−1) term from the proof of Theorem 2.4.17 vanishes. The terms in the

expression above are either all bounded at s = 1, or become negatively infinite when

L(1, χ) = 0. Since the expression must be positive, L(1, χ) must be nonzero.

The next result is the main step towards proving Dirichlet’s theorem. It is an

interesting result in its own right, since it unites the theories of L-series, Dirichlet

density and ray class groups we have studied so far.

Theorem 2.5.7. Let k0 be any ray class in CQ(m), where m = (m)∞. The set of

primes in k0 has density 1φ(m)

.

Proof. For any character χ of CQ(m) we have

log(s, χ) ∼∑p prime

χ(p)

ps=

∑k∈CQ(m)

χ(k)∑p∈k

1

ps.

Multiplying by χ(k−10 ) and summing over all characters of CQ(m) yields

logL(s, χ0) +∑χ 6=χ0

χ(k−10 ) logL(s, χ) =

∑k

∑χ

χ(k−10 k)

∑p∈k

1

ps.

Note the following orthogonality relations for a finite abelian group A:

(1) For χ1, χ2 characters on A,

∑a∈A

χ1(a)χ2(a) =

{0 if χ1 6= χ−1

2

|A| if χ1 = χ−12 .

124

(2) For any a, b ∈ A,

∑χ

χ(a)χ(b) =

{0 if ab 6= 1

|A| if ab = 1.

(For details, see section IV.3 of [15].) These imply

∑χ

χ(k−10 k) =

{0 if k 6= k0

φ(m) if k = k0

where the sum is over all characters χ of CQ(m). Moreover, Corollary 2.5.6 implies

that the sum over nonprincipal characters is bounded at s = 1 since L(1, χ) 6= 0 for

χ 6= χ0. Therefore

logL(s, χ0) ∼ φ(m)∑p∈k0

1

ps.

Recall from Section 2.4 that L(s, χ0) differs from the Riemann zeta function ζ(s) only

by finitely many terms, so logL(s, χ0) ∼ log ζ(s) ∼ − log(s − 1). Finally this shows

that ∑p∈k0

1

ps∼ − 1

φ(m)log(s− 1).

By definition this means the Dirichlet density of the set of primes in any k0 in the

ray class group CQ(m) is 1φ(m)

.

Now we are prepared to state and prove the famous result.

Dirichlet’s Theorem. For each positive integer m and each integer a relatively

prime to m, there are infinitely many primes p = mb+ a.

Proof. To access our work with the Dirichlet density, we turn the problem into one

involving ray classes. Suppose p is a prime in the arithmetic progression mb + a,

where b ∈ Z. Then mb + a ≡ a (mod m) implies mb+aa∈ Qm,1, where m = (m)∞ as

125

before. This means p lies in the coset aQm,1. On the other hand, if p ∈ aQm,1 then

p = axy

with x ≡ y (mod m). It follows that x ≡ mq + y and so p = mb+ a for some

b. Hence the primes congruent to a mod m generate a prime ideal in a fixed coset

of i(Qm,1), which is a ray class in the ray class group CQ(m). By Theorem 2.5.7, the

density of such primes is 1φ(m)

so in particular there are infinitely many.

Remarkably, Dirichlet proved his theorem several years before Frobenius had a

proof of the density theorem. We discuss the history of these theorems at greater

length in Section 2.10 and relate everything to Cebotarev’s density theorem.

Dirichlet’s theorem has an important generalization to classes of ideals in gener-

alized ideal class groups which we will examine in Section 2.10. The proof of that

result depends on the condition that L(1, χ) 6= 0 for any nonprincipal character χ of

the class group in question. One should note that such results are highly nontrivial,

as the nonvanishing of L-series in all cases is only guaranteed by a positive proof of

the Generalized Riemann Hypothesis.

2.6 The Second Fundamental Inequality

In Section 2.5, we proved that NL/K(ImL )i(Km,1) has index less than or equal to [L : K]

in ImK for any modulus m of K (the first fundamental inequality). We have also seen

(courtesy of Corollary 2.5.2) that the Artin map is surjective onto Gal(L/K), so

kerϕL/K has index [L : K] in ImK . We want to show kerϕL/K = NL/K(ImL )i(Km,1) for

all abelian extensions L/K precisely when m is divisible by all ramified primes of K.

This is obtained via the second fundamental inequality of class field theory:

Theorem 2.6.1 (Second Inequality). For an abelian extension L/K, if m is divisible

by the primes of K which ramify in L, then

[ImK : NL/K(ImL )i(Km,1)] ≥ [L : K].

126

In his formulation of the main theorems of class field theory, Takagi proved the

general form of the fundamental equality. Since our approach to the Artin reciprocity

theorem in Section 2.7 requires and later generalizes the cyclic case, it will suffice the

prove the second fundamental inequality for cyclic extensions L/K.

Let L/K be a Galois extension with cyclic Galois group G = 〈σ〉. Suppose m

is a modulus of K divisible by all primes that ramify in L. We first compute some

cohomology groups (in the sense of Section A.3 of the Appendix).

Proposition 2.6.2. Let L,K and m be as above. Then

(i) H0(ImL ) = ImK/N (ImL ).

(ii) H1(ImL ) = 1.

(iii) H0(L∗) = K∗/N (L∗).

(iv) H1(L∗) = 1.

Proof. (i) Let a =∏

Paii be a fractional ideal in ImL which is fixed by σ, i.e. a ∈

ker(σ−1). Since σ(a) = a, the distinct conjugates σj(Pi) of the primes over a appear

with the same exponent. If we denote p = Pi ∩K, then

pOL =

g−1∏j=0

σj(Pi)

where g is the smallest positive integer such that σj(Pi) = Pi. This demonstrates

that the Pi contribute precisely the factor pai to the decomposition of a, and since

Pi was arbitrary, we conclude that a ∈ ImK . Therefore ImK is the subgroup of ImL fixed

by G, so

H0(ImL ) = (ImL )G = ImK/N (ImL ).

(ii) Now suppose a ∈ kerN , so N (a) = OK . Let P0 ⊂ OL be a prime in the

factorization of a which has g distinct images under the G-action. For 0 ≤ i ≤ g− 1,

127

let Pi = σi(P0) and as above, let ai be the exponent of Pi in a. Let B =

g−2∏i=0

Pcii

where for each i, ci = a0 + . . .+ai. Then we have (σ−1)B = Pa00 Pa1

1 · · ·Pag−2

g−2 P−cg−2

g−1 .

Let pf = N (P0). Since N (a) = 1, we see that

N

(g−1∏i=0

Paii

)= pf(a0+...+ag−1) = 1.

Since f ≥ 1, this shows that a0 + . . .+ ag−1 = 0, i.e. −cg−2 = ag−1. Thus (σ− 1)B is

precisely the part of a contributed by the Pi. Since Pi was arbitrary, a ∈ im(σ − 1)

so kerN = im(σ − 1). By definition, this proves H1(ImL ) = 1.

(iii) comes from the fact that ker(σ − 1)L∗ = K∗.

(iv) is just Hilbert’s Theorem 90 (Appendix A.3).

Definition. For a modulus m of K divisible by the primes ramifying in L, we define

a G-module homomorphism jm : IL → ImL by

jm(P) =

{P if P - m1 if P | m.

We further define a homomorphism fm : L∗ → ImL as the composite fm = jm ◦ i, where

i : L∗ → IL is the inclusion α 7→ (α).

Let S be the set of primes dividing m and set LS = ker fm. Then we see that

LS = {α ∈ L∗ | i(α) is divisible only by primes in S}.

The following relates the Herbrand quotients (Appendix A.3) of LS, UL and ker jm.

Lemma 2.6.3. If q(UL) and q(ker jm) are defined then q(LS) = q(UL) q(ker jm).

128

Proof. Since fm(LS) = jm ◦ i(LS) = 1, we get an exact sequence

1→ i(LS)→ ker jm → C → 1

for some G-module C satisfying

C ∼=ker jmi(LS)

∼=ker jm

i(L∗) ∩ ker jm∼=i(L∗) ker jm

i(L∗).

Notice that C is itself a subgroup of C(OL) and since the class group is finite by

Corollary 2.3.5, so C is finite as well. Therefore by Corollary A.3.3, q(i(LS)) =

q(ker jm). Finally, the exact sequence

1→ UL → LS → i(LS)→ 1

and Corollary A.3.3 can similarly be used to conclude q(LS) = q(UL) q(i(LS)) =

q(UL) q(ker jm).

This lemma shows that computing q(LS) comes down to finding q(UL) and q(ker jm).

One can obtain the following results using local class field theory [15] or ideles [20].

Theorem 2.6.4. Let r0 be the number of infinite primes ramifying in the extension

L/K. Then q(UL) =[L : K]

2r0.

Theorem 2.6.5. Let jm : IL → ImL be the homomorphism defined above for a modulus

m of K containing every prime that ramifies in L. Then

q(ker jm) =1∏

p|m0epfp

where the product is over all primes p dividing m0 the finite part of m, and ep and fp

denote respectively the ramification index and inertial degree of p.

129

Corollary 2.6.6. Let S be the set of primes which divide m, a modulus of K con-

taining all ramified primes of L/K. Then the Herbrand quotient of LS is

q(LS) =[L : K]∏p|m epfp

.

Theorem 2.6.7. For a cyclic extension L/K, suppose m is a modulus of K divisible

by sufficiently high powers of the ramified primes in L/K. Then

a(m) := [K∗ : N (L∗)Km,1] =∏p|m

epfp.

Denote the main index in the fundamental inequality by

hm(L/K) = [ImK : NL/K(ImL )i(Km,1)].

To prove Theorem 2.6.1, we will prove hm(L/K) = [L : K] under certain conditions

on a cyclic extension L/K.

For the set S of primes dividing m, the map fm = jm ◦ i gives us an exact sequence

1→ LS → L∗fm−→ ImL → V → 1

for some group V . Looking closer, this sequence contains two short exact sequences:

1→ LSγ−→ L∗

α−→ fm(L∗)→ 1 (2.1)

and 1→ fm(L∗)β−→ ImL → V → 1. (2.2)

It is from these two sequences (and their cohomologies) that we derive the ingredients

for the second fundamental inequality. Define

P = {α ∈ K∗ | fm(α) ∈ N (ImL )}

and Q = {α ∈ K∗ | jm(α) ∈ N (ImL )i(Km,1)}.

130

Consider the following commutative diagram, which is constructed using the se-

quences (1) and (2) above.

N (L∗)Km,1

N (L∗)

N (ImL )i(Km,1)

N (ImL )X 1

P

N (L∗)

K∗

N (L∗)

ImKN (ImL )

coker f0 1

Q

N (L∗)Km,1

K∗

N (L∗)Km,1

ImKN (ImL )i(Km,1)

coker g 1

1 1 1

f ∗0 p∗

f0 p

g p′

Set n(m) = [Km ∩ i−1(N (ImL )) : Km,1 ∩N (L∗)]. A standard diagram chase (cf section

V.4 in [15]) shows that coker f0∼= coker g and | ker f0| = | ker g| · n(m). Note that

ker f0 =P

N (L∗)and ker g =

Q

N (L∗)Km,1

.

Next we relate ker f0 and coker f0 to q(LS). Recall from Proposition 2.6.2 that H1(L∗)

and H1(ImL ) are trivial. Then the exact sequences (1) and (2) from above give us exact

hexagons (see Lemma A.3.1) which may be laid flat:

1 H1(fm(L∗)) H0(LS) H0(L∗) H0(fm(L∗)) H1(LS) 1

1 H1(V ) H0(fm(L∗)) H0(ImL ) H0(V ) H1(fm(L∗)) 1

δ1 γ0 α0 δ2

δ3 β0 γ0 δ4

f0

131

The dashed arrow is the identity map on H0(fm(L∗)), and correspondingly the vertical

arrow is f0 = β0α0. Then

| coker f0| = [H0(ImL ) : im β0α0] = [H0(ImL ) : im β0] [im β0 : im β0α0]

= [H0(ImL ) : im β0][H0(fm(L∗)) : imα0]

[ker β0 : ker β0 ∩ imα0]by isomorphism theorems

= | coker β0|| cokerα0|

[ker β0 : ker β0 ∩ imα0]

= | im γ0|| im δ2|

[ker β0 : ker β0 ∩ imα0]by exactness

= | im γ0||H1(LS)|

[ker β0 : ker β0 ∩ imα0].

Also note that |H0(V )| = | im γ0| |H1(fm(L∗))| by the second exact hexagon, so

| coker f0| =|H0(V )| |H1(LS)|

|H1(fm(L∗))| [ker β0 : ker β0 ∩ imα0].

In a similar fashion, we use the exact hexagons to compute | ker f0|:

| ker f0| = | ker β0α0|

= | ker β0 ∩ imα0| | kerα0|

= | ker β0 ∩ imα0| | im γ0|

= | ker β0 ∩ imα0||H0(LS)||H1(fm(LS))|

.

Lemma 2.6.8. q(LS) =| coker f0|| ker f0|

.

132

Proof. By the computations above,

| coker f0|| ker f0|

=|H0(V )| |H1(LS)|

|H1(fm(L∗))| [ker β0 : ker β0 ∩ imα0]· |H1(fm(LS))|| ker β0 ∩ imα0| |H0(LS)|

=|H1(LS)||H0(LS)|

· |H0(V )|

| ker β0|=|H1(LS)||H0(LS)|

· |H0(V )|

|H1(V )|=q(LS)

q(V ).

Now, notice that since V is a quotient of the class group of L, which by Corollary 2.3.5

is finite, V is also finite. Then applying Corollary A.3.3 shows that q(V ) = 1. The

result follows.

We now focus on the bottom row of the big commutative diagram from above,

1 −→ ker g −→ K∗

N (L∗)Km,1

g−−−→ ImKN (ImL )i(Km,1)

p′−−−→ coker g −→ 1.

Using this and Theorem 2.6.7, we know that when m is divisible by sufficiently high

powers of the ramified primes in L/K,

hm(L/K) =| im g|| coker g|

= a(m)| coker g|| ker g|

.

Then by Lemma 2.6.8, this can be written

hm(L/K) = a(m)n(m)| coker f0|| ker f0|

= a(m)n(m)q(LS).

We are now ready to prove the second inequality for cyclic extensions.

Theorem 2.6.9 (Second Inequality for Cyclic Extensions). For L/K a cyclic exten-

sion of number fields and m a modulus of K divisible by sufficiently high powers of

the ramified primes of the extension,

hm(L/K) = [ImK : N (ImL )i(Km,1)] ≥ [L : K].

133

Proof. By the work directly preceding the theorem, hm(L/K) = a(m)n(m)q(LS). The

hypotheses allow us to apply Corollary 2.6.6 and Theorem 2.6.7, which say

q(LS) =[L : K]∏p|m epfp

and a(m) =∏p|m

epfp.

Putting these together with the expression for hm(L/K) yields

hm(L/K) = n(m)[L : K]

so in particular hm(L/K) ≥ [L : K]. This proves the second inequality.

Finally, combining the results from Theorems 2.5.4 and 2.6.9 gives us the funda-

mental equality for cyclic extensions.

Corollary 2.6.10 (Fundamental Equality for Cyclic Extensions). Let L/K be a Ga-

lois extension of number fields such that Gal(L/K) is cyclic. If m is a modulus of K

that is divisible by sufficiently high powers of every prime ramifying in L, then

[ImK : N (ImL )i(Km,1)] = [L : K].

2.7 The Artin Reciprocity Theorem

Recall the subgroup PK(m, 1) ≤ ImK for a modulus m of K. In Section 2.3 it was used

to define the ray class group CK(m) = ImK/PK(m, 1), and Corollary 2.3.5 showed that

PK(m, 1) has finite index in ImK .

Definition. Let K be a number field. A subgroup H of group of fractional ideals

prime to a modulus m of K is a congruence subgroup for m if PK(m, 1) ≤ H ≤ ImK .

The quotient ImK/H is called a generalized ideal class group for m.

Corollary 2.3.5 implies that every congruence subgroup has finite index in ImK .

134

Example 2.7.1. Let m = 1 so that ImK is the full group of fractional ideals IK . Then

PK = PK(m, 1) is a congruence subgroup for m. This shows that generalized ideal

class groups properly encompass the class group.

Example 2.7.2. Let O be the order of conductor f in K = Q(√−n) for n ∈ N.

We proved in Section 1.9 that the ideal class group for O can be written C(O) ∼=

IK(f)/PK,Z(f) where PK,Z(f) is the subgroup generated by principal fractional ideals

αOK with generators satisfying α ≡ a mod fOK , a ∈ Z and (a, f) = 1. Since fOK

is a modulus,

PK(fOK , 1) ≤ PK,Z(f) ≤ IK(f)

so C(O) is a generalized ideal class group for fOK .

It turns out that the generalized ideal class groups are exactly the Galois groups

of all abelian extensions of K. This correspondence is encoded in the Artin map

ϕL/K : ImK −→ Gal(L/K)

where m is chosen so that it is divisible by every ramified prime of K. We have

seen (courtesy of Corollary 2.5.2) that the Artin map is surjective onto Gal(L/K), so

kerϕL/K has index [L : K] in ImK .

The main result in this section is one of central importance in class field theory:

Artin Reciprocity Theorem. Let L/K be an abelian extension of number fields

with G = Gal(L/K). If m is a modulus divisible by sufficiently high powers of every

prime in K that ramifies in L, then the Artin map

ϕL/K : ImK −→ G

is surjective and kerϕL/K = NL/K(ImL )i(Km,1). In particular, G is a generalized ideal

class group for m.

135

We now focus on developing the tools to prove Artin reciprocity.

Definition. Let L/K be an abelian extension of number fields and take m a modulus

of K. We say the reciprocity law holds for the triple (L,K,m) provided i(Km,1) ⊆

kerϕL/K .

The reciprocity law is important to the proof of Artin reciprocity for the following

reason.

Lemma 2.7.3. If m is divisible by all primes ramifying in L and the reciprocity law

holds for (L,K,m) then kerϕL/K = NL/K(ImL )i(Km,1).

Proof. By Corollary 2.2.7 we know NL/K(ImL ) ⊆ kerϕL/K and so NL/K(ImL )i(Km,1) ⊆

kerϕL/K as long as the reciprocity law holds. The first fundamental inequality says

that

[ImK : NL/K(ImL )i(Km,1)] ≤ [L : K],

but since [ImK : kerϕL/K ] = |Gal(L/K)| = [L : K] by surjectivity, we must have

NL/K(ImL )i(Km,1) = kerϕL/K .

Example 2.7.4. We have previously shown (Example 2.3.8) that for a primitive

mth root of unity ζm and the modulus m = (m)∞, the reciprocity law holds for

(Q(ζm),Q,m) – in fact we proved that i(Qm,1) = kerϕQ(ζm)/Q.

Remark. By properties of the Artin map (Section 2.2), one can easily prove that

• If the reciprocity law holds for (L,K,m) and E is any finite extension of K,

then the reciprocity law holds for (LE,E,m).

• If the reciprocity law holds for (L,K,m), then it holds for (L,K,mn) where n

is any modulus of K.

136

• Combining these with the previous example, we see that for any primitive mth

root of unity ζm and any modulus m of K divisible by (m)∞, reciprocity holds

for (K(ζm), K,m).

It is clear that creating certain cyclotomic extensions of number fields is critical

to preserving the reciprocity law. This connection runs deep throughout this section,

culminating in the Kronecker-Weber Theorem at the end.

Before proving some properties of cyclotomic extensions of K, we need two results

from number theory.

Lemma 2.7.5 ([15]). Let a, r ∈ Z such that a, r ≥ 2 and let q be prime. Then there

exists a prime p such that ordp(a) = qr.

Lemma 2.7.6 ([15]). Let n be an integer with prime factorization n = pr11 · · · prss .

Then for any integer a > 1 there exist infinitely many squarefree integers m such that

n | ordm(a). Furthermore, there exists an integer b > 1 such that a 6≡ b (mod m) and

n | ordm(b).

Now let L/K be an abelian extension of number fields.

Proposition 2.7.7. Let n = [L : K] and suppose s is a positive integer. Take a prime

p ⊂ OK which is unramified in L. Then there exists a primitive mth root of unity

ζm, with E = K(ζm), such that m is relatively prime to p and s, and the following

conditions are met:

(i) L ∩ E = K.

(ii) The element ϕE/K(p) in Gal(E/K) has order divisible by n.

(iii) There is some element σ ∈ Gal(E/K) whose order is divisible by n that

satisfies 〈σ〉 ∩ 〈ϕE/K(p)〉 = {1}.

137

Proof. (i) We apply Lemma 2.7.6 to a = N(p). Since L only has finitely many

subfields, there is some M such that Q(e2πi/M) contains every cyclotomic subfield of

L. Lemma 2.7.6 allows us to select m with no prime divisors less than M · s. Then

Q(e2πi/M) ∩ Q(ζm) = Q and L ∩ Q(ζm) = Q. Taking E = K(ζm) it follows that

L ∩ E = K.

(ii) Let τ = ϕE/K(p) ∈ Gal(E/K). By definition ϕE/K(p) is a Frobenius automor-

phism satisfying τ(ζm) = ζN(p)m = ζam. Thus τ has order divisible by n.

(iii) Finally, choose b ∈ Z according to Lemma 2.7.6 and define σ ∈ Gal(E/K) on

the primitive element of E/K by σ(ζm) = ζbm. Then σ has order divisible by n. Since

(a, b) = 1, it is clear that 〈σ〉 ∩ 〈τ〉 = {1} as desired.

Lemma 2.7.8 (Artin). Let L/K be a cyclic extension and p ⊂ OK a prime that is

unramified in L. Then there exists an mth root of unity ζm and an extension F/K

such that

(1) L ∩ F = K.

(2) L ∩K(ζm) = K.

(3) L(ζm) = F (ζm).

(4) p splits completely in F .

Proof. Choose m and ζ = ζm as in Proposition 2.7.7. Then L(ζ) = LE and L ∩

E = K (so (2) is done). This means that Gal(L(ζ)/K) ∼= Gal(L/K) × Gal(E/K).

Let σ be a generator of Gal(L/K) and choose τ ∈ Gal(E/K) according to (iii) of

Proposition 2.7.7. Define H to be the subgroup of Gal(L(ζ)/K) generated by (σ, τ)

and (ϕL/K(p), ϕE/K(p)). We claim that F = (LE)H is the desired field extension of

K.

By Corollary 2.2.3, ϕLE/K(p) = (ϕL/K(p), ϕE/K(p)) which generates the decom-

position group of p (rather, a prime lying over p) in Gal(LE/K), so in particular the

138

decomposition group is contained in H. Since LE is abelian, it follows that p splits

completely in F = (LE)H , proving (4).

Next, note that F (ζ) = FE is the fixed field of H∩(Gal(L/K)×{1}). Suppose we

have an element (σ, τ)a(ϕL/K(p), ϕE/K(p))b of H that lies in Gal(L/K) ∩ {1}. Then

τa ∈ 〈ϕE/K(p)〉 so τa = 1 since 〈τ〉 ∩ 〈ϕE/K(p)〉 = 1 by (iii) of Proposition 2.7.7. This

implies n = [L : K] divides a, and since the order of σ is n we have σa = 1. This

further shows that ϕE/K(p)b = 1 and n | b by Proposition 2.7.7. Thus ϕL/K(p)b = 1.

All of this shows that H ∩ (Gal(L/K) × {1}) = {1} so F (ζ) = LE = L(ζ), proving

(3).

Finally, observe that L∩F is the subfield of L fixed by H. Since (σ, τ) ∈ H, L∩F

is really the subfield fixed by σ, which is K. This proves (1) and we’re finished.

We next prove an intermediate result for cyclic extensions which we will use to

prove the Artin Reciprocity Theorem for all abelian extensions.

Theorem 2.7.9. Let L/K be a cyclic extension, G = Gal(L/K), m a modulus of

K divisible by all ramified (in L) primes of OK. Then the reciprocity law holds for

(L,K,m).

Proof. By Corollary 2.6.10, the fundamental equality holds for the cyclic extension

L/K, so it suffices to prove kerϕL/K ⊆ NL/K(ImL )i(Km,1). Take an ideal a ∈ kerϕL/K

and write its prime factorization a = pa11 · · · parr . The pi are all unramified in L since

a ∈ ImK and m is assumed to contain all the ramified primes. For each pi we may

use Artin’s Lemma to select a root of unity ζmisuch that (mi,mj) = 1 for all i 6= j,

i, j = 1, . . . , r. By Proposition 2.7.7, we can also force K ∩ Q(ζmi) = Q for each

i. Define Gi := Gal(K(ζmi)/K). Then Gi

∼= Gal(Q(ζmi)/Q) and the automorphism

group of L(ζm1 , . . . , ζmr)/K is G×G1 × · · · ×Gr.

139

Suppose G = 〈σ〉. For each i let τi be the element in Gi chosen via (iii) of

Proposition 2.7.7. Let Hi be the subgroup of G×Gi generated by the elements

(σ, τi) and (ϕL/K(pi), ϕK(ζmi )/K(pi)).

Furthermore, let Fi be the fixed field of Hi ×∏j 6=i

Gj and set F = F1 · · ·Fr. We take

a moment to verify that L ∩ F = K and Gal(L/K) = Gal(LF/F ). Note that the

intersection of all the Gal(LF/Fi) fixes F and contains (σ, τ1, . . . , τr). The field L∩F

is also fixed by this element and by (1, τ1, . . . , τr) so L∩F is fixed by σ and therefore

L ∩ F = K.

Now let ϕL/K(paii ) = σdi where di ≥ 0. Then 1 = ϕL/K(a) = σd where d =

d1 + . . .+ dr and [L : K] | d. For a sufficiently large modulus m′, the Artin map

ϕLF/F : Im′

F −→ Gal(LF/F )

is surjective so there is an ideal b0 relatively prime to m and all the mi such that

ϕLF/F (b0) = σ. Let b = NF/K(b0) ∈ ImK . By properties of the Artin map in extensions

(Proposition 2.2.6), we see that ϕL/K(b) = σ. For each i, pi splits completely so there

exists an ideal ci relatively prime to m and each mj such that NFi/K(ci) = paii b−di .

By our choice of di,

ϕLFi/Fi(ci) = ϕL/K(NFi/K(ci)) = 1.

By properties of the reciprocity law, Fi ⊂ LFi ⊂ Fi(ζmi) and so the reciprocity law

holds for (LFi, Fi,m′) as long as m′ is divisible by (mi)∞.

We chose ci prime to the mi so we may select m′ so that ci ∈ Im′

Fi. Then there exist

γi ∈ Fi, γi ≡ 1 mod m′ and an ideal di ∈ Im′

LFisuch that ci = (γi)NLFi/Fi

(di). Taking

K-norms yields

paii b−di = (NFi/K(γi))NLFi/K(di).

140

Selecting m′ so that m | m′ ensures that αi := NFi/K(γi) lies in Km,1. Now taking

products of the above pieces over all i gives us

ab−d =r∏i=1

paii b−di =

r∏i=1

αi

r∏i=1

NLFi/K(di).

Write d′i = NLFi/L(di). Then a = bd(α1 · · ·αr)NL/K(d′1 · · · d′r). Above we saw that [L :

K] divides d, so bd is a norm on L/K. Hence we have shown that a ∈ NL/K(ImL )i(Km,1)

and the theorem is proved.

A small bit of work remains to prove the main result, which we restate here.

Artin Reciprocity Theorem. Let L/K be an abelian extension with G = Gal(L/K).

Suppose m is a modulus of K divisible by all primes in K which ramify in L and as-

sume their exponents are sufficiently large. Then the Artin map

ϕL/K : ImK −→ G

is surjective with kerϕL/K = NL/K(ImL )i(Km,1).

Proof. Surjectivity was proven in Corollary 2.5.2. By the fundamental theorem of

finite abelian groups we can express G as the product of cyclic groups:

G = C1 × · · · ×Gs.

Set Hj =∏i 6=j

Ci so that G = Ci ×Hi for any i. Let Ei denote the subfield of L fixed

by Hi. Then Ei/K is a cyclic extension with Galois group Ci and by Theorem 2.7.9

there is a modulus mi such that the reciprocity law holds for (Ei, K,mi). We may

choose each mi so that mi | m, meaning the reciprocity law also holds for (Ei, K,m)

141

and thus

i(Km,1) ⊆s⋂i=1

kerϕEi/K .

By properties of the Frobenius automorphism (Proposition 2.2.6), we have ϕL/K(a)|Ei=

ϕEi/K(a) for any fractional ideal a ofOK . In particular, if a ∈ i(Km,1) then ϕL/K(a)|Ei=

1 for all i. But E1 · · ·Es = L because the group that fixes all the Ei is⋂

Hi = {1}.

Thus any automorphism acting trivially on all the Ei is the identity on L, which gives

us i(Km,1) ⊆ kerϕL/K . The theorem follows at once from Lemma 2.7.3.

We have therefore also proven Theorem 1.8.4 which was instrumental in construct-

ing the connection between the Hilbert class field and the class group C(OK). Here

we have proven a much stronger connection between Artin maps for a large class of

moduli and generalized ideal class groups. The full picture will become clear in Sec-

tion 2.9 when we show that the finite abelian extensions of K and generalized ideal

class groups are in correspondence.

Corollary 2.7.10 ([15]). Let L/K be abelian and suppose m is a modulus of K such

that the reciprocity law holds for (L,K,m). If E is a normal extension of K such that

NE/K(ImE) ⊆ IL/K(ImL )i(Km,1)

then L ⊂ E.

We use this corollary to prove another important result in class field theory. One

has probably noticed by now that the roots of unity are an important tool in describ-

ing Artin reciprocity for abelian extensions. The famous Kronecker-Weber Theorem

characterizes every abelian extension of Q as a subfield of some cyclotomic field.

Kronecker-Weber Theorem. Every abelian extension K of Q is contained in

Q(ζm) for some primitive mth root of unity ζm.

142

Proof. Our proof of the Artin Reciprocity Theorem shows that the reciprocity law

holds for (K,Q,m) for some modulus m. We may write m = (m)∞ where m is

a positive integer. Let ζm = e2πi/m, a primitive mth root of unity, and consider

L = Q(ζm). In Example 2.3.8 we computed the kernel of ϕL/Q to be i(Qm,1), so we

have

i(Qm,1) = NL/Q(ImL )i(Qm,1) ⊆ NK/Q(ImK)i(Qm,1) = kerϕK/Q.

By Corollary 2.7.10, we conclude that K ⊂ L = Q(ζm).

For a proof of Kronecker-Weber that does not rely on class field theory, see the

exercises in chapter 4 of [18]. This completes our discussion of Artin reciprocity and

the Kronecker-Weber Theorem for now, although these concepts continue to crop up

in future discussions as they are integral to class field theory as a whole.

2.8 The Conductor Theorem

For an abelian extension L/K, the Artin reciprocity theorem and its corollary (2.7.10)

imply that Gal(L/K) is a generalized ideal class group for an infinite number of

moduli m, namely those divislbe by the primes of K that ramify in L. There is in

fact a ‘best’ modulus for a particular extension L/K, called the conductor, which is

divisible by only those primes that ramify.

Fix a prime p ⊂ OK and take m to be any modulus divisible by p. Theorem 2.3.4

gives us an exact sequence

0→ (OK/pm(p))× → Km/Km,1 → CK(m)ϕL/K−−−→ C(OK)→ 0,

where ϕL/K is the Artin map for m. There is a smallest integer f(p) ≤ m(p) such

that this sequence factors through (OK/pf(p))×.

Definition. Let f(p) be as above and let m∞ be the modulus of all infinite primes

143

of K. The modulus f(L/K) = m∞∏

pf(p) is called the conductor of the extension

of L/K. It is the smallest modulus f such that the Artin map ϕL/K factors through

CK(f).

Proposition 2.8.1. If the reciprocity law holds for (L,K,m) then f(L/K) | m.

Proof. Obvious.

So far we do not know if the reciprocity law holds for f(L/K); of particular concern

is that some ramified primes might not divide the conductor. The Conductor Theorem

states that this does not happen.

The Conductor Theorem. Let L/K be abelian with conductor f = f(L/K). Then

a prime of K (finite or infinite) ramifies in L if and only if it divides f. Moreover, a

modulus m is divisible by f if and only if kerϕL/K is a congruence subgroup for m.

The proof of the conductor theorem is rather interesting, as it makes extensive use

of the local Artin map and thus establishes one of the powerful local-global connections

in class field theory. For details, consult sections V.11–12 of [15].

Proposition 2.8.2. Let L = Q(ζm) where ζm is a primitive mth root of unity. The

conductor of L/Q is determined by

f(L/Q) =

1 m ≤ 2

(n)∞ m = 2n where n > 1 is odd

(m)∞ otherwise.

Proof. The conductor theorem says that f(L/Q) is the modulus of L divisible by

exactly those primes, finite and infinite, which ramify in L. Every modulus of L/Q

is of the form (n)∞ for some integer n, so write f = (n)∞. When m = 1, 2 the

conductor is clearly 1 since Q(ζm) = Q in both cases. When m > 2, Example 2.3.8

144

tells us that all ramified primes divide the modulus m = (m)∞, so by definition the

conductor divides (n)∞, that is, n | m.

What’s more, m is a modulus on L that is divisible by every ramified prime of

both L and M = Q(ζn). This implies that kerϕM/K(m) is a subgroup of kerϕL/K(m),

which by Corollary 2.7.10 shows that L ⊂ M . Since both extensions are Galois, we

must have that |Gal(M/Q)| divides |Gal(L/Q)|, that is, φ(m) | φ(n). It is well known

that n | m always implies φ(n) | φ(m) so in this case we see that φ(n) = φ(m). Now,

under the condition n | m, this can only happen when m and n are equal or differ by

a single factor of 2. Notice that this corresponds precisely with the second and third

lines of the formula for f(L/Q) given above, so we are done.

Example 2.8.3. Let K = Q(√D) for a squarefree integer D. Using the definition of

conductor we have

f(K/Q) =

{(|dK |) D > 0

(|dK |)∞ D < 0.

2.9 The Existence and Classification Theorems

Definition. Suppose L/K is an abelian extension and m is a modulus of K. If H is

a congruence subgroup for m then L is said to be a class field of H.

The goal of class field theory is then to classify all abelian extensions by their class

groups. We will prove the Existence Theorem:

Theorem. Let m be a modulus of K and let H be a congruence subgroup for m. Then

there exists an abelian extension L ⊃ K, all of whose ramified primes divide m, such

that H is the kernel of the Artin map ϕL/K : ImK −→ Gal(L/K), that is, L is a class

field of H.

145

Directly constructing a class field for H is hard, so the usual approach in class

field theory texts [15] is to construct enough extensions to force the existence of L.

Lemma 2.9.1. Let m be divisible by all primes of K ramifying in L and suppose

there is a chain of subgroups

i(Km,1) ≤ H0 ≤ H1 ≤ Im

such that H0 is a congruence subgroup for an abelian extension L/K. Then H1 is a

congruence subgroup for the subfield of L fixed by the subgroup ϕL/K(H1) ≤ Gal(L/K).

Proof. Let G1 = ϕL/K(H1) and let E be the subfield of L fixed by G1. Let r :

Gal(L/K) → Gal(E/K) be the natural restriction, so that r(G1) = 1. For any

a ∈ Im, ϕE/K(a) = (r ◦ ϕL/K)(a) so in particular ϕE/K(a) = 1 when a ∈ H1. Thus

H1 ⊂ kerϕE/K .

On the other hand, since H1 is a congruence subgroup the reciprocity law holds

for (E,K,m) and so

[Im : kerϕE/K ] = [Gal(L/K) : G1] = [Im : H1].

This proves H1 = kerϕE/K and the Artin reciprocity theorem implies the rest.

Lemma 2.9.2. Let H be a congruence subgroup of K for the modulus m. To show

there exists a class field L of H, it suffices to prove this when K contains a primitive

nth root of unity, where n = [Im : H].

Proof. We create a tower

K = K(1) ⊂ K(2) ⊂ · · · ⊂ K(r) = K(ζn)

where each subextension K(i+1)/K(i) is cyclic. Now apply Lemma 2.9.1 and Proposi-

tion V.7.2 from [15].

146

This allows us to assume K contains the nth roots of unity. Let S1 be a finite set

of primes of K and let

m1 =∏p∈S1

pm1(p)

for sufficiently high powers m1(p). Define S2 and m2 in the same way and suppose

S1 ∩ S2 = ∅ and that S1 ∪ S2 contains all primes p satisfying

(i) p | n;

(ii) p | ∞;

(iii) and p | ai where {ai} is a finite set of OK-ideals whose images cover C(OK).

Then any ideal a can be expressed as a = ai(α) for some α ∈ K and ai only divisible

by primes in S := S1 ∪ S2. Define the congruence subgroups

H1 = i(Km1,1)(Im1)nI(S2)

and H2 = i(Km2,1)(Im2)nI(S1)

where I(Sj) denotes the group generated by finite primes in Sj. (These are congruence

subgroups since S1 ∩ S2 = ∅ implies H1 ⊆ Im1 and H2 ⊆ Im2 .) Next we define two

subgroups of K∗:

W1 = KSKn ∩Km2,1

and W2 = KSKn ∩Km1,1.

We claim that L1 = K( n√W1) and L2 = K( n

√W2) are the respective class fields over

K for H1 and H2. This is proven in detail in section V.9 of [15]. We will end the

discussion here, since our goal is to explore the consequences of the existence theorem.

In any case, the construction of such a class field L1 for H1 allows us to prove

The Existence Theorem. Every congruence subgroup H of K has a class field.

147

We consolidate the proof here.

Proof. Take a congruence subgroup H and set [Im : H] = n. Lemma 2.9.2 says that

we may assume K contains the nth roots of unity. Let S1 be a finite set of primes

containing all primes dividing m and satisfying (i) – (iii) above. Let S2 = ∅ so that

S = S1 ∪ S2 = S1. Define m1 as above so that m | m1. Then H1 = H ∩ Im1 and by

the above work there is an abelian extension L1 with H1 = kerϕL1/K . Finally, by

Lemma 2.9.1 there is a subfield L of L1 which is class field for H ⊆ H1.

An important corollary is the classification theorem of class field theory, which

bears a resemblance to the fundamental theorem of Galois theory. Such classification

theorems are a primary tool in many areas of modern mathematics. First, we need

Lemma 2.9.3. Suppose n and m are moduli of K such that n | m. If Hn is a

congruence subgroup for n and Hm = Hn ∩ ImK then the class groups InK/Hn and

ImK/Hm are isomorphic.

Proof. Since Hn is a congruence subgroup for n, InK = ImKHn, so by isomorphism

theorems,

ImKHm

=ImK

ImK ∩Hn∼=ImKH

n

Hn=InKHn

.

The Classification Theorem. Let K be a number field. There is a one-to-one,

inclusion-reversing correspondence

{finite abelian

extensions L/K

}←→

{generalized ideal

class groups of K

}.

Proof. The existence theorem shows that every congruence subgroup corresponds to

an abelian extension. Conversely, let L and M be abelian extensions of K. Consider

148

the Artin maps ϕL/K : If(L/K)K → Gal(L/K) and ϕM/K : I

f(M/K)K → Gal(M/K), where

f denotes the conductor of each extension. By the conductor theorem, kerϕL/K and

kerϕM/K are both congruence subgroups for K and by Lemma 2.9.3 it suffices to prove

the correspondence for these congruence subgroups. On one hand, Corollary 2.7.10

shows that if kerϕL/K ⊆ kerϕM/K then M ⊂ L. On the other hand, M ⊂ L implies

that kerϕL/K ⊂ kerϕM/K and so the correspondence is indeed one-to-one.

At this point we return to the defining property of the Hilbert class field which we

have so far neglected to justify. Take the modulus m = 1 on K and the congruence

subgroup PK = PK(m, 1) ≤ ImK = IK . By the existence theorem, there is a unique

abelian extension L/K such that the Artin map induces the isomorphism

C(OK) = IK/PK ∼= Gal(L/K).

Using this, we may now prove

Theorem 2.9.4. For a number field K, the Hilbert class field L/K is the maximal

unramified abelian extension of K.

Proof. Since m = 1, it follows that L is unramified. Let M be another unramified

abelian extension of K. By the conductor theorem, the primes of K dividing the

conductor f(M/K) are exactly those which ramify in M . There are none of these,

so f(M/K) = 1. The conductor theorem also tells us that kerϕM/K is a congruence

subgroup for m = 1. Then PK ⊂ kerϕM/K , but for the Hilbert class field L, PK =

kerϕL/K . Thus kerϕL/K ⊂ kerϕM/K . Finally Corollary 2.7.10 shows thatM ⊂ L.

We have now proven in greater generality all of the main theorems from Sec-

tion 1.8. Finally, we briefly mention a nice property of the Hilbert class field which

was conjectured by Hilbert and proven by Artin and Furtwangler using the transfer

map in group theory.

149

Theorem 2.9.5 (Principal Ideal Theorem). If L is the Hilbert class field of K, then

every ideal a ⊂ OK becomes principal in OL.

2.10 The Cebotarev Density Theorem

In understanding the connections between the density theorems of Frobenius and

Cebotarev, it is important to study how they fit in with other related results. Frobe-

nius proved his theorem in 1880 (and finally published the result 16 years later; see

[26]), but this came several decades after Dirichlet’s more famous theorem on primes

in arithmetic progression (see Section 2.5). Although his original proof did not refer

to the idea of density, Dirichlet’s result essentially showed that for any m ∈ Z, the

density of the set

S = {p prime | p ≡ a (mod m), (a,m) = 1}

is δ(S) = 1ϕ(m)

. Frobenius successfully generalized this result to describe the splitting

behavior of monic polynomials f over Fp, where p is a prime not dividing the dis-

criminant D(f). In loose terms, Frobenius’ result showed that the number of primes

p such that f has a given decomposition over Fp is proportional to the number of au-

tomorphisms σ ∈ Gal(K/Q) with the same cycle type as this decomposition, where

K is a splitting field of f over the rationals. We illustrate this with an example.

Example 2.10.1. Let f = x4 − x− 1. Some decomposition patterns of f over finite

fields are shown below.

f ≡ (x3 + 3x2 + 2x+ 5)(x+ 4) (mod 7)

f ≡ x4 − x− 1 (mod 47)

f ≡ (x2 + 34x+ 24)(x2 + 67x+ 21) (mod 101).

(These factorizations are easy to produce with the MAGMA code found in Ap-

150

pendix A.4.) It turns out [26] that f factors into the different decompositions (parti-

tions of n = 4) with the following approximate frequencies:

decomposition proportion of primes

4 14

3,1 13

2,2 18

2,1,1 14

1,1,1,1 124

For example, the prime 7 falls into the set C1,3 = {p prime | f = gh3 (mod p)}, while

47 ∈ C4 and 101 ∈ C2,2. Correspondingly, Frobenius’ theorem says that the number

of automorphisms σ ∈ G = Gal(K/Q) with cycle type 4 is |G|4

; likewise, the number

of σ with cycle type 1,3 is |G|3

; the number with cycle type 2,2 is |G|8

; and so forth.

In every case, the identity automorphism is the only element of G with cycle type

1,1,1,1, which tells us that |G| = 24 and we can go back and compute the number of

elements of each cycle type accordingly.

So far we have seen that for a field K/Q, classes of primes are in a certain cor-

respondence with the various cycle types of elements of the Galois group of this

extension. The natural question arising from this discussion is: given a polynomial

f and a prime p that doesn’t divide D(f), is it possible to find, in some canonical

way, an element in G with the same cycle type as the decomposition of f over Fp?

This would successfully generalize both Dirichlet’s and Frobenius’ results, and in-

deed Frobenius conjectured that it was possible. The solution was finally found by

Cebotarev after 42 years in the form of his density theorem.

For the next few theorems, we will assume K is a number field and E is a normal,

not necessarily abelian, extension of K, with Galois group G = Gal(E/K).

Let m be a modulus divisible by sufficiently high powers of all the primes of K

which ramify in E. Then the group Hm(E/K) := NE/K(ImE)i(Km,1) is a congruence

151

subgroup for m and so the Existence Theorem tells us there is a (unique) abelian

extension L/K that is class field for Hm(E/K). We may ‘enlarge’ m by forming a

modulus n such that m | n and NE/K(InE) ⊆ Hn(L/K). By Corollary 2.7.10, L ⊂ E

so we may as well use m after all. This tells us that Hm(E/K) = Hm(L/K) and

moreover,

ImK/Hm(E/K) = ImK/H

m(L/K) ∼= Gal(L/K).

To identify Hm(E/K) with Gal(E/K), we prove the following theorem which also

serves to generalize the Artin map to the non-abelian case.

Theorem 2.10.2. L is the largest abelian subfield of E and therefore Gal(L/K) ∼=

G/G′ where G′ denotes the commutator subgroup of G.

Proof. First suppose L ⊂M ⊂ E where M/K is abelian. By norm properties,

NE/K(ImE)i(Km,1) ⊆ NM/K(ImM)i(Km,1) ⊆ NL/K(ImL )i(Km,1)

but we showed that the first and last are equal, so it follows that L = M since both

are abelian. Now this tells us by the classification theorem that Gal(L/K) is the

largest possible quotient of G that is abelian. By definition this is the abelianization

of G, so Gal(L/K) ∼= G/G′.

To describe the isomorphism, let P be a prime in ImE and let p = P ∩K. By the

proof of Theorem 1.3.3, the primes lying over p are Galois conjugates under the action

of G and therefore p determines a conjugacy class of the Frobenius automorphism

FrobE/K(P). This means that p determines a single element in G/G′. We define the

Artin map for non-abelian extensions to be

ϕE/K(p) :=

(E/K

P

)G′.

152

By the work above, this extends to a homomorphism ImK → G/G′.

To complete the description of ϕE/K , we compute its kernel. By Proposition 2.2.2,

(E/K

P

)∣∣∣∣L

=

(L/K

PL

)where PL = P ∩ L.

Thus ϕE/K(p) = ϕL/K(p)G′ so kerϕL/K ≤ kerϕE/K . But kerϕL/K = Hm(E/K)

which was shown to have index [G : G′] in ImK . Hence kerϕE/K = Hm(E/K) and our

description is complete.

Remark. The above proof and discussion shows that [ImK : Hm(E/K)] = [G : G′].

In particular, this means that for a non-abelian extension of number fields the first

fundamental inequality (Theorem 2.5.4) is strict.

As another consequence of the classification theorem, we have the following gen-

eralization of Corollary 2.5.6.

Proposition 2.10.3. Let χ be a nontrivial character of the ray class group CK(m) =

ImK/PK(m, 1). Then L(1, χ) 6= 0.

Proof. Let H = PK(m, 1). Then there is an abelian extension L/K that is the

class field of H – this is called the ray class field for the modulus m. Note that,

except for a finite number, all the primes of K which split in L are contained in H.

Thus by the Frobenius density theorem the density of this set of primes is 1[L:K]

. By

the Artin reciprocity theorem, this is equal to 1[ImK :H]

. Finally, apply the comments

following Theorem 2.4.17 to conclude that L(1, χ) 6= 0 for any nontrivial character of

ImK/H.

This can be used to prove the following generalization of Dirichlet’s Theorem.

153

Theorem 2.10.4 (Dirichlet’s Theorem for Number Fields). Let H be a congruence

subgroup for a modulus m. Then any coset of H in ImK contains infinitely many primes

and the density of this set of primes is1

[ImK : H].

We are now ready to state and prove the main theorem of this section.

Cebotarev Density Theorem. Let L/K be a Galois extension of number fields and

suppose an element σ ∈ G = Gal(L/K) belongs to a conjugacy class C. Then the set

S of all primes p ⊂ OK divisible by a prime P ⊂ OL such that FrobL/K(P) ∈ C has

density

δ(S) =|C|

[L : K].

Proof. Let E be the subfield of L fixed by the cyclic subgroup 〈σ〉. Then since

Gal(L/E) = 〈σ〉, the extension L/E is abelian. Let T ′ be the set of primes P ⊂ OE

with FrobL/E(P) = σ. By Theorem 2.10.4, δ(T ′) = 1|〈σ〉| . Recall that Lemma 2.4.16

says we may restrict our attention to the set T of primes in T ′ with inertial degree

f(E/K) = 1, since δ(T ) = δ(T ′).

For any P ∈ T with p = P ∩ K, we will count the number of Pi ∈ T dividing

p. Take Q ⊂ OL lying over P such that FrobL/E(Q) = σ. Let {τi} be a transversal

of 〈σ〉 in Gal(L/K); one will recall that this means 〈σ〉τi are all the distinct cosets of

〈σ〉. By transitivity of the G-action on primes over P, the primes in L dividing p are

τi(Q) and these are distinct. Likewise the primes of E dividing p are Pj := τj(Q)∩E.

It is a property of the Frobenius automorphism (III.2.8 in [15]) that

Pj ∈ T ⇐⇒ 〈σ〉τjσ = 〈σ〉τj.

154

So in particular,

FrobL/E(Pj) =

(L/E

τj(Q)

)= τj

(L/E

Q

)τ−1j = τjστ

−1j .

It follows that Pj ∈ T ⇐⇒ τjστ−1j = σ. Since the τj and therefore the Pj are distinct

(remember that {τj} is a transversal of 〈σ〉), the number of primes in T dividing p is

equal to [ZG(σ) : 〈σ〉] where ZG(σ) is the centralizer of σ in G = Gal(L/K).

Now let S denote the set of OK-primes divisible by a prime in T and choose some

p ∈ S. There are precisely [ZG(σ) : 〈σ〉] primes P ∈ T for which NE/K(P) = p. This

implies that [ZG(σ) : 〈σ〉] · δ(S) = δ(T ) = 1|〈σ〉| . Finally, we conclude that

δ(S) =1

|〈σ〉| · [ZG(σ) : 〈σ〉]=

1

|ZG(σ)|=|C||G|

=|C|

[L : K].

The Cebotarev density theorem immediately gives us the following result for

abelian extensions.

Corollary 2.10.5. Let L/K be abelian, m a modulus of K divisible by all primes

that ramify in L, and σ ∈ Gal(L/K). Then the set S of primes p - m such that(L/K

p

)= σ has density δ(S) =

1

[L : K]and in particular S is infinite.

This corollary is similar to the conclusion in the proof of Theorem 2.5.4, and both

density theorems imply the surjectivity of the Artin map (this was originally proven in

Corollary 2.5.2). However, Cebotarev’s result implies surjectivity in a much stronger

sense, in that the density of primes in L is uniformly distributed across the collection of

sets S corresponding to conjugacy classes in G. Recall that with Frobenius’ theorem,

this density was only uniformly distributed across divisions, a much less intuitive

object to work with in the group-theoretic sense.

155

The Cebotarev density theorem is undoubtedly one of the most useful tools in

modern algebraic number theory, and is beginning to have practical application in

algebraic geometry. One important result for our purposes answers a question posed

back in Section 1.3.

Proposition 2.10.6. For any Galois extension L/K, there are infinitely many primes

of K that split completely in L.

Proof. Apply the Cebotarev density theorem to the conjugacy class of 1 ∈ Gal(L/K)

to see that the primes p ⊂ OK such that

(L/K

p

)= 1 have density

1

[L : K]. Then

Proposition 1.8.3 says that

(L/K

p

)= 1 ⇐⇒ p splits completely in L.

This implies the result.

Example 2.10.7. To illustrate the differences between conjugacy class, division and

cycle type and their associated densities, consider the group G = Z/3Z×Z/3Z. The

reason is that these three types of partitions are all distinct for G, as we will see in a

moment. To apply the density theorems to G we must find a Galois extension M/Q

such that G = Gal(M/Q). We provide two computational methods of constructing

such an extension below.

The hard way is to find two extensions K/Q and L/Q of degree 3 and take their

compositum. Corollary 14.22 from [10] says if K and L are Galois extensions of

Q and K ∩ L = Q then the Galois group of their compositum is a direct product

Gal(KL/Q) ∼= Gal(K/Q) × Gal(L/Q). There are two concerns: we want M/Q to

be Galois with Gal(M/Q) ∼= Z/3Z× Z/3Z and we also want K and L to be normal

subfields of M .

156

Q

K = Q(α) L = Q(β)

M = Q(α, β)

3 3

3 3

9

Q(ζ9) Q(ζ13)

Q(ζ117)

By the Kronecker-Weber Theorem (Section 2.7), we can find all of these abelian

extensions within cyclotomic fields. It is a fact (cf. 14.27 in [10]) that if gcd(m,n) = 1

then Gal(Q(ζmn)/Z) ∼= Gal(Q(ζm)/Q) × Gal(Q(ζn)/Z) where ζj denotes a primitive

jth root of unity. For our purposes we want an integer k = mn such that gcd(m,n) =

1 and 3 divides ϕ(m) and ϕ(n); this way we can find subfields of degree 3.

Along these lines, we chose m = 9 and n = 13. We found subfields K = Q(α)

and L = Q(β), where α = ζ9 + ζ89 and β = ζ13 + ζ5

13 + ζ813 + ζ12

13 . The previous

paragraphs ensure that M = Q(α, β) is a Galois extension of Q with Galois group

Gal(M/Q) ∼= Z/3Z× Z/3Z. All of this is verified with the following Magma code.

> A<x> := PolynomialRing(Integers());

> C<z> := CyclotomicField(9);

> D<w> := CyclotomicField(13);

> f := MinimalPolynomial(z+z^8);

> g := MinimalPolynomial(w+w^5+w^8+w^12);

> K<a> := NumberField(f);

> L<b> := NumberField(g);

> M<c> := Compositum(K,L);

> IsNormal(M);

true

> G := GaloisGroup(M);

> G;

Permutation group G acting on a set of cardinality 9

157

Order = 9 = 3^2

(1, 5, 6)(2, 3, 8)(4, 7, 9)

(1, 2, 9)(3, 4, 5)(6, 8, 7)

> h<x> := MinimalPolynomial(c);

> h;

x^9 + 3*x^8 - 18*x^7 - 38*x^6 + 93*x^5 + 147*x^4 - 161*x^3 - 201*x^2

+ 57*x + 53

The package galpols provides an easier option for generating a polynomial p ∈

Z[x] such that a splitting field of p over Q has a desired Galois group. For instance,

the code

> load galpols;

> P<x> := PolynomialRing(IntegerRing());

> p := PolynomialWithGaloisGroup(9, 2);

> p;

x^9 - 15*x^7 - 4*x^6 + 54*x^5 + 12*x^4 - 38*x^3 - 9*x^2 + 6*x + 1

returns a polynomial p such that Gal(p) ∼= Z/3Z × Z/3Z — the (9, 2) notation

comes from the numbering scheme for transitive groups created by Butler and McKay

in [4]. As the polynomials h and p are different, their splitting fields are different but

both have the desired Galois group. For the rest of the example we will work with

h(x) = x9 + 3x8 − 18x7 − 38x6 + 93x5 + 147x4 − 161x3 − 201x2 + 57x+ 53.

Consider G = Z/3Z×Z/3Z as above. Since G is abelian, there are nine singleton

conjugacy classes in G. These can be viewed with the ConjugacyClass(G) code in

Magma. On the other hand, there are five different divisions and two cycle types in

G. The cycle types are (1) for the identity and (3, 3, 3) for the remaining elements;

the divisions can be viewed with the following code:

> Divisions(G);

[

Id($),

(1, 5, 6)(2, 3, 8)(4, 7, 9),

(1, 3, 7)(2, 4, 6)(5, 8, 9),

(1, 2, 9)(3, 4, 5)(6, 8, 7),

158

(1, 8, 4)(2, 7, 5)(3, 9, 6)

]

(For the code defining the function Divisions(G), see Appendix A.4.) Now the

next three tables display the distributions of primes p ≤ 10, 000 whose Frobenius

elements occur among the different divisions, cycle types and conjugacy classes of G,

where G is identified with Gal(M/Q) for M defined above.

> frobTal := FrobeniusTally(h,10000);

> divTally := DivisionTally(frobTal);

> divTally;

[

<Id($), 126>,

<(1, 5, 6)(2, 3, 8)(4, 7, 9), 272>,

<(1, 3, 7)(2, 4, 6)(5, 8, 9), 277>,

<(1, 2, 9)(3, 4, 5)(6, 8, 7), 273>,

<(1, 8, 4)(2, 7, 5)(3, 9, 6), 277>

]

> cycTally := CycleTypeTally(frobTal);

> cycTally;

[

<[ <1, 9> ], 126>,

<[ <3, 3> ], 1099>

]

> ccTally := ConjClassTally(frobTal);

> ccTally;

[

<Id($), 126>,

<(1, 5, 6)(2, 3, 8)(4, 7, 9), 135>,

<(1, 6, 5)(2, 8, 3)(4, 9, 7), 137>,

<(1, 2, 9)(3, 4, 5)(6, 8, 7), 137>,

<(1, 9, 2)(3, 5, 4)(6, 7, 8), 136>,

<(1, 3, 7)(2, 4, 6)(5, 8, 9), 139>,

<(1, 7, 3)(2, 6, 4)(5, 9, 8), 138>,

<(1, 8, 4)(2, 7, 5)(3, 9, 6), 143>,

<(1, 4, 8)(2, 5, 7)(3, 6, 9), 134>

]

(These functions can also be found in Appendix A.4.) Notice that the distribution

159

is essentially uniform across each of the three types of partitions of G; that is, the

distribution of primes in an element of a given partition is proportional to the size of

the element of the partition.

2.11 Ring Class Fields

In the final section of Chapter 2, we will utilize class field theory to construct an

extension of an imaginary quadratic field that corresponds to an order O, general-

izing the Hilbert class field from Section 1.8. We will use this extension to prove a

characterization theorem for when a prime has the form x2 + ny2, finally answering

our motivating question.

Let K be a number field. An ideal m ⊂ OK can be viewed as a modulus of K.

We will usually be working with principal ideals αOK , in which case we will denote

the group of fractional ideals derived from the modulus (α) by IK(α), with principal

subgroup PK(α, 1). From Theorem 1.9.13, we know the class group for an order O is

C(O) = I(O)/P (O) ∼= IK(f)/PK,Z(f)

where f is the conductor of O in OK . Then clearly PK,Z(α) is a congruence subgroup:

PK(α, 1) ≤ PK,Z(α) ≤ IK(f)

so C(O) is a generalized ideal class group for K corresponding to the modulus fOK .

The existence theorem (Section 2.9) then says that there is a unique abelian extension

L/K such that Gal(L/K) ∼= C(O).

Definition. For an order O in a number field K, the unique abelian extension L ⊃ K

satisfying Gal(L/K) ∼= C(O) is called the ring class field of the order O.

Some authors ([5] for example) denote a ring class field by KO. It is clear from the

classification theorem that the ring class field of the maximal order OK is precisely the

160

Hilbert class field of K. We will see that ring class fields are a useful generalization

of the Hilbert class field in many ways.

On the group theory side of things, we have the following characterization of the

Galois group of a ring class field.

Lemma 2.11.1. Let L be the ring class field of the order O in an imaginary quadratic

field K. Then L/Q is Galois and its Galois group can be written as a semidirect

product

Gal(L/Q) ∼= Gal(L/K) o (Z/2Z),

where the nontrivial element in Z/2Z acts on Gal(L/K) via σ 7→ σ−1.

Proof. See [7] section 6 and Lemma 9.3.

As we did with the Hilbert class field, we begin by relating a prime p = x2 + ny2

to its splitting behavior in the ring class field of Z[√−n].

Theorem 2.11.2. Fix n ∈ N, let K = Q(√−n) and let L be the ring class field of

the order Z[√−n] in K. If p is an odd prime not dividing n, then

p = x2 + ny2 ⇐⇒ p splits completely in L.

Proof. Let O = Z[√−n] and denote its conductor by f . The discriminant of O is

D = −4n, so we know from Section 1.9 that−4n = f 2dK , where dK is the discriminant

of K. If p - n is an odd prime, then of course p - f 2dK and so by Corollary 1.6.13, p

is unramified in K. As with the analogous Theorem 1.8.8, we prove the equivalence

161

of the following statements:

(i) p = x2 + ny2 ⇐⇒ pOK = pp, p 6= p and p = αOK for some α ∈ O (ii)

⇐⇒ pOK = pp, p 6= p and p ∈ PK,Z(f) (iii)

⇐⇒ pOK = p 6= p and

(L/K

p

)= 1 (iv)

⇐⇒ pOK = p 6= p and p splits in L (v)

⇐⇒ p splits in L. (vi)

(i) ⇐⇒ (ii) Suppose p = x2 + ny2 = (x +√−ny)(x −

√−ny). Let p =

(x +√−ny)OK , so that pOK = pp is the prime factorization of p in OK . Since p is

unramified in K, p 6= p. Also note that x +√−ny ∈ Z[

√−n]. This entire argument

is reversible, as in the proof of Theorem 1.8.8.

(ii) ⇐⇒ (iii) follows from Theorem 1.9.13.

(iii) ⇐⇒ (iv) ⇐⇒ (v) Note that

IK(f)/PK,Z(f) = C(O) ∼= Gal(L/K)

where the isomorphism is the Artin map ϕL/K . This shows that p ∈ PK,Z(f) if and

only if

(L/K

p

)= 1, and Proposition 1.8.3 further implies that

(L/K

p

)= 1 if and

only if p splits completely in L.

(v) ⇐⇒ (vi) Finally, Lemma 2.11.1 shows that L is Galois over Q and so as in

the proof of Theorem 1.8.8, p splits in L if and only p splits in K and some prime lying

over p (e.g. p) splits in L. This proves all equivalences and hence the theorem.

We finally arrive at the main characterization theorem for primes of the form

x2 + ny2. This argument, due to Cox [7], successfully generalizes Theorem 1.8.7 to

all positive integers n.

162

Theorem 2.11.3. For every integer n > 0, there is a monic irreducible polynomial

fn(x) of degree h(−4n) with integer coefficients such that for all odd primes dividing

neither n nor the discriminant of fn,

p = x2 + ny2 ⇐⇒(−np

)= 1 and fn(x) ≡ 0 (mod p) for some x ∈ Z.

Furthermore, any such choice of fn(x) will be the minimal polynomial of a real al-

gebraic integer α for which L = K(α) is the ring class field of the order Z[√−n] in

K = Q(√−n).

Proof. As in the proof of Theorem 1.8.7, knowing L/Q is Galois allows us to pick a real

algebraic integer α that generates L/K, that is L = K(α). Let fn(x) be the minimal

polynomial of α over K. By definition such a polynomial is monic, irreducible and

has integer coefficients. Moreover, fn must have degree [L : K] = h(O) = h(−4n).

Let p be a prime not dividing n or the discriminant of fn. Then fn is separable

mod p, so p splits completely in K if and only if(−np

)= 1. We may assume p splits

completely in K, which means OK/p ∼= Z/pZ for an OK-primes p such that p = p∩Z.

Since fn is separable over Z/pZ, it is also separable over OK/p. Hence Theorem 1.3.4

shows that

p splits completely in L ⇐⇒ fn(x) ≡ 0 mod p has a solution in OK

⇐⇒ fn(x) ≡ 0 mod p has a solution in Z.

The main equivalence follows from Theorem 2.11.2.

To address fn(x), note that there are infinitely many choices of such a polynomial

since there are infinitely many primitive elements of the extension L/K. We want

to prove that the possible fn(x)’s that arise are exactly those which are the minimal

polynomials of primitive elements of L/K. Let f be a monic integral polynomial of

163

degree h(−4n) satisfying the main equivalence of the theorem. Let g ∈ K[x] be an

irreducible factor of f(x) and let M = K(α) where α is a root of g. Note that if we

knew L ⊂M , then

h(−4n) = [L : K] ≤ [M : K] = deg g ≤ deg f = h(−4n).

Therefore if L ⊂ M then we would be able to conclude that L = K(α) and f is the

minimal polynomial of α over K. To verify L ⊂ M we need the next lemma which,

once established, will allows us to finish the proof of Theorem 2.11.3.

To borrow the notation in [7], given two sets S and T , we will write S·⊂ T if S is

contained in T except for a finite number of elements. We will apply this in the next

lemma to the set SL/K = {p ⊂ OK | p is prime and splits completely in L}.

Lemma 2.11.4. Let L and M be Galois extensions of a number field K and define

S = SL/Q = {p ∈ Z prime | p splits completely in L}

and T = {p ∈ Z prime | p is unramified in L, f(p | p) = 1 for some p ⊂ OM}.

Then L ⊂M ⇐⇒ T·⊂ S.

Proof. First, if L ⊂ M then T is clearly a subset of S. Conversely, suppose T·⊂

S. Let N be a Galois extension of K containing both L and M as subfields. By

the fundamental theorem of Galois theory, it will suffice to show that Gal(N/M) ≤

Gal(N/L).

Take any σ ∈ Gal(N/M); we will show that σ restricts to the identity on L. By

the Cebotarev density theorem, there exists an OK-prime p that is unramified in N

for which

(N/K

p

)is the conjugacy class of σ – recall from Section 2.10 that when

N/K is non-abelian, the Artin symbol describes a conjugacy class of the Galois group.

164

Thus for some P ⊂ ON lying over p,

(N/K

P

)= σ. Define Q = P ∩ OM . Then for

any α ∈ OM ,

α ≡ σ(α) ≡ αN (p) mod Q

by definition of the Artin map (and the fact that σ ∈ Gal(N/M)). This shows that

OM/Q ∼= OK/p so f(Q | p) = 1, which further implies that p ∈ T . In fact, the

Cebotarev density theorem guarantees that there are infinitely many of these primes

p and since we assumed T·⊂ S, we may therefore assume p is one of the primes of T

which lies in S. Now this means

(L/K

p

)= 1 and by Proposition 2.2.2,

1 =

(L/K

p

)=

(N/K

P

)∣∣∣∣L

= σ|L .

Hence σ ∈ Gal(N/L) and the lemma is proved.

To finish the proof of Theorem 2.11.3, let L,M and K be as described previously.

Define S = SL/Q and T as in Lemma 2.11.4. By Theorem 2.11.2, S is exactly the

set of primes p = x2 + ny2. Since f is assumed to satisfy the main equivalence in

Theorem 2.11.3, S contains, with finitely many exceptions, the primes p which split

completely in K and for which f(x) ≡ 0 is solvable mod p. If p ∈ T , there is some

prime P ∈ OM such that f(P | p) = 1. Let p = P ∩ OK so that by properties of

inertial degree,

1 = f(P | p) = f(P | p)f(p | p) =⇒ f(p | p) = 1.

Thus p splits completely in K.

Let α ∈ OM be the algebraic integer for f from the theorem. Then since

g(α) = f(α) = 0, f(x) ≡ 0 mod P has a solution. However, f(P | p) = 1 im-

plies that Z/pZ ∼= OM/P and so f(x) ≡ 0 has solution in integers. By definition this

165

means p is in S which proves S contains T with finitely many exceptions. Applying

Lemma 2.11.4 shows that L ⊂M and therefore we have finished checking everything

in the proof of Theorem 2.11.3.

Let’s pause for a moment to see how far we have come. Beginning with Exam-

ple 1.3.7, where we proved Fermat’s theorem on primes of the form x2 +y2, we utilized

the properties of the Hilbert class field to characterize primes of the form x2 +ny2 for

infinitely many n – this was Theorem 1.8.7. In order to answer the x2 +ny2 question

for all integer n, we needed the full force of class field theory, notably Cebotarev’s

density theorem, culminating in Theorem 2.11.3. However, both theorems have the

same weakness: they do not provide a method for producing the primitive element α

of the ring class field L for Q(√−n).

It turns out that there is an element j(O), called the j-invariant of the order O,

that generates L/K where L is the ring class field of K. Its defining characteristics

are described in the so-called First Fundamental Theorem of Complex Multiplication:

Theorem 2.11.5. Let O be an order in an imaginary quadratic field K.

(1) For any proper fractional O-ideal a, j(a) is an algebraic integer.

(2) For any proper fractional O-ideal a, K(j(a)) is the ring class field of K.

(3) For any two proper fractional ideals a, b ⊂ O, j(a) and j(b) are conjugate and

therefore they are all roots of a single irreducible polynomial HO(x) ∈ Q[x]

which satisfies

HO(x) =

h(O)∏i=1

(x− j(ai)),

where h(O) is the class number of O and ai are distinct representatives of the

class group for O.

166

(4) The equation HO(x) = 0 is called the class equation for O and there exists an

algorithm for computing the class equation.

The First Fundamental Theorem of CM usually refers to (1) and (2). The details

of (3) and (4), i.e. the computation of the minimal polynomial of j(O), can be

found in [7]. In practice, it is rather difficult to compute HO(x) but there have been

significant results in recent years (surveyed in [7]) that make it easier to compute in

special cases.

167

Chapter 3: Quadratic Forms and n-Fermat Primes

The main focus in the first two chapters was on developing the tools necessary

for answering the question “Given a natural number n and a prime p, when does

p = x2 +ny2 have a solution in integers x and y?” The object x2 +ny2 is an example

of a quadratic form. In this chapter we will further explore the theory of quadratic

forms and then prove several results about the special case x2 + ny2. Finally, in

Section 3.3 we define a symmetric n-Fermat prime to be a prime x2 + ny2 such that

y2 + nx2 is also prime and describe the distribution of such primes for various values

of n.

3.1 The Theory of Binary Quadratic Forms

There is a rich history of the study of quadratic forms dating back at least to Fermat.

Some of the greatest mathematical minds, from Euler and Gauss to Legendre and

Lagrange, contributed to the theory which we survey here.

Definition. A binary quadratic form is a function f(x, y) = ax2 +bxy+cy2 where

a, b and c are integers.

Fermat was one of the earliest mathematicians to study binary quadratic forms

[7]. His motivation was the study and proof of such theorems as

Theorem 3.1.1 (Fermat). Let p be an odd prime.

(i) p = x2 + y2, x, y ∈ Z ⇐⇒ p ≡ 1 (mod 4).

(ii) p = x2 + 2y2, x, y ∈ Z ⇐⇒ p ≡ 1, 3 (mod 8).

(iii) p = x2 + 3y2, x, y ∈ Z ⇐⇒ p = 3 or p ≡ 1 (mod 3).

168

Euler was able to prove more complicated formulas of this flavor using his two-step

Descent-Reciprocity method (explained in detail in chapter 1 of [7]) which ultimately

evolved into Gauss’s cherished quadratic reciprocity. We have proven (i) ourselves in

Example 1.3.7 and (ii) and (iii) are easy consequences of Theorem 2.11.3 so we have

already done a lot of work on the easiest types of these problems.

Definition. A form f(x, y) = ax2 + bxy + cy2 is primitive if gcd(a, b, c) = 1.

Since any binary quadratic form is a multiple of a primitive one, we will implicitly

assume any form we are working with is primitive.

Definition. A form f(x, y) represents an integer k is there exist integers x and y

such that f(x, y) = k. Further, f(x, y) properly represents k if x and y may be

chosen such that gcd(x, y) = 1.

In the theory of quadratic forms, there is a crucial idea of equivalence called proper

equivalence, which we define here:

Definition. Two forms f(x, y) and g(x, y) are properly equivalent if there is an

invertible matrix P ∈ SL2(Z) such that f(x) = g(Px).

It is easy to see that proper equivalence is an equivalence relation on the set of

binary quadratic forms and furthermore, that properly equivalent forms represent the

same integers.

Example 3.1.2. Let f(x, y) = ax2 + bxy + cy2 and take any integer n. Note that

the matrix T =

[1 n0 1

]has determinant 1 and therefore T ∈ SL2(Z). Consider

f(T x) = f(x+ ny, y)

= a(x2 + 2ny + n2y2) + b(x+ ny)y + cy2

= ax2 + (b+ 2an)xy + (an2 + bn+ c)y2.

169

Therefore f(x, y) is properly equivalent to ax2 + (b + 2an)xy + (an2 + bn + c)y2 for

any n ∈ Z.

Lemma 3.1.3. A form f(x, y) properly represents k ∈ Z if and only if f(x, y) is

properly equivalent to kx2 + b′xy + c′y2 for some b′, c′ ∈ Z.

Proof. ( =⇒ ) Let f(x, y) = ax2 + bxy + cy2 and suppose k = f(p, q) for relatively

prime integers p, q. Then there exist integers r, s such that ps−qr = 1. Set P =

[p qr s

]and notice that detP = ps − qr = 1 so P ∈ SL2(Z). Then writing xT = (x y) we

have

f(Px) = f(px+ qy, rx+ sy)

= a(px+ qy)2 + b(px+ qy)(rx+ sy) + c(rx+ sy)2

= f(p, q)x2 + (2apr + bps+ brq + 2cqs)xy + f(r, s)y2

which is of the form kx2 + b′xy + c′y2.

( =⇒) If f is properly equivalent to g(x, y) = kx2 +b′xy+c′y2 then they represent

the same integers. Notice that g(1, 0) = k so g properly represents k and therefore so

does f .

Definition. The discriminant of a binary quadratic form ax2 + bxy + cy2 is D =

b2 − 4ac.

This is not to be confused with the discriminant of an ideal (Section 1.6) or an

order (Section 1.9). We will see in Section 3.2 that there is a close connection between

quadratic forms and orders in imaginary quadratic fields and the multiple notions of

discriminant will actually coincide in the end.

It’s easy to prove that properly equivalent forms have the same discriminant.

Moreover, the second half of the proof of Lemma 3.1.3 actually shows that every

170

integer is properly represented by some quadratic form, so the proper equivalence on

forms corresponds to a partition of Z.

If D > 0 is the discriminant of f(x, y) then f represents some positive and negative

integers, but if D < 0, the integers represented by f are either all positive or all

negative. Accordingly, we define

Definition. Let f(x, y) be a binary quadratic form of discriminant D. If D < 0 we

say f is positive definite or negative definite according to the sign of the integers

f represents. If D > 0 we say f is indefinite.

Proposition 3.1.4. Let f(x, y) = ax2 + bxy + cy2 be a primitive form.

(i) For every prime p, one of f(1, 0), f(0, 1), f(1, 1) is relatively prime to p.

(ii) For every integer M , f(x, y) properly represents an integer relatively prime

to M .

Proof. (i) If p divides f(1, 0) and f(0, 1), this implies p | a and p | c, so f(1, 1) =

pa′ + b + pc′ where a = pa′ and c = pc′. Since f(x, y) is primitive, gcd(a, b, c) = 1 so

p cannot divide b and therefore p - f(1, 1). Similarly, if p divides f(1, 0) and f(1, 1),

p must divide a and a + b which implies p | b as well. Then f(0, 1) = c but since

gcd(a, b, c) = 1, p cannot divide c. Thus p - f(0, 1). The third case is identical to the

second.

(ii) Let M be given. For each prime pi in the prime factorization of M , part (i)

says that one of f(1, 0), f(0, 1), f(1, 1) represents a number that is relatively prime

to pi. We will prove the case where M = p1p2 and then induction on the number of

prime factors will finish the proof of (ii).

Let k1 and k2 be integers such that p1 - k1 and p2 - k2. By (i), we may suppose

f(x, y) represents k1 (mod p1) via f(x1, y1) and it represents k2 (mod p2) via f(x2, y2)

171

for some x1, x2, y1, y2 ∈ Z. By the Chinese remainder theorem, let K be the unique

integer modulo p1p2 satisfying

K ≡ k1 (mod p1)

K ≡ k2 (mod p2).

Also using the Chinese remainder theorem, define A and B to be the unique solutions,

modulo p1p2, to

A ≡ 1 (mod p1) B ≡ 1 (mod p2)

A ≡ 0 (mod p2) B ≡ 0 (mod p1).

Then we can write K = Ak1 +Bk2. In other words, K is the inverse image of (k1, k2)

under the isomorphism given by the primary decomposition of M :

Z/(M) ∼= Z/(p1)× Z/(p2)

Ai+Bj 7−→(i, j).

We use these ingredients to show that f(x, y) properly represents K modulo p1p2.

Consider

f(Ax1 +Bx2, Ay1 +By2) = a(A2x21 + ABx1x2 +B2x2

2)

+ b(A2x1y1 + ABx2y1 + ABx1y2 +B2x2y2)

+ c(A2y21 + ABy1y2 +B2y2

2).

Reducing mod p1, the Bs are all 0 so we have

f(Ax1 +Bx2, Ay1 +By2) ≡ ax21 + bx1y1 + cy2

1 ≡ k1 (mod p1).

On the other hand, reducing mod p2 yields

f(Ax1 +Bx2, Ay1 +By2) ≡ ax22 + bx2y2 + cy2

2 ≡ k2 (mod p2).

172

By our choice of K, this shows that f(Ax1 + Bx2, Ay1 + By2) is congruent to K

(mod p1p2). Therefore f(x, y) represents K, which is relatively prime to M by con-

struction.

Example 3.1.5. To illustrate Proposition 3.1.4, consider f(x, y) = 2x2 + 3xy + 6y2.

Let p1 = 11 and p2 = 13, whereby M = p1p2 = 143. By (i) of the proposition, we

can represent k1 = 2 using f(1, 0) and k2 = 6 using f(0, 1). Calculations show that

A = 78 and B = 66 (e.g. using a computer algorithm for the Chinese remainder

theorem) which gives us

K = Ak1 +Bk2 = 78(2) + 66(6) ≡ 123 (mod 143).

Note that K and M are coprime, so we can show that f(x, y) represents K in order

to demonstrate the conclusion in Proposition 3.1.4(ii). Letting (x1, y1) = (1, 0) and

(x2, y2) = (0, 1), we compute

f(Ax1 +Bx2, Ay1 +By2) = f(A,B)

= 2A2 + 3AB + 6B2

= 2(78)2 + 3(78)(66) + 6(66)2

= 12168 + 15444 + 26136 = 53748

≡ 123 (mod 143).

So f(A,B) represents K which is relatively prime to M .

Lemma 3.1.6. Let D be an integer and suppose k is an odd integer such that

gcd(D, k) = 1.

(i) D ≡ 0, 1 (mod 4) if D is the discriminant of a binary quadratic form.

(ii) k is properly represented by a primitive form of discriminant D if and only if

D is a quadratic residue mod k.

173

Proof. (i) If D is the discriminant of f(x, y) = ax2 + bxy + cy2 then D = b2 − 4ac

which means D ≡ b2 (mod 4). The only squares mod 4 are 0 and 1 so D ≡ 0, 1

(mod 4).

(ii) If k is properly represented by some form f(x, y) of discriminantD, Lemma 3.1.3

allows us to assume f(x, y) = kx2 + bxy + cy2 for b, c ∈ Z. Then D = b2 − 4kc so

D ≡ b2 (mod k), that is, D is a quadratic residue mod k. On the other hand, if

D ≡ b2 (mod k) then D ≡ 0, 1 (mod 4) implies D = b2 − 4kc for some c ∈ Z.

The form g(x, y) = kx2 + bxy + cy2 properly represents k and since gcd(D, k) = 1,

gcd(k, b, c) = 1 so g(x, y) is primitive.

Corollary 3.1.7. Let n ∈ Z and p be a prime not dividing n. Then

(−np

)= 1 if

and only if p is represented by a primitive form of discriminant −4n.

Proof. Note that −4n is a quadratic residue mod p ⇐⇒(−4n

p

)=

(−np

)= 1.

Apply part (ii) of the lemma.

Definition. A positive definite form ax2 + bxy + cy2 is reduced if it is primitive,

|b| ≤ a ≤ c and if either |b| = a or a = c then b ≥ 0.

There is a powerful characterization of primitive, positive definite (p.p.d.) forms

in terms of reduced forms:

Theorem 3.1.8. Every proper equivalence class of primitive, positive definite forms

contains a unique reduced form.

Proof. See [7] or [17].

Example 3.1.9. For any n ∈ N, x2+ny2 is a reduced, primitive, positive definite form

of discriminant −4n. For this reason, Corollary 3.1.7 explains one of the conditions

for p to be represented by x2 + ny2 in Theorems 1.8.7 and 2.11.3.

174

Lemma 3.1.10. For every reduced form ax2 + bxy + cy2 of discriminant D < 0,

a ≤√−D3

.

Proof. Let f(x, y) = ax2 + bxy + cy2. Since f(x, y) is reduced, b2 ≤ a2 and a ≤ c.

Thus −D = 4ac− b2 ≥ 4a2 − a2 = 3a2 which implies the result.

Definition. For a fixed D < 0, the number h(D) of equivalence classes of primitive,

positive definite forms of discriminant D is called the class number of D.

Theorem 3.1.11. For every D < 0, the class number h(D) is finite.

Proof. By Theorem 3.1.8, h(D) is the number of distinct reduced forms of discrim-

inant D. For a reduced form ax2 + bxy + cy2 of discriminant D, there are only a

finite number of choices for a and b since |b| ≤ a ≤ −D3

by Lemma 3.1.10. Moreover,

D = b2 − 4ac shows that the choices of D, a and b determine c. Therefore there are

only a finite number of reduced forms of discriminant D, so h(D) is finite.

3.2 The Form Class Group

Our first goal in this section is to justify the word group in the following definition.

Definition. For a negative integer D ≡ 0, 1 (mod 4), the set of equivalence classes

of primitive, positive definite forms of discriminant D is called the form class group

for D, denoted C(D). We will sometimes abuse notation and write f(x, y) ∈ C(D)

for a single form f .

Note that |C(D)| = h(D) which is equal to the number of reduced forms of

discriminant D. To prove C(D) is a group, we need to define a law of composition

on classes of quadratic forms. Legendre realized that since each class in C(D) has

a unique representative that is reduced, the composition may be defined on reduced

175

forms. However, his method was cumbersome to work with, so instead we follow

Dirichlet’s method of form composition.

Lemma 3.2.1. Suppose f and g are p.p.d. forms of discriminant D, where f(x, y) =

ax2 + bxy + cy2 and g(x, y) = a′x2 + b′xy + c′y2. If gcd(a, a′, b+b

′

2

)= 1 then there is

an integer B, unique modulo 2aa′, satisfying

B ≡ b (mod 2a)

B ≡ b′ (mod 2a′)

B2 ≡ D (mod 4aa′).

Proof. See [7].

Definition. Given two p.p.d. forms f(x, y) = ax2 + bxy + cy2 and g(x, y) = a′x2 +

b′xy+ c′y2 of discriminant D which satisfy gcd(a, a′, b+b

′

2

)= 1, their Dirichlet com-

position is

(f ∗ g)(x, y) = aa′x2 +Bxy +B2 −D

4aa′y2,

where B is the unique integer modulo 2aa′ chosen in Lemma 3.2.1.

Lemma 3.2.2. For any primitive, positive definite forms f and g of discriminant D,

if f ∗ g is defined, it is a primitive, positive definite form of discriminant D.

Proof. Suppose f(x, y) = ax2 + bxy + cy2 and g(x, y) = a′x2 + b′xy + c′y2 satisfy the

conditions of Lemma 3.2.1. Set C = B2−D4aa′

and F (x, y) = aa′x2 + Bxy + Cy2. The

discriminant of F is B2 − 4aa′(B2−D4aa′

)= D so F (x, y) is positive definite. Suppose

m is a number dividing all the coefficients of F . By Lemma 3.1.3, f and g are

properly equivalent to the quadratic forms ax2 +Bxy+a′Cy2 and a′x2 +Bxy+aCy2,

176

respectively. Notice that

f(x, y)g(x, y) ∼ (ax2 +Bxy + a′Cy2)(a′x2 +Bxy + aCy2)

= aa′x4 + aBx3y + a2Cx2y2 + a′Bx3y +B2x2y2 + aBCxy3

+ (a′)2Cx2y2 + a′BCxy3 + aa′C2y4

= aa′(x4 + C2y4) +B(ax3y + a′x3y +Bx2y2 + aCxy3 + a′Cxy3)

+ C(a2x2y2 + aBxy3 + (a′)2x2y2 + aa′Cy4 + a′Bxy3)

= aa′(x2 − Cy2)2 +B(x2 − Cy2)(axy + a′xy +By2)

+ C(axy + a′xy +By2)2

= aa′z2 +Bzw + Cw2.

So the product f(x, y)g(x, y) is properly equivalent to F (x, y). This means m divides

every number represented by f(x, y)g(x, y) but by Proposition 3.1.4, f and g represent

some numbers relatively prime to m. Therefore m = 1 so F (x, y) is primitive.

Definition. Let D ≡ 0, 1 (mod 4) be a negative integer. The principal form of

discriminant D is defined to be

FD(x, y) =

{x2 − D

4y2, D ≡ 0 (mod 4)

x2 + xy + 1−D4y2, D ≡ 1 (mod 4).

Notice that when D = −4n for an integer n ≥ 1, the principal form is x2 + ny2.

Theorem 3.2.3. Let D ≡ 0, 1 (mod 4) be a negative integer. The set C(D) is a finite

abelian group under Dirichlet composition. Moreover, the identity element is the class

containing the principal form and the inverse of the class containing ax2 + bxy + cy2

is the class containing ax2 − bxy + cy2.

Proof. First, Theorem 3.1.11 says that |C(D)| = h(D) is finite. If f(x, y) = ax2 +

bxy + cy2 and g(x, y) are p.p.d. forms of discriminant D then Proposition 3.1.4(ii)

177

shows we can replace g with a properly equivalent form g′(x, y) = a′x2 + b′xy + c′y2

with gcd(a, a′) = 1. Therefore Dirichlet composition is well-defined on classes of p.p.d.

quadratic forms. Moreover, Dirichlet composition is clearly abelian, so it suffices to

check the identity and inverses.

Let f(x, y) = ax2 + bxy + cy2 ∈ C(D). Note that for the principal form FD(x, y),

a′ = 1 so gcd(a, a′) = 1 and Dirichlet composition is well-defined for f and FD. The

integer B that satisfies Lemma 3.2.1 is precisely b, so

FD ∗ f(x, y) = aa′x2 + bxy +b2 −D

4aa′y2

= ax2 + bxy +4ac

4ay2

= ax2 + bxy + cy2 = f(x, y).

Hence FD is the identity.

Next, note that Dirichlet composition is not defined on the forms f(x, y) and

f ′(x, y) = ax2 − bxy + cy2 but by proper equivalence we can replace f ′(x, y) with

g(x, y) = f ′(−y, x) = cx2 + bxy + ay2 — the transformation matrix S =

[0 −11 0

]has determinant 1. Since f(x, y) is primitive, gcd(a, b, c) = 1 so f ∗ g(x, y) is defined.

Again, B = b satisfies Lemma 3.2.1 so

f ∗ g(x, y) = acx2 + bxy +b2 −D

4acy2 = acx2 + bxy + y2.

To finish, we show that F (x, y) = acx2 + bxy + y2 is properly equivalent to FD(x, y).

Using the matrix S again, F (x, y) is properly equivalent to F (−y, x) and by Exam-

ple 3.1.2 we can replace F (−y, x) with x2 + (−b + 2n)xy + (n2 − bn + ac)y2 for any

178

n ∈ Z. If D ≡ 0 (mod 4), b must be even so let n = b2. Then

x2 + (−b+ 2n)xy + (n2 − bn+ ac)y2 = x2 + (−b+ b)xy +

(b2

4− b2

2+ ac

)y2

= x2 +

(−b+ 4ac

4

)y2

= x2 − D

4y2 = FD(x, y).

On the other hand, if D ≡ 1 (mod 4), b is odd so let n = b+12

. Then

x2 + (−b+ 2n)xy + (n2 − bn+ ac)y2

= x2 + (−b+ b+ 1)xy +

(b2 + 2b+ 1

4− b2 − b

2+ ac

)y2

= x2 + xy +

(1− b2 + 4ac

4

)y2

= x2 + xy +1−D

4y2 = FD(x, y).

In both cases, F (x, y) is properly equivalent to the principal form so the inverse of

the class containing ax2 + bxy + cy2 is the class containing ax2 − bxy + cy2. This

completes the proof that C(D) is a finite abelian group.

We now return to a statement in Section 1.8 regarding the relationship between

C(dK) and the ideal class group C(OK). In fact, we will prove a more general relation

between C(D) and C(O) where O is an order in an imaginary quadratic field.

Theorem 3.2.4. Let K be an imaginary quadratic number field, let D ≡ 0, 1 (mod 4)

be a negative integer and let O be the order of discriminant D in K.

(1) If f(x, y) = ax2 +bxy+cy2 is a p.p.d. form of discriminant D then[a, −b+

√D

2

]is a proper ideal of O.

179

(2) There is an isomorphism Ψ : C(D) → C(O) defined by f(x, y) 7→[a, −b+

√D

2

]and therefore |C(O)| = h(D).

(3) A positive integer m is represented by a form f(x, y) ∈ C(D) if and only if

m = N(a) for some proper ideal a ∈ Ψ(f(x, y)).

Proof. We will prove (1) and (2). The details of (3) can be found in [7].

(1) Let f(x, y) = ax2 + bxy+ cy2 be p.p.d. of discriminant D. Then α = −b+√D

2ais

a root of the polynomial f(x, 1) = ax2 + bx+ c so by Lemma 1.9.5, a[1, α] is a proper

ideal of the order [1, aα]. Notice that a[1, α] = [a, aα] =[a, −b+

√D

2

]so it suffices to

show [1, aα] = O. Let f be the conductor of O. Then we showed in Section 1.9 that

D = f 2dK where dK is the field discriminant, so

aα =−b+

√D

2=−b+ f

√dK

2

=−b+ fdK

2+ f

dK +√dK

2

=−b+ fdK

2+ fwK

where wK is defined as in Section 1.9. Since D = b2 − 4ac = f 2dK , fdK and b have

the same parity which means that −b+fdK2

is an integer. Therefore [1, aα] = [1, fwK ]

by the above work and since every order is determined by its conductor, this shows

[1, aα] = O.

(2) Let f(x, y) and g(x, y) be p.p.d. forms of discriminant D. Let α, β ∈ C∗ be

the roots of f(x, 1) and g(x, 1), respectively, with positive imaginary parts. First, we

show

f(x, y), g(x, y) are properly equivalent ⇐⇒ β =aα + b

cα + dfor a, b, c, d ∈ Z, ad− bc = 1

⇐⇒ [1, α] = λ[1, β] for some λ ∈ K∗.

180

Suppose f(x) = g(Ax) where A =

[a bc d

]∈ SL2(Z). Then since α is a root of f(x, 1),

0 = f(α, 1) = g(aα + b, cα + d) = (cα + d)2g

(aα + b

cα + d, 1

).

Thus aα+bcα+d

is a root of g(x, 1) and it is easy to verify that it has positive imaginary

part, so β = aα+bcα+d

. On the other hand, the equation above shows that if β = aα+bcα+d

for A =

[a bc d

]in SL2(Z) then f(x, 1) and g(A(x, 1)) have the same root. It follows

that f(x) = g(Ax) so the forms are properly equivalent. This proves the first of the

equivalences above.

Next, suppose β = aα+bcα+d

where ad− bc = 1. Then cα+ d ∈ K∗ so set λ = cα+ d.

This implies

λ[1, β] = (cα + d)

[1,aα + b

cα + d

]= [cα + d, aα + b]

but since ad− bc = 1, [cα+d, aα+ b] = [1, α]. On the other hand, if [1, α] = λ[1, β] =

[λ, λβ] for some λ ∈ K∗ then

λβ = eα + f

and λ = gα + h

for some e, f, g, h such that

[e fg h

]∈ GL2(Z). Then β = eα+f

λ= eα+f

gα+hand since α

and β both have positive imaginary parts, we must have eh− fg = 1, that is

[e fg h

]is in SL2(Z). Therefore f and g are properly equivalent if and only if [1, α] = λ[1, β]

181

for some λ ∈ K∗. This establishes an injection

Ψ : C(D) −→ C(O)

f(x, y) 7−→ a[1, α] =

[a,−b+

√D

2

].

We next show that Ψ is surjective. Let a be a fractional O-ideal which, by the

proof of Proposition 1.9.6, can be written a = [α, β] for some α, β ∈ K. Without loss

of generality assume βα

has positive imaginary part. Set γ = βα

and let ax2 + bx + c

be the minimal polynomial of γ over Q – we may rescale the coefficients to ensure

gcd(a, b, c) = 1 and a > 0. Let f(x, y) = ax2 + bxy + cy2 which is then a p.p.d.

quadratic form. We next check that f(x, y) has discriminant D = disc(O). Writing

O = [1, aγ] we compute the discriminant by

D =

∣∣∣∣1 aγ1 aγ

∣∣∣∣2 = a2(γ − γ)2 = 4a2 im(γ)2.

The roots of ax2 + bx+ c are γ and γ which are solutions to the quadratic formula:

γ =−b+

√b2 − 4ac

2aand γ =

−b−√b2 − 4ac

2a.

So im(γ) =√b2−4ac

2aand hence D = 4a2

(√b2−4ac

2a

)2

= b2 − 4ac. This is precisely

the discriminant of f(x, y). Therefore f(x, y) is a primitive, positive definite form of

discriminant D which maps to a[1, γ] ∼ α[1, γ] = a in C(O). Hence Ψ is surjective.

Now we show that Ψ preserves the group structure of C(D). If f and g are p.p.d.

forms of discriminant D, denote their Dirichlet composition by F (x, y). In the proof

of Theorem 3.2.3, we saw that B = b satisfies the conditions of Lemma 3.2.1 for f

182

and g, so we can write the images of f, g and F under Ψ as:

Ψ([f ]) =

[a,−b+ f

√dK

2

]= [a,∆];

Ψ([g]) =

[a′,−b′ + f

√dK

2

]= [a′,∆];

and Ψ([F ]) =

[aa′,−B + f

√dK

2

]= [aa′,∆] where ∆ =

−b+ f√dK

2.

We want to show [a,∆][a′,∆] = [aa′,∆] in C(O). Note that the conditions on B from

Lemma 3.2.1 give us ∆2 ≡ −B∆ mod aa′ so we have

[a,∆][a′,∆] = [aa′, a∆, a′∆,∆2] = [aa′, a∆, a′∆,−B∆].

Since f, g and F are all primitive, the conditions on B also force gcd(a, a′, B) = 1 so

[a,∆][a′,∆] = [aa′, a∆, a′∆,−B∆] = [aa′,∆] as desired. Hence Ψ : C(D)→ C(O) is

an isomorphism.

3.3 n-Fermat Primes

In the final section of this text, we describe our approach to the following question:

Question 3.3.1. If p = x2 + ny2 is prime, when is q = y2 + nx2 also prime?

The following definitions are not standard in the literature. We have introduced

them in order to facilitate our discussion of Theorem 2.11.3 and Question 3.3.1.

Definition. Let n ≥ 1 be an integer. A number of the form x2 +ny2, where x, y ∈ Z,

is called an n-Fermat number. If p = x2 +ny2 is prime, p is said to be an n-Fermat

prime.

Definition. An n-Fermat prime p = x2 + ny2 is a symmetric n-Fermat prime

provided q = y2 + nx2 is also prime.

183

Question 3.3.1 can therefore be restated: When is an n-Fermat prime symmetric?

The question is stated rather broadly for a reason, as there are several ways we could

answer this.

In this language, Theorems 2.11.3 and 2.11.5 together say the following: Let f(x)

be the minimal polynomial of the j-invariant j(O) for the order O = Z[√−n] in

Q(√−n). Then a prime p not dividing disc(f) is an n-Fermat prime if and only if(

−np

)= 1 and f(x) ≡ 0 (mod p) has an integer solution. In other words, n-Fermat

primes are characterized by congruence conditions in all but finitely many cases. The

best possible situation would therefore be a positive answer to the following question:

Question 3.3.2. For an integer n ≥ 1, are there congruence conditions that deter-

mine when an n-Fermat prime is a symmetric n-Fermat prime?

There is fortunately a case when the answer to Question 3.3.1 is quite trivial.

When n = 1, an n-Fermat prime is always symmetric. This is certainly the only case

when the ratio of symmetric n-Fermat primes to total n-Fermat primes is 1, as the

next example shows.

Example 3.3.3. Let n = 2. The first few symmetric 2-Fermat primes are: p = 3,

11, 19, 43, 59, 67, 83, 107, 139, 163, 179, . . . For small primes it appears that p is a

symmetric 2-Fermat prime if and only if p ≡ 3 (mod 8). However, 131 is a 2-Fermat

prime since it can be written 131 = 92 + 2 · 52, but 52 + 2 · 92 = 187 = 11 · 17 is not

prime. Therefore the condition p ≡ 3 (mod 8) breaks early on.

The function getSymmRatio(n,M) generates a list (n, symmRat, expSymm, N ,

S), whose entries are

184

n a positive integer

N the number of n-Fermat primes p = x2 + ny2 whose solutions (x, y)

satisfy x, y ≤M

S the number of these that are symmetric n-Fermat primes

symmRat the ratio of symmetric n-Fermat primes to total n-Fermat primes that

satisfy x, y ≤M , i.e. symmRat = SN

expSymm the ratio of observed symmetric n-Fermat primes to the expected

number of symmetric n-Fermat primes satisfying x, y ≤M , as predicted

by the Prime Number Theorem.

(See Appendix A.4 for further documentation; a description of our Prime Number

Theorem heuristic for expSymm follows in the next paragraph.) Now consider the

data generated for n = 2 and M = 1000:

> getSymmRatio(2,1000);

[ 2.0, 0.1142830989, 0.9586592239, 2501656.0, 285897.0 ]

Empirically, it appears that the ratio of symmetric 2-Fermat primes to total 2-Fermat

primes is about 0.1143; that is, about 11.43% of 2-Fermat primes are symmetric. On

the other hand, the data shows that the ratio of the number of symmetric 2-Fermat

primes to the expected number of 2-Fermat primes, under the assumptions of our

Prime Number Theorem heuristic below, is about 0.9587. That is, there are slightly

less symmetric 2-Fermat primes than we expect. Something interesting is going on

here.

For an integer n ≥ 1, let πsym,n(M) denote the number of primes y2 + nx2 such

that x2 + ny2 is prime and x, y ≤ M . Notice that if x2 + ny2 is prime and x and y

are both relatively prime to n, then y2 + nx2 is necessarily odd. Of course a number

has twice the probability of being prime given that it is odd – this is accounted for

185

in the Magma code in Appendix A.4 – so our Prime Number Theorem heuristic says

that for each n ≥ 1, there is a nonnegative real number αn such that

πsym,n(M) ∼ 2αn∑q≤M

1

log q,

where log is the natural logarithm and the sum is over n-Fermat numbers q = y2+nx2,

x, y ≤ M , for which x2 + ny2 is prime. For example, the data above shows that α2

is close to 0.9328. We posit several conjectures related to αn and the asymptotic

behavior of πsym,n(M) below, along with empirical results that lead us to believe they

might hold.

Conjecture. For all n ≥ 1, αn > 0.

Theorem 2.11.3 characterizes primes of the form x2 + ny2 up to solvability con-

ditions of fn(x) ≡ 0 (mod p). Moreover, Cox [7] gives a general formula for the

Dirichlet density δ(f) of primes represented by a p.p.d. quadratic form f of discrim-

inant D < 0:

δ(f) =

{1

h(D)if f is properly equivalent to its opposite

12h(D)

otherwise.

Therefore there are infinitely many n-Fermat primes for any n ≥ 1. In other words,

the sum∑

q≤M1

log qover n-Fermat numbers q obtained by switching solutions for n-

Fermat primes diverges as M → ∞, so Conjecture 3.3 would imply that there are

infinitely many symmetric n-Fermat primes for every n ≥ 1. To test this conjecture,

we turned Magma loose on some computations with large search spaces, including

> bigData := getSymmRatios(40000,1000);

(The reader should be warned that the above computation took the better part

of a weekend, so proceed with caution when running some of the code contained in

186

Appendix A.4.) Through the first 40,000 values for n, and with search parameters

x, y ≤ 1, 000, Conjecture 3.3 is seen to hold. There were several other interesting

observations made, which are discussed via the next two conjectures.

Conjecture. The average value of αn over all n ≥ 1 is equal to 1.

Informally, Conjecture 3.3 means that, on average, n-Fermat primes are about as

likely to be symmetric as the Prime Number Theorem predicts. This is supported by

the statistical analysis of the list above:

> averageRatio(bigData);

> stdDevRatio(bigData);

This describes a global property of the natural numbers, which reinforces the pre-

dictions of the Prime Number Theorem. This shouldn’t be a surprise, as the PNT

makes a strong, global statement about the natural numbers and subsets thereof.

However, we know from experience that the integers often behave more erratically

from a local perspective. The function getInterestingPrimes returns the informa-

tion from getSymmRatio on the values of n such that αn exceeds a certain threshold r.

For example, there are a handful of numbers n in the first 40,000 such that αn > 2:

> getInterestingPrimes(40000,1000,2);

{@

[ 2277.0, 0.2115438844, 2.038176783, 56740.0, 12003.0 ],

[ 12699.0, 0.1919496912, 2.018037461, 44048.0, 8455.0 ],

[ 13629.0, 0.1934944899, 2.042425528, 28130.0, 5443.0 ],

[ 14540.0, 0.1919620775, 2.030119412, 24260.0, 4657.0 ],

[ 15091.0, 0.1888063168, 2.002682571, 64590.0, 12195.0 ],

[ 16615.0, 0.1901929816, 2.025452899, 81044.0, 15414.0 ],

[ 22576.0, 0.1899758230, 2.051227164, 43016.0, 8172.0 ],

[ 24089.0, 0.1846163922, 2.000292558, 30539.0, 5638.0 ],

[ 27250.0, 0.1843967171, 2.011252308, 31801.0, 5864.0 ],

[ 29127.0, 0.1883313546, 2.059048767, 41273.0, 7773.0 ],

[ 29798.0, 0.1849947634, 2.024235890, 21006.0, 3886.0 ],

187

[ 31927.0, 0.1852485929, 2.034336859, 85280.0, 15798.0 ],

[ 33060.0, 0.1826687458, 2.007948420, 13467.0, 2460.0 ],

[ 34159.0, 0.1816352201, 2.002051748, 71550.0, 12996.0 ],

[ 35814.0, 0.1822754491, 2.010832673, 16700.0, 3044.0 ],

[ 36743.0, 0.1834493426, 2.026459979, 51720.0, 9488.0 ]

@}

These n have the apparent property that there are more than twice the number of

symmetric n-Fermat primes than expected. The function getInterestingPrimes2

returns similar data for n values such that αn is less than a threshold r. In the future

we hope to be able to discern why certain numbers have higher or lower densities of

symmetric n-Fermat primes than predicted, but if one is to believe that the values of

αn follow a normal distribution, then such outliers are to be expected in larger and

larger data sets.

Conjecture. The set of αn is bounded. That is, there are positive constants ε and

M such that for all n, ε ≤ αn ≤M .

This conjecture is offered solely based on the observations made for large param-

eter searches for symmetric n-Fermat primes. It appears so far that 0.4 ≤ αn ≤ 2.1.

Finally, a question lingering on the edge of this discussion is

Question 3.3.4. If p is an n-Fermat prime, is there an algorithm for finding solutions

x, y ∈ Z to p = x2 + ny2? And if so, how many solutions (x, y) are there?

The authors in [8] note that Question 3.3.4 is unsolved and it would be difficult at this

time to implement a method of solving p = x2 +ny2 even for small n. However, there

is clear motivation for answering such a question, as there are important implications

to the theory of quadratic partitions and cryptography [8].

In a related sense, the characterization (Example 1.3.7) of primes of the form

x2 + y2, that is 1-Fermat primes, forms the basis of a primality test discovered by

188

Euler: m = x2 + y2 has a single solution (x, y) in positive integers when m is prime.

In the future, the complexity of n-Fermat primes and symmetric n-Fermat primes

may contribute to the rise of more secure cryptosystems and faster primality test

algorithms.

189

Bibliography

[1] Artin, Emil. Collected Papers, edited by Lang, S. and Tate, J. Addison-Wesley

Publishing, Reading (1965).

[2] Artin, Emil and Tate, John. Class Field Theory. W.A. Benjamin, New York

(1968).

[3] Borevich, Z.I. and Shafarevich, I.R. Number Theory. Translated from Russian.

Academic Press, New York (1966).

[4] Butler, Gregory and McKay, John. The transitive groups of degree up to

eleven. Communications in Algebra, 11(8) (1983). pp. 863-911.

[5] Cho, Bumkyu. Primes of the form x2 + ny2 with conditions x ≡ 1 mod N ,

y ≡ 0 mod N . Journal of Number Theory. 130 (2010), pp. 852-861.

[6] Conrad, Keith. The different ideal. http://www.math.uconn.edu/∼kconrad/blurbs/

(2009).

[7] Cox, David A. Primes of the Form x2 +ny2: Fermat, Class Field Theory, and

Complex Multiplication, 2nd ed. John Wiley & Sons, Hoboken (2013).

[8] Cusick, T.W., Ding, C. and Renvall, A. Stream Ciphers and Number Theory.

Elsevier Science B.V., Amsterdam (1993).

[9] Dirichlet, Peter Gustav Lejeune. There are infinitely many prime numbers in

all arithmetic progressions with first term and difference coprime. Translated

from German. arXiv:0808.1408v1 [math.HO] (2008).

190

[10] Dummit, David S. and Foote, Richard M. Abstract Algebra, 3rd ed. Wiley &

Sons, Hoboken (2004).

[11] Golod, E.S. and Shafarevich, I.R. On the class-field tower. Translated from

Russian. Izvestiya Akademii Nauk 28, S.S.S.R. (1964). pp. 261-272.

[12] Gouvea, Fernando Q. p-adic Numbers: An Introduction, 2nd ed. Springer-

Verlag, New York (1997).

[13] Hilbert, David. The Theory of Algebraic Number Fields. Translated from Ger-

man. Springer-Verlag, New York (1998).

[14] Hungerford, Thomas W. Algebra. Springer-Verlag, New York (1974).

[15] Janusz, Gerald J. Algebraic Number Fields. Academic Press, New York (1973).

[16] Kolmogorov, A.N. and Fomin, S.V. Introductory Real Analysis. Dover, New

York (1970).

[17] Lee, Holden. Algebraic Number Theory, Class Field Theory and Complex Mul-

tiplication. http://web.mit.edu/∼holden1/ (2012).

[18] Marcus, Daniel A. Number Fields. Springer-Verlag, New York (1977).

[19] Milne, J.S. Algebraic Number Theory, v3.05. http://www.jmilne.org/math/

(2013).

[20] Milne, J.S. Class Field Theory, v4.02. http://www.jmilne.org/math/ (2013).

[21] Schulze, Volker. “Die Primteilerdichte von ganzzahligen Polynomen, I”. Jour-

nal fur die reine und angewandte Mathematik, 253 (1972). pp. 175-185.

191

[22] Schoof, Rene. Class numbers of real cyclotomic fields of prime con-

ductor. Mathematics of Computation, vol. 72, no. 242 (2002). pp. 913-

937. Published electronically: http://www.ams.org/journals/mcom/2003-72-

242/S0025-5718-02-01432-1/S0025-5718-02-01432-1.pdf.

[23] Serre, Jean-Pierre. A Course in Arithmetic. Springer-Verlag, New York (1973).

[24] Stein, William. A Brief Introduction to Classical and Adelic Algebraic Number

Theory. http://modular.math.washington.edu/129/ant/ (2004).

[25] Stein, William. Introduction to Algebraic Number Theory.

http://modular.math.washington.edu/129-05/notes/129.pdf (2005).

[26] Stevenhagen, P. and Lenstra, H.W., Jr. Chebotarev and his Density Theorem.

The Mathematical Intelligencer, vol. 18, no. 2 (1996). pp. 26-37.

192

Appendix A: Appendix

A.1 The Four Squares Theorem

We discuss a beautiful application of Minkowski’s theorem from Section 1.7. The four

squares theorem is a famous result in number theory which was proven by Lagrange

in 1770, well over 100 years before Minkowski’s theorem was discovered. Here we

provide a neat proof of the four squares theorem using Minkowski’s geometry of

numbers arguments.

Theorem A.1.1 (Four Squares). Every positive integer is the sum of the squares of

four integers.

Proof. It suffices to prove this for primes p, since

(a2 + b2 + c2 + d2)(e2 + f 2 + g2 + h2) = (ae+ bf + cg + dh)2 + (af − be+ ch− dg)2

+ (ag − ce+ df − bh)2 + (ah− de+ bg − cf)2.

(This is due to Euler.) Also note that 2 = 12 + 12 + 02 + 02 so we may assume p is an

odd prime. Consider the congruence

x2 + y2 + 1 ≡ 0 (mod p).

As x runs through 0, 1, . . . , p − 1, x2 takes on exactly p+12

distinct values mod p.

Similarly, −1 − y2 takes on p+12

distinct values, so together x2 and −1 − y2 take on

p+ 1 values, which implies one of them must be shared. This shows x2 + y2 + 1 ≡ 0

(mod p) has a solution in integers.

Fix one of these solutions, say (x, y), and consider the lattice Λ ⊂ Z4 consisting

193

of (a, b, c, d) such that

c ≡ ax+ by and d ≡ bx− ay (mod p).

Then Z4 ⊃ Λ ⊃ pZ4 and Λ/pZ4 is a two-dimensional subspace of F4p since once we

pick a and b, the c and d are determined. Thus Λ has index p2 in Z4 so µ(D) = p2 for

D a fundamental parallelopiped for Λ. Let T be a closed ball about the origin with

radius r. Then µ(T ) = 12π2r4 so we may choose r such that

2p > r2 > 1.9p.

This gives us µ(T ) > 16µ(D) so by Minkowski’s theorem there exists a nonzero point

(a, b, c, d) in T ∩ (Λ r {0}). This means

a2 + b2 + c2 + d2 ≡ a2 + b2 + (ax+ by)2 + (bx− ay)2

≡ a2 + b2 + a2x2 + 2abxy + b2y2 + b2x2 − 2abxy + a2y2

≡ a2(1 + x2 + y2) + b2(1 + x2 + y2)

≡ 0 (mod p).

Moreover, since (a, b, c, d) ∈ T we have

a2 + b2 + c2 + d2 < 2p.

But since a2 +b2 +c2 +d2 is a positive integer and p is prime, p = a2 +b2 +c2 +d2.

194

A.2 The Snake Lemma

The following results from commutative algebra are used in the proof of the finiteness

of ray class groups (Theorem 2.3.4).

Snake Lemma. Given a commutative diagram of R-modules with exact rows

A′ A A′′ 0

0 B′ B B′′

α′ α

f g h

β′ β

there is an exact sequence

0→ kerα′ → ker f → ker g → kerh→ coker f → coker g → cokerh→ coker β → 0.

Proof. We prove the classic snake lemma, which asserts that the following sequence

is exact:

0→ ker fα′−→ ker g

α−→ kerhð−→ coker f

β′−→ coker gβ−→ cokerh→ 0.

Note that it is straightforward to verify that kerα′ ⊆ ker f which shows exactness at

the first position and likewise for coker g ⊆ coker β at the final position.

To begin, note that the left square commutes so α′ restricts to α′ : ker f → ker g.

Likewise α restricts to α : ker g → kerh and these both inherit injectivity. Consider

the projection B′β′−→ B → B/ im g = coker g. For any b′ ∈ im f , there is some a′ ∈ A′

such that f(a′) = b′, but also gα′(a′) = β′(b′) by commutativity. So β′(b′) ∈ im g

which induces a map β′ : coker f → coker g – in other words β′ factors through f .

Likewise, β induces a map β and both β′ and β inherit surjectivity.

The interesting part of the sequence is at kerh → coker f ; we have labelled this

with the character ð. How can we define ð? Start with a′′ ∈ kerh, so that h(a′′) = 0.

195

By exactness, there is some a ∈ A such that α(a) = a′′. If b is the image of a under

g, b 7→ 0 in B′′ and so there is some b′ ∈ B′ such that b′ 7→ b 7→ 0.

With this notation, we define

ð : kerh −→ coker f

a′′ 7−→ b′ + im f.

Since all maps in this diagram are linear, it suffices to check well-definedness. Once

we do so, we will have all maps defined and can then proceed to check exactness.

Suppose we have α(a1) = α(a2) = a′′. Then there is some a′ ∈ A′ such that

α′(a′) = a1 − a2 since this difference lies in kerα. Let b′1 and b′2 be the unique (by

injectivity) elements of B′ such that β′(b′1) = g(a1) and β′(b′2) = g(a2). We want

to show that b′1 − b′2 = f(a′). But this is easily seen, since commutativity of the

left square gives us a′ 7→ a1 − a2 7→ g(a1) − g(a2) := b1 − b2 around the top, and

a′ 7→ f(a′) 7→ b1− b2 around the bottom. Then by injectivity, f(a′) must equal b′1− b′2

as desired. This shows that ð maps into B′/ im f = coker f , i.e. it is well-defined.

We now have all our maps, so we can proceed to show exactness. First α′α = 0

implies im(α′ |ker f ) ⊆ ker(α |ker g). For the other containment, take a ∈ kerα that

maps to 0 under g. Then by exactness, there exists an a′ 7→ a but the left square

commutes, so a′ 7→ b′ 7→ 0 and injectivity of β′ shows b′ = 0.

Next suppose b ∈ coker g maps to b′′ ∈ cokerh = B/ imh. Then there is some

a′′ mapping to b′′ under h, and α is onto to there also exists a ∈ A mapping to a′′.

Let g(a) = b ∈ B; by commutativity b 7→ b′′. Then b − b 7→ 0 ∈ B′′ implies there

exists b′ ∈ B′ such that β′(b′) = b − b = g(a) − b. So b = g(a) − β′(b′) and thus

β′(b′ + im g) = β′(b′) + im g = b+ im g.

The hard part is showing exactness around the connecting map ð. For a′′ ∈ kerh,

we can lift back to some a ∈ A which maps to a′′ ∈ A′′ and has image g(a) = b. But

196

b 7→ 0 under β so there is some b′ ∈ B′ such that b′ 7→ b. Recall that we defined

ð(a′′) = b′ + im f . Then im(α |ker g) ⊂ kerð because g(a) = 0. On the other hand,

suppose ð(a′′) = 0. Then in the diagram chase for the definition of ð, choose a′ ∈ A′

such that f(a′) = b′. Set a = α′(a′). By commutativity, g(a−a) = 0 and by exactness,

α(a− a) = α(a) = a′′.

Let b′ ∈ B′ such that β′(b′ + im f) = im g, so β′(b′) ∈ im g. The fact that

imð ⊂ ker β′ is just by definition of the connecting map. Conversely, suppose b′ 7→ b

such that there is some a ∈ A with g(a) = b. If we set α(a) = a′′, we have already

constructed the connecting map for a′′. Moreover we know b′ 7→ b 7→ 0 so a′′ must map

to 0 along h, i.e. a′′ ∈ kerh. This shows exactness at every part of the sequence.

We use the Snake Lemma to prove

Lemma A.2.1. Given a pair of homomorphisms Af−→ B

g−→ C there is an exact

sequence

0→ ker f → ker g ◦ f → ker g → coker f → coker g ◦ f → coker g → 0.

Proof. Apply the Snake Lemma to

A B coker f 0

0 C C 0

f

g ◦ f g

id

to produce the desired exact sequence.

197

A.3 Cyclic Group Cohomology

Here we briefly introduce the basic concepts in group cohomology and use them to

define the Herbrand quotient of a G-module. Familiarity with Tor, Ext and group

algebras is assumed. The Herbrand quotient is used extensively in Section 2.6 in the

proof of the second fundamental inequality.

Definition. Let G be a group and consider an abelian group A on which G acts on

the left. Then A is called a G-module. Alternatively, A may be thought of as a left

module over the group algebra ZG.

Group homology/cohomology is the study of Tor and Ext over these group alge-

bras. If G-Mod is the category of G-modules, there are two important functors:

(−)G : G-Mod −→ Ab

A 7−→ AG := {a ∈ A | g · a = a for all g ∈ G}

and (−)G : G-Mod −→ Ab

A 7−→ AG := A/〈g − 1〉A.

The functor (−)G sends A to the set of G-invariants of A, that is, AG is the largest

subspace of A which is fixed by G. On the other hand, the image of (−)G are the

quotients AG = A/〈g − 1〉A which comprise the G-coinvariants of A. It turns out

that for any G-module A, AG ∼= HomZG(Z, A) and AG ∼= Z⊗ZG A (these are natural

isomorphisms of functors). By Hom-tensor adjointness, this means (−)G and (−)G

are adjoint functors.

Definition. For a left G-module A, we define the nth group cohomology of A by

Hn(G;A) := ExtnZG(Z, A).

Note that H0(G;A) = AG. The dual construction defining group homology relates

198

to AG in exactly the same way. One perspective of group cohomology is that it

measures how far G is from being a finite group. For practical purposes, we can use

Hn(G;Z) to measure this property of G.

For the rest of this section, let G = 〈t〉 be a cyclic group of order n. We construct

a free resolution of Z over ZG,

· · · N−→ ZG t−1−−→ ZG N−→ ZG t−1−−→ ZG→ Z→ 0

where N = 1+t+t2+. . .+tn−1 =∑g∈G

g, called the norm element of G. This resolution

is an infinite, 2-periodic resolution of Z as a left ZG-module. Therefore there are only

two distinct cohomology groups for k ≥ 1, one for odd homological degrees and one

for even:

Heven(G;A) := H2k(G;A) and Hodd(G;A) := H2k−1(G;A).

(Recall that the 0th cohomology is H0(G;A) = AG.)

Lemma A.3.1 (Exact Hexagon). Given an exact sequence 0→ A→ B → C → 0 of

G-modules, the long exact sequence in cohomology is an exact hexagon:

H0(G;A) H0(G;B)

H0(G;C)

H1(G;A)H1(G;B)

H1(G;C)

Proof. The exact hexagon is just the long exact sequence in cohomology when G is

cyclic and the cohomologies are 2-periodic after the 0th homological degree.

199

Definition. Let A be a G-module. The Herbrand quotient of A is

q(A) =|H1(A)||H0(A)|

which is defined whenever the cohomology groups of A are finite.

Lemma A.3.2. Let 0 → A → B → C → 0 be an exact sequence of G-modules. If

any two of q(A), q(B), q(C) are defined then so is the third, and q(A)q(C) = q(B).

Proof. Apply the exact hexagon.

Corollary A.3.3. If A ⊂ B are G-modules and C = B/A is a finite quotient, then

q(A) = q(B) whenever either of these are defined.

Proof. If C is finite, we have

q(C) =[kerN : im(t− 1)]

[ker(t− 1) : imN ]=| kerN | | im(t− 1)|| ker(t− 1)| | kerN |

=|C||C|

= 1.

Then apply Lemma A.3.2.

There is a special case of cyclic cohomology for finite, Galois extensions L/K,

famously listed as Theorem 90 in [13].

Theorem (Hilbert’s Theorem 90). If G = Gal(L/K) is the Galois group for L/K,

a finite, Galois extension of number fields then H1(G;L∗) = 1 where L∗ denotes the

invertible elements of L.

200

A.4 Helpful Magma Functions

Here we list some functions we developed to compute the distribution of primes

among divisions, cycle types and conjugacy classes of G in the sense of Frobenius’ and

Cebotarev’s density theorems (Sections 2.5 and 2.10). These are referenced frequently

in Example 2.10.7.

FrobeniusTally := function(f,n)

// INPUT: f - monic polynomial with integer coefficients

// n - positive integer

// OUTPUT: Outputs a tally of how often each element of the Galois

// group occurs as the Frobenius element of a prime p for

// p less than n.

K := NumberField(f);

disc := Discriminant(f);

discPrimes := [p[1] : p in Factorization(Integers() ! disc)];

validPrimes := [ p : p in [2..n] | IsPrime(p) and p notin discPrimes ];

frobElts := [ FrobeniusElement(K,p) : p in validPrimes ];

uniqueFrobElts := SequenceToSet(frobElts);

frobTally := [ <p,#[q : q in frobElts | q eq p]> : p in uniqueFrobElts ];

return frobTally;

end function;

Divisions := function(G)

// INPUT: G - a finite group

// OUTPUT : A set of representatives of the divisions of G.

ccG := ConjugacyClasses(G);

ccSubgroups := SequenceToSet([ sub < G | cc[3] > : cc in ccG ]);

divReps := [ GeneratorsSequence(H)[1] : H in ccSubgroups ];

return divReps;

end function;

Divisionmates := function(G,x)


// x - an element of G

// OUTPUT: The set of elements of G that are in the same division as x.

subgp := sub < G | x >;

ordx := Order(subgp);

gens := [ x^k : k in [1..ordx-1] | Gcd(k,ordx) eq 1 ];

ccs := [ Conjugates(G,y) : y in gens ];

201

return (&join ccs);

end function;

Cyclemates := function(G,x)


// x - an element of G

// OUTPUT: The set of elements of G that are in the same division as x.


cycleCCGs := [ Conjugates(G,cc[3]) : cc in ccG | CycleStructure(cc[3])

eq CycleStructure(x) ];

return (&join cycleCCGs);

end function;

DivisionTally := function(T)

// INPUT: T - tally from frobeniusTally

// OUTPUT: Collect all elements that are in the same division in T.

G := Parent(T[1][1]);

divsG := Divisions(G);

divTally := [ < divRep, &+[ p[2] : p in T | (sub < G | p[1] >) eq (

sub < G | divRep >) ] > : divRep in divsG ];

return divTally;

end function;

CycleTypeTally := function(T)


// OUTPUT: Collect all elements that are in the same cycle type in T.

G := Parent(T[1][1]);


cycleTypesG := SequenceToSet ([ CycleStructure(cc[3]) : cc in ccG ]);

ccTally := [ <ct, &+[ p[2] : p in T | CycleStructure(p[1]) eq ct] > :

ct in cycleTypesG ];

return ccTally;

end function;

ConjClassTally := function(T)


// OUTPUT: Collect all elements that were conjugate from T.

G := Parent(T[1][1]);


ccTally := [ < cc[3], &+[ p[2] : p in T | IsConjugate(G,cc[3],p[1])

] > : cc in ccG ];

return ccTally;

202

end function;

The function FrobeniusElement(K,p) is a built-in Magma function that com-

putes a Frobenius element FrobN/K(p) for a prime p ∈ OK , where N is the Galois

closure of K. By the definition of FrobN/K(p) in Section 2.10 we know that such an

element is unique up to conjugacy, and is completely unique in the abelian case.

In light of the importance of Theorem 2.11.3, it is desirable to be able to compute

the minimal polynomial of the j-invariant of OK for K = Q(√−n) and n ≥ 2. The

following Magma function does just that.

getJInvMinPoly := function(n)

// INPUT : n - an integer

// OUTPUT : The minimal polynomial of the j invariant of the ring of

// integers in number field QQ(\sqrt{-n}), i.e. a primitive

// element for the Hilbert class field of QQ(\sqrt{-n}).

R<x> := PolynomialRing(Integers());

f := x^2+n;

K<y> := NumberField(f);

H := HilbertClassField(K);

z := PrimitiveElement(H);

return MinimalPolynomial(z,RationalField());

end function;

The next function makes it easy to create a list of class numbers for the first n

negative discriminants. This is useful when creating examples such as Example ??

where a particular class number is desired.

getJInvDegrees := function(n);

// INPUT : n - a positive integer

// OUTPUT : A list of tuples [i,d] where d is the degree of the minimal

// polynomial of the j-invariant of the ring of integers of

// QQ(\sqrt{-i}) for i less than or equal to n.

jInvDegs := {@ [i,Degree(getJInvMinPoly(i))] : i in [2..n] @};

return jInvDegs;

end function;

The next sequence of functions are used in Section 3.3 to study n-Fermat primes

and symmetric n-Fermat primes. Note that if x2 +ny2 is prime then gcd(x, n) = 1; if

203

in addition y2 + nx2 is prime, gcd(y, n) = 1. So when searching for n-Fermat primes

and symmetric n-Fermat primes, we can reduce our search parameters on x and y to

those solutions that are relatively prime to n.

isNFermatPrime := function(f,n,p)

// INPUT: f - minimal polynomial of the jInvariant of the order

// Z[sqrt(-n)]

// n - positive integer

// p - prime integer (primality is checked below)

// OUTPUT: true if p = x^2+ny^2 and false otherwise.

if not IsPrime(p) then

return false;

end if;

if n eq p then

return true;

end if;

// At this point, p is prime and n neq p.

// Create polynomial ring over F_p

R<x> := PolynomialRing(FiniteField(p));

// Test if Legendre symbol is 1

jacobiCond := LegendreSymbol(-n,p) eq 1;

// Check if f mod p has a root

congruenceCond := HasRoot(R ! f,R);

return (jacobiCond and congruenceCond);

end function;

// This is the brute force factoring method.

factorNFermatPrimeNaive := function(n,p)


// p - a positive integer

// OUTPUT : Returns a sequence [p, x, y] if p = x^2 + ny^2.

// If no such x or y exist, [p,0,0] is returned.

factor := {@ [x,y] : x in [0..p], y in [0..p] | p eq x^2 + n*y^2 @};

if #factor eq 0 then

return [p,0,0];

else

return [p,factor[1][1],factor[1][2]];

end if;

end function;

getNFPNaive := function(n,M)


204

// M - a positive integer

// OUTPUT : Return a list of tuples [p,x,y] where p is an n-Fermat

// prime with x and y less than or equal to M.

searchSpace := [m : m in [1..M] | Gcd(n,m) eq 1];

primesList := {@ [x^2+n*y^2,x,y] : x in searchSpace, y in searchSpace |

IsPrime(x^2 + n*y^2) @};

return primesList;

end function;

getNFPNaive2 := function(n,M)



// OUTPUT : Return a list of tuples [p,x,y] where p is an n-Fermat

// prime with x and y less than or equal to M.

// Similar to getNFPNaive, but the search space is traversed in the

// opposite order.

searchSpace := [m : m in [1..M] | Gcd(n,m) eq 1];

primesList := {@ [x^2+n*y^2,x,y] : y in searchSpace, x in searchSpace |

IsPrime(x^2 + n*y^2) @};

return primesList;

end function;

getSymmNFPNaiveList := function(n,primesList);


// primesList - a list of n-Fermat primes in the form of a

// tuple [p,x,y] with p = x^2 + ny^2.

// OUTPUT : Return a list of tuples [p,x,y,q] where q = y^2 + nx^2 is

// prime.

symmPrimes := {@ [p[1],p[2],p[3],p[3]^2+n*p[2]^2] : p in primesList |

IsPrime(p[3]^2+n*p[2]^2) and p[2] lt p[3] @};

return symmPrimes;

end function;

getSymmNFPNaive := function(n,M);



// OUTPUT : Return a list of tuples [p,x,y,q] where p = x^2 + ny^2 and

// q = y^2 + nx^2 are both prime with x and y less than or equal

// to M.

primesList := getNFPNaive(n,M);

return getSymmNFPNaiveList(n,primesList);

end function;

205

In Section 3.3, we are interested in the number of symmetric n-Fermat primes.

This is best understood in terms of the Prime Number Theorem. We will use the

following heuristic: let π(L) denote the number of primes in a set L of positive

integers; then

π(L) ∼ C∑m∈L

1

logm

where log is the natural logarithm and C is a constant. For our purposes, we are

interested in πsym,n(M) = π(L) where L is the set of n-Fermat numbers y2 +nx2 such

that x2 + ny2 is prime and x, y ≤M . We have written a Magma function that takes

a list of n-Fermat primes x2 + ny2 and calculates the expected number of y2 + nx2

that are prime.

expectedNumberOfPrimes := function(n,L);


// L - A list of tuples of the form [p,x,y] where p = x^2 + ny^2

// and p is prime.

// OUTPUT : The number of elements of L that are expected

// to be prime, as predicted by PNT. This is the sum

// of the reciprocals of the logs of the elements of L.

return &+[1/Log(p[3] + n*p[2]^2) : p in L];

end function;

The next four functions generate data on πsym,n(M) for a range of values for

n and M . The getInterestingPrimes and getInterestingPrimes2 functions in

particular are useful for picking out n values for which there are either many more

or many less symmetric n-Fermat primes than predicted by our PNT heuristic. In

the getSymmRatio function, the factor of 2 in the output for expSymm is explained in

Section 3.3.

getSymmRatio := function(n,M);



// OUTPUT : A tuple of the form [n,symmRat,expSymm,N,S] where:

// symmRat - the number of symmetric n-Fermat primes divided by the

206

// number of n-Fermat primes

// expSymm - The ratio of the number of symmetric n-Fermat tuples to

// the expected number of symmetric n-Fermat tuple as pre-

// dicted by PNT

// N - the number of n-Fermat primes with x and y leq M

// S - the number of symmetric n-Fermat primes with x and y leq M.

R := RealField(10);

primeList := getNFPNaive(n,M);

symmList := getSymmNFPNaiveList(n,primeList);

// In the below command, the 2 is present in the third entry because

// if x^2 + ny^2 is prime and (x,n) = (y,n) = 1, y^2 + nx^2 is odd,

// so is already twice as likely to be prime.

return [

n,

R ! #(symmList)/#(primeList),

#(symmList)/(2*expectedNumberOfPrimes(n,primeList)),

#primeList,

#symmList];

end function;

getSymmRatios := function(n,M);



// OUTPUT : A list of tuples returned by getSymmRatio(i,M) for i

// between 2 and n.

// See the description above for the entries in the tuple.

symRats := {@ getSymmRatio(i,M) : i in [2..n] @};

return symRats;

end function;

getInterestingPrimes := function(n,M,r);



// r - a real number


// between 2 and n such that the ratio of symmetric n-Fermat

// primes to the number of symmetric n-Fermat primes expected

// by PNT *exceeds* r.

symRats := getSymmRatios(n,M);

return {@ p : p in symRats | p[3] gt r @};

end function;

207

getInterestingPrimes2 := function(n,M,r);



// r - a real number


// between 2 and n such that the ratio of symmetric n-Fermat

// primes to the number of symmetric n-Fermat primes expected

// by PNT *is less than* r.

symRats := getSymmRatios(n,M);

return {@ p : p in symRats | p[3] lt r @};

end function;

The following functions are used to generate statistics for the data generated by

the functions above.

averageRatio := function(primeData)

retVal := &+[ p[3] : p in primeData ] / (#primeData);

return retVal;

end function;

stdDevRatio := function(primeData)

N := #primeData;

avg := averageRatio(primeData);

retVal := SquareRoot(&+ [ (p[3] - avg)^2 : p in primeData ]/N);

return retVal;

end function;

ratioFrequency := function(primeData,i)

avg := averageRatio(primeData);

stdDev := stdDevRatio(primeData);

numStdDev := Abs(primeData[i-1][3] - avg) / stdDev;

freq := 1/(1-Erf(numStdDev/SquareRoot(2)));

return freq;

end function;

208

Curriculum Vitae

————————————————————————————————————–

EDUCATION

Wake Forest University, Winston-Salem, North Carolina

M.A. Mathematics, expected graduation May, 2015

GPA: 4.00

Honors: Teaching Assistantship (2013-2015)

Outstanding Graduate Student Award (2014)

Wake Forest University, Winston-Salem, North Carolina

B.S. Mathematics; B.A. Spanish

GPA: 3.88; Dean’s List (8 semesters); Summa Cum Laude

Honors: Kenneth Tyson Raynor scholar (2011 and 2012)

John Y. Phillips Prize in Mathematics (2013)

Phi Beta Kappa

Pi Mu Epsilon (NC Lambda Chapter President)

————————————————————————————————————–

RESEARCH EXPERIENCE

Class Field Theory, with Dr. Frank Moore, Wake Forest University, 2013-2015

• Master’s thesis entitled Class Field Theory and the Study of n-Fermat Primes

Knot Mosaic Theory, with Dr. Hugh Howards, Wake Forest University, 2012-2014

• Co-authored paper, Crossing Number Bounds in Knot Mosaics; submitted for con-sideration to Journal of Knot Theory and Its Ramification

• Available at http://arxiv.org/abs/1405.7683 (arXiv:1405.7683v2 [math.GT])

————————————————————————————————————–

TEACHING EXPERIENCE

Teaching Assistant, Wake Forest University Dept. of Mathematics, 2013-2015

• Tutoring, study sessions, grading and some lecture experience

• Statistics, discrete math, linear algebra, multivariable calculus and real analysis

Mathematics Tutor, Wake Forest University Math Center, 2010-2013

• Statistics, calculus (single and multivariable), algebra, discrete math

Spanish Tutor, Wake Forest University Dept. of Romance Languages, 2012-2014

————————————————————————————————————–

209

http://arxiv.org/abs/1405.7683

class field theory and the theory of n-fermat primes … · 2017-08-07 · class field theory and...

Documents