class field theory and the theory of n-fermat primes … · 2017-08-07 · class field theory and...
TRANSCRIPT
CLASS FIELD THEORY AND THE THEORY OF N -FERMAT PRIMES
BY
ANDREW KOBIN
A Thesis Submitted to the Graduate Faculty of
WAKE FOREST UNIVERSITY GRADUATE SCHOOL OF ARTS AND SCIENCES
in Partial Fulfillment of the Requirements
for the Degree of
MASTER OF ARTS
Mathematics
May, 2015
Winston-Salem, North Carolina
Approved By:
Frank Moore, Ph.D., Advisor
Hugh Howards, Ph.D., Chair
Jeremy Rouse, Ph.D.
Acknowledgments
This was probably the hardest page of this thesis to write, as no number of wordsare sufficient to praise those who have helped and supported me along the way to myMaster’s degree.
First, I would like to thank the members of my committee. Thank you to Dr.Jeremy Rouse for his seemingly infinite wisdom and even greater generosity in sharinghis knowledge with me. We had numerous discussions on the finer details of algebraicnumber theory that helped shape the direction of my research. He is also a fastrunner. Thank you to Dr. Hugh Howards for his mentorship and advice going allthe way back to 2010 when I was mere freshman at Wake Forest. My identity as amathematician is in large part due to the dedication of Dr. Howards as an educator.Lastly, thank you to my adviser, Dr. Frank Moore, for his selfless devotion to thisproject over nearly two years’ time. I would not be where I am today without hismathematical knowledge, worldly advice and sincere friendship.
The Department of Mathematics at Wake Forest has been like a second family tome for some years now. Thank you to everyone here that has helped me to surviveand thrive at Wake Forest.
To my office mates, Mackenzie, Elliott, Elena and Amelie: thank you for yoursupport and for putting up with my loud music!
Finally, I would like to thank my family for their love and for providing me withopportunities in life that have allowed me to succeed. They are my biggest supportersand I love them dearly.
ii
Table of Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Chapter 1 Algebraic Number Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Rings of Algebraic Integers . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Dedekind Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Ramification of Primes . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 The Decomposition and Inertia Groups . . . . . . . . . . . . . . . . . 18
1.5 Norms of Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.6 Discriminant and Different . . . . . . . . . . . . . . . . . . . . . . . . 26
1.7 The Class Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.8 The Hilbert Class Field . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.9 Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
1.10 Units in a Number Field . . . . . . . . . . . . . . . . . . . . . . . . . 70
Chapter 2 Class Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2.1 Valuations and Completions . . . . . . . . . . . . . . . . . . . . . . . 78
2.2 Frobenius Automorphisms and the Artin Map . . . . . . . . . . . . . 90
2.3 Ray Class Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
2.4 L-series and Dirichlet Density . . . . . . . . . . . . . . . . . . . . . . 105
2.5 The Frobenius Density Theorem . . . . . . . . . . . . . . . . . . . . . 118
2.6 The Second Fundamental Inequality . . . . . . . . . . . . . . . . . . . 126
2.7 The Artin Reciprocity Theorem . . . . . . . . . . . . . . . . . . . . . 134
2.8 The Conductor Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 143
2.9 The Existence and Classification Theorems . . . . . . . . . . . . . . . 145
2.10 The Cebotarev Density Theorem . . . . . . . . . . . . . . . . . . . . 150
2.11 Ring Class Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Chapter 3 Quadratic Forms and n-Fermat Primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
3.1 The Theory of Binary Quadratic Forms . . . . . . . . . . . . . . . . . 168
3.2 The Form Class Group . . . . . . . . . . . . . . . . . . . . . . . . . . 175
3.3 n-Fermat Primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
iii
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Appendix A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
A.1 The Four Squares Theorem . . . . . . . . . . . . . . . . . . . . . . . 193
A.2 The Snake Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
A.3 Cyclic Group Cohomology . . . . . . . . . . . . . . . . . . . . . . . . 198
A.4 Helpful Magma Functions . . . . . . . . . . . . . . . . . . . . . . . . 201
Curriculum Vitae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
iv
Abstract
Andrew J. Kobin
Most problems in number theory are exceedingly simple to state, yet many con-tinue to elude mathematicians even centuries after they were originally posed. Sucha question, “Given a positive integer n, when can a prime number be written in theform x2 + ny2?”, was solved by Cox [7], and although the statement is elementary,the solution requires the depth and power of class field theory to understand. In ourapproach to this question, we will explore a variety of topics, including: algebraicnumber fields; types of class groups and class fields; two density theorems; the maintheorems in class field theory; and the theory of quadratic forms. Our discussion willculminate in Theorem 2.11.3, a full characterization of primes of the form x2 + ny2.
However, the intrigue doesn’t end there. In Chapter 3, we pose the related ques-tion: “If p is a prime of the form x2 +ny2, when is y2 +nx2 also prime?” This questionturns out to be much harder to approach, but we will investigate the symmetric n-Fermat prime question thoroughly.
In certain sections (1.8, 2.10 and 3.3) we use the Magma Computational Alge-bra System to handle large or complicated computations. Many of the basic com-mands can be found in the Magma handbook, available at http://magma.maths.
usyd.edu.au/magma/handbook/ through the University of Sydney’s ComputationalAlgebra Group.
v
Chapter 1: Algebraic Number Fields
In the first chapter we provide a detailed description of the main topics in algebraic
number theory: algebraic number fields, rings of integers, the behavior of prime ideals
in extensions, norms of ideals, the discriminant and different, the class group, the
Hilbert class field, orders and Dirichlet’s unit theorem.
1.1 Rings of Algebraic Integers
Let Q be an algebraic closure of Q. Then Q is an infinite dimensional Q-vector space
and every polynomial f ∈ Q[x] splits in Q[x]. An example of such an algebraic closure
is Q = {u ∈ C | f(u) = 0 for some f ∈ Q[x]}. Then Q ⊂ Q ⊂ C. Note that any two
choices of Q are isomorphic.
One of the most important elements of a number field we will be working with is:
Definition. An element α ∈ Q is an algebraic integer if it is a root of some monic
polynomial with coefficients in Z.
Example 1.1.1.√
2 is an algebraic integer since it is a root of x2− 2. However, 12, π
and e are not algebraic integers. We will see in a moment why 12
is not algebraic, but
the proof for π and e is famously difficult.
Note that the set of algebraic integers in Q is precisely the integers Z. In a moment
we will generalize this set to fields other than Q.
Definition. The minimal polynomial of α ∈ Q is the monic polynomial f ∈ Q[x]
of minimal degree such that f(α) = 0.
The minimal polynomial of α is unique, as the following lemma shows.
1
Lemma 1.1.2. Suppose α ∈ Q. Then the minimal polynomial f of α divides any
other polynomial h such that h(α) = 0.
Proof. Suppose h(α) = 0. Then by the division algorithm, h = fq + r with deg r <
deg f . Note that r(α) = h(α) − f(α)q(α) = 0 so α is a root of r. But since deg f is
minimal among all polynomials of which α is a root, r must be 0. This shows that f
divides h.
Lemma 1.1.3. If α ∈ Q is an algebraic integer then the minimal polynomial has
coefficients in Z.
Proof. Let f ∈ Q[x] be the minimal polynomial of α. Since α is an algebraic integer,
there is some g ∈ Z[x] such that g(α) = 0. By Lemma 1.1.2, g = fh for some monic
h ∈ Q[x]. Suppose f 6∈ Z[x]. Then there is some prime p dividing the denominator
of at least one of the coefficients of f ; let pi be the largest power of p that divides a
denominator. Likewise let pj be the largest power of p that divides the denominator of
a coefficient of h. Then pi+jg = (pif)(pjh) and reducing mod p gives 0 on the left, but
two nonzero polynomials in Fp[x] on the right, a contradiction. Hence f ∈ Z[x].
An important characterization of algebraic integers is provided in the following
proposition.
Proposition 1.1.4. α ∈ Q is an algebraic integer if and only if
Z[α] =
{n∑i=0
ciαi : ci ∈ Z, n ≥ 0
}
is a finitely generated Z-module.
Proof. ( =⇒ ) Suppose α is integral with minimal polynomial f ∈ Z[x], where deg f =
k. Then Z[α] is generated by 1, α, . . . , αk−1.
2
( =⇒) Suppose α ∈ Q and Z[α] is generated by f1(α), . . . , fn(α). Let d ≥ M
where M = max{deg fi | 1 ≤ i ≤ n}. Then
αd =n∑i=1
aifi(α)
for some choice of ai ∈ Z. Hence α is a root of xd −n∑i=1
aifi(x) so it is integral.
Example 1.1.5. α = 12
is not an algebraic integer since Z[
12
]is not finitely generated
as a Z-module.
Definition. For a given algebraic closure Q of Q, we will denote the set of all algebraic
integers in Q by Z.
This set inherits the binary operations + and · from Q, and an important property
is that Z is closed under these operations:
Proposition 1.1.6. The set Z of all algebraic integers is a ring.
Proof. Note that 0 is a root of the zero polynomial, so 0 ∈ Z. Then it suffices to
prove closure under addition and multiplication.
Suppose α, β ∈ Z and let m and n be the degrees of their respective minimal
polynomial. Then 1, α, . . . , αm−1 span Z[α] and 1, β, . . . , βn−1 likewise span Z[β]. So
the elements αiβj for 1 ≤ i ≤ m, 1 ≤ j ≤ n span Z[α, β], so this Z-module is finitely
generated. This implies that the submodules Z[α + β] and Z[αβ] of Z[α, β] are also
finitely generated, so it follows by Proposition 1.1.4 that α+ β and αβ are algebraic
integers.
The two most important objects of study in algebraic number theory are number
fields and their associated rings of integers, which are defined below.
3
Definition. A number field is a subfield K ⊂ Q such that K is a finite dimensional
vector space over Q. The dimension of K/Q is called the degree of the field extension,
denoted [K : Q].
Definition. The ring of integers of a number field K is
OK = K ∩ Z = {α ∈ K | α is an algebraic integer}.
Example 1.1.7. Q is the unique number field of degree 1, and its ring of integers is
the rational integers Z.
Example 1.1.8. Q(i) is a number field of degree 2. Its ring of integers is Z[i], the
Gaussian integers.
Example 1.1.9. K = Q(√
5) has ring of integers OK = Z[(1 +√
5)/2]. The reader
may recognize this number (1 +√
5)/2 as the golden ratio.
An object we will study in Section 1.9 is:
Definition. An order in OK is any subring O ⊂ OK such that the quotient OK/O
of abelian groups is finite.
Example 1.1.10. For OQ(i) = Z[i], the subring Z+niZ is an order for every nonzero
n ∈ Z. However, Z ⊂ Z[i] is not an order since Z does not have finite index in Z[i].
Example 1.1.11. For K = Q(α) where α is an algebraic integer, Z[α] is an order in
OK but in general Z[α] 6= OK . We study orders in further detail in Section 1.9 and
see some important examples where Z[α] 6= OK .
Lemma 1.1.12. For any number field K, OK ∩Q = Z and QOK = K.
Proof. Suppose α ∈ OK ∩Q such that α = ab
in lowest terms. We may assume b > 0.
Since α is integral, Z[ab
]is finitely generated as a Z-module so b = 1.
4
On the other hand, suppose α ∈ K with minimal polynomial f(x) ∈ Q[x], where
deg f = n. For any positive integer d, the minimal polynomial of dα is dnf(xd
). In
particular, let d be the least common multiple of the denominators of the coefficients
of f . Then dnf(xd
)has integer coeffients, so dα ∈ OK . Hence QOK = K.
Definition. A lattice in a number field K is a subset L such that QL = K and L is
an abelian group of rank [K : Q].
Proposition 1.1.13. For any number field K, the ring of integers is a lattice in K.
Proof. QOK = K was proven in Lemma 1.1.12, and the second statement can be
shown by choosing a basis for K consisting of elements in OK .
Corollary 1.1.14. OK is a noetherian ring.
Proof. By the proposition, OK is finitely generated as a Z-module, so it is clearly
finitely generated as a ring. It is well known (cf. 2.2.7 in [25]) that this implies OK
is noetherian.
1.2 Dedekind Domains
A nice property of the integers Z is unique factorization: every integer n can be
written as a unique product of powers of prime numbers. This unique factorization
property fails in general for rings of algebraic integers. However, OK has the special
property that every nonzero ideal factors uniquely as a product of prime ideals.
Definition. An integral domain R is integrally closed in its field of fractions K if
every α ∈ K that is a root of a monic polynomial f ∈ R[x] is itself in R.
Proposition 1.2.1. Z is integrally closed and for any number field K, its ring of
integers OK is integrally closed.
5
Proof. First suppose α ∈ Q is integral over Z. Then there is some monic polynomial
f(x) in Z[x] such that f(α) = 0. If f(x) = a0 + a1x + . . . + xn then the ai all lie
in OK , where K = Q(a0, . . . , an−1). Since OK is finitely generated as a Z-module,
so is Z[a0, . . . , an−1]. Now f(α) = 0 means that we can write αn as a combination
of αi for i < n, with weights ci ∈ Z[a0, . . . , an−1]. Thus Z[a0, . . . , an−1, α] is also a
finitely generated Z-module. But notice that Z[α] is a submodule of Z[a0, . . . , an−1, α],
so it too is finitely generated. Hence α is integral over Z, meaning α ∈ Z. This
proves the first statement, and the second statement now follows easily. Suppose
α ∈ K is integral over OK . Then since Z is integrally closed, α ∈ Z, implying
α ∈ K ∩ Z = OK .
This property of OK is important in establishing it as a special type of domain,
called a Dedekind domain.
Definition. An integral domain R is a Dedekind domain if
(1) R is noetherian.
(2) R is integrally closed in its field of fractions.
(3) Every nonzero prime ideal p ⊂ R is maximal.
Example 1.2.2. Z[√
5] is not integrally closed since for example (1+√
5)/2 ∈ Q(√
5)
is integral over Z[√
5] but is not itself an element of Z[√
5]. Therefore Z[√
5] is not
a Dedekind domain, but as we shall see in Section 1.9 it is an order of the ring of
integers Z[(1 +√
5)/2].
Example 1.2.3. Any field is (trivially) a Dedekind domain.
Example 1.2.4. Z is integrally closed and every nonzero prime ideal is maximal, but
Z is not noetherian and hence not a Dedekind domain.
6
Proposition 1.2.5. OK is a Dedekind domain.
Proof. We have shown (Corollary 1.1.14 and Proposition 1.2.1) that OK is integrally
closed and noetherian, so it suffices to show that prime ideals are maximal.
Suppose p is a nonzero prime ideal of OK . Let α ∈ p and let f(x) = xn +
an−1xn−1 + . . .+ a0 be its minimal polynomial. Then f(α) = 0 so
a0 = −αn − an−1αn−1 − . . .− a1α ∈ p.
Then a0 ∈ Z∩ p so every element of the quotient OK/p is killed by a0, which implies
OK/p is finite. Since p is prime, OK/p is an integral domain, and every finite integral
domain is a field, which proves that p is maximal. Hence OK is a Dedekind domain.
The crucial property of Dedekind domains is that their nonzero ideals factor
uniquely into prime ideals. In fact, unique factorization holds for a more general
class of objects in a Dedekind domain called fractional ideals.
Definition. Let R be a Dedekind domain andK be its field of fractions. A fractional
ideal of R is a nonzero R-submodule of K that is finitely generated as an R-module.
Note that since fractional ideals are finitely generated, we can clear denominators
of a generating set to realize every fractional ideal in the form
aI = {ab | b ∈ I}
where a ∈ K and I is an integral ideal of the ring R.
Example 1.2.6. 12Z is a fractional ideal of Z.
Lemma 1.2.7. Let R be a Dedekind domain. For every nonzero ideal I ⊂ R, there
exist prime ideals p1, . . . , pn such that p1 · · · pn ⊂ I.
7
Proof. Let S be the set of nonzero ideals in R that do not satisfy the conclusion of the
lemma. The idea here is to use the fact that R is noetherian to show that S must be
empty. Supposing to the contrary that S is not empty, the noetherian property allows
us to choose a maximal element I ∈ S. If I were prime, it would trivially contain a
product of primes so we know this is not the case. Then there exist a, b ∈ Rr I such
that ab ∈ I. Let J1 = I + (a) and J2 = I + (b). Then neither J1 nor J2 is in S since
I is maximal, so each contains the product of primes, say
p1 · · · pr ⊂ J1 and q1 · · · qs ⊂ J2.
Then p1 · · · prq1 · · · qs ⊂ J1J2 = I2 + I(b) + (a)I + (ab) ⊂ I. We have shown I to
contain a product of primes, producing the necessary contradiction to show that S is
empty. Hence every nonzero ideal of R contains a product of primes.
The critical property of fractional ideals is proven next.
Theorem 1.2.8. The set of fractional ideals of a Dedekind domain R forms an abelian
group under ideal multiplication, with identity R.
Proof. The product of two fractional ideals is again finitely generated, hence a frac-
tional ideal. Also, for any nonzero ideal I, IR = R so it suffices to show the existence
of inverses.
First we prove that if p ⊂ K is prime, it has an inverse. Let I = {a ∈ K | ap ⊂ R};
we will show this is an inverse of p. Fix a nonzero b ∈ p. Since I is an R-module, bI
is an ideal in R. And since R ⊂ I we have p ⊂ Ip ⊂ R, but p is maximal (R is a
Dedekind domain) so either p = Ip or Ip = R.
If Ip = R then I is an inverse of p and we’re done. Instead suppose Ip = p. By
Lemma 1.2.7 we can choose a minimal product of prime ideals p1p2 · · · pm ⊂ (b) ⊂ p.
If no pi is contained in p then for each i there is some ai ∈ pi with ai 6∈ p, but∏ai ∈ p
8
which contradicts that p is a prime ideal. Thus there is some pi ⊂ p. However, every
prime is maximal so pi = p. Since m was minimal, p2p3 · · · pm 6⊂ (b) and so there is
some c 6∈ (b) that lies in p2p3 · · · pm. Then p(c) ⊂ (b) so we have d := cb∈ I. However
d 6∈ R since if it were, it would lie in (b). But note that d preserves p as an R-module
– that is, dp ⊂ p since d = cb
– so d must be in R, a contradiction. Hence Ip = R, so
every prime ideal has an inverse in R.
Now we turn to fractional ideals. Every fractional ideal is of the form aI for some
a ∈ K and I an ideal of R. Since the prime ideals are maximal in R, I ⊂ p for some
prime p. Multiplying both sides of this containment by p−1, we have
I ⊂ p−1I ⊂ p−1p = R.
By the same argument as above, p−1I = R so every fractional ideal has an inverse.
In the next two theorems we show that unique factorization of ideals holds in any
Dedekind domain.
Theorem 1.2.9. Every nonzero ideal I in a Dedekind domain R can be written as a
unique (up to order) product of prime ideals.
Proof. Suppose I is maximal among those ideals that cannot be factored into primes.
Every ideal is contained in a maximal ideal so I ⊂ p for some maximal p which is
also prime. If Ip−1 = I then p−1 = R by group properties, but this is impossible.
However, R ⊂ p−1 which implies I ( Ip−1. By maximality of I, Ip−1 = p1 · · · pn
for prime ideals pi. Then I = p1 · · · pnp, which shows I can in fact be written as
a product of primes, contradicting our initial assumption. Hence every ideal has a
prime factorization.
To prove uniqueness, suppose p1 · · · pn = q1 · · · qm. If no qi is contained in p1 then
for each i there is some ai ∈ qi r p1. But then a1 · · · am ∈ q1 · · · qm = p1 · · · pn ⊂ p1
9
which contradicts primality of p1. Thus p1 = qi for some i, and this argument can be
repeated for each pj to show that pj = qi for some i. Thus the factorization is unique
up to order.
Theorem 1.2.10. If I is a fractional ideal of R then there exist prime ideals p1, . . . , pn
and q1, . . . , qm so that
I = (p1 · · · pn)(q1 · · · qm)−1
and this factorization is unique up to order.
Proof. We can clear denominators to write aI = J for some a ∈ R and J an integral
ideal of R. Apply unique factorization to J and (a) and the result follows from
Theorem 1.2.8.
Example 1.2.11. Let K = Q(√−6) with ring of integers OK = Z[
√−6]. If ab =
√−6 with neither a unit, then Norm(a)Norm(b) = 6 (see Section 1.5). Without loss
of generality let Norm(a) = 2 and Norm(b) = 3. If a = x + y√−6 then Norm(a) =
x2 + 6y2 = 2 which has no solutions in Z. This shows that√−6 is irreducible, and
even if a or b were a unit, the other would equal√−6 so
√−6 would be irreducible
anyways. So 6 cannot be written as a product of irreducibles in OK . However, (6)
factors into prime ideals as Theorem 1.2.9 suggests:
(6) = (2, 2 +√−6)2(3, 3 +
√−6)2.
This is not trivial to calculate, but we will develop the techniques required to deter-
mine such a factorization in subsequent sections.
A special case of a Dedekind domain is:
Definition. An integral domain R is a discrete valuation ring if it is noetherian,
integrally closed and contains exactly one nonzero prime ideal.
10
In Section 2.1, we will see where the name ‘discrete valuation ring’ comes from,
as well as study some of the properties of a DVR as they relate to absolute values on
a field.
We proved in Theorem 1.2.8 that the set of fractional ideals of a Dedekind domain
forms a group under ideal multiplication, but there is an even stronger characteriza-
tion.
Theorem 1.2.12. Let R be an integral domain. Then the following are equivalent:
(1) R is a Dedekind domain.
(2) For every prime ideal p ⊂ R, the local ring Rp is a discrete valuation ring.
(3) The fractional ideals of R form a group.
(4) For every fractional ideal I ⊂ R there is an ideal J ⊂ R such that IJ = R.
Proof. See VIII.6.10 in [14].
In the case of OK , there are some important groups that arise from fractional
ideals, the most important being the class group.
Definition. Let IK denote the group of fractional ideals of OK and let PK denote
the subgroup of all principal fractional ideals of OK :
PK = {αOK | α ∈ K∗}.
Then the quotient IK/PK is called the ideal class group of K, denoted C(OK).
In Section 1.7 we explore this group fully. A major result we will prove is
Theorem. C(OK) is a finite group.
11
1.3 Ramification of Primes
Let K be a number field and suppose L/K is any finite extension. If p is a prime
ideal of OK then pOL is an ideal of OL and hence has prime factorization
pOL = Pe11 · · ·Peg
g
where Pi are the distinct prime ideals of OL containing p. We will sometimes say a
prime Pi lies over p, Pi contains p or Pi divides pOL.
Definition. For each Pi, the integer ei is called the ramification index of p in Pi.
If any of these are greater than 1, p is said to ramify in L.
Definition. Each ideal Pi lying over p gives a residue field extension OL/Pi ⊃ OK/p.
The degree of this extension, denoted fi, is called the inertial degree of p in Pi.
Definition. A prime p is said to split completely in L if ei = fi = 1 for all Pi in
the prime factorization of pOL. If in addition pOL is itself a prime ideal, i.e. g = 1,
we say p is inert.
The set of all prime ideals of a ring R together with (0) is called the spectrum
of R, denoted Spec(R). We will also occasionally use Spec(p) to denote the set of
primes P ⊂ OL lying over p ⊂ OK .
Example 1.3.1. In Z[i], (2) = (1 + i)2 so (2) ramifies with e1 = 2. By contrast, (3)
is inert in Q(i) with residue field Z[i]/(3) ∼= F9, and (5) = (2 + i)(2− i) is unramified.
The next lemma characterizes the primes of OL which divide pOL.
Lemma 1.3.2. A prime ideal P ⊂ OL divides pOL if and only if p = P ∩K.
Proof. ( =⇒ ) Clearly p ⊂ P∩K 6= OK . Since p is maximal, this implies p = P∩K.
( =⇒) If p ⊂ P then we have seen that pOL ⊂ P and this implies that P occurs
in the prime factorization of pOL.
12
There is an important relation between the ramification indices, inertial degrees
and number of primes in Spec(p) that is described in the next theorem, known as the
efg theorem.
Theorem 1.3.3. Let m = [L : K] and let P1, . . . ,Pg be the prime OL-ideals con-
taining p ⊂ OK. Then
g∑i=1
eifi = m.
Furthermore, if L/K is Galois, then all the ramification indices are equal to e = e1
and all the inertial degrees are equal to f = f1, so efg = m.
Proof. The first statement is proven by showing both sides are equal to [OL/pOL :
OK/p]. By the Chinese remainder theorem,
OLpOL
=OL∏Peii
∼=∏ OL
Peii
.
For each i = 1, . . . , g, fi is the degree of the extension OL/Pi ⊃ OK/p, and for each
ri, Prii /P
ri+1i is an OL/Pi-module. Since there is no ideal between Pri
i and Pri+1i –
(OL)Piis a DVR – this module has dimension 1 as an OL/Pi-vector space, and hence
dimension fi as an OK/p-vector space. Therefore each quotient in the chain
OL ⊃ Pi ⊃ P2i ⊃ · · · ⊃ Pei
i
has dimension fi over OK/p. Thus [OL/Peii : OK/p] = eifi. This shows that the left
side equals [OL/pOL : OK/p].
For the other equality, we first prove it when OL is a free OK-module (e.g. when
OK is a PID). On one hand, OnK∼=−→ OL induces an isomorphism Kn → L which
shows that n = m. On the other hand, OnK∼=−→ OL also induces an isomorphism
(OK/p)n → OL/pOL which shows that m = n = [OL/pOL : OK/p]. In the general
13
case, localize OK at p to obtain a DVR O′K = (OK)p. Since a DVR is always a PID,
O′L = (OL)p satisfies
pO′L =∏
(PiO′L)ei
so [O′L/pO′L : O′K/pO′K ] = m. This completes the first part of the proof.
Now assume L is Galois over K. Take σ ∈ G = Gal(L/K). Then if P ⊂ OL is
a prime ideal, so is σ(P). Moreover, if P contains p then by Lemma 1.3.2 so must
σ(P). Clearly e(σ(P) | p) = e(P | p) and f(σ(P) | p) = f(P | p).
To complete the proof, we will show that G acts transitively on Spec(p), the set
of prime ideals of OL lying over p. Suppose P and Q both contain p but are not
Galois conjugates. By the Chinese remainder theorem we can find an element β ∈ Q
that does not lie in σ(P) for any σ ∈ G. Define b = N(β), where N denotes the
norm (see Section 1.5). Then b ∈ OK and since β ∈ Q, b ∈ Q as well. Thus
b ∈ OK ∩ Q = p. On the other hand, β 6∈ σ−1(P) for any σ ∈ G so σ(β) 6∈ P.
However, N(σ(β)) = N(β) = b ∈ p so we have p ⊂ P which contradicts primality of
p. Hence Gal(L/K) acts transitively on the primes containing p and the result follows
by the preceding paragraph since e and f are invariant under Gal(L/K).
As we saw in Example 1.2.11, it is hardly easy to determine the factorization of
ideals in a number field. The next theorem will be of immense importance going
forward, as it allows us to describe the splitting behavior of a prime p ⊂ OK as we
pass to an extension L/K.
14
Theorem 1.3.4. Let L/K be Galois, where L = K(α) for some α ∈ OL. Let
f(x) ∈ OK [x] be the minimal polynomial of α over K. Suppose p is a prime ideal of
OK and f(x) is separable mod p. Then
(1) p is unramified in L.
(2) If f(x) ≡ f1(x) · · · fg(x) mod p for distinct fi(x) which are irreducible mod p,
then Pi = pOL + fi(α)OL is a prime ideal of OL, and the prime factorization
of pOL is
pOL = P1 · · ·Pg.
Furthermore, deg fi = f(Pi | p) for all i, and since L/K is Galois, these are
all the same.
(3) p splits completely in L ⇐⇒ f(x) ≡ 0 mod p has a solution in OK.
Proof. (1) and (3) will follow during the course of proving (2). To prove (2), observe
that since f(x) is separable mod p, f(x) ≡ f1(x) · · · fg(x) mod p for distinct, irre-
ducible (mod p) polynomials fi(x). If P ⊂ OL is a prime lying over p, then fi(α) ∈ P
for some i; we may relabel the fi so that f1(α) ∈ P. Then by Galois theory,
[OL/P : OK/p] ≥ [L : K] = deg f.
Now for any σ ∈ Gal(L/K) such that σ(P) = P, f1(σ(α)) ∈ P and f1(x) is separable
by hypothesis, so deg f1 ≥ ef , where e and f are the ramification index and inertial
degree, respectively, of p in P. This shows that e = 1 and f = deg f1, so (1) is proved.
Now let pOL = P1 · · ·Pg be the prime factorization of pOL into prime ideals of
OL. Theorem 1.3.3 implies that deg fi = f for all i, so it remains to prove that each
Pi is generated by p and fi(α). On one hand, pOL +fi(α)OL is contained in Pi since
fi(α) ∈ Pi (reindexing if necessary). On the other hand,∏
(pOL + fi(α)OL) ⊂ pOL.
15
Each ideal on the left is contained in a prime ideal in the factorization of pOL, and
this must be Pi for each i. This completes the proof of (2).
We will develop further techniques for deciding when a prime ramifies/splits/stays
inert in Section 1.6. For the moment, we do not even know if there are an infinite
number of primes splitting in an extension L/K; this question will finally be given
an answer in Section 2.10.
Example 1.3.5. In this example we provide a full characterization of the splitting
behavior of primes in quadratic extensions. Suppose K = Q(√n) where n is a square-
free integer. Then K/Q is Galois, so for each prime p ∈ Z we have 2 = efg by
Theorem 1.3.3. There are exactly three possibilities for e, f and g:
• e = 2 and f, g = 1. In this case p ramifies in OK so pOK = P2 for some prime
ideal P. It turns out that there are only finitely many such primes since by (3)
of the previous theorem, p ramifies in K if and only if x2 +n ≡ 0 (mod p) has a
multiple root. This ties in with the idea that the discriminant of a polynomial
determines its number of roots – in Section 1.6, we will see that the connection
between ramification and discriminants runs even deeper.
• f = 2 and e, g = 1. In this case p is inert, so pOK is prime. It turns out that this
happens half the time (minus the finitely many cases when a prime ramifies).
• g = 2 and e, f = 1. Here p splits completely in OK , so pOK = P1P2 for prime
ideals P1 6= P2. This happens the other half of the time.
Definition. For a quadratic field K = Q(√n), the discriminant of K is
dK =
{n if n ≡ 1 (mod 4)
4n otherwise.
16
For any integer q we also define the Kronecker symbol by
(q2
)=
0 if q ≡ 0 (mod 4)
1 if q ≡ 1 (mod 8)
−1 if q ≡ 5 (mod 8).
As a consequence of the above characterization of primes in OK , where K = Q(√n),
we have the following characterization of the splitting of primes in a quadratic exten-
sion.
Proposition 1.3.6. A prime p ramifies in K = Q(√n) if and only if p | dK, and p
splits completely in K if and only if(dKp
)= 1.
The first statement follows from the general case in Sections 1.6 and the second
is a consequence of (3) of Theorem 1.3.4, since(−4np
)=(−np
)= 1 if and only if
x2 + n ≡ 0 (mod p) for some integer x. For now, let’s take a look at a familiar
example.
Example 1.3.7. Let K = Q(i) and recall that the Gaussian integers Z[i] are the ring
of integers for K. In this example we will describe the splitting behavior of primes in
Z[i]. From the last few results, we claim that for an odd prime integer p (excluding
p = 2) the following are equivalent:
(i) p ≡ 1 (mod 4).
(ii) (p) splits completely in Z[i].
(iii) p = x2 + y2 for some integers x, y.
Proof. To prove our claim, note that Z[i] is the ring of integers for K = Q(i) so we
may take the α in Theorem 1.3.4 to be i, which has minimal polynomial x2 + 1 over
Q. Thus (p) splits completely in Z[i] if and only if x2 + 1 splits mod p. This in turn
17
happens if and only if Fp contains a fourth root of unity, i.e. F×p contains an element
of order 4. Since F×p has order p−1, this means 4 | p−1 and so (i)⇐⇒ (ii) is proven.
Next suppose (p) splits in Z[i]; let (p) = p1p2 for prime ideals p1, p2 ∈ Z[i]. In
Example 1.7.2, we will prove that the ring of Gaussian integers Z[i] is a PID. Using
this fact, we know p1 = (x + yi) for integers x and y, but then p2 must be (x − yi).
Therefore p = x2 + y2 up to multiplication by a unit in Z[i]. However the only
units are ±1,±i so clearly p must just be x2 + y2. Conversely, if p = x2 + y2 then
p = (x+ yi)(x− yi) in Z[i].
Note that this solves Fermat’s theorem characterizing primes of the form x2 + y2.
It will be a continuing theme in these notes to fully characterize primes of the form
x2 + ny2 for all integers n.
1.4 The Decomposition and Inertia Groups
In this section we describe two important subgroups of Gal(L/K) for a Galois exten-
sion L/K of number fields.
Definition. For a Galois extension L/K and a prime ideal P ⊂ OL lying over
p ⊂ OK , the decomposition group of P is
DP = {σ ∈ Gal(L/K) | σ(P) = P}
and the inertia group of P is
IP = {σ ∈ Gal(L/K) | σ(α) ≡ α mod P for all α ∈ OL}.
Let k = OK/p and ` = OL/P denote the respective residue fields of p and P. We
will prove that there is an exact sequence
1→ IP → DP → Gal(`/k)→ 1.
18
Recall from the proof of Theorem 1.3.3 that G = Gal(L/K) acts transitively on
Spec(p). Then we can interpret DP as the stabilizer of P under this action. The
Orbit-Stabilizer Theorem tells us that [G : DP] = g, where g is the number of distinct
primes in the factorization of pOL. Hence |Dp| = ef .
Lemma 1.4.1. For a fixed prime ideal p ⊂ OK, the decomposition groups DP of the
prime ideals lying over p are conjugate subgroups of Gal(L/K).
Proof. This is a more general fact about the stabilizers of a transitive group action.
Note that for σ, τ ∈ Gal(L/K),
τ−1στ ∈ DP ⇐⇒ τ−1στP = P ⇐⇒ στP = τP ⇐⇒ σ ∈ DτP
which implies that σ ∈ DP ⇐⇒ τστ−1 ∈ DP. Hence τDPτ−1 = DτP.
The decomposition group is useful because we can view an extension L/K as a
tower of extensions so that we understand the splitting of primes better in each step
of the tower.
Proposition 1.4.2. Let L/K be a Galois extension and fix a prime p ⊂ OK. Let
D = DP be the decomposition group for a particular prime P lying over p. Then the
fixed field
LD = {α ∈ L | σ(α) = α for all σ ∈ D}
is the smallest subfield E of L such that g = 1 for P ∩ OE.
Proof. First suppose E = LD. By Galois theory, Gal(L/E) ∼= D and as in the last
section, D acts transitively on the set of primes of OL lying over PE := P ∩ OE.
One of these primes is P itself, and D fixes P by definition, so this must be the only
prime lying over PE, i.e. g = 1.
On the other hand, if g = 1 for PE then Gal(L/E) fixes P: it’s the only prime
over PE. So Gal(L/E) ≤ D and by Galois correspondence, LD ⊂ E.
19
This shows that p does not split when moving from LD to L: it either ramifies or
stays inert. Let E = LD and denote P ∩ OE by PE, e = e(P | p), f = f(P | p) and
g = g(P | p). To piece together more of the puzzle, we have the following.
Proposition 1.4.3. Given K,L,E, p,P and PE as above, e(PE | p) = f(PE | p) =
1, g = [E : K], e = e(P | PE) and f = f(P | PE).
Proof. As mentioned in the remarks preceding Lemma 1.4.1, the Orbit-Stabilizer
Theorem implies that g = [Gal(L/K) : D]. Then by Galois theory, [Gal(L/K) : D] =
[E : K], so this equals g.
The previous proposition gives us g(P | PE) = 1, and by Theorem 1.3.3 we have
e(P | PE)f(P | PE) = [L : E] =[L : K]
[E : K]
=efg
[E : K]= ef.
Now e(P | PE) ≤ e and f(P | PE) ≤ f , so we must have that e(P | PE) = e and
f(P | PE) = f . It follows easily that e(PE | p) and f(P | p) are 1.
Fix a prime P ⊂ OL lying over p ⊂ OK . Observe that each σ ∈ DP acts on the
finite field ` = OL/P and fixes k = OK/p so we obtain a group homomorphism
ϕ : DP −→ Gal(`/k).
The next two results establish that ϕ is surjective, which we will use to prove exactness
of the sequence described at the start of this section.
Lemma 1.4.4. The residue field extension `/k is Galois.
Proof. First we show that `/k is normal. To do this, take any α ∈ ` and let f(x) be
20
its minimal polynomial over k. Let α ∈ OL be a lift of α. Then
f(x) =∏σ∈DP
(x− σ(α)) ∈ OK [x]
splits completely over OK and has α as a root, when taken mod p. Thus `/k is
normal. Furthermore, `/k will be Galois whenever it is separable, but since OK/p is
a finite field, it is perfect and therefore any finite extension is separable [10].
Proposition 1.4.5. ϕ is surjective.
Proof. By the lemma, `/k is Galois. We will show that ϕ(DP) acts transitively on the
conjugates of α over k. By the Chinese remainder theorem, one may choose α ∈ OL
such that
α ≡
{α mod P
0 mod P′ for any other P′ lying over p.
Then for any σ ∈ G rDP we have α ≡ 0 mod σ−1P and hence σ(α) ≡ 0 mod P.
This implies that
f(x) =∏σ∈DP
(x− σ(α)
) ∏σ 6∈DP
x
=∏σ∈DP
(x− ϕ(σ)(α))∏σ 6∈DP
x
which lies in k[x]. Notice that the first product lies in k[x], so it is divisible by the
minimal polynomial of α over k. So given any conjugate α′ of α, (x − α′) divides
the first product above and thus α′ must equal ϕ(σ)(α) for some σ ∈ DP. Hence the
action of ϕ(DP) on the conjugates of α is transitive and it follows that ϕ is surjective
since the image has at least [` : k] = |Gal(`/k)| elements.
Next we relate the inertia group IP to the map ϕ, and use it to prove that the
original sequence we defined is exact.
21
Proposition 1.4.6. The inertia group IP is the kernel of ϕ : DP → Gal(`, k).
Proof. By definition
kerϕ = {σ ∈ DP | σ(α) ≡ α mod P for all α ∈ OL}
so it suffices to show that if σ 6∈ DP then there exists an α ∈ OL such that σ(α) 6≡ α
mod P. If σ 6∈ DP then of course σ−1 6∈ DP so σ−1(P) 6= P. Since both σ−1(P)
and P are maximal ideals, there exists some α ∈ P with α 6∈ σ−1(P), which implies
σ(α) 6∈ P. Thus σ(α) 6≡ α mod P and it follows that IP = kerϕ.
We now summarize our findings.
Corollary 1.4.7. If L/K is a Galois extension, the sequence
1→ IP → DP → Gal(`/k)→ 1
is exact. Moreover, |IP| = e and |DP| = ef , where e and f are the ramification index
and inertial degree, respectively, for L/K.
Notice that the inertia group is a very useful measure of how a prime p ramifies
in a Galois extension L. This is a common theme in algebraic number theory: the
behavior of primes in an extension is often encoded in the automorphisms of the field
itself.
1.5 Norms of Ideals
In this section we define the norm of an ideal. As in previous sections, all of these
definitions and results generalize to any Dedekind domain A with integral closure B
– see [19] for the general cases. In our context, we will replace A with OK and B with
OL, which have fields of fractions K and L, respectively.
22
Let IK and IL denote the groups of fractional ideals of OK and OL, respectively.
We want to define a group homomorphism N : IL → IK . Since IL is the free abelian
group on the set of prime ideals in OL, we only have to define N for p prime.
Let p be a prime ideal of OL and factor
pOL =∏
Peii
for Pi prime. Suppose p = (π) is principal. Then we should have
N (pOL) = N (πOL) = N (π)OK = (π)m = pm
where m = [L : K]. We also want N to be a homomorphism, so we must have
N (pOL) = N(∏
Peii
)=∏N (Pi)
ei .
Recall that m =∑
eifi, so the correct definition for N is
Definition. For a prime P ⊂ OL lying over p ⊂ OK , the norm of P is defined to be
N (P) = pf
where f = [OL/P : OK/p].
To distinguish this norm from a similar norm to be defined shortly, we will some-
times refer toN as the ideal norm. If the norm is taken with respective to an extension
L/K, we write NL/K but when the context is clear we will often drop the decoration.
Remark. By the properties of inertial degree f , it is easy to see that for a tower
M ⊃ L ⊃ K,
NL/K(NM/L(a)) = NM/K(a).
Next we check that the properties discussed above hold for the norm we have
defined.
23
Proposition 1.5.1. Let L/K, OK and OL be as above.
(a) For any nonzero ideal a ⊂ OK, N (aOL) = am where m = [L : K].
(b) If L/K is Galois and P ⊂ OL is any nonzero prime ideal with p = P ∩ OK
and pOL = (P1 · · ·Pg)e, then
N (P) = (P1 · · ·Pg)ef =
∏σ∈Gal(L/K)
σ(P).
(c) For any nonzero element β ∈ OL, N(β)OK = N (βOL), where N denotes the
regular field norm.
Proof. (a) It suffices to prove this for prime ideals, for which we have
N (pOL) = N(∏
Peii
)= p
∑eifi = pm
using Theorem 1.3.3.
(b) Since N (Pi) = pf for any prime Pi in the prime factorization of pOL, the left
equality is clear. Recall that G = Gal(L/K) acts transitively on the set Spec(p) =
{P1, . . . ,Pg}. Then by the Orbit-Stabilizer Theorem, each Pi occurs
|Gal(L/K)||Spec(p)|
=m
g= ef
times in the collection {σ(P) | σ ∈ G}, which implies the right equality.
(c) First suppose L/K is Galois. Denote βOL by b. The map IK → IL given by
a 7→ aOL is injective since IK and IL are free on nonzero prime ideals, so it suffices
to show that N(β)OL = N (b). But by (b),
N (b) =∏σ∈G
σ(b) =∏σ∈G
(σ(β)OL) =
(∏σ∈G
σ(β)
)OL = N(β)OL.
24
In the general case, let E be a finite Galois extension of K containing L, with
d = [E : L] and OE the integral closure of OL in E. Then we have
NL/K(βOL)d = NE/K(βOE) by the remark
= NE/K(β)OK by the Galois case
= NL/K(β)dOK .
Lastly since IK is torsion-free, the above implies that NL/K(βOL) = NL/K(β)OK for
all nonzero β ∈ OL.
For a Galois extension K/Q, we define a different norm taking ideals of OK to
integers. We will see that the definition below coincides with the ideal norm.
Definition. Let a ⊂ OK be a nonzero ideal. The numerical norm of a is its index
in the lattice of integers: N(a) = [OK : a].
In order to justify this definition, we need to check that [OK : a] is always finite.
Proposition 1.5.2. Every nonzero ideal a in OK has finite index in the lattice OK.
Proof. Let a be a nonzero OK-ideal. The proof of Proposition 1.2.5 shows that a
contains a nonzero integer m (a0 from that proof). So consider ϕ : OK/mOK →
OK/a, which is clearly surjective. By Proposition 1.1.13, OK is a free Z-module of
rank n = [K : Q]. This means thatOKmOK
∼=Zn
mZnis a finite quotient of order mn.
Since ϕ is surjective, it follows that |OK/a| ≤ mn <∞.
Notice that the ideal norm is defined for any extension L/K and outputs an ideal
of OK . On the other hand, the numerical norm is defined on K/Q and outputs
an integer in Z. The connection between the two norms is described in the next
proposition.
25
Proposition 1.5.3. Let K be any number field.
(a) For any ideal a ⊂ OK, NK/Q(a) = (N(a)) and therefore N(ab) = N(a)N(b).
(b) For any fractional ideals b ⊂ a of OK, [a : b] = N(a−1b).
Proof. (a) Write a =∏
peii and let fi = f(pi | pi) where (pi) = Z ∩ pi. Then
N (pi) = (pi)fi . By the Chinese remainder theorem, OK/a ∼=
∏OK/peii and thus
[OK : a] =∏
[OK : peii ].
We previously proved that [OK : peii ] = peifii , thus [OK : a] =∏
(peifii ) = NK/Q(a).
When we identify the set of nonzero ideals of Z with the set of positive integer
generators, N and N are seen to coincide, and multiplicativity of N follows from the
same property of the ideal norm.
(b) We can multiply by some integer d to make a and b integral ideals. Then part
(a) gives us
[a : b] = [da : db] =[OK : db]
[OK : da]=
N(db)
N(da)= N(a−1b).
1.6 Discriminant and Different
One may recall the definition of discriminant from field theory. Here we present it in
the context of extensions of number fields.
Definition. For a number field extension L/K with rings of integers OL ⊃ OK ,
suppose OL has a basis {β1, . . . , βm} over OK . Then the discriminant of OL is
D(OL) = D(β1, . . . , βm) = det(TrL/K(βiβj))
26
where Tr denotes the trace.
In this section we use the discriminant to characterize which primes ramify in
Galois extensions of a number field.
Definition. Let D = D(OL) as above. The discriminant ideal of OL, denoted
∆(L/K) or simply ∆, is the ideal of OK generated by D.
We will prove
Theorem. The primes which ramify in OL are those that divide ∆.
First, we establish some properties of the discriminant. The details can be found
in [19].
Definition. Let L = K(β) for some β ∈ L and let f be the minimal polynomial of
β over K, setting deg f = m. Then the discriminant of f is defined to be
D(f) := D(1, β, β2, . . . , βm−1) = (−1)m(m−1)/2NL/K(f ′(β)).
Proposition 1.6.1. D(f) = 0 if and only if f has a repeated root, i.e. is not sepa-
rable.
Lemma 1.6.2. Let OL have basis {β1, . . . , βm} over OK. Then for any OK-ideal a,
{β1, . . . , βm} is a basis for OL/aOL over OK/a and the discriminant satisfies
D(β1, . . . , βm) ≡ D(β1, . . . , βm) mod a,
where the discriminant on the left is taken with respect to OL/aOL (as a module over
OK/a) and on the right with respect to OL over OK.
Proof. See 3.36 in [19].
We are now ready to prove the main result.
27
Theorem 1.6.3. A prime p ⊂ OK ramifies in OL if and only if p | ∆(L/K).
Proof. By definition ∆ = ∆(L/K) is the ideal generated by D = D(OL). Thus p | ∆
if and only if D ∈ p, which in turn happens if and only if
D(β1, . . . , βm) = det(Tr(βiβj)) = 0
in OK/p by Lemma 1.6.2. Let p have factorization pOL = Pe11 · · ·P
egg . By the Chinese
remainder theorem,
OL/pOL ∼= OL/Pe11 ⊕ · · · ⊕ OL/Peg
g .
First suppose p is not ramified in L. Then each ei = 1 and OL/Pi is a separable
extension of OK/p. Let ti denote the trace map OL/Pi → OK/p. Select a basis {ui}
for OL/pOL such that {u1, . . . , uk} is a basis for OL/P1, {uk+1, . . . , uk+l} is a basis
for OL/P2, etc. Then for each y ∈ OL/pOL, y = y1 + . . .+ yg with yi ∈ OL/Pi. Each
multiplication map ri : x 7→ xyi takes OL/Pi to itself, and if ri has standard matrix
Ai then the matrix for r : x 7→ xy decomposes into the block matrix
A =
A1
A20
0. . .
Ag
Then Tr(y) = t1(y1) + . . . + tg(yg). More importantly, the discriminant matrix has
block form
B =
∆1
∆2
. . .
∆g
where ∆i is the discriminant of the chosen basis of OL/Pi over OK/p. But OL/Pi
28
is separable over OK/p if and only if det ∆i 6= 0. Hence if we view the B above as a
change of basis matrix, we have
D(β1, . . . , βg) = (detB)2(det ∆1)(det ∆2) · · · (det ∆g) 6= 0.
By the initial comments, this shows that when p is unramified, p - ∆.
On the other hand, if some ei > 1 then OL/Pi is not separable over OK/p.
We may reindex the primes lying over p so that e1 > 1. Choose a basis {vi} for
OL/Pe11 such that v1 ∈ P1/P
e11 . In the quotient, ve11 = 0 so the multiplication map
rv1 : x 7→ xv1 from above (now defined for the vi) is trivial. Moreover, (v1vj)e1 = 0
so the characteristic polynomial for the map rv1vj only has roots for its eigenvalues.
Thus ti(v1vj) = Tr(rv1vj) = 0 so the discriminant matrix for OL/P1 over OK/p has a
row of zeros. Hence det ∆1 = 0 which implies D(β1, . . . , βm) = 0. By the preliminary
comments, this shows that p divides ∆.
This tells us when a prime in OK ramifies in L, but the discriminant misses some
critical information:
• Which primes in OL lying over p ramify? That is, which primes P in the
factorization of pOL have ramification index greater than 1?
• How do we determine the multiplicity of a prime dividing the discriminant?
The rest of this section follows K. Conrad’s paper “The Different Ideal”, which
outlines the techniques required to answer these questions. To motivate the problem,
consider the following example.
Example 1.6.4. Let K = Q(α) where α is a root of f(x) = x3 − x − 1. This
polynomial has discriminant −23 so 23 is the only integer prime which ramifies in
29
OK . Since [K : Q] = 3 and
x3 − x− 1 ≡ (x− 3)(x− 10)2 mod 23,
the factorization we obtain from Theorem 1.3.4 is 23OK = pq2 where p 6= q and both
are prime. In general, how do we know that q ramifies but p doesn’t?
Definition. For a lattice L in a number field K, its dual lattice is
L∨ = {α ∈ K | TrK/Q(αL) ⊂ Z}.
Proposition 1.6.5. For a lattice L with Z-basis {e1, . . . , en}, the dual lattice may be
written
L∨ =n⊕i=1
Ze∨i
where {e∨i } is the dual basis of {ei} relative to the trace product on K/Q. In particular,
L∨ is a lattice.
Proof. See 3.4 in [6].
We will see in the next section just how useful lattices can be in algebraic number
theory, but for now we will focus on the lattice OK and its fractional ideals. Consider
the dual lattice O∨K . First, we should recognize that O∨K is not just the elements of
K with trace in Z – actually it’s smaller. But since algebraic integers have integral
trace, we see that OK ⊂ O∨K .
Proposition 1.6.6. For any fractional ideal a in K, a∨ is a fractional ideal and
a∨ = a−1O∨K. Moreover, O∨K is the largest fractional ideal of K whose elements all
have integral trace.
Proof. By definition, a∨ = {α ∈ K | TrK/Q(αa) ⊂ Z}. First, since any dual lattice
is a lattice by the previous proposition, we know a∨ is a finitely generated Z-module.
30
Take α ∈ a∨ and x ∈ OK . Then for any β ∈ a,
TrK/Q((xα)β) = TrK/Q(α(xβ)) ∈ Z
since xβ ∈ a and α ∈ a∨. Thus xα ∈ a∨ so a∨ is a fractional OK-ideal.
Next we check the formula for a∨. Take α ∈ a∨ again. Then for any β ∈ a,
Tr(αβOK) ⊂ Z since βOK ⊂ a. Thus αa ⊂ O∨K which implies α ∈ a−1O∨K . This
shows a∨ ⊂ a−1O∨K and the reverse containment is similarly shown.
For the last statement, note that any fractional OK-ideal satisfies a = aOK .
Therefore
Tr(a) ⊂ Z ⇐⇒ Tr(aOK) ⊂ Z ⇐⇒ a ⊂ O∨K .
Since O∨K is a fractional ideal containing OK , its inverse is an integral ideal con-
tained in OK , from which we define:
Definition. The different of K is the ideal
DK = (O∨K)−1 = {x ∈ K | xO∨K ⊂ OK}.
Example 1.6.7. For K = Q(i) with OK = Z[i], Tr(a + bi) ∈ Z precisely when
2a ∈ Z, so we see that Z[i]∨ = 12Z[i]. Thus the different of K is 2Z[i]. This can be
verified with the next proposition.
Proposition 1.6.8. If OK = Z[α] then DK = (f ′(α)) where f(x) is the minimal
polynomial of α over Q.
Proof. See 4.3 in [6].
31
Example 1.6.9. For a quadratic field K = Q(√n) where n is squarefree,
DK =
{(2√n) if n 6≡ 1 (mod 4)
(√n) if n ≡ 1 (mod 4).
The different is related to the field discriminant dK by the following.
Theorem 1.6.10. For a number field K of discriminant dK, NK/Q(DK) = |dK |.
Proof. Let {β1, . . . , βn} be a Z-basis for OK so that we have
OK =n⊕i=1
Zβi.
Then D−1K = O∨K =
n⊕i=1
Zβ∨i by Proposition 1.6.5. Using the definition of norm, we
have
N (DK) = [OK : DK ] = [D−1K : OK ] = [O∨K : OK ].
We can calculate [O∨K : OK ] by finding | detA| where A is the matrix expressing the
basis {β1, . . . , βn} in terms of the dual basis {β∨1 , . . . , β∨n}. Since {β∨i } is a dual basis
of {βi} it follows that
A = (TrK/Q(βiβj))
and by definition detA = D(OK) = dK . The result follows.
Lemma 1.6.11. For any nonzero ideal a ⊂ OK, a | DK if and only if Tr(a−1) ⊂ Z.
Proof. This may be stated as a ⊃ DK = (O∨K)−1 which in turn is equivalent to
a−1 ⊂ O∨K . By Proposition 1.6.6, this is equivalent to Tr(a−1) ⊂ Z.
Dedekind proved the following characterization of ramified primes in terms of the
different ideal. The proof can be found in [6].
32
Theorem 1.6.12 (Dedekind). The prime factors of DK are exactly the primes in K
that ramify over Q. In particular, for any prime ideal p ⊂ OK lying over a prime
p ∈ Z with ramification index e, the multiplicity of p in DK is e − 1 when e 6≡ 0
(mod p), and at least e when p | e.
Corollary 1.6.13. The primes in Z that ramify in K are precisely the prime divisors
of dK.
Proof. Use the fact that |dK | = N (DK) and Theorem 1.6.12.
Note that this also proves the rest of Proposition 1.3.6 which characterized ramified
primes in quadratic extensions.
The theorem we just proved showed the true power of the different: while the
discriminant also tells us if a prime ramifies in an extension, it does not tell us anything
about the ramification indices of the primes in the larger field. This information is
conveyed by the different, but if we know the full factorization of pOK , we can relate
this multiplicity to dK :
Corollary 1.6.14. Suppose pOK = pe11 · · · pegg with inertial degrees denoted by fi.
Then the multiplicity of p in dK is at least
(e1 − 1)f1 + . . .+ (eg − 1)fg = n− (f1 + . . .+ fg).
Furthermore, if p - ei for all i then this is the exact multiplicity of p in dK.
It turns out that the multiplicity of a prime p ∈ DK is bounded by
e− 1 ≤ ordp(DK) ≤ e− 1 + e ordp(e).
The left is Theorem 1.6.12 and the right was proven by Hensel (see [6]).
33
We can extend the ideas of O∨K and DK to an arbitrary extension of number fields
L/K in the following way. Define the fractional ideal
O∨L = {x ∈ L | TrL/K(xy) ∈ OK for all y ∈ OK}.
Then we have
Definition. For an extension of number fields L/K, the relative different is
δL/K = (O∨L)−1 = {x ∈ L | xO∨L ⊂ OL}
which is an integral ideal of OL.
As in the case withK/Q we have several important results for the relative different.
See section 15 of [13] for details.
Theorem 1.6.15. For any extension of number fields L/K, the discriminant and
relative different are related by DL/K = NL/K(δL/K).
Theorem 1.6.16. Let dL and dK be the field discriminants of L and K, respectively.
Then dL = ±dKNL/K(DL/K).
Theorem 1.6.17. For all number field extensions L/K, DL = DKδL/K.
Corollary 1.6.18. The primes of OL that ramify over K are precisely those that
divide the relative different δL/K.
1.7 The Class Group
Recall that the class group of a number field K is C(OK) = IK/PK , where IK is the
group of fractional ideals of OK and PK is the subgroup of principal fractional ideals.
There is an exact sequence
0→ O∗K → K∗ → IK → C(OK)→ 0.
34
In this section we explore the structure of the class group and prove that its order,
called the class number hK of K, is finite.
In the previous section we defined the discriminant of a number field K to be
dK = D(OK). We will prove
Theorem. Let K be a number field with [K : Q] = n and discriminant dK. Let
2s be the number of nonreal embeddings of K into C. Then there exists a set of
representatives for the ideal class group C(OK) consisting of ideals a ⊂ OK with
N(a) ≤ n!
nn
(4
π
)s√|dk|.
The value on the right is called the Minkowski bound and is often denoted BK .
According to [24], BK is currently the best known bound for a generating set of
C(OK) that does not depend on unproven conjectures.
In the statement of the main theorem, 2s counted the number of nonreal em-
beddings K ↪→ C. Alternatively, by the primitive element theorem we may write
K = Q(α) for some α ∈ K with minimal polynomial f(x) ∈ Q[x]. Then 2s is the
number of nonreal roots of f . We will also denote by r the number of real roots of f ,
so that we have
K ⊗Q R ∼= Rr × Cs ∼= Rr+2s (as Q-vector spaces).
Before proving the main theorem, we present some applications and examples
using the Minkowski bound. The first result is an important property of the class
group.
Theorem 1.7.1. The class number hK := |C(OK)| is finite for any number field K.
Proof. It suffices to show there are only finitely many ideals a ⊂ OK whose norms fall
under the bound. Let a =∏
prii so that N(a) =∏
prifii where (pi) = Z ∩ pi. Since
35
N(a) is bounded by BK , there are only finitely many possibilities for the pi – and
hence for the pi – and only finitely many possibilities for the ri. Hence the number
of such a is finite, and it follows that the class group is finite.
Note that hK = 1 if and only if OK is a principal ideal domain. Thus the
class group is a direct measure of how far the ring of integers is from being a PID.
Since every PID is also a UFD, the class number is related to how badly unique
factorization fails in OK . An open question in class field theory asks if there are
infinitely many number fields with hK = 1. However, it is known [7] that there
are only nine imaginary quadratic fields with class number 1; these are Q(√n) for
n = −1,−2,−3,−7,−11,−19,−43,−67,−163.
Example 1.7.2. Let K = Q(i). Then n = 2, s = 1 and |dK | = 4 so the Minkowski
bound is
2!
22
(4
π
)1√4 =
4
π< 2.
Thus every fractional ideal is equivalent to an ideal of norm 1. Since the only ideal
of norm 1 is (1), every ideal is principal. Hence hK = 1, which reflects the fact that
Z[i] is a PID.
Example 1.7.3. Let K = Q(√−5). Then a set of representatives for C(OK) may
be chosen with N(a) ≤ BK ≈ 0.63√
20 < 3. Thus every ideal that satisfies this must
divide 2OK . In fact we can use Theorem 1.3.4 to compute the factorization of 2:
2OK = (2, 1 +√−5)2.
Since N(2OK) = 22 = 4, it must be that N((2, 1 +√−5)) = 2. This shows that OK
and (2, 1+√−5) form a set of representatives for C(OK). Further, (2, 1+
√−5) cannot
be principal because there is no element α = a + b√−5 with N(α) = a2 + 5b2 = 2.
Hence |C(OK)| = 2.
36
Another useful application of the Minkowski bound is to prove
Theorem 1.7.4. Every extension of Q ramifies at some prime.
Proof. We will prove this for K/Q a finite extension. A set of representatives of
C(OK) has at least one element and the element has numerical norm ≥ 1. Define a
sequence an by an = nn
n!
(π4
)n/2and note that by the Minkowski bound,
an ≤nn
n!
(π4
)s≤√|dK |.
We can also see that a2 > 1 and for all n ≥ 2,
an+1
an=(π
4
)1/2(
1 +1
n
)n> 1.
So the sequence (an) is monotone increasing. This implies |dK | > 1 and Corol-
lary 1.6.13 tells us that some prime ramifies.
This shows that the only unramified extension of Q is Q itself. However, there
may exist unramified extensions of a number field other than Q. In the next section
we will describe the Hilbert class field of K, which is the maximal unramified abelian
extension L of K. This field has the special property that Gal(L/K) is isomorphic to
the class group C(OK).
Constructing further abelian extensions of the Hilbert class field is called the class
field tower problem. Let K1 be a number field with class number h1 > 1. Let K2 be
the Hilbert class field of K1, let K3 be the Hilbert class field of K2, and so on. It is
an open question to decide when the tower
· · · ⊃ K3 ⊃ K2 ⊃ K1 ⊃ Q
is infinite, or terminates with a field of class number 1 in a finite number of steps.
37
Golod and Shafarevich [11] proved that there are fields K1 with infinite class field
towers.
For the rest of the section, we develop the mechanics required to prove the
Minkowski bound. First we redefine the notion of a lattice in a vector space – a slight
generalization of the definition given in Section 1.1, where lattices were assumed to
have full rank.
Definition. Let V be an n-dimensional vector space over R. A lattice in V is a
subgroup of the form
Λ = Ze1 + . . .+ Zer
where e1, . . . , er are linearly independent vectors in V . When r = n, Λ is said to be
a full lattice in V .
Remark. A lattice is a free abelian subgroup of V generated by elements of V that
are linearly independent over R. A full lattice Λ ⊂ V is a subgroup such that the
map
R⊗Z Λ −→ V∑ri ⊗ xi 7−→
∑rixi
is an isomorphism.
Since V is isomorphic to Rn, this induces a topology on V .
Definition. A subgroup W ⊂ V is a discrete subgroup if every point in W is
open in the topology on V , i.e. if every w ∈ W has a neighborhood U such that
U ∩W = {w}.
Proposition 1.7.5. A subgroup Λ ⊂ V is a lattice if and only if it is a discrete
subgroup.
38
Proof. ( =⇒ ) is clear. See 4.15 in [19] for the other direction.
Definition. Let Λ =∑
Zei be a full lattice in V . Then for any λ0 ∈ Λ, the set
D ={λ0 +
∑aiei : 0 ≤ ai < 1
}is called a fundamental parallelopiped for Λ.
The shape of the parallelopiped depends on the choice of the ei, but for a fixed
basis we may vary the λ ∈ Λ so that the parallelopipeds cover Rn without overlaps.
Furthermore, if D is a fundamental parallelopiped for Λ =∑
Zei, the volume of D
is given by
µ(D) = | det(e1, . . . , en)|.
(Here µ is actually Lebesgue measure, but all our sets will have well-defined volumes.)
Notice that if Λ = Zf1 + . . .+ Zfn then the change-of-basis matrix between {ei} and
{fi} has determinant ±1, so the volume of D does not depend on the choice of basis
for Λ.
Definition. For a set T ⊂ Rn, we say T is convex if every pair of points in T is
connected by a line that lies in T .
Definition. A set T ⊂ Rn is symmetric about the origin if α ∈ T implies −α ∈ T .
Lemma 1.7.6. Let D be a fundamental parallelopiped for a full lattice Λ in V and
suppose S is a measurable subset of V . If µ(S) > µ(D) then S contains distinct points
α and β such that β − α ∈ Λ.
Proof. See 4.17 in [19].
It will be useful to let T be a subset of V such that for any α, β ∈ T , 12(α−β) ∈ T ,
and let S = 12T . Then by Lemma 1.7.6, T contains the difference of any two points
in S and so T will contain a point of Λ r {0} whenever µ(D) < µ(
12T)
= 2−nµ(T ).
39
The main theorem en route to proving the Minkowski bound is a classic theorem
in the geometry of numbers in its own right:
Minkowski’s Theorem. Let T be a subset of V that is compact, convex and sym-
metric about the origin. If Λ is a lattice in V with fundamental parallelopiped D such
that
µ(T ) ≥ 2nµ(D)
then T contains a point of Λ other than the origin.
Proof. Let ε > 0. Then
µ((1 + ε)T ) = (1 + ε)nµ(T ) > 2nµ(D)
so by the preceding comments, (1 + ε)T contains a point of Λr {0}. T only contains
finitely many points in Λr{0} since Λ is discrete and T is compact (and so is (1+ε)T ).
Now since T is closed,
T =⋂ε>0
(1 + ε)T
so if none of the finitely many points in Λ∩ (1 + ε)T other than the origin were in T ,
we could keep making ε > 0 smaller and smaller so that (1 + ε)T contains no point
of Λ other than the origin. This of course contradicts the lemma, hence T contains a
point in Λ r {0}.
For a fascinating application of Minkowski’s theorem to the proof of the Four
Squares Theorem, see Appendix A.1.
Moving forward, let K be a number field with [K : Q] = n. Suppose K has r real
embeddings {σ1, . . . , σr} and 2s complex embeddings {σr+1, σr+1, . . . , σr+s, σr+s}, so
40
that n = r + 2s. Then we have an embedding
σ : K ↪→ Rr × Cs
α 7→ (σ1(α), . . . , σr+s(α)).
Let V = Rr × Cs and identify V with Rn using {1, i} as a basis for C. The relation
between ideals of OK and lattices in V is contained in the following proposition.
Proposition 1.7.7. Let a ⊂ OK be any nonzero ideal. Then σ(a) is a full lattice in
V and the volume of any fundamental parallelopiped for σ(a) is 2−sN(a)√|dK |.
Proof. Let {α1, . . . , αn} be a basis for a as a Z-module. We claim that {σ(α1), . . . , σ(αn)}
is a basis for σ(a). To prove this, we will show that the matrix A whose ith row is
(σ1(αi), . . . , σr(αi),Re(σr+1(αi)), im(σr+1(αi)), . . . ,Re(σr+s(αi)), im(σr+s(αi)))
has nonzero determinant. First consider the matrix B with ith row
(σ1(αi), . . . , σr(αi), σr+1(αi), σr+1(αi), . . . , σr+s(αi)).
By definition of the discriminant, (detB)2 = D(α1, . . . , αn) 6= 0.
Next we relate the determinants of A and B. If we perform the column opera-
tions Cr+2 + Cr+1 → Cr+1 and −12Cr+1 + Cr+2 → Cr+2 to matrix B, we will have
2 Re(σr+1(αi)) in Cr+1 and −i · im(σr+1(αi)) in Cr+2. Repeat this for the remaining
pairs of columns to obtain a matrix A′. These column operations do not change detB
and it’s easy to scale A′ to obtain A, so we have detB = detA′ = (−2i)s detA, or
detA = (−2i)−s detB = ±(−2i)−sD(α1, . . . , αn)1/2 6= 0.
Thus {σ(αi)} is a basis for σ(a), which proves σ(a) is a lattice in V of rank n.
Now we can write σ(a) =∑
Zσ(αi) so the fundamental parallelopiped for σ(a)
41
has volume | detA|. One can prove that
|D(α1, . . . , αn)| = [OK : a]2 |D(OK/Z)| = N(a)2|dK |
(see 4.26 in [19] for details). Hence µ(D) = 2−s|D(α1, . . . , αn)|1/2 = 2−sN(a)√|dK |.
Lemma 1.7.8. Let a ⊂ OK be an integral ideal. Then a contains an element α ∈ K∗
whose norm is bounded by
|N(α)| ≤ n!
nn
(4
π
)sN(a)
√|dK |.
Proof. For a fixed positive t ∈ R, define X(t) = {v ∈ V : ||v|| ≤ t}, where || · || is the
Euclidean norm on V . Using complex analysis (see 4.27 in [19]), one can calculate
µ(X(t)) = 2r(π
2
)s tnn!.
The set X(t) is compact (it is closed and bounded), convex and symmetric about the
origin, and we may choose a large enough t so that
µ(X(t)) ≥ 2nµ(D)
where D is a fundamental parallelopiped for a. Then Minkowski’s theorem says that
X(t) contains some σ(α) 6= 0, for some α ∈ a. Consider
|N(α)| = |σ1(α)| · · · |σr(α)| · |σr+1(α)|2 · · · |σr+s(α)|2
≤ (∑|σi(α)|+
∑2|σi(α)|)n
nn(geometric mean ≤ arithmetic mean)
≤ tn
nn.
42
Now in the case that µ(X(t)) ≥ 2nµ(D), we must have by Proposition 1.7.7 that
2r(π
2
)s tnn!≥ 2n2−sN(a)
√|dk| ⇐⇒ tn ≥ n!
2n−r
πsN(a)
√|dK |.
So choose t ∈ R so that tn = n!2n−r
πs N(a)√|dK |. Then the above work gives the result:
|N(α)| ≤ n!
nn· 2n−r
πsN(a)
√|dK | =
n!
nn
(4
π
)sN(a)
√|dK |.
We are now ready to prove the main theorem:
Theorem 1.7.9. For a number field K, there exists a set of representatives for the
class group C(OK) consisting of integral ideals a whose norms satisfy
N(a) ≤ n!
nn
(4
π
)s√|dK |.
Proof. Let c be a fractional ideal in IK . There is some d ∈ K∗ that clear the de-
nominators of c−1, so b := dc−1 is an integral ideal. By Lemma 1.7.8 there exists a
nonzero element β ∈ b with |N(β)| ≤ BKN(b). Note that βOK ⊂ b which implies
βOK = ab for some integral ideal a, with a ∼ b−1 ∼ c in the class group. Then
N(a)N(b) = |N(β)| ≤ BKN(b). Since N(b) = [OK : b] > 0, we can cancel to obtain
N(a) ≤ BK .
In general we can compute C(OK) with the following approach:
1) Use the results from Section 1.3 to list all prime ideals p ⊂ OK that appear in the
factorization of any prime p ≤ BK .
2) Find the group generated by the ideal classes [p] for the primes found in Step 1.
43
Example 1.7.10. Let K = Q(√−19) with ring of integers OK = Z[(1 +
√−19)/2].
Since n = 2, r = 0, s = 1 and dK = −19, the Minkowski bound for K is
BK =2!
22
(4
π
)1√19 ≈ 2.775.
So every class in C(OK) is represented by a prime ideal with norm either 1 or 2.
The ideal 2OK is unramified in K since 2 - dK . The minimal polynomial of α =
(1 +√−19)/2 is f(x) = x2 − x + 5, so because
(−192
)= −1 and f has no roots
mod 2, Theorem 1.3.4 tells us that 2OK is inert and thus prime in K. Clearly this
is principal, so the class group is trivial. By previous comments h(−19) = 1 implies
that Z[(1 +√−19)/2] is a PID.
Example 1.7.11. Let K = Q(√−2) with OK = Z[
√−2]. Note that n = 2, r = 0,
s = 1 and dK = −8 so the Minkowski bound is calculated to be
BK =2!
22
(4
π
)1√8 ≈ 1.801.
It easily follows that C(OK) is trivial and hence Z[√−2] is a PID. In particular,
Z[√−2] has unique factorization. We will use this fact to deduce a famous theorem
of Fermat whose proof was first discovered by Euler.
Theorem 1.7.12 (Fermat). The only integer solutions to x3 = y2 + 2 are (3,±5).
Proof. First suppose ab = u3 in Z[√−2] where a and b are relatively prime. We will
show that a and b must be cubes in Z[√−2]. Since Z[
√−2] is a UFD, we may write
u = γ∏
peii for primes pi ∈ Z[√−2], integers ei and some unit γ. Then
ab = u3 =(γ∏
peii
)3
= γ3∏
p3eii .
44
Since a and b are relatively prime, each pi appears in exactly one of the factorizations
for a and b. So by the above equality, a and b each factor into products of primes
whose exponents are all 3ei. We have not worried about the unit γ yet, but that is
because the units in K are ±1, each of which is a cube in Z[√−2] anyways. Thus we
conclude that a and b are both cubes in Z[√−2].
Now suppose (x, y) is an integer solution to x3 = y2 + 2 = (y +√−2)(y −
√−2).
If d divides both y +√−2 and y −
√−2, then it divides their difference:
(y +√−2)− (y −
√−2) = 2
√−2.
However√−2 is prime in Z[
√−2] (norm is multiplicative), so d must divide 2. Sup-
pose x were even. Then we would have y2+2 ≡ x3 ≡ 0 (mod 8), or y2 ≡ −2 (mod 8).
Of course −2 is not a square mod 8, so x must be odd. This forces y to be odd as
well, so d | y2 +2 implies that d must be 1. Hence y+√−2 and y−
√−2 are coprime.
By the first part of the proof, y +√−2 and y −
√−2 are both cubes in Z[
√−2].
Write y +√−2 = (a + b
√−2)3 = (a3 − 6ab2) + (3a2b − 2b3)
√−2. We now solve for
a and b to show that (3,±5) are the only valid choices for (x, y). From the above,
we see that 1 = 3a2b − 2b3 = b(3a2 − 2b2). Since a and b are integers, this implies
b = ±1. If b = −1, the other factor is 3a2 + 2 = 1, which can be written 3a2 = −1.
This of course is impossible. So b = 1 and this means 3a2− 2 = 1 which has solutions
a = ±1. Plugging these values in above, we see that y = ±5 and x = 3.
1.8 The Hilbert Class Field
Prime ideals p ⊂ OK are often referred to as finite primes to distinguish them from
infinite primes, which are defined as
Definition. A real infinite prime of a number field K is an embedding σ : K ↪→ R,
while a complex infinite prime is a pair of conjugate embeddings σ, σ : K ↪→ C.
45
We will see why it is useful to include infinite primes in our list of primes of K in
Section 2.1. For now, we use it to define unramified extensions of a number field, but
first we need to define when an infinite prime ramifies.
Definition. Given an extension L/K, an infinite prime σ of K is said to ramify in
L if σ is real and has an extension to L which is complex.
Example 1.8.1. The infinite prime σ : Q ↪→ R is unramified in Q(√
2) but σ is
ramified in Q(√−2).
Definition. We say an extension of number fields L/K is unramified if every prime
in K, finite or infinite, is unramified in L.
A number field may have unramified extensions of arbitrary degree – the work
of Golod and Shafarevich [11] in the 1960s was famous for its rather complicated
examples. However, if we restrict our focus to unramified abelian extensions, the
theory becomes more tractable.
Theorem. For every number field K, there exists a finite Galois extension L ⊃ K
such that L is an unramified abelian extension of K, and L contains every other
unramified abelian extension of K.
Proof. This will follow from a more general result established in Section 2.9.
Definition. The Hilbert class field of a number field K is the maximal unramified
abelian extension of K.
For now we will assume the existence of the Hilbert class field and further develop
the connections between Hilbert class fields and rings of integers. The main tool
in describing this relationship is the Artin symbol, whose existence is proved in the
following lemma.
46
Lemma 1.8.2. Let L/K be a Galois extension, p ⊂ OK an unramified prime and P
a prime of OL lying over p. Then there is a unique element σ ∈ Gal(L/K) such that
for all α ∈ OL,
σ(α) ≡ αN(p) mod P
where N(p) = [OK : p] is the norm of p.
Proof. Let D = DP and I = IP be the decomposition and inertia groups of P ⊃ p.
Let ` = OL/P and k = OK/p, with G = Gal(`/k). Recall from Proposition 1.4.5
that each σ ∈ D maps via ϕ to an element σ ∈ G. Since p is unramified in L,
|I| = e(P | p) = 1 and since kerϕ = I by Proposition 1.4.6, ϕ is an isomorphism.
Let q = N(p) = |OK/p|. It is well known that G is a cyclic group generated by the
Frobenius automorphism x 7→ xq. Thus there is a unique σ ∈ G which maps to the
Frobenius automorphism. Finally, since q = N(p), this σ satisfies the lemma.
Definition. For a given prime P ⊂ OL, the unique element σ ∈ DP described above
is called the Artin symbol, denoted
(L/K
P
). For all α ∈ OL, it satisfies
(L/K
P
)(α) ≡ αN(p) mod P,
where p = P ∩ OK . If p = OK ∩P,
(L/K
P
)is called a Frobenius element for p.
We will describe Frobenius automorphisms in greater detail in Section 2.2 but for
now we will focus on their relation to the Hilbert class field.
Proposition 1.8.3. For a Galois extension L/K, an unramified prime p ⊂ OK and
a prime P ⊃ p, the Artin symbol has the following properties.
(i) For all σ ∈ Gal(L/K),
(L/K
σ(P)
)= σ
(L/K
P
)σ−1.
47
(ii) The order of
(L/K
P
)in DP is the inertial degree f = f(P | p).
(iii) p splits completely in L ⇐⇒(L/K
P
)= 1.
Proof. (i) follows from the uniqueness of
(L/K
P
)and the fact, proven in Section 1.3,
that all primes lying over p are conjugates under the action of Gal(L/K).
(ii) From Lemma 1.8.2, DP∼= G = Gal(`/k) and the order of G is [OL/P :
OK/p] = f . By definition, the Artin symbol maps to a generator of G so the order
of
(L/K
P
)is f .
(iii) Recall that p splits completely if and only if e = f = 1. Then e = 1 since we
are assuming p is unramified in L, and f = 1 ⇐⇒(L/K
P
)= 1 follows from part
(ii).
Since L/K is abelian, the Artin symbol only depends on the underlying prime p: if
P and P′ are both primes ofOL containing p, then P′ = σ(P) for some σ ∈ Gal(L/K)
as we have already shown. Thus (i) of the proposition implies
(L/K
P′
)=
(L/K
σ(P)
)= σ
(L/K
P
)σ−1 = σσ−1
(L/K
P
)=
(L/K
P
).
We will write the Artin symbol as
(L/K
p
)to indicate that it is determined by the
underlying prime p ⊂ OK .
The Artin symbol is the first step in establishing a powerful tool in class field
theory called Artin reciprocity (Section 2.7). The name comes from the fact that it
is a generalization of more elementary reciprocity laws, such as quadratic, cubic and
biquadratic reciprocities established by Euler, Legendre and Gauss.
48
When L/K is an unramified abelian extension, things are especially nice. Let IK
be the group of fractional ideals of OK . For any a ∈ IK with prime factorization
a =∏
prii we can define the Artin symbol on a by
(L/K
a
)=∏(
L/K
pi
)ri.
Definition. The Artin map for an extension L/K is the homomorphism
(L/K
·
): IK −→ Gal(L/K).
Notice that if L/K is ramified at any primes, the Artin map is not defined for all
of IK . Likewise if Gal(L/K) is not abelian, the Artin symbol may not be uniquely
defined for all p ∈ IK . For this reason many of the main theorems in class field theory
are complicated to state, as we will see in Chapter 2. However when L is the Hilbert
class field of K we have the following characterization of the Artin map.
Theorem 1.8.4 (Artin Reciprocity for the Hilbert Class Field). If L is the Hilbert
class field of a number field K, the Artin map
(L/K
·
): IK −→ Gal(L/K) is
surjective and its kernel is PK. Therefore the Artin map induces an isomorphism
C(OK) ∼= Gal(L/K) where C(OK) = IK/PK is the ideal class group.
Proof. This will follow from the full Artin reciprocity theorem in Section 2.7.
Using Galois theory, we have the following classification of unramified abelian
extensions of K.
Corollary 1.8.5. For a number field K, there is a one-to-one correspondence
{unramified abelian extensions
M ⊃ K
}←→
{subgroupsH ≤ C(OK)
}.
49
Furthermore, if the extension M/K corresponds to the subgroup H, then the Artin
map induces an isomorphism C(OK)/H ∼= Gal(M/K).
Proof. This too will be proven in a more general setting in Section 2.9.
This is a good example of the general strategy employed in class field theory:
describe a certain type of extensions of K – in this case unramified abelian extensions
– using information encoded in K itself, e.g. subgroups of the class group.
Corollary 1.8.6. Let L be the Hilbert class field of a number field K and let p ⊂ OK
be a prime ideal. Then p splits completely in L ⇐⇒ p is a principal ideal.
Proof. By (iii) of Proposition 1.8.3, p splits completely if and only if
(L/K
p
)= 1.
Since the Artin map induces C(OK) ∼= Gal(L/K) by the Artin reciprocity theorem
(Theorem 1.8.4),
(L/K
p
)= 1 ⇐⇒ [p] is trivial in the class group, which is
equivalent to p being a principal ideal.
The Hilbert class field has an important application to the study of primes of the
form p = x2 + ny2.
Theorem 1.8.7 ([7]). Let n > 0 be a squarefree integer such that n 6≡ 3 (mod 4).
Then there is a monic irreducible polynomial fn(x) ∈ Z[x] of degree h(−4n) – the
class number of K = Q(√−n) – such that if p is an odd prime that does not divide n
or the discriminant of fn, then
p = x2 + ny2 ⇐⇒(−np
)= 1 and fn(x) ≡ 0 (mod p) has an integer solution.
Furthermore, any choice of fn(x) will be the minimal polynomial of a real algebraic
integer α for which L = K(α) is the Hilbert class field of K.
50
We devote the rest of this section to the proof of Theorem 1.8.7 and its applica-
tions. The first step is to relate p = x2 + ny2 to the splitting behavior of p in the
Hilbert class field.
Theorem 1.8.8. Let L be the Hilbert class field of K = Q(√−n), where n > 0 is
squarefree and n 6≡ 3 (mod 4), so that OK = Z[√−n]. If p is an odd prime not
dividing n, then
p = x2 + ny2 ⇐⇒ p splits completely in L.
Proof. We will prove
dK = −4n ⇐⇒ OK = Z[√−n] ⇐⇒ n is squarefree and n 6≡ 3 (mod 4)
in the next section. For now, assume the conditions on n imply that dK = −4n. Let
p be an odd prime not dividing n, so that p - dK . By Theorem 1.6.3 this means that
p is unramified in K. To prove the theorem, we will prove
(i) p = x2 + ny2 ⇐⇒ pOK = pq where p 6= q and p is principal in OK (ii)
⇐⇒ pOK = pq, p 6= q and p splits completely in L (iii)
⇐⇒ p splits completely in L. (iv)
(i) ⇐⇒ (ii) Suppose p = x2 + ny2 = (x + y√−n)(x − y
√−n). Let p =
(x + y√−n)OK . Then pOK = pq must be the prime factorization of pOK , where
q = p = (x − y√−n)OK . Since p is unramified, p 6= q. This entire argument is
reversible, so we have proved the first equivalence.
(ii) ⇐⇒ (iii) follows from Corollary 1.8.6.
(iii) ⇐⇒ (iv) First we prove that L is Galois over Q. To do this, let τ denote
complex conjugation. It is easy to see that τ(L) is an unramified abelian extension of
τ(K) = K. Then since [τ(L) : K] = [L : K] and L is the maximal unramified abelian
51
extension of K by definition, we must have τ(L) = L. Hence τ ∈ Gal(L/K) and this
implies L/Q is Galois by conventional Galois theory arguments.
To finish the final equivalence, note that condition (iii) says that p splits in K
and some prime lying over p splits in L. Since L/Q is Galois, this is the same as p
splitting in L. Hence p = x2 + ny2 if and only if p splits completely in L.
The next step is to further describe the criteria for when p splits in L.
Theorem 1.8.9. Let K be an imaginary quadratic field and L be a finite extension
of K that is Galois over Q. Then
(1) There exists a real algebraic integer α such that L = K(α).
(2) Let f denote the minimal polynomial of α over Q, with f(x) ∈ Z[x]. If p is an
odd prime not dividing the discriminant of f(x), then
p splits in L ⇐⇒(dKp
)= 1 and f(x) ≡ 0 (mod p) has an integer solution.
Proof. (1) By hypothesis, L/Q is Galois so [L ∩ R : Q] = [L : K] since L ∩ R is
the fixed field of complex conjugation. Then for any α ∈ L ∩ R, L ∩ R = Q(α)
precisely when L = K(α). Hence if α ∈ OL ∩ R such that L ∩ R = Q(α) then α is a
real algebraic integer generating the extension L/K. Such an element exists by the
primitive element theorem.
(2) Now let f be the minimal polynomial of α over Q. By the first part, [L ∩
R : Q] = [L : K] so f is also the minimal polynomial of α over K. Let p be a
prime not dividing the discriminant of f(x). Then f(x) is separable mod p, so by
Proposition 1.3.6,
pOK = pp where p 6= p ⇐⇒(dKp
)= 1.
52
We may assume p splits completely in K, so that Z/pZ ∼= OK/p. Since f(x) is
separable over Z/pZ, it is separable over OK/p. Then Theorem 1.3.4 gives us
p splits completely in L ⇐⇒ f(x) ≡ 0 mod p is solvable in OK
⇐⇒ f(x) ≡ 0 mod p is solvable in Z.
Finally (2) is proven using (iii) ⇐⇒ (iv) from the previous proof.
We are now ready to prove Theorem 1.8.7.
Proof. Since the Hilbert class field L of K = Q(√−n) is Galois over Q, Theorem 1.8.9
says there is a real algebraic integer α which is a primitive element of the extension
L/K. Let fn be its minimal polynomial and let p be a prime that does not divide n
or the discriminant of fn. Then the previous two theorems show that
p = x2 + ny2 ⇐⇒ p splits completely in L
⇐⇒(−np
)= 1 and fn(x) ≡ 0 mod p is solvable in Z.
As discussed in the proof of Theorem 1.8.8, the hypotheses imply that dK = −4n so
(dKp
)=
(−np
).
It remains to show that deg fn = h(−4n), but by Artin reciprocity, [L : K] =
|Gal(L/K)| = |C(OK)|, and h(−4n) = |C(OK)| when K = Q(√−n), so the the-
orem is proved.
The polynomial fn(x) is not unique since L/K has infinitely many primitive ele-
ments. We can at least use this theorem to predict deg fn, and later we will see that
fn(x) completely describes the Hilbert class field – quite an amazing result indeed!
53
The Hilbert class field also allows us to relate the ideal class group C(OK) to the
form class group C(dK) for binary quadratic forms. In Section 3.2 we prove
Theorem. Let K be an imaginary quadratic field of discriminant dK = −4n, n ≥ 1.
(1) If f(x, y) = ax2 + bxy + cy2 is a primitive positive definite quadratic form of
discriminant dK, then
[a, (−b+√dK)/2] = {ma+ n(−b+
√dK)/2 | m,n ∈ Z}
is an ideal of OK.
(2) The map f(x, y) 7→ [a, (−b +√dK)/2] is an isomorphism between C(OK) ∼=
C(dK) and hence |C(OK)| = h(dK) which is the number of reduced forms of
discriminant dK.
Lemma 1.8.10. Let L = K(√β) for some β ∈ OK and let p ⊂ OK be a prime ideal.
Then p is unramified in L if either of the following two conditions are met:
(i) 2β 6∈ p, or
(ii) 2 ∈ p, β 6∈ p and β = b2 − 4c for some b, c ∈ OK.
Proof. (i) Since the discriminant of x2 − β is 4β 6∈ p, x2 − β is separable mod p and
hence p is unramified by Theorem 1.3.4.
(ii) Note that L = K(γ) as well, where γ = −b+√β
2is a root of x2 + bx + c. The
discriminant of x2+bx+c is b2−4c 6∈ p so by Theorem 1.3.4 again, p is unramified.
Example 1.8.11. Let K = Q(√−17). Our goal is to prove a characterization of
primes of the form p = x2 + 17y2 using Theorem 1.8.7. Note that n = 2, r = 0, s = 1
and dK = −68 so the Minkowski bound is computed as
BK =2!
22
(4
π
)1√68 ≈ 5.250.
54
Thus the class group C(OK) is generated by prime ideals with norm ≤ 5. These
correspond to ideals pOK for p = 2, 3 and 5. Corollary 1.6.13 tells us that of these,
only 2 ramifies, so we have the following factorizations:
• 2OK = p22 where p2 is prime.
• Using quadratic reciprocity, we calculate
(−17
3
)=
(−1
3
)(17
3
)=
(−1
3
)(2
3
)= −1 · −1 = 1.
Thus by our characterization of quadratic extensions in Proposition 1.3.6, 3
splits in K and we write 3OK = p3p′3 for prime ideals p3 6= p′3.
• Likewise, for 5 we have
(−17
5
)=
(−1
5
)(17
5
)=
(−1
5
)(2
5
)= 1 · −1 = −1.
So 5 is inert, i.e. 5OK is prime.
This shows that C(OK) may be generated by [p2] and [p3], since p3p′3 is principal.
Suppose p2 is principal, say p2 = αOK for α = a + b√−17. Then 2OK = p2
2 =
α2OK so we must have 4 = N (2OK) = N (α)2, or N (α) = ±2. However the equation
a2 + 17b2 = ±2 has no integer solutions, so p2 must not be principal. Thus its ideal
class is an element of order 2 in the class group. Similar arguments shows that p3 is
not principal, and that p23 = p2. Therefore |C(OK)| = 4.
We claim that the Hilbert class field of K is L = K(α), where α =√
(1 +√
17)/2,
following a suggestion [7]. The work above shows the Hilbert class field is a degree
4 extension of K, so it suffices to show that L = K(α) is an unramified abelian
extension of degree 4 over K, from which it will follow from the uniqueness of the
Hilbert class field.
55
It’s easy to verify, using the minimal polynomial x2− x− 4 for α2 = (1 +√
17)/2,
that the minimal polynomial for α is f(x) = x4−x2−4 which splits in L. This shows
that L/K is Galois, so [L : K] = 4. Of course every group of order 4 is abelian, so
L/K is an abelian extension. It remains to check that L/K is ramified at every prime
of OK .
Of course any infinite prime is unramified since K = Q(√−17) is imaginary
quadratic and thus has no real embeddings. We will use Lemma 1.8.10 to show
that E/K and L/E, where E = K(√
17), are both unramified extensions and it will
follow that L/K is unramified. As a sidenote, observe that α2 = (1 +√
17)/2 implies√
17 ∈ L, so K ⊂ K(√
17) ⊂ L and thus it makes sense to define the extensions E/K
and L/E.
Let p be a prime ideal ofOK . Since (i) of Lemma 1.8.10 tells us that p is unramified
in E whenever 2 6∈ p, let us assume 2 ∈ p. Note that 17 6∈ p and 17 can be written
17 = 12 − 4(−4)
and 1,−4 ∈ Z ⊂ OK so (ii) of the lemma tells us that p is unramified in E. Thus
E/K is an unramified extension.
Now we turn our attention to L/E. Let µ = (1 +√
17)/2 and µ′ = (1−√
17)/2,
so that L = E(õ) = E(
√µ′). Suppose p ⊂ OE is a prime ideal; we may assume
2 ∈ p by (i), and furthermore 1 6∈ p, else it’s the whole ring of integers. Notice that
µ + µ′ = 1 6∈ p, so that either µ 6∈ p or µ′ 6∈ p. But these each satisfy x = x2 − 4 so
(ii) of the lemma tells us that p is unramified.
We have shown L/K to be an unramified abelian extension of degree 4, so by
uniqueness it is the Hilbert class field. We now use this to prove a theorem for primes
of the form x2 + 17y2.
56
Theorem 1.8.12. Let p 6= 17 be an odd prime. Then
p = x2 +17y2 ⇐⇒(−17
p
)= 1 and x2(x2−1) ≡ 4 mod p has an integer solution.
Proof. Let K = Q(√−17). We proved that the Hilbert class field of K is L = K(α)
where α =√
(1 +√
17)/2. We also know that the minimal polynomial for α is
f17(x) = x4 − x2 − 4 = x2(x2 − 1)− 4. Note that the discriminant of f17 is −216 · 172
which explains why we remove p = 2 and 17 from consideration. The result follows
from Theorem 1.8.7.
It is clear that even when K is only quadratic, the Hilbert class field is nontrivial
to compute. In the next example we show how Magma can be used to facilitate these
computations.
Example 1.8.13. Let K = Q(√−47). According to Magma, the class number of K
is 5:
> R<x> := PolynomialRing(RationalField());
> K<a> := NumberField(x^2 + 47);
> ClassNumber(K);
5
The next command produces the minimal polynomial for a primitive element of
L/K, where L is the Hilbert class field of K.
> L<b> := HilbertClassField(K);
> f<x> := MinimalPolynomial(b,RationalField());
> f;
x^10 + 10*x^8 - 295*x^6 + 17200*x^4 + 726840*x^2 + 6539063
Set f(x) = x10 + 10x8− 295x6 + 17200x4 + 726840x2 + 6539063 and take α to be a
root of f . Let L = Q(α). It is easy to verify that L is indeed the Hilbert class field of
K. Since f splits in L, L/Q is Galois and [L : Q] = 10. This of course means L/K is
57
Galois and [L : K] = 5. Further, the only group of order 5 is Z/5Z which is abelian,
so L/K is an abelian extension. Finally, Magma shows the extension is unramified:
> OL := MaximalOrder(L);
> IsUnramified(OL);
true
The commands Discriminant(OL) and Different(OL) may also be used to verify
ramification. In any case, we have shown L = Q(α) to be the maximal unramified
abelian extension of K. Unfortunately, we cannot prove a theorem for primes of the
form x2 + 47y2 just yet, as 47 ≡ 3 (mod 4) and Theorem 1.8.7 won’t apply.
1.9 Orders
In the previous section we were able to prove a full characterization of when a prime
is of the form p = x2 + ny2 given certain restrictions on n. We have thus described
the main question for infinitely many n, but what about the rest?
In general, if K = Q(√n) we have the following characterization (see [7]) of the
ring of integers:
OK =
{Z[√n] if n 6≡ 1 (mod 4)
Z[
1+√n
2
]if n ≡ 1 (mod 4).
Recall that for a quadratic extension, the field discriminant is given by
dK =
{n if n ≡ 1 (mod 4)
4n otherwise.
Using this allows us to write the ring of integers more succinctly:
OK = Z[dK +
√dK
2
].
The important thing is that when n does not satisfy the criteria in Section 1.8, i.e.
58
when Z[√−n] is not the full ring of integers for Q(
√−n), we still have a characteri-
zation that involves Z[√−n]. We will make some headway on the x2 + ny2 question
towards the end of this section, but a full characterization of primes of the form
x2 + ny2 will not be possible until we have the theorems of class field theory at our
disposal.
The ring Z[√−n] is an example of an order. In Lemma 1.9.2 we will prove that
the following definition is equivalent to the one given in Section 1.1.
Definition. Let K be a number field. Then a subring O ⊂ K is an order if
• 1K ∈ O
• O is finitely generated as a Z-module
• O contains a Q-basis of K.
There is a more general notion of an order in an arbitrary ring R, but the behavior
is quite different even when R is not a field. We will primarily make use of orders in
quadratic fields.
Proposition 1.9.1. Let O be an order in a quadratic number field K.
(1) O is a free Z-module of rank 2.
(2) K is the field of fractions of O.
(3) OK is an order in K containing every other order. In other words OK is the
maximal order in K.
Proof. (1) Clearly O is torsion free, so since it is a Z-module it is free. Also, since
O contains a Q-basis of a quadratic field, O is at least rank 2, so it must be exactly
rank 2.
(2) follows from the fact that O contains a Q-basis for K.
59
(3) Since 1K ∈ OK and OK is a Z-module of rank [K : Q] = 2 by Proposi-
tion 1.1.13, it suffices to show that OK contains a basis for K/Q. But this follows
from the discussion above: OK is generated by 1 and dK+√dK
2.
Now let O be any order in K. Since O is a free Z-module, it is noetherian. Let
α ∈ O and consider the chain of Z-submodules I0 ⊂ I1 ⊂ I2 ⊂ · · · where I0 = Z and
for n ≥ 1,
In = Z + αZ + α2Z + . . .+ αnZ.
By the noetherian condition, there is some n such that for all m ≥ n, Im = In. So
for all such m we have Z + αZ + . . . + αmZ = Z + αZ + . . . + αnZ. This implies
αm = αi for some 1 ≤ i ≤ n and thus the powers of α are finite. This shows that
Z[α] is finitely generated as a Z-module, so Proposition 1.1.4 shows α ∈ OK . Thus
O ⊂ OK .
Recall that Z+ niZ is an order in Q(i) for every nonzero n ∈ Z. The next lemma
shows that this is essentially the form of every order in a quadratic field.
Lemma 1.9.2. Let O be an order in a quadratic field K with discriminant dK and
ring of integers OK. Then f = [OK : O] is finite and O = Z + fOK.
Proof. The finiteness of f is a result of the fact that O and OK are both free Z-
modules of rank 2. On one hand, since f = [OK : O] we have
fOK ⊂ O =⇒ Z + fOK ⊂ O.
On the other hand, our description of OK at the beginning of the section allows us
to write Z + fOK = [1, fwK ], where
wK =dK +
√dK
2.
60
Clearly [1, fwK ] has index f in [1, wK ] = OK , which proves the result.
Definition. The index f = [OK : O] is called the conductor of the order.
This is not to be confused with the conductor of an extension in class field theory,
which will be discussed in Section 2.8. To add to the clutter, each order has an
associated value called the discriminant which is distinct from, although related to,
the field discriminant.
Definition. For an order [α, β], its discriminant is defined to be
D =
(det
[α βα′ β′
])2
where α′ and β′ denote the respective images of α and β under the nontrivial auto-
morphism of K/Q.
The discriminant of an order is independent of the basis chosen, since if A =[α βα′ β′
]then changing basis is done by conjugating A by some invertible matrix
B, but this doesn’t change the determinant calculation above. Therefore we can let
O = [1, fwK ] as in Lemma 1.9.2 and have D = f 2dK . This shows that an order
is determined by its conductor. Moreover, the maximal order OK has conductor 1
which shows that the discriminant of OK is dK .
By our description of dK for quadratic fields, we see that D ≡ 0, 1 (mod 4). Let
K = Q(√−n) for any integer n. Then Z[
√−n] is an order in K with discriminant
−4n. By the comments above, −4n = f 2dK which makes it relatively easy to compute
the conductor of Z[√−n].
In fact, if D ≡ 0 or 1 (mod 4) there will be an in order in a quadratic field whose
discriminant is D. For D ≡ 0 (mod 4), we may write D = 4n and see that the
maximal order OK = [1, wK ] in K = Q(√n) has discriminant dK = 4n = D. On the
61
other hand, if D ≡ 1 (mod 4), Q(√D) has ring of integers OK = Z
[1+√D
2
]which
has discriminant dK = D.
Recall that OK is a Dedekind domain and has unique factorization of ideals.
Unfortunately this is not true in general for an order O ( OK so our description of
the ideals of O requires a bit more care. It turns out that we can still define a class
group C(O) by restricting to certain types of ideals. One should view the subsequent
construction as a precursor to the types of constructions used in class field theory in
Chapter 2.
Proposition 1.9.3. Let a be a nonzero ideal in an order O of K. Then the quotient
O/a is finite.
Proof. By Proposition 1.5.2, every nonzero ideal a of the maximal order OK has finite
index in OK . If b is a nonzero ideal in an order O of K, Proposition 1.9.1 tells us
that O ⊂ OK so that b ⊂ OK . Then [OK : b] = [OK : O][O : b] and the left side is
finite, so [O : b] must also be finite.
This allows us to define
Definition. For an order O, the norm of an O-ideal a is N(a) = [O : a].
For any nonzero ideal a ⊂ O, O ⊆ {β ∈ K : βa ⊂ a}, but equality may not always
hold. The ideals for which equality does hold have a special name.
Definition. An ideal a of an order O is a proper ideal if O = {β ∈ K : βa ⊂ a}.
Notice that principal ideals are always proper. Also, every ideal of the maximal
order OK is proper. From this definition we proceed with our construction of a class
group for O by defining an analog of fractional ideals.
62
Definition. For an order O, a fractional O-ideal is a subset of K which is finitely
generated as an O-module. We say a fractional O-ideal b is proper if O = {β ∈ K :
βb ⊂ b}.
Proposition 1.9.4. Every fractional O-ideal is of the form αa for some nonzero
α ∈ K and ideal a ⊂ O.
Proof. This is identical to the property for fractional ideals of a Dedekind domain;
see Section 1.2.
Lemma 1.9.5. Let K = Q(α) be a quadratic field and suppose ax2 + bx + c is the
minimal polynomial for α – we may assume (a, b, c) = 1. Then [1, α] is a proper
fractional ideal of the order [1, aα] in K.
Proof. First, [1, aα] is an order by Lemma 1.9.2 since [1, aα] = Z + aαOK and aα is
an algebraic integer. Now suppose β ∈ K such that β[1, α] ⊂ [1, α]. Equivalently,
β · 1 ∈ [1, α] and β · α ∈ [1, α].
The first of these gives us β = j + kα for j, k ∈ Z, so we can write the second as
β · α = (j + kα)α = jα + kα2 = jα +k
a(−bα− c) = −ck
a+
(−bka
+ j
)α.
By hypothesis (a, b, c) = 1 so the above shows β · α ∈ [1, α] if and only if a | k. Thus
{β ∈ K : β[1, α] ⊂ [1, α]} = [1, aα],
proving [1, α] is a proper fractional ideal of [1, aα].
For orders in a quadratic field, we have the following characterization of their
fractional ideals.
Proposition 1.9.6. A fractional O-ideal a is proper if and only if a is invertible.
63
Proof. ( =⇒) If a is invertible, there exists some fractional O-ideal b such that ab =
O. Suppose β ∈ K such that βa ⊂ a. Then βO = β(ab) = (βa)b ⊂ ab = O. This
implies β ∈ O so a is a proper fractional O-ideal.
( =⇒ ) Suppose a ⊂ O is a proper fractional ideal. Since K is quadratic, a is a free
Z-module of rank 2, so a = [β, γ] for some β, γ ∈ K. Let α = γβ; then a = β[1, α] and
Lemma 1.9.5 implies that O = [1, aα] where ax2 + bx + c is the minimal polynomial
of α over Q. Let z 7→ z′ be the nontrivial automorphism in Gal(K/Q). Since α′ is
also a root of ax2 + bx + c, Lemma 1.9.5 also shows that a′ = β′[1, α′] is a fractional
O-ideal. We will show that aaa′ = N(β)O. Note that
aaa′ = aββ′[1, α][1, α′] = N(β)[a, aα, aα′, aαα′].
Also observe that α + α′ = − ba
and αα′ = ca, so
aaa′ = N(β)[a, aα,−b, c] = N(β)[1, aα] = N(β)O
since (a, b, c) = 1. This proves the claim, and it follows that a is invertible.
Example 1.9.7. O = Z[√−3] is an order of conductor 2 in K = Q(
√−3). Consider
the ideal [2, 1 +√−3] in O. It’s easy to see that
O ( {β ∈ K : β[2, 1 +√−3] ⊂ [2, 1 +
√−3]} = OK .
Further, 2, 1 +√−3 and 1 −
√−3 are all irreducible in O, but 4 = 2 · 2 = (1 +
√−3)(1−
√−3) showing that unique factorization fails in O.
In the next theorem we construct a class group C(O) for an order in a quadratic
number field. As with the class group in Section 1.7, we take a quotient of a fractional
ideal group by some principal fractional ideals, but in this context we must restrict
our consideration to proper fractional ideals in O.
64
Theorem 1.9.8. Given an order O in a quadratic number field, the set I(O) of
proper fractional O-ideals forms a group under ideal multiplication. Moreover, the
set P (O) of principal O-ideals is a subgroup of I(O) and hence the ideal class group
C(O) = I(O)/P (O) is defined.
Proof. Let a and b be proper fractional ideals of the order O. By Proposition 1.9.6,
it is equivalent to consider invertible ideals. First note that O is clearly the identity
in I(O). Since a is invertible, there is some fractional O-ideal which we will denote
a−1, such that aa−1 = O. This shows that a−1 is also invertible and hence proper, so
I(O) has inverses.
Now consider the product (ab)c, where we set c = b−1a−1. Then
(ab)c = abb−1a−1 = aOa−1 = aa−1 = O
so we see that ab is invertible and hence proper. This proves that I(O) is a group.
Clearly P (O) is a subgroup of I(O) since every principal ideal is proper, and the
product of principal ideals is again principal. C(O) = I(O)/P (O) is a quotient of
abelian groups, so it is a group. This completes the proof of the theorem.
In order to make our work on orders in quadratic fields more compatible with
the rest of class field theory, it will be advantageous to translate O-ideals into the
language of OK-ideals.
Definition. Given an order O of conductor f , we say that a nonzero O-ideal a is
prime to f if a + fO = O.
Lemma 1.9.9. Let O be an order of conductor f .
(1) An O-ideal a is prime to f ⇐⇒ N(a) is relatively prime to f .
(2) Every O-ideal that is prime to f is proper.
65
Proof. (1) Define the map ϕf : O/a→ O/a to be multiplication by f . Note that
a + fO = O ⇐⇒ ϕf is surjective
⇐⇒ ϕf is an isomorphism
⇐⇒ f and |O/a| are relatively prime
where the last equivalence comes from the fundamental theorem of finite abelian
groups. Then by definition of numerical norm, |O/a| = N(a) so (1) is proved.
(2) Suppose a is prime to the conductor. Let β ∈ K and suppose βa ⊂ a. Then
βO = β(a + fO) = βa + βfO ⊂ a + fOK .
But fOK ⊂ O so βO ⊂ O which proves β ∈ O. Hence a is proper.
Note that since norm is multiplicative, (1) can be used to show that the set of
O-ideals prime to the conductor forms a subgroup I(O, f) ≤ I(O). Moreover, the set
P (O, f) = {αO | α ∈ O, (N(α), f) = 1}
is a subgroup of I(O, f). The next proposition describes the class group C(O) in
terms of O-ideals prime to the conductor.
Proposition 1.9.10. I(O, f)/P (O, f) ∼= I(O)/P (O) = C(O).
Proof. A result in Chapter 3 will imply that every ideal class in C(O) contains a
proper O-ideal whose norm is prime to a fixed M ∈ Z. Thus the map I(O, f)→ C(O)
is surjective with kernel I(O, f) ∩ P (O), so it suffices to show P (O, f) = I(O, f) ∩
P (O).
On one hand, P (O, f) ⊂ I(O, f) ∩ P (O) is clear from the definitions of these
subgroups. On the other hand, every element of I(O, f) ∩ P (O) is a fractional ideal
of the form αO = ab−1, where α ∈ K and a, b are O-ideals prime to f . Let m = N(b).
66
Then mO = bb ∈ P (O, f) and mb−1 = b which implies
mαO = mab−1 = a(mb−1) = ab ⊂ O.
So mαO ∈ P (O, f). It follows that αO = (mαO)(mO)−1 ∈ P (O, f) and hence the
kernel is equal to P (O, f).
Given any positive integer m, an OK-ideal a is prime to m provided that a +
mOK = OK . By Lemma 1.9.9, this is equivalent to (N(a),m) = 1. This implies
that for every ring of integers OK , inside the group of fractional OK-ideals we have a
subgroup IK(m) ≤ IK . In Section 2.3 we will generalize this construction using class
field theory, but for now we have
Theorem 1.9.11. Let O be the order of conductor f in a quadratic field K.
(1) If a is an OK-ideal prime to f , then a ∩ O is an O-ideal prime to f and
N(a ∩ O) = N(a), where the first norm is taken with respect to O and the
second with respect to OK.
(2) If b is an O-ideal prime to f , then bOK is an OK-ideal prime to f with the
same norm.
(3) IK(f) ∼= I(O, f).
Proof. (1) Let a be an OK-ideal prime to f . By the natural injection ν : O/(a∩O) ↪→
OK/a, (N(a), f) = 1 implies (N(a∩O), f) = 1 as well. This shows a∩O is prime to
f . As in Lemma 1.9.9, the map ϕf is an automorphism of OK/a, but fOK ⊂ O so
the injection ν is also a surjection. Hence the norms are equal.
(2) and (3) Let b be an O-ideal prime to f . Then
bOK + fOK = (b + fO)OK = OOK = OK
67
which shows that bOK is an OK-ideal prime to f . In a moment we will show the
norms are equal, but first consider
bOK ∩ O = (bOK ∩ O)O
= (bOK ∩ O)(b + fO)
⊂ b + f(bOK ∩ O)
⊂ b + b(fOK).
Since fOK ⊂ O this proves bOK ∩ O ⊂ b. The other containment, b ⊂ bOK ∩ O, is
clear so we have bOK ∩ O = b.
On the other hand, suppose a is an OK-ideal prime to f . Then
a = aO = a(a ∩ O + fO) ⊂ (a ∩ O)OK + fa,
but fa ⊂ fOK ⊂ O so fa ⊂ a ∩ O ⊂ (a ∩ O)OK and it follows that a ⊂ (a ∩ O)OK .
Again the other inclusion is obvious, so we have (a∩O)OK = a. These two identities
for O- and OK-ideals, along with (1), prove the equality of norms in (2). Furthermore
we have established a bijection
IK(f)←→ I(O, f)
a 7−→ a ∩ O
bOK 7−→b.
To show this is an isomorphism, we must simply check that it is multiplicative:
(aa′)OK = (aOK)(a′OK)
and we have proven the theorem.
Corollary 1.9.12. Every O-ideal prime to the conductor has a unique decomposition
as a product of prime O-ideals which are prime to the conductor.
68
Proof. Apply unique factorization of ideals in OK and Theorem 1.9.11.
Finally we describe C(O) in terms of the maximal order.
Theorem 1.9.13. Let O be the conductor of order f in an imaginary quadratic field
K and define PK,Z(f) of IK(f) by
PK,Z(f) = {αOK | α ∈ OK and α ≡ a mod fOK for some a ∈ Z, (a, f) = 1}.
Then C(O) ∼= IK(f)/PK,Z(f).
Proof. We have proven that C(O) ∼= I(O, f)/P (O, f). In the proof of Theorem 1.9.11
we saw that I(O, f) ∼= IK(f), so it suffices to show that the image of P (O, f) under
this isomorphism is PK,Z(f). To do so, we will prove that for α ∈ OK ,
α ≡ a mod fOK , a ∈ Z, (a, f) = 1 ⇐⇒ α ∈ O, (N(α), f) = 1.
( =⇒ ) Assume α ≡ a mod fOK where a ∈ Z is relatively prime to f . By
definition of the numerical norm in a quadratic field, N(α) ≡ a2 (mod f) which
implies (N(α), f) = (a2, f) = 1. Since fOK ⊂ O we see that α ∈ O.
( =⇒) Conversely, suppose α ∈ O = [1, fwK ] with (N(α), f) = 1. We may write
α = a + bfwK for a, b ∈ Z, so α ≡ a mod fOK . Since (N(α), f) = 1, N(α) ≡ a2
(mod f) again implies (a, f) = 1. This proves the stated equivalence.
Now by definition P (O, f) is generated by ideals αO, where α ∈ O and (N(α), f) =
1. Thus we see that the image of P (O, f) under the isomorphism I(O, f)∼=−→ IK(f)
is generated by the corresponding ideals αOK . By the equivalence proven above, this
proves the image is precisely PK,Z(f).
We are by no means finished working with orders. In Section 2.11 we will re-
alize PK,Z(f) as a congruence subgroup for the conductor, and show that there is
69
a corresponding field extension L/K with the special property that Gal(L/K) ∼=
IK(f)/PK,Z(f). This will allow us to provide a full solution to the question of when
a prime is of the form p = x2 + ny2, which we have only answered partially as of
Section 1.8.
1.10 Units in a Number Field
In this section we further describe the structure of OK by characterizing the group of
units UK , which is the group of all elements in OK which have a multiplicative inverse.
In particular we will prove Dirichlet’s unit theorem, which is the main structure
theorem for the group of units in a number field. At the end of the section we will
discuss a nice application of Dirichlet’s unit theorem to Pell’s equation x2 − dy2 = 1.
By the theory of finitely generated abelian groups, O×K ∼= Zt × T where t is the
rank of O×K and T is the torsion subgroup of O×K . Henceforth we will write O×K = UK
and T = µ(K), which is equivalently the set of roots of unity which lie in OK .
Definition. A set of units u1, . . . , ut is called a fundamental system of units if
it forms a basis for UK modulo torsion, i.e. if every unit u ∈ UK can be written
u = ζum11 · · ·umt
t for ζ ∈ µ(K) and mi ∈ Z.
Proposition 1.10.1. The torsion subgroup µ(K) is finite and cyclic.
Proof. Recall that if ζ is a primitive mth root of unity then Q(ζ) is a Galois extension
of Q with Gal(Q(ζ)/Q) ∼= (Z/mZ)×. Moreover, [Q(ζ) : Q] = |Gal(Q(ζ)/Q)| = φ(m)
where φ is Euler’s function, defined to be the number of positive integers less than m
that are relatively prime to m – we will provide a proof of this well-known fact using
the Frobenius density theorem in Section 2.5. There is a product formula for φ(m):
if m = pr11 · · · prss is the prime factorization of m, then φ(m) =∏pri−1i (pi − 1). Now
70
note that for any number field K,
ζ ∈ µ(K) =⇒ ζ ∈ K =⇒ K(ζ) ⊂ K =⇒ φ(m) | [K : Q].
Since [K : Q] is finite, there are only a finite number of choices for ζ, the primitive
root, and each one has finite order. It follows easily that µ(K) is cyclically generated
by ζ.
Using the field norm N = NK/Q, we have the following characterization of the
units of a number field K.
Proposition 1.10.2. An element α ∈ K is a unit if and only if α ∈ OK and
NK/Q(α) = ±1.
Proof. ( =⇒ ) If α is a unit, then there exists a β ∈ OK such that αβ = 1. Then
N(α), N(β) ∈ Z and since norm is multiplicative, 1 = N(αβ) = N(α)N(β). There-
fore N(α) = ±1.
( =⇒) Fix an embedding σ0 : K ↪→ C. By properties of the field norm,
N(α) =∏
σ:K↪→C
σ(α) = α∏σ 6=σ0
σ(α).
Let β =∏σ 6=σ0
σ(α). Then α ∈ OK implies β ∈ OK as well. If N(α) = ±1, then
β = ±α−1 so α has an inverse ±β in OK and is therefore a unit.
For all real fields K, UK = {±1}. This turns out to be the unit group for many
nonreal fields as well.
Example 1.10.3. Let K = Q(√d), a quadratic field. Recall that
OK =
{Z[√d] n 6≡ 1 (mod 4)
Z[
1+√d
2
]n ≡ 1 (mod 4).
71
In each case, the units in OK are the solutions to one of
x2 − dy2 = ±1
(2x+ y)2 − dy2 = ±4.
When d < 0 (i.e. K is imaginary quadratic), these equations only have finitely many
solutions, so UK = µ(K). In fact, ζm ∈ K if and only if φ(m) ≤ 2 which only happens
for m dividing 4 or 6. Thus µ(K) = {±1} except for the following cases:
Q(i) : µ(K) = {±1,±i}
Q(√−3) : µ(K) = {±1,±ρ,±ρ2} where ρ = 1+
√−3
2.
For d > 0, there are infinitely many solutions. The equation x2 − dy2 = ±1 is known
as Pell’s equation; we will describe this case further after proving the unit theorem.
Proposition 1.10.4. For any m,M ∈ Z, the set of all algebraic integers α such that
the degree of α is ≤ m and |α′| ≤M for all conjugates α′ of α is finite.
Proof. If the degree of α is bounded by m then α is equivalently the root of a monic
irreducible polynomial with degree at most m. On the other hand, |α′| ≤ M for all
conjugates α′ of α implies the coefficients of the polynomial are all bounded. Since
the degree and coefficients of such a polynomial are all integers, and in this case
they are all bounded, there are only a finite number of polynomials satisfying the
requirements, each one having only finitely many roots. Hence the set of these α
described above is finite.
Corollary 1.10.5. An algebraic integer α is a root of unity ⇐⇒ |α′| = 1 for all
conjugates α′ of α.
Proof. Apply the proposition to the set {1, α, α2, . . .} to see that it is finite, and hence
αn = 1 for some n ∈ N.
72
As in Section 1.7, consider the map
σ : K −→ Rr × Cs
α 7−→ (σ1(α), . . . , σr(α), σr+1(α), σr+1(α), . . . , σr+s(α))
where {σ1, . . . , σr, σr+1, σr+1, . . . , σr+s} is the set of all embeddings of K into C. This
map takes sums to sums, but to describe a multiplicative group such as UK we want to
map products to sums instead. The solution is to construct a map using logarithms:
L : UK −→ Rr+s
α 7−→(log |σ1(α)|, . . . , log |σr(α)|, log |σr+1(α)|, . . . , log |σr+s(α)|
).
If u is a unit in OK , then by Proposition 1.10.2, N(u) = ±1 and so
|σ1(u)| · · · |σr(u)| |σr+1(u)|2 · · · |σr+s(u)|2 = 1.
Taking the log of both sides shows that the image L(u) lands in the hyperplane
H : x1 + . . .+ xr + 2xr+1 + . . .+ 2xr+s = 0.
This is a linear system with one degree of dependence, so we see that H is isomorphic
to Rr+s−1. This suggests the main result in this section, Dirichlet’s unit theorem.
Dirichlet’s Unit Theorem. Let K be a number field with ring of integers OK.
Then UK ∼= Zr+s−1 × µ(K) where r is the number of real embeddings of K and s is
the number of pairs of complex embeddings.
We will prove this theorem by establishing that L(UK) is a lattice in Rr+s−1 with
full rank. First we have
Lemma 1.10.6. The image of L : UK → H is a lattice in H and the kernel is µ(K).
Proof. Let C be a bounded subset of H, say C = {(xi) ∈ H : |xi| ≤ M for all i}.
If L(u) ∈ C for a unit u ∈ UK then |σj(u)| ≤ eM for each embedding σj. By
73
Proposition 1.10.4, this implies there are only a finite number of such u, and thus
L(UK) ∩ C is finite. This implies L(UK) is a lattice in H. Further, u ∈ kerL if and
only if |σi(u)| = 1 for all i and thus Corollary 1.10.5 says that kerL = µ(K).
We will also need
Lemma 1.10.7. Suppose A = (aij) is an m×m matrix such that
• aij < 0 for all i 6= j
• ai1 + ai2 + . . .+ aim > 0 for each i.
Then A is invertible.
Proof. If A is not invertible, then the system of equations
m∑j=1
aijxj = 0, i = 1, . . . ,m
has a nontrivial solution x = (x1, . . . , xm). Suppose xk is a component such that
|xk| = max{|xj|}. We may scale x so that xk = 1. Then |xj| ≤ 1 for each j 6= k which
implies
0 =m∑j=1
akjxj = akk +∑j 6=k
akjxj ≥ akk +∑j 6=k
akj > 0,
a contradiction. Thus A must be invertible.
We now proceed to the proof of the Unit Theorem.
Proof. We will show that L(UK) is a full lattice in H, which, since kerL is finite, will
imply that UK has rank r+s−1. Again we will make use of the map σ : K ↪→ Rr×Cs.
Set V = Rr × Cs. For each x ∈ V , define
N(x) = x1 · · ·xrxr+1xr+1 · · · xr+s.
74
Then N(σ(α)) = NK/Q(α). Also note that |NK/Q(α)| = |x1 · · · |xr| |xr+1|2 · · · |xr+s|2.
Recall from Proposition 1.7.7 that σ(OK) is a full lattice in V and the volume of its
fundamental parallelopiped is 2−s√|dK |. Equivalently, if {α1, . . . , αn} is a Z-basis for
OK then we showed that the absolute value of the determinant of the matrix whose
ith row is
σ(αi) =(σ1(αi), . . . ,Re(σr+1(αi)), im(σr+1(αi)), . . . , im(σr+s(αi))
)is equal to 2−s
√|dK |. For the remainder of the proof, let x ∈ V with 1
2≤ N(x) ≤ 1.
Define an action of V on σ(OK) by x ·σ(OK) = {x ·σ(α) | α ∈ OK}, where u ·v is the
multiplication operation of V as a ring. Then x · σ(OK) is again a lattice in V and
the volume of its fundamental parallelopiped is the determinant of the matrix with
ith row
(x1 · σ1(αi), . . . ,Re(xr+1 · σr+1(αi)), im(xr+1 · σr+1(αi)), . . .)
which equals 2−s√|dK | |N(x)|. Observe that as x ranges over the set of points with
12≤ N(x) ≤ 1 these volumes remain bounded.
Let T be a compact subset of V such that T is symmetric about the origin and
convex, and large enough in volume so that by Minkowski’s Theorem T contains a
point x · σ(γ) for some nonzero γ ∈ OK . The points in T have bounded coordinates
(by compactness and the Heine-Borel Theorem) and thus bounded norms; say M is
a bound on their norms. Then since x · σ(γ) ∈ T , |N(x · σ(γ))| ≤M and hence
|NK/Q(γ)| ≤ M
N(x)≤ 2M.
Consider the set of ideals γOK where γ runs through the chosen algebraic integers
above for each x, i.e. those γ such that x · σ(γ) ∈ T . The norm of these ideals
is bounded by 2M , so there are only finitely many: {γ1OK , . . . , γtOK}. If γ is any
75
nonzero algebraic integer in OK such that x · σ(γ) ∈ T then γOK = γiOK for some
1 ≤ i ≤ t, and thus for some unit u we have γ = γiu. This shows that x · σ(u) ∈
σ(γ−1i )T . Note that the set T ′ =
⋃ti=1 σ(γ−1
i )T is bounded and does not depend on
x. We have therefore shown that for each x with 12≤ N(x) ≤ 1 there exists a unit u
for which the coordinates of x · σ(u) are bounded uniformly.
To prove L(UK) is a full lattice in H, we may assume r+s−1 ≥ 1, since otherwise
the proof is trivial. For each i between 1 and r + s, we may choose an x ∈ V with
12≤ N(x) ≤ 1 such that all the coordinates xj for j 6= i are large compared to all
y ∈ T ′, and xi is small enough so that |N(x)| = 1. For each i = 1, . . . , r + s− 1, we
have shown that there exists a unit ui such that the coordinates of x · σ(ui) are all
bounded, and so for all j 6= i,
|σj(ui)| < 1 =⇒ log |σj(ui)| < 0.
We claim that L(u1), . . . , L(ur+s−1) are linearly independent vectors in L(UK), which
we prove by showing that the matrix with ith row
(log |σ1(ui)|, . . . , log |σr+s−1(ui)|
)is invertible. The non-diagonal entries of the matrix are all negative, but ui ∈ kerL
and so
log |σ1(ui)|+ . . .+ log |σr+s(ui)| = 0
=⇒ log |σ1(ui)|+ . . .+ log |σr+s−1(ui)| = − log |σr+s(ui)| > 0.
In other words the sum of the entries across each row are positive. Then Lemma 1.10.7
implies that our matrix is invertible. Thus rankL(UK) ≥ r + s− 1, but H has rank
r+s−1. It follows that L(UK) is a full lattice in H, completing the proof of the Unit
Theorem.
76
We now apply the Unit Theorem to Pell’s equation.
Example 1.10.8. The famous problem of Pell is to find all positive integer solutions
(x, y) to x2 − dy2 = 1, where d > 0 is squarefree. Let K = Q(√d). Note that for any
α = x+ y√d ∈ K,
N(α) = (x+ y√d)(x− y
√d) = x2 − dy2.
Thus the solutions to Pell’s equation form a finite-index subgroup of UK , the group
of units. By Dirichlet’s Unit Theorem, the solution set is an infinite abelian group of
rank 1, a fact that is exponentially harder to show with elementary number theory.
Definition. Let t = r+ s− 1 and let u1, . . . , ut be a fundamental system of units for
UK . The regulator of K is the determinant of the matrix whose ith row is
L(ui) =(log |σ1(ui)|, . . . , 2 log |σt(ui)|
).
In other words, the regulator is the signed volume of the fundamental parallelop-
iped for L(UK). We will encounter the regulator again in Section 2.4.
77
Chapter 2: Class Field Theory
In this chapter we develop the main concepts and theorems in class field theory,
including valuations on a field, the Artin map, ray class groups, Dirichlet L-series,
Artin reciprocity, the Conductor and Existence Theorems and two density theorems.
At the end of the section, we prove the main characterization of primes of the form
x2 + ny2 by applying class field theory to the ring class field of Z[√−n].
2.1 Valuations and Completions
Definition. A function | · | : K → R on a field K is called an absolute value, or
valuation, if it satisfies
(1) |x| ≥ 0 for all x ∈ K and |x| = 0 ⇐⇒ x = 0.
(2) |xy| = |x| |y| for all x, y ∈ K.
(3) |x+ y| ≤ |x|+ |y|. This is called the triangle inequality.
There is an additional axiom that defines an important type of valuation.
Definition. An absolute value on K is nonarchimedean if |x + y| ≤ max{|x|, |y|}
for all x, y ∈ K. Otherwise, | · | is said to be archimedean.
By (1) and (2), | · | is a multiplicative homomorphism from K× to the positive
reals. Moreover, since R>0 is torsion-free, | · | maps K× to 1. This implies
(a) | − 1| = 1.
(b) | − x| = |x| for all x ∈ K.
78
Examples.
1 For any number field K, let σ : K ↪→ C be any complex embedding. Then
|x| := |σ(x)| is a valuation, where the second set of absolute values denotes the
usual absolute value on C.
2 Let ord : K× → Z be a discrete (additive) valuation. Then for any real number
c > 1, the following gives us a nonarchimedean absolute value on K:
|x| =
{c− ord(x) if x 6= 0
0 if x = 0.
3 The most important example of the valuation in 2 is the p-adic valuation for
any prime number p. It is a map | · |p : Q→ R defined by |a|p = p− ordp(a) = p−v
where a = pvm such that p does not divide the numerator of m.
4 This can be generalized to extensions of Q in the following way. For any prime
ideal p in the ring of integers of a number field K, we define the normalized
p-adic valuation
| · | : K −→ R
a 7−→ |a|p := N(a)− ordp(a)
where a = aOK , N(a) is the ideal norm from Section 1.5 and ordp(a) is the
largest integer n such that pn divides a.
5 On any field, the trivial absolute value is |x| = 1 for all x 6= 0. Note that the
only absolute value on a finite field F is the trivial absolute value since every
nonzero element of F is a root of 1F .
79
Remark. The condition |x+y| ≤ max{|x|, |y|}, called the nonarchimedean condition,
is equivalent to ∣∣∣∑xi
∣∣∣ ≤ max{|xi|}.
We also have the following characterization.
Proposition 2.1.1. An absolute value | · | is nonarchimedean if and only if it is
bounded on the set {m · 1K | m ∈ Z}.
Proof. ( =⇒ ) If | · | is nonarchimedean, then for any m > 0,
|m · 1| = |1 + 1 + . . .+ 1| ≤ |1| = 1.
By property (a) following the definition of absolute values, | −m · 1| = |m · 1| so the
values are bounded for all m ∈ Z.
( =⇒) Conversely, suppose there is some N such that |m · 1| ≤ N for all m ∈ Z.
Then
|x+ y|n =
∣∣∣∣∣n∑r=0
(n
r
)xryn−r
∣∣∣∣∣ ≤n∑r=0
∣∣∣∣(nr)∣∣∣∣ |x|r|y|n−r
by the triangle inequality for sums. Clearly |x|r|y|n−r ≤ max{|x|n, |y|n} = max{|x|, |y|}n
and(nr
)is an integer, so we see that
|x+ y|n ≤ N(1 + n) max{|x|, |y|}n =⇒ |x+ y| ≤ N1/n(1 + n)1/n max{|x|, |y|}.
As n→∞, N1/n(1+n)1/n tends to 1 by limit laws, so we have |x+y| ≤ max{|x|, |y|}.
Hence | · | is nonarchimedean.
Corollary 2.1.2. If K is a field of characteristic p 6= 0, then every absolute value on
K is nonarchimedean.
Proof. If charK 6= 0 then the set {m · 1K | m ∈ Z} is finite and therefore bounded
under | · |. Apply Proposition 2.1.1.
80
In example 2 , we saw that an additive map ord on K induces an absolute value
|x| = c− ord(x), where c is any positive real number. Taking logs, we have
logc |x| = − ord(x), or ord(x) = − logc |x|
which suggests the following connection between additive and multiplicative valua-
tions.
Definition. An absolute value | · | on K is said to be discrete if |K×|, the image of
the units of K, is a discrete subgroup of R>0.
Proposition 2.1.3. For any nonarchimedean absolute value | · | on K, define v :
K× → R by v(x) = − log |x| (with v(0) defined to be 0). Then
(i) v(xy) = v(x) + v(y),
(ii) v(x+ y) ≥ min{v(x), v(y)}.
Furthermore, if the image v(K×) is discrete in R, then v is a multiple of some discrete
valuation ord : K×→→ Z ⊂ R.
Proof. Obvious. See [19] for the details.
We next define several important objects which arise from a nonarchimedean
valuation on a field.
Definition. For any nonarchimedean valuation | · | on K, define
A = {a ∈ K : |a| ≤ 1}
U = {a ∈ K : |a| = 1}
m = {a ∈ K : |a| < 1}.
81
Proposition 2.1.4. Let | · | be a nonarchimedean valuation on K. Then A is a
local subring of K with U as its group of units and m as its unique maximal ideal.
Furthermore, | · | is discrete if and only if m is principal, in which case A is a DVR.
Proof. It follows easily from the properties of a nonarchimedean valuation that A is
a ring. By property (a) following the definition of valuations, it is easy to see that U
is the group of units of A. By the nonarchimedean condition, m is an ideal of A. To
see that it is the unique maximal ideal, let y ∈ A r m. Then |y| = 1 by definition,
and |y−1| = 1 so y−1 ∈ A as well. Thus every element in A outside of m is a unit,
which implies that m is the unique maximal ideal of A.
Turning to the last statement, if | · | is discrete then v(|x|−1) = −v|x| implies that
one of v(|A|),−v(|A|) contains all the positive integers. We may assume 1 ∈ v(|A|).
Let π ∈ A such that v|π| = 1. For any x ∈ A, v|x| = n ∈ Z+ and so v|xπ−n| = 0.
However, v : K× → Z is an isomorphism so |xπ−n| = 1. Thus u = xπ−n is a unit, and
x = uπn. We have proven that every nonzero x ∈ A can be written as a unit times
a power of π, so the only ideals in A are powers of πA. Hence m must be principal,
which of course implies A is a DVR. The other direction follows from showing that
v : K× → Z is again an isomorphism and using the fact that Z is a discrete subgroup
of R.
Definition. The ring A defined above is called the valuation ring for | · |, and m is
called its valuation ideal.
The last statement in Proposition 2.1.4 says that | · | is discrete exactly when A
is a discrete valuation ring, which explains where this term comes from.
Every absolute value defines a metric on K given by d(x, y) = |x − y| and hence
induces a metric topology on K. For each x ∈ K, the sets
B(x, ε) = {y ∈ K : 0 < |x− y| < ε}
82
form a basis of neighborhoods around x.
Example 2.1.5. The p-adics are a perfect case study of absolute values and their
related groups, rings and topologies. In the p-adic topology induced on Q by | · |p,
two rationals a and b are considered “close” if their difference is divisible by a high
power of p. For example, the sequence xn = pn = (1, p, p2, . . .) converges to 0 quite
rapidly in the p-adic topology. Even the series∑∞
n=1 pn converges in this topology.
Gouvea [12] provides a great introductory-level examination of the p-adics.
Definition. Two absolute values | · |1 and | · |2 on a field K are said to be equivalent
if they induce the same topology on K.
Proposition 2.1.6. For two absolute values | · |1, | · |2 : K → R, the following are
equivalent:
(1) | · |1 and | · |2 are equivalent.
(2) |x|1 < 1 ⇐⇒ |x|2 < 1 for all x ∈ K.
(3) There exists some c > 0 such that |x|1 = |x|c2 for all x.
Proof. This is a standard proof in the study of absolute values. See [19] or any
topology book for details.
Nonarchimedean absolute values cause the metric topology on K to have some
strange properties. For example,
• If y ∈ B(x, ε) then B(y, ε) = B(x, ε).
• The open ball B(x, ε) is also closed.
• Under this topology, K is totally disconnected.
83
The next theorem characterizes all possible (equivalence classes of) absolute values
on Q. We will write | · |∞ for the usual absolute value on R.
Theorem 2.1.7 (Ostrowski). Let | · | be a nontrivial absolute value on Q.
(1) If | · | is archimedean, it is equivalent to | · |∞.
(2) If | · | is nonarchimedean, it is equivalent to | · |p for exactly one prime p.
Proof. Let m,n ∈ Z. Then we can write m = a0 + a1n + . . . + arnr where ai ∈ Z,
0 ≤ ai < n and nr ≤ m. Let N = max{1, |n|}. By the triangle inequality,
|m| ≤r∑i=0
|ai| |n|i ≤r∑i=0
|ai|N r.
Note that r ≤ logmlogn
and |ai| ≤ |1 + . . .+ 1| = ai|1| = ai ≤ n. So we have
|m| ≤ (1 + r)nN r ≤(
1 +logm
log n
)nN
logmlogn .
If we replace m with mt for an integer t, this gives us
|m| ≤(
1 +t logm
log n
)1/t
n1/tNlogmlogn .
Then as t→∞, this becomes |m| ≤ Nlogmlogn .
For the first case, suppose that |n| > 1 for all integers n > 1. Here N = |n|, so
we see that |m|1/ logm ≤ |n|1/ logn. By symmetry, these must be equal so there exists
a number c > 1 such that c = |m|1/ logm for all integers m. Hence
|m| = clogm = elog c logm = mlog c
for all m > 1. This shows that |m| = |m|log c∞ for all m ∈ Z,m > 1. Since | · | and | · |∞
84
are equivalent on a set of generators for Q (the integers), they must be equivalent
absolute values.
Now suppose there exists some integer n > 1 such that |n| ≤ 1. In this case N = 1
and the inequality proved in the first part of the proof implies |m| ≤ 1 for all m ∈ Z.
By Proposition 2.1.1, | · | is nonarchimedean. Let A and m be its valuation ring and
ideal, respectively. By definition, Z ⊂ A and so m∩Z is a prime ideal in Z since it is
nonzero. Thus m∩Z = (p) for some prime p. This implies that |m| = 1 if p - m, so for
all rationals q such that p does not divide its numerator or denominator, |qpr| = |p|r.
Let c be a number such that |p| = p−c. Then |x| = |x|cp for all x ∈ Q and we have
shown that | · | is equivalent to the p-adic valuation | · |p.
Definition. For a number field K, an equivalence class of valuations on K is called
a place or prime of K.
By Theorem 2.1.7, the places on Q are in one-to-one correspondence with the
prime integers – it is for this reason that places are also called primes – with the
exception of ∞. To rectify this, we refer to ∞ as the infinite prime, which thus
corresponds to the equivalence class of the archimedean absolute values on Q. This
motivates our use of the term infinite prime in Section 1.8.
Definition. For each place p of Q, let p be its prime integer representative. There is
a valuation | · |p in p that satisfies |p|p = 1p, called the normalized absolute value
of p. By convention, the normalized absolute value of the infinite place is taken to be
| · |∞.
The next theorem shows an important relation between the values of any x ∈ Q
under the normalized absolute values of all primes on Q.
85
Theorem 2.1.8 (The Product Formula). For each prime p = 2, 3, 5, . . . ,∞, let | · |p
be the corresponding normalized absolute value on Q. Then for all x ∈ Q,
∏p
|x|p = 1.
Proof. Let x = ab
for a, b ∈ Z. Then |x|p = 1 except when p | a or p | b. (Importantly,
this establishes that the product above really is finite.) The map
Q× −→ R×
x 7−→∏p
|x|p
is a homomorphism, so it suffices to show that the image of −1 and all primes q ∈ Z
is 1. But | − 1| = 1 for any absolute value, and since q is prime we have
|q|p =
q p =∞1q
p = q
1 p 6= q.
Therefore the product over all primes of Q is equal to 1.
One of the first objectives in class field theory is to prove an analog of the product
formula for finite extensions K/Q. This will require a description of how to extend
an absolute value to a field extension of Q.
Recall that a field K is said to be complete with respect to an absolute value | · |
if every Cauchy sequence in K converges with respect to | · |. Of course not every
field is complete: for example, many sequences in Q itself do not converge to rational
numbers. For this reason we embed K into a complete field K, called the completion
of K. The most common method is to define K to consist of equivalence classes
of Cauchy sequences in K. This is a common construction in many first courses in
86
analysis; the reader may consult [16] for the general case or [15] for the construction
in the number field context. We highlight one important result.
Theorem 2.1.9. The completion (K, | · |) of (K, | · |) is unique up to valuation-
preserving isomorphism.
Example 2.1.10. For Q, the completion with respect to | · |∞ is isomorphic to R.
For any prime p, (Q, | · |p) is called the p-adic numbers and is denoted Qp. The ring
of integers in this completion is called the p-adic integers, denoted Zp.
We briefly state, without proof, some results about nonarchimedean valuations
in completions. Our objective is to obtain a description of all possible completions
of a number field K, so we will content ourselves with highlights from the lengthier
discussions in [15] and [19].
The objects related to nonarchimedean absolute values extend nicely in a comple-
tion.
Proposition 2.1.11. If | · | is a nonarchimedean absolute value on K whose valuation
ring A is a DVR, then the valuation ring A of the completion (K, | · |) is also a DVR,
and its maximal ideal has the same generator as m ⊂ A.
Proposition 2.1.12. Let K be a completion of a number field K with respect to | · |.
Every element α ∈ K r {0} has a unique representation as a power series
α = πr(s0 + s1π + s2π2 + . . .)
where si ∈ A and π is a generator of m.
Theorem 2.1.13. Let K be complete with respect to a nonarchimedean absolute value
| · | and let L be a finite extension of K, with n = [L : K]. Then | · | extends uniquely
to L by
|x| = |NL/K(x)|1/n.
87
The next result is a famous one in the study of absolute values.
Hensel’s Lemma. Suppose A is a commutative ring that is complete with respect
to an ideal m. Let k denote the residue field for A, and for any f(x) ∈ A[x] denote
its image in k[x] by f(x). Then for every monic polynomial f(x) ∈ A[x] such that
f(x) = g0(x)h0(x) where g0 and h0 are monic and relatively prime in k[x], we can
factor f(x) = g(x)h(x) where g = g0 and h = h0.
Corollary 2.1.14. In Qp, xp−1 − 1 has p− 1 distinct roots.
Let K be an algebraic number field, A a DVR such that K is its field of fractions,
and p the maximal ideal of A. Let L be a finite extension of K and let B be a DVR
which is the integral closure of A in L, with P the maximal ideal of B. Now complete
both fields with respect to the p-adic valuation | · |p, which by Theorem 2.1.13 extends
uniquely to | · |P on L. The next proposition completely describes the behavior of
primes in the extension L/K in the language of Section 1.3.
Proposition 2.1.15. Let A, B, p and P denote the DVRs and maximal ideals, re-
spectively, of the extension L/K. Then
(a) p = pA and P = PB.
(b) pB = Pe where e = e(P | p), the ramification index in L/K.
(c) Moreover, e(P | p) = e(P | p) = e and f(P | p) = f(P | p) = f .
(d) [L : K] = ef .
Proof. It suffices to prove (a) and (b), since the main results in Section 1.3 will then
imply that (c) and (d) are true. Proposition 2.1.11 directly implies (a), since p has
the same generator as p and P has the same generator as P.
To prove (b), let pB have prime factorization pB = Pe11 · · ·P
egg where P = P1
– this is possible since B is a DVR and hence Dedekind. Each Pi, i ≥ 2, contains
88
elements of B outside of the maximal ideal P, and since Proposition 2.1.11 says that
B is a DVR with PB as its maximal ideal, this implies that PiB = B for i ≥ 2.
Hence
pB = (pA)B = (Pe11 · · ·Peg
g )B = (P1B)e1 = Pe1 .
For now, this concludes our description of nonarchimedean absolute values in ex-
tensions. We conclude the section by discussing how to extend archimedean absolute
values.
Theorem 2.1.16. If K is complete with respect to an archimedean absolute value | · |,
then K is isomorphic to either R or C, and | · | is equivalent to the usual absolute
value on these.
Proof. Since | · | is archimedean, the values for |n| where n ∈ Z are unbounded. Thus
K must have characteristic zero and therefore |·| restricts to an archimedean valuation
on Q. By Theorem 2.1.7 (Ostrowski’s theorem), we may replace | · | with the usual
(archimedean) absolute value on Q.
K is complete, so the completion of Q with respect to | · | is contained in K. We
know this completion is isomorphic to R, with | · | equivalent to the usual absolute
value on R, so it remains to show that either K = R already or K = C (up to
isomorphism).
Suppose K contains i, a root of x2 + 1 = 0. Then K ∼= R(i) = C. If not, adjoin i
to obtain a field K(i). Then | · | extends to K(i) by
|a+ bi| =√|a|2 + |b|2
and K(i) is complete under this valuation. It is straightforward to check that this is
equivalent to the usual absolute value on C. In any case we may at this point assume
89
C ⊆ K or replace K with K(i). Janusz [15] shows that C 6= K (or K(i)) produces a
contradiction, so we must have K = C or K(i) = C, proving the theorem.
Corollary 2.1.17. Let K be an algebraic number field and {σ1, . . . , σr, σr+1, . . . , σr+s}
be the set of embeddings of K into C. Then every archimedean absolute value on K
is equivalent to exactly one of the form |x|i = |σi(x)|.
From this description of absolute values on K, both archimedean and nonar-
chimedean, we have a generalized product formula for the places of K.
The Product Formula. Let K be an algebraic number field. For each prime p of
K, finite or infinite, we may select a valuation | · |p in p such that for all nonzero
x ∈ K, ∏p
|x|p = 1.
2.2 Frobenius Automorphisms and the Artin Map
Fix a Galois extension L of a number field K and let G be the Galois group of this
extension. Recall from Section 1.8 that for an unramified prime P ⊂ OL, there
is an automorphism σ ∈ G called the Artin symbol such that σ(α) = αq for all
α ∈ OL/P, where q = |OK/p| if p = P ∩ OK . Cox [7] denotes the Artin symbol by(L/K
P
)since it is used to define the Artin map
(L/K
·
): IK → G in the abelian
case. On the other hand, Janusz [15] and many other authors refer to this element as
the Frobenius automorphism, denoted FrobL/K(P). We will use these names and
notations interchangeably, since each has its uses in particular contexts and neither
is really preferred in the literature. There should be no confusion.
We’ve already proven the existence and uniqueness of the Frobenius automorphism
(Lemma 1.8.2) and in Proposition 1.8.3 we gave some nice properties:
90
(i) For all σ ∈ G,
(L/K
σ(P)
)= σ
(L/K
P
)σ−1.
(ii) FrobL/K(P) has order f = [OL/P : OK/p] in G.
(iii) p splits completely in L ⇐⇒(L/K
P
)= 1 for any prime P lying over p.
Note that (i) means that in general, the set {FrobL/K(P) | P ⊂ OL divides p} is a
conjugacy class in G. If L/K is abelian, this represents a single element of G which
we denote with
(L/K
p
)or FrobL/K(p). One sometimes sees
(L/K
p
)denote the
conjugacy class as well.
It will be useful to know how the Frobenius automorphism behaves in towers.
Suppose L ⊃ E ⊃ K and denote P∩E by pE. If p = P∩K is unramified in L, pE is
clearly also unramified in L so there is a Frobenius automorphism FrobL/E(P) which
relates to FrobL/K(P) by the next few results.
Proposition 2.2.1. Let f0 = f(pE | p). Then
(L/K
P
)f0=
(L/E
P
).
Proof. The residue fields are related in the following way:
OL/P ⊃ OE/pE ⊃ OK/p
and they have orders qf , qf0 and q, respectively. Consider G′ = Gal(`/ε), where
` = OL/P and ε = OE/pE. This group is generated by the automorphism x 7→ xqf0
which is the f0th power of the generator of Gal(`/k). The proposition then follows
from the definitions of the Frobenius automorphisms.
Proposition 2.2.2. Suppose L ⊃ E ⊃ K is a tower of fields so that L/K is abelian
and E/K is normal. Let m be a modulus on K and let mE denote the modulus of E
defined by the primes lying over each p | m. Then the following diagram commutes:
91
ImK
ImEE
Gal(L/K)
Gal(E/K)
σ
σ|E
FrobL/K(·)
FrobE/K(·)
Proof. Let P ∈ OL and set pE = P ∩ E. Since E/K is normal, FrobE/K(pE) is
defined. To show the diagram commutes, it suffices to prove that the restriction of
FrobL/K(P) to E is exactly FrobE/K(pE). For any α ∈ OE, σ(α) ≡ αq mod P if
and only if σ(α) ≡ αq mod pE since pE = P ∩ E is fixed by all of G when E/K is
normal. Therefore
FrobL/K(P)∣∣E
= FrobE/K(pE).
Corollary 2.2.3. Suppose E1 and E2 are normal extensions of K and L = E1E2.
Define p1 = P∩E1 and p2 = P∩E2 so that their Frobenius elements are all defined.
Then the homomorphism
Gal(L/K) −→ Gal(E1/K)×Gal(E2/K)
σ 7−→ (σ |E1 , σ |E2)
is one-to-one and therefore
(L/K
P
)=
(E1/K
p1
)×(E2/K
p2
).
Proof. The previous proposition shows that the map is a well-defined homomorphism.
Then the fact that p splits completely in L ⇐⇒ p splits completely in E1 and E2
proves the map is one-to-one.
Let’s take a look at Frobenius automorphisms in our favourite example.
92
Example 2.2.4. Let K = Q(i) and take any prime integer p. Since K/Q is abelian,(K/Qp
)represents a single element. We claim that
(K/Qp
)=
{complex conjugation if p ≡ 3 (mod 4)
1 if p ≡ 1 (mod 4).
To prove this, first let p ≡ 3 (mod 4). Then p remains prime in Q(i) and the residue
fields are given by
` = Z[i]/pZ[i] = Fp2 and k = Z/pZ = Fp.
The Frobenius element for p in `/k must be x 7→ xp:
(a+ bi)p = ap + bpip ≡ a− bi (mod p).
So the Frobenius element of any prime p ≡ 3 (mod 4) is complex conjugation.
On the other hand, recall that if p ≡ 1 (mod 4), (p) splits completely in Q(i). If
pZ[i] = p1p2, these prime ideals must be complex conjugates. Then we have
Z[i]/p1 = Z[i]/p2 = Fp and Z/pZ = Fp
so the Frobenius automorphism is the identity.
Next we describe Frobenius automorphisms in general cyclotomic extensions.
Example 2.2.5. LetK = Q(ζn) where ζn = e2πi/n for some n ≥ 2. Then Gal(L/K) ∼=
(Z/nZ)× via the automorphism identifying [k] ∈ (Z/nZ)× with the map ζn 7→ ζkn.
For a prime p - n, this implies that
(K/Qp
)= (ζn 7→ ζpn)←→ p (mod n).
In particular, (p) splits completely in Q(ζn) if and only if p ≡ 1 (mod n).
93
For the rest of the section, we focus on setting up the right conditions for a
generalization of the Artin map. The definition is simpler when it is a map on
unramified primes of OK so we need a way to restrict to these primes.
Definition. For a number field K, let IK be the group of fractional OK-ideals and
let S be a finite set of primes in OK . Then ISK is defined to be the subgroup of IK
generated by those prime ideals which are not in S.
In practice we will . For this choice of S, we define
Definition. Suppose L/K is abelian and let S = {primes p ⊂ OK | p ramifies in L}
so that ISK is generated by the unramified primes in OK . Define the Artin map to
be the homomorphism
ϕL/K : ISK −→ G = Gal(L/K)
a 7−→∏pi
(L/K
pi
)ei
where a is a fractional ideal with prime factorization a =∏
peii .
Since L/K is abelian, this map is well-defined. We will later (Section 2.10) generalize
the Artin map to non-abelian extensions.
Suppose E is a finite extension of K. Then EL/E is an abelian extension whose
Galois group, say H, is a subgroup of Gal(L/K) when we restrict elements of H to
L. Let ISE denote the subgroup of IE generated by primes in OE that do not lie over
any prime in S. Note that this is equivalent to saying ISE is generated by the primes
of OE which have norm in ISK .
Proposition 2.2.6. Let G = Gal(L/K) and H = Gal(EL/E). Then restricting H
to L gives us ϕEL/E = ϕL/KNE/K on ISE.
94
Proof. Let P ⊂ OEL be prime and let PE = P ∩ E, PL = P ∩ L and p = P ∩K.
Then q := NK/Q(p) is a prime power and NE/K(PE) = pf . Let σ = FrobEL/E(PE).
Then for each α ∈ OEL we have σ(α) ≡ αqf
mod P. Recall that σ(P) = P and
σ(PL) = PL. Let τ = FrobL/K(p). Then when α ∈ OL we have
τ(α) ≡ αq mod PL =⇒ τ f (α) ≡ αqf
mod PL
Since the Frobenius automorphism is unique, τ f = σ on L. This proves the property
for all primes in ISE and since they generate ISE we’re done.
Corollary 2.2.7. Let ϕ be the Artin map in an extension L/K. Then NL/K(ISL) ⊆
kerϕ.
Proof. Let E = L and apply Proposition 2.2.6 to obtain ϕL/KNL/K = ϕL/L = 1.
From this we obtain a nice description of ϕ for any abelian extension K of Q.
Theorem 2.2.8. Let K/Q and let S be the set of prime ideals containing (m) for
some positive integer m. Then the Artin map ϕ : ISQ → Gal(K/Q) is surjective with
kerϕ ={
fractional ideals(ab
): a ≡ b (mod m)
}.
Proof. See III.3.3 of [15]. Surjectivity of ϕ will follow from the Frobenius Density
Theorem in Section 2.5.
When L/K is not an abelian extension, a description of the Artin map becomes
more difficult. For this reason many theorems in class field theory are complicated
to state. It is our goal in the next few sections to provide a glimpse of some of the
constructions required to prove a more general description of the Artin map.
95
2.3 Ray Class Groups
In this section we generalize the class group from Chapter 1.
Definition. A modulus m is a formal product of places of K:
m =∏p
pn(p).
This product is taken over all places of K, and the n(p) are nonnegative integers
subject to the following conditions:
(1) If p is finite then n(p) ≥ 0 and only finitely many of these are nonzero.
(2) If p is a real infinite prime, n(p) = 0 or 1.
(3) If p is a complex infinite prime, n(p) = 0.
It is common to write a modulus as m = m0m∞ where m0 denotes the product
of all finite primes with positive exponent and m∞ denotes the product of the real
primes in m. In this way m0 may be realized as an integral ideal in OK .
Fix a place p of K and take α ∈ K∗. If p is a real infinite place, we say α ≡ 1
mod p if αp > 0. Otherwise α 6≡ 1 mod p. If p is finite, we say α ≡ 1 mod pn(p) if
α is in the valuation ring corresponding to p and α − 1 ∈ pn(p). We can extend this
notion of congruence for elements of K∗ to any modulus m by α ≡ 1 mod m if and
only if α ≡ 1 mod pn(p) for all primes with n(p) > 0.
Definition. For a modulus m of a field K, define the following subgroups of K∗:
Km ={ab| a, b ∈ OK and aOK , bOK are relatively prime to m0
}Km,1 = {α ∈ Km | α ≡ 1 mod m}.
Let ISK be as in the last section; that is, for any set of primes S, ISK is the subgroup
of IK generated by primes outside S. We define a special case of this for a modulus.
96
Definition. Let S be the set of primes dividing m0 for some modulus m. Then we
denote the subgroup ISK ≤ IK by Im.
There is a natural inclusion i : K∗ → IK given by α 7→ (α); we denote the image
of Km,1 under this map by PK(m, 1) := i(Km,1). This allows us to define
Definition. The ray class group of a modulus m is CK(m) = Im/PK(m, 1). The
cosets of PK(m, 1) in this quotient are referred to as ray classes mod m.
Example 2.3.1. If m = 1 then PK(m, 1) is just the subgroup of principal ideals and
thus CK(m) is the full ideal class group C(OK).
Example 2.3.2. If m =∏ν real
ν then CK(m) = IK/{(a) : |a|ν > 0 for all real ν} is
called the narrow class group of K.
Example 2.3.3. Let m = (2)3(17)2(19)·∞, a modulus of Q. Then m0 = (2)3(17)2(19)
so Qm,1 consists of all x ∈ Q satisfying
x > 0
x ≡ 1 mod 23
x ≡ 1 mod 172
x ≡ 1 mod 19.
For example, if x = ab
for a, b ∈ Z and b 6= 0 then the condition at the place 2 tells us
a and b are odd and ab−1 ≡ 1 mod 8. This looks similar to the Chinese remainder
theorem. The connection is made clear in the weak approximation theorem.
Weak Approximation Theorem. Let | · |1, . . . , | · |n be inequivalent, nontrivial
valuations on K and let β1, . . . , βn ∈ K∗. Then for any ε > 0 there exists an element
α ∈ K such that |α− βi|i < ε for all i = 1, . . . , n.
97
Proof. Following the proof in [15], we first prove the existence of elements y1, . . . , yn
in K that satisfy
|yi|i > 1 and |yi|j < 1 for all i 6= j.
We do this by inducting on n. The base case n = 2 is proven by the negation of the
definition of equivalent valuations. Now suppose there is an element y ∈ K satisfying
|y|1 > 1 and |y|j < 1 for j = 2, . . . , n− 1.
By the base case there exists some t ∈ K such that |t|1 > 1 and |t|n < 1. Now choose
y1 according to
y1 =
y if |y|n < 1
yrt if |y|n = 1yrt
1+yrif |y|n > 1
for a real number r yet to be chosen. If |y|n = 1 then |y1|j = |y|rj |t|j for all 2 ≤ j ≤ n.
Thus for sufficiently large r, |y1|j < 1 for all 2 ≤ j ≤ n. In the case that |y|n > 1,
note that
|yr|j|1 + yr|j
<1
|y−r|j−→ 0 as r →∞.
In all cases we have y1 ∈ K that is “large” at | · |1 and “small” at the other valuations.
We could have picked another valuation to start with, so the same proof produces
y2, . . . , yn with the desired properties.
Now let
α =n∑i=1
yri1 + yri
βi.
We claim that α is the element prescribed by the theorem when r is chosen appro-
98
priately. By the triangle inequality,
|α− βi|i ≤∣∣∣∣ βi1 + yri
∣∣∣∣i
+∑j 6=i
∣∣∣∣ yrj1 + yrj
βj
∣∣∣∣i
.
For any ε > 0 we can choose r large enough so that both terms on the right are less
than ε. This completes the proof.
Remark. When p is an infinite place of K, the statement |α−β|p < ε for small ε > 0
is equivalent to(αβ
)p> 0, i.e. α ≡ β mod p. When p is a finite place, recall that
|α|p = cv(α) for some real number c, 0 < c < 1. Then we see that |α − β|p < ε is
equivalent to ∣∣∣αβ − 1∣∣∣p<
ε
|β|p=: ε′.
In turn when ε′ is small, say ε′ < cn for some n, then v(αβ− 1)> 1 which means α
β−1
is in the valuation ring for p. Recall that this is the same as saying α ≡ β mod pn.
So in general we see that |α−β|p < ε is equivalent to α ≡ β mod pn for a sufficiently
large n. As suggested in Example 2.3.3, the reformulation of the weak approximation
theorem in terms of congruences allows us to view it as a generalization of the Chinese
remainder theorem.
Theorem 2.3.4. For every modulus m of K, there is an exact sequence
0→ UK/Um,1 → Km/Km,1 → CK(m)→ C(OK)→ 0
and isomorphisms
Km/Km,1∼=∏p realp|m
{±1} ×∏p|m0
(OK/pn(p))× ∼=∏p realp|m
{±1} × (OK/m0)×
where Um,1 = UK ∩Km,1.
99
Proof. First, the inclusion Im ↪→ IK induces a homomorphism CK(m) → C(OK).
Consider the sequence
0→ UK → Km → Im → C(OK)→ 0.
We will show that it is exact. In particular, to show Im → C(OK) is surjective,
we must prove that every ideal class is represented by an ideal in Im. Let a be a
fractional ideal; we may write a = bc−1 where b and c are integral ideals. For any
c ∈ c, a · (c) = bc−1(c) is integral so we may assume a is integral in the first place.
Write
a =∏p|m
pn(p)b
where b ∈ Im. For each p | m, choose πp ∈ p r p2 such that πp ≡ 1 mod p. By the
weak approximation theorem, there is some a ∈ OK so that a ≡ πn(p)p mod pn(p)+1
for all p | m. This means we can write
(a) =∏p|m
pn(p)b′ where b′ ∈ Im
but then a−1a ∈ Im and this belongs to the same ideal class as a. Hence Im → C(OK)
is surjective. Next, if a ∈ Im maps to the trivial class in C(OK) then a = (α) for some
α ∈ Km and this α is uniquely determined up to multiplication by a unit u ∈ UK .
This implies exactness of the rest of the sequence.
Now consider the maps Km,1f−→ Km
g−→ Im. By the work above, ker g = UK and
coker g = C(OK). By definition, coker(g ◦ f) = CK(m) and ker(g ◦ f) = Km,1 ∩UK =
Um,1. Finally, f is injective by the definitions of Km and Km,1. Hence by Lemma A.2.1
we have an exact sequence
0→ Um,1 → UK → Km/Km,1 → CK(m)→ C(OK)→ 0.
100
Next we prove the isomorphisms. Let p | m. If p is an infinite prime we map
α ∈ Km to the sign (+ or −) of the image of α under the embedding (·)p : K ↪→ C. If
p is finite, we map α to [a][b]−1 ∈ (OK/pn(p))× where a, b ∈ OK such that a ≡ b ≡ 1
mod m0. Since a and b are in particular relatively prime to p, it makes sense to define
their equivalence classes and take inverses in (OK/pn(p))×. Consider the map we have
defined:
ϕ : Km −→∏p real
{±} ×∏p|m0
(OK/pn(p))×.
By the weak approximation theorem and subsequent remark, ϕ is surjective. More-
over, its kernel is Km,1 by the way this subroup is defined. This shows the first
isomorphism; the second is easily concluded from the Chinese remainder theorem.
Corollary 2.3.5. For any m, the ray class group CK(m) is a finite group of order
hm =hK 2r0 N(m0)
[UK : Um,1]
∏p|m0
(1− 1
N(p)
)
where r0 is the number of real primes dividing m.
Proof. First, OK/pn is a local ring with maximal ideal p/pn; this can be seen by the
correspondence between its ideals and the ideals of OK containing p. Moreover, the
units in OK/pn are precisely those elements not in p/pn. It follows that (OK/pn)×
has order qn−1(q − 1) where q = N(p) = [OK : p]. Then by Theorem 2.3.4,
|CK(m)| = |(Km/Km,1)/(UK/Um,1)| · |C(OK)|
=
∣∣∣∣∣∣∏p real
{±1} ×∏p|m0
(OK/pn(p))×
∣∣∣∣∣∣ [UK : Um,1]−1 · hK
= hK 2r0 [UK : Um,1]−1∏p|m0
N(p)n(p)−1(N(p)− 1).
101
Furthermore, this expression is equal to the desired one when we factor out N(m0)
from the product on the right, using that N is multiplicative.
The most important implication of Corollary 2.3.5 is that every ray class group
CK(m) is finite. Let’s take a look at some examples.
Example 2.3.6. For K = Q, the narrow class group is trivial.
Example 2.3.7. Let K = Q(√n) for n > 0. Here there are two real primes and
UK = {±εm} ∼= Z/2Z×Z for a fundamental unit ε. Let ε be the conjugate of ε. Then
hm =
{2hK if ε, ε have the same sign
hK otherwise.
Also note that N(ε) = −1 if and only if ε and ε have different signs. For the first few
values of n we have
n hK ε N(ε)
2 1 1 +√
2 −1
3 1 2 +√
3 1
5 1 (1 +√
5)/2 −1
6 1 5 + 2√
6 1
so we see that the narrow class numbers for Q(√
3) and Q(√
6) are 2, whereas the
others have narrow class number 1.
Example 2.3.8. Let’s look at the important example of cyclotomic extensions. Let
L = Q(ζm) where ζm = e2πi/m for m > 2. Define the modulus m = (m)∞ on L. We
claim that all ramified primes of L divide m. The minimal polynomial of ζm over Q
is well known: it is the mth cyclotomic polynomial Φm(x). These polynomials are
102
constructed by setting Φ1(x) = x− 1 and recursively defining
Φm(x) =xm − 1∏d|md<m
Φd(x).
The relevant property we will use is that Φm(x) is a factor of xm − 1. For a prime
p, consider xm − 1 over the finite field Fp. Since the formal derivative of xm − 1 is
mxm−1, these polynomials are relatively prime unless m = 0 in Fp, i.e. p | m. In
particular this shows that if p - m, xm − 1 is separable mod p and so are all of its
irreducible factors, namely Φm(x). Hence by Theorem 1.3.4, p is unramified in L.
This allows us to consider the Artin map ϕL/Q : ImQ → Gal(L/Q) ∼= (Z/mZ)×.
We know from Section 2.2 that in any abelian extension L/K, the Artin map takes
a prime p ∈ ImK to the Frobenius automorphism x 7→ xq where q = |OK/p|. In this
example K = Q and L = Q(ζm) so OK = Z and p = (p) for a prime integer p. The
isomorphism Gal(L/Q) ∼= (Z/mZ)× is exhibited by (σ : ζm 7→ ζkm) 7→ [k]. Using this
description, we can extend ϕL/Q to all fractional ideals. If a =∏psp and b =
∏rtr
then we should have
ϕL/Q(abZ)
=
∏p|a
ϕL/Q(pZ)sp
∏r|b
ϕL/Q(rZ)tr
−1
=
∏p|a
|Z/pZ|sp∏
r|b
|Z/rZ|tr−1
=
∏p|a
psp
∏r|b
rtr
−1
= [a][b]−1.
103
It’s easy to see that the kernel of the Artin map is precisely PQ(m, 1):
PQ(m, 1) = {(α) ∈ IQ | α ≡ 1 mod m}
= {(a/b)Z | a, b ∈ Z and a ≡ b mod m}
= {(a/b)Z | [a][b]−1 = 1 ∈ (Z/mZ)×}.
Moreover, the Artin map in this case is clearly surjective (this will be proven in general
in Section 2.5). This implies that the ray class group for m = (m)∞ is isomorphic to
(Z/mZ)×.
We can use Corollary 2.3.5 to get even more information out of this example. For
m = (m)∞, the above shows that |CQ(m)| = φ(m). Plugging this into the ray class
formula, we have
φ(m) =hK 2r0 N(m0)
[UK : Um,1]
∏p|m0
(1− 1
N(p)
).
Notice that the numerical norm on Q just evaluates to the integer itself, so we can
multiply N(m0) = m back into the product on the right to obtain
φ(m) =hK2r0
[UK : Um,1]
∏p|m
pn(p)−1(p− 1)
where n(p) is the exponent of p in the prime factorization of m. This product is now
easily recognized as φ(m), so we can cancel this from both sides and rearrange:
[UK : Um,1] = hK2r0 .
In general it is a very hard problem to compute the class number of a cyclotomic field
so we end the discussion here. The study of the cyclotomic fields is closely related
to 20th Century pursuits of a proof of Fermat’s Last Theorem. For example, unique
factorization can be used to prove FLT when Q(ζm) has class number 1 but this fails
104
for m as small as 23. To worsen matters, the class number of Q(ζm) is not even known
for sure for m > 70, and even assuming the Generalized Riemann Hypothesis only
allows for computations [22] up to m = 163.
2.4 L-series and Dirichlet Density
In these next two sections we delve into one of my favourite topics in number theory:
Dirichlet series. At the end of Section 2.5 we will be able to prove Dirichlet’s theorem
on primes in arithmetic progression, one of the cornerstones of early analytic number
theory.
Definition. For any positive integer m, a Dirichlet character mod m is a homo-
morphism χ : (Z/mZ)× → C×. It is typical to extend a character to the entire ring
of integers by
χ(n) =
{χ([n]) if gcd(n,m) = 1
0 if gcd(n,m) 6= 1.
Note that since (Z/mZ)× is a finite group for all m ∈ Z+, χ([n]) is a root of unity
for all congruence classes [n] ∈ (Z/mZ)×. In other words, a Dirichlet character is a
multiplicative homomorphism from (Z/mZ)× to the circle group S1 ⊂ C.
Example 2.4.1. The trivial character mod m, which takes every [n] ∈ (Z/mZ)× to 1
(and every other integer to 0), is called the principal Dirichlet character, denoted
χ0.
Definition. For a Dirichlet character χ, we define a complex-valued function
L(s, χ) =∞∑n=1
χ(n)
ns
called a Dirichlet L-series.
105
The product form for L-series is
L(s, χ) =∏p-m
1
1− χ(p)p−s
which may be obtained by using unique factorization of n and multiplicativity of χ.
Note that both expressions for L(s, χ) converge when Re(s) > 1. The most important
and probably the most thoroughly studied example of an L-series is the Riemann zeta
function:
Example 2.4.2. The Riemann zeta function is the L-series
ζ(s) =∞∑n=1
1
ns= L(s, χ0),
where χ0 is the principal Dirichlet character for m = 1. Notice that for any m > 1,
L(s, χ0) differs from ζ(s) only by factors 11−p−s for p | m. It is well known in analytic
number theory [23] that ζ(s) extends to a meromorphic function on the half-plane
Re(s) > 0 and satisfies
ζ(s) =1
1− s+ g(s)
for some holomorphic function g(s) defined on Re(s) > 0.
As a result of the relation between L(s, χ) and ζ(s), we have the following analytic
properties of L-series. Because we are focusing on the algebra of number fields, we
leave out many analytic proofs but provide references for where one may find them.
Proposition 2.4.3 ([23]). If χ is a nonprincipal Dirichlet character, then L(s, χ)
converges for all Re(s) > 0 and L(1, χ) 6= 0.
106
Proposition 2.4.4 ([15]). For an L-series L(s, χ), define
s(x) =∑n≤x
χ(n)
and suppose there exist real numbers a, b > 0 such that |s(x)| ≤ axb for all x ≥ 1.
Then
(1) For any ε, δ > 0, L(s, χ) is uniformly convergent on the domain
D ={s ∈ C : Re(s) ≥ b+ δ, |Arg(s− b)| ≤ π
2− ε}.
(2) L(s, χ) is analytic on the half-plane Re(s) > b.
(3) For all s ∈ D0 ={s ∈ C : Re(s) ≥ 1, |Arg(s− 1)| ≤ π
2− ε}
,
lims→1
(s− 1)L(s, χ) = limx→∞
s(x)
x.
We can extend the idea of Riemann’s zeta function to an arbitrary algebraic
number field in the following way.
Definition. Let K be an algebraic number field and for any nonzero ideal a ⊂ OK ,
let N(a) denote its numerical norm. Then the Dedekind zeta function for K is
the complex-valued function
ζK(s) =∑a⊂OK
1
N(a)s.
Notice that when K = Q, the zeta function is simply the Riemann zeta function.
An even further generalization of ζK(s) is obtained by taking a modulus m of K and
107
letting k be a class in the ray class group CK(m), and defining
ζ(s, k) =∑a∈k
1
N(a)s.
In particular when m = 1, ζK(s) =∑
k∈C(OK)
ζ(s, k).
We are interested in computing the limit of (s− 1)ζ(s, k) as s→ 1. If we write
ζ(s, k) =∑a⊂OK
χ(a)
N(a)s
where χ(a) = 1 if a ∈ k and 0 otherwise, then s(x) simply counts the number of ideals
of OK with norm less than or equal to x. By Proposition 2.4.4,
lims→1
(s− 1)ζ(s, k) = limx→∞
s(x)
x.
To evaluate the limit on the right, we require a bit more machinery.
For a lattice L in an n-dimensional vector space V (as in Section 1.7), and any
bounded region D ⊂ V , let T (γ) denote the number of points of γLv in D, where γ > 0
is real and Lv := v + L for some vector v ∈ V . Define the function M(t) = T (t−1).
Then the Euclidean volume (or Lebesgue measure) of D is given by [15]
vol(D) = limt→∞
M(t)
tn.
The plan is to identify s(x)x
with M(t)tn
for suitably chosen L,D and M(t). First we
observe the following.
Lemma 2.4.5. Each ray class k ∈ CK(m) contains an integral ideal.
Proof. Since CK(m) is finite, each prime not dividing m has some power in the trivial
108
class. If a = a1a−12 is an ideal in the class k, where a1 and a2 are integral ideals, then
at2 is trivial for some t > 1. Thus aat2 is an integral ideal in k = kat2.
Now suppose a is an integral ideal in k with N(a) ≤ n for a fixed n ∈ N. Then
for any integral ideal b ∈ k−1, ab = 0 in CK(m) so ab = (α) for some α ∈ b ∩Km,1
with N(α) ≤ nN(b). On the other hand, if we have such an α, then a = (α)b−1 ∈ k
has norm less than or equal to n. We summarize this in the following lemma.
Lemma 2.4.6. For any n, the value s(n) is the number of principal ideals (α) such
that α ∈ b ∩Km,1 and N(α) ≤ nN(b). Furthermore, there is some α0 ∈ K satisfying
α0 ≡ 1 mod m0 and α0 ≡ 0 mod b
such that α ≡ α0 mod m0b for every α counted by s(n).
The existence of such an α0 is guaranteed by the weak approximation theorem
(Section 2.3) and the fact that b ∈ Im implies b - m.
Now let β1, . . . , βn be a basis for the ideal m0b, where n = [K : Q]. Then we may
write any α from Lemma 2.4.6 in the form
α = α0 +n∑i=1
aiβi.
Moreover, α0 =∑hiβi for some hi ∈ Q. To connect ideals with lattices once again,
let L be the lattice in Rn of points with integer coordinates, i.e. L = Zn. Take
v = (hi) and recall the notation Lv = v + L. Then the map
Lv −→ K∗
(xi) 7−→∑
xiβi
gives a one-to-one correspondence between points in Lv and elements α ∈ K∗ which
satisfy Lemma 2.4.6. We also need
109
Lemma 2.4.7. Um,1 = UK ∩Km,1 is the direct product of a finite cyclic group with a
free abelian group of rank r+ s− 1, where r and s are respectively the number of real
and complex pairs of embeddings K ↪→ C.
Proof. Recall from Section 1.10 the map L : UK → Rr+s. We used L to embed UK
as an (r + s− 1)-dimensional lattice in Rr+s. Corollary 2.3.5 says that UK/Um,1 is a
finite group, which implies that L(Um,1) has finite index in L(UK) and so it too is an
(r + s− 1)-dimensional lattice.
From this, we have
Lemma 2.4.8 ([15]). Let wm denote the number of roots of unity in Um,1. Then there
are exactly wm · s(n) points (x1, . . . , xn) ∈ Lv which satisfy
(1) α =n∑i=1
xiβi.
(2) α ≡ 1 mod m∞.
(3) 0 < N(α) ≤ nN(b).
(4) L(α) = c0w0 +n∑i=1
ciwi, where 0 ≤ ci < 1, w0 = (
r︷ ︸︸ ︷1, . . . , 1,
s︷ ︸︸ ︷2, . . . , 2) and wi =
L(ui), the images of the generators of the unit group Um,1.
Proof sketch. We know there are s(n) principal ideals (α) satisfying (2) and (3) by
Lemma 2.4.6. Each ideal (α) may be generated by any α′ = uα, where u ∈ Um,1.
Out of all these elements, exactly wm satisfy (4). Finally, the map L : UK → Rr+s
restricted to Um,1 provides the connection between these ideals and points in Lv.
110
Now let D be the set of all points (x1, . . . , xn) ∈ Rn satisfying Lemma 2.4.8 such
that each xi ≥ 0. We skip straight to the statement of the volume; see section IV.2
of [15] to see how it is derived.
Proposition 2.4.9. As before, let r0 be the number of real primes dividing a modulus
m. For D defined above,
vol(D) =2r−r0 reg(m)(2π)s
N(m0b)√|dK |
where reg(m) is the regulator for Um,1, as defined in Section 1.10.
Recall that reg(m) is the determinant of the matrix whose ith row is L(ui). Above
we defined r0 to be the number of real primes dividing m∞. We can extend the norm
to any modulus by setting N(m∞) = 2r0 , so that N(m) = 2r0N(m0). This leads to
the main result.
Theorem 2.4.10. Let K be a number field, m a modulus of K and k a class of ideals
in CK(m). Then
lims→1
(s− 1)ζ(s, k) =2r(2π)s reg(m)
N(m)wm
√|dK |
where r is the number of real primes of K, s is the number of pairs of complex primes
of K and wm is the number of roots of unity in Um,1.
Corollary 2.4.11. Let ζK(s) be the Dedekind zeta function for a number field K.
Then
lims→1
(s− 1)ζK(s) =2r(2π)s reg(K)
wK√|dK |
hK
where wK = |µ(K)| and hK is the class number.
Proof. Remember that ζK(s) coincides with the sum of all the ζ(s, k) for m = 1,
111
i.e. k are the distinct ideal classes in C(OK). Taking the sum of the formula in
Theorem 2.4.10 over all k ∈ C(OK) gives the result.
Example 2.4.12. In the case when K = Q, the Riemann zeta function has a simple
pole at s = 1 since by Corollary 2.4.11,
lims→1
(s− 1)ζ(s) = 1.
This is a well-known fact about the Riemann zeta function, however our work on
ζK(s) gives us a simple proof. What’s more, the Dedekind zeta function for any
number field can be analytically continued to the whole complex plane except for a
simple pole at s = 1. For more information on this, see section 5.1 in [3].
Next we extend L-series to arbitrary number fields in a similar fashion to what
we did with zeta functions. Let m be a modulus of K and let χ be any multiplicative
function χ : CK(m)→ C×. We extend χ to a character on all of Im be defining χ(a)
for an ideal a ∈ Im to be the value of χ at the ideal class [a] in CK(m).
Definition. The L-series for χ is
L(s, χ) =∑a
χ(a)
N(a)s
where the sum is taken over all a ∈ Im, i.e. all integral ideals relatively prime to m.
Note that since χ(a) only depends on k = [a], we may express L(s, χ) in terms of
zeta functions as we did with the Dedekind zeta function:
L(s, χ) =∑
k∈CK(m)
χ(k)ζ(s, k).
As with L-series over the rational field, we have
112
Proposition 2.4.13 (Product Formula). Fix a modulus m of a number field K. For
all s ∈ C with Re(s) > 1 and for any character χ : Im → C×, L(s, χ) may be expressed
as the uniform limit of the product
L(s, χ) =∏p-m
(1− χ(p)
N(p)s
)−1
.
Proof. Let p be any prime ideal in OK . Then the series
(1− χ(p)
N(p)s
)−1
= 1 +χ(p)
N(p)s+
χ(p2)
N(p2)s+
χ(p3)
N(p3)s+ . . .
converges absolutely. Suppose p1, . . . , pr are all the primes in Im with norm at most
n – by the Minkowski bound from Section 1.7 there are finitely many of these. Then
r∏i=1
(1− χ(pi)
N(pi)s
)−1
=∑ χ(pa11 · · · parr )
N(pa11 · · · parr )s=∑a∈Im
N(a)≤n
χ(a)
N(a)s.
Rearranging the terms of the L-series, we see that
∣∣∣∣∣∣L(s, χ)−∏
N(p)≤n
(1− χ(p)
N(p)s
)−1
∣∣∣∣∣∣ ≤∣∣∣∣∣∣∑
N(a)>n
χ(a)
N(a)s
∣∣∣∣∣∣ .L(s, χ) converges for all Re(s) > 1 (in fact for all Re(s) > 0 as with L-series over Q;
see section 5.2 of [3]) so the remainder term on the right must tend to 0 as n → ∞.
Hence for all Re(s) > 1,
L(s, χ) =∏p-m
(1− χ(p)
N(p)s
)−1
.
113
The function log z is well-known from complex analysis. One typically restricts
its domain to(−π
2, π
2
)for Re(z) > 0 – called the principal branch of the logarithm –
and writes its series expansion as
− log(1− z) = z +z2
2+z3
3+ . . . =
∞∑n=1
zn
n.
It is also known that every L-series satisfies
logL(s, χ) =∑p∈Im
χ(p)
N(p)s+ gχ(s)
for some function gχ which is bounded on a neighborhood of s = 1. Consult [15] or
[23] for details of these and other analytic properties of L(s, χ).
Example 2.4.14. Suppose there are only a finite number of primes p ∈ Z. Then
ζ(s) = ζQ(s) would have to be bounded near s = 1. Recall that lims→1
(s − 1)ζ(s) = 1
by Example 2.4.12. Then (s− 1)ζ(s) is also bounded near s = 1. This means
log(s− 1) = log((s− 1)ζ(s))− log ζ(s)
is bounded near s = 1, which of course is impossible since log(s−1)→ −∞ as s→ 1.
This is a rather neat proof that there are an infinite number of rational primes using
the Riemann zeta function. Moreover, we showed that
log ζ(s) ∼ − log(s− 1)
where f(z) ∼ g(z) as usual means
limz→1|f(z)− g(z)| <∞.
This generalizes in an important way.
114
Definition. Let K be an algebraic number field and S a set of prime ideals in OK .
If there exists a real number δ such that
∑p∈S
1
N(p)s∼ −δ log(s− 1)
then S is said to have Dirichlet density δ, denoted δ(S) = δ.
Example 2.4.14 shows that the set of rational primes has Dirichlet density δ = 1.
In general, establishing that a set has nonzero density is important for the following
reason.
Proposition 2.4.15. For any set S whose Dirichlet density δ(S) is defined, 0 ≤
δ(S) ≤ 1, and if δ(S) 6= 0 then S is an infinite set.
Proof. The first statement comes from the more general fact that if T ⊆ S then
δ(T ) ≤ δ(S). This in turn is a result of the fact that∑p∈S
1
N(p)scannot be negative
for s ∈ R sufficiently close to s = 1. The prove the second statement, consider the
contrapositive: if S is finite then
∑p∈S
1
N(p)s∼ 0.
This is true by definition of ∼ and the desired statement follows.
Consider the set S of primes p ⊂ OK having inertial degree f = 1. We call S the
set of degree 1 primes of K. In the following lemma we prove that there are infinitely
many of these primes in any number field.
Lemma 2.4.16. The set S of degree 1 primes of a number field K is an infinite set.
115
Proof. Since there are only a finite number of primes that ramify in K, we may
assume S excludes these. Then S consists of precisely those primes p ∈ OK whose
norm N(p) is a prime integer. Then
log ζK(s) ∼∑p⊂OK
1
N(p)s
where the p are all primes in OK . For p 6∈ S (again excluding ramified primes, since
the sum above is bounded at s = 1 for finite sums), N(p) = pf ≥ p2, where p = p∩Z.
At most [K : Q] of these p have their norms equal to a power of the same prime.
Therefore we bound the sum by
∣∣∣∣∣∑p6∈S
1
N(p)s
∣∣∣∣∣ ≤ [K : Q]∑p prime
1
p2s.
The sum on the right is bounded at s = 1, so therefore
log ζK(s) ∼∑p∈S
1
N(p)s.
Lemma 2.4.11 now tells us that log(s−1)ζK(s) is bounded at s = 1, but since log(s−1)
is clearly not bounded at s = 1, we must have
∑p∈S
1
N(p)s∼ log ζK(s) ∼ − log(s− 1).
This shows that S is an infinite set; in fact, we have shown that δ(S) = 1. This will
be important in Section 2.5.
We will need the next theorem in the course of proving Dirichlet’s theorem on
arithmetic progressions in Section 2.5.
116
Theorem 2.4.17. Let m be a modulus of K and take H to be a subgroup PK(m, 1) ≤
H ≤ Im, setting h = [Im : H]. If S is a set of primes in H with density δ(S), then
δ(S) ≤ 1h
.
Proof. First note that Corollary 2.3.5 ensures that the index h will be finite. Let χ
be a character defined on Im/H; we may view χ as a homomorphism Im → C whose
kernel contains H. Then by previous remarks,
logL(s, χ) =∑p-m
χ(p)
N(p)s+ gχ(s)
for gχ(s) convergent on Re(s) > 0 and bounded at s = 1. For any p ∈ Im, the sum∑χ
χ(p) taken over all characters χ of Im/H is either h if p ∈ H or 0 otherwise. Then
we see that
∑p∈H
h
N(p)s=∑χ 6=χ0
(logL(s, χ)− gχ(s)
)+ log(s− 1)L(s, χ0)− log(s− 1)− gχ0(s).
We also have that ∑p∈S
1
N(p)s= −δ(S) log(s− 1) + g(s)
for some g(s) bounded at s = 1. Since S ⊆ H, Proposition 2.4.15 implies that
∑p∈H
1
N(p)s−∑p∈S
1
N(p)s≥ 0
for all real s > 1. Hence for all such s,
−(
1h− δ(S)
)log(s−1)+
∑χ 6=χ0
(logL(s, χ)−gχ(s)
)+log(s−1)L(s, χ0)−gχ0(s)−g(s) > 0.
Each of the logL(s, χ) terms are bounded at s = 1 unless L(1, χ) = 0, in which case
117
the terms become negatively infinite at s = 1. However since we are assuming that s
is real and s > 1, log(s − 1) is negative near s = 1. Hence for the above expression
to be positive, we must have 1h− δ(S) ≥ 0, which impies δ(S) ≤ 1
has claimed.
Our proof implies that if δ(S) = 1h
then L(1, χ) 6= 0 for any nonprincipal character
χ of Im/H. In Section 2.10 we will see that the condition δ(S) = 1[Im:H]
holds when S
is the set of splitting primes and use this to prove a generalization of the Frobenius
density theorem for non-abelian extensions.
2.5 The Frobenius Density Theorem
In this section we prove the first main density theorem used in class field theory. In
some ways the Frobenius density theorem has been rendered obsolete by the more
powerful Cebotarev density theorem (Section 2.10), but we felt it is important to see
Frobenius’ earlier result which was intimately related to Dirichlet’s study of primes
in arithmetic progression. At the end of the section, we present a proof of Dirichlet’s
Theorem using the Frobenius density theorem.
For this section, fix a number field K, a Galois extension L/K and let G =
Gal(L/K).
Definition. Let σ ∈ G be an element of order n. The division of σ is the set of all
elements of G which are conjugate to some σm where m ∈ Z is relatively prime to n.
Equivalently, the division of σ is the union of conjugacy classes of all generators of
the cyclic subgroup 〈σ〉.
Lemma 2.5.1. Let σ ∈ G, H = 〈σ〉 and t the number of elements in the division of
σ. Then t = φ(n)[G : NG(H)] where φ is Euler’s function and NG(H) denotes the
normalizer of H.
118
Proof. For all m relatively prime to n = |σ|, ZG(σm) = ZG(σ), where ZG denotes the
conjugacy class of an element. Thus as m ranges over the integers relatively prime
to n, we count φ(n)[G : ZG(σ)] conjugates. However, some of these need not be
distinct. An element is counted q times if it is conjugate to q distinct powers of q.
Equivalently, q counts the number of conjugates of σm which are also powers of σ,
i.e. q is the number of distinct automorphisms of H induced under the conjugation
action of G. Thus q = [NG(H) : ZG(σ)]. Putting this together,
t =φ(n)[G : ZG(σ)]
[NG(H) : ZG(H)]= φ(n)[G : NG(H)].
We now state and prove the Frobenius density theorem.
Frobenius Density Theorem. Let σ ∈ G = Gal(L/K), let t denote the number of
elements in the division of σ and let S be the set of primes p ⊂ OK such that there
is some prime P ⊂ OL whose Frobenius automorphism FrobL/K(P) is in the division
of σ. Then
δ(S) =t
|G|.
Proof. We induct on n = |〈σ〉|. For the base case, n = 1 means σ is the identity
and S is the set of primes of K which split completely in L. Let S∗ denote the set
of primes of p ⊂ OL dividing some prime in S. For each p ∈ S, there are exactly
|G| = [L : K] primes in S∗ dividing p, each of which has norm equal to p. Then
∑P∈S∗
1
NL/Q(P)s=∑P∈S∗
1
NK/Q(NL/K(P))s= |G|
∑p∈S
1
NK/Q(p)s.
Let T be the set of degree 1 primes of L (those having inertial degree f = 1 over Q).
Recall that in the proof Lemma 2.4.16 we showed that δ(T ) = 1. By properties of
119
Dirichlet density, T ⊆ S∗ implies that δ(S∗) ≥ δ(T ) = 1, so δ(S∗) = 1. This combines
with the above work to give us
∑p∈S
1
N (p)s∼ 1
|G|(− log(s− 1))
and hence δ(S) = 1|G| , proving the base case.
Now assume that n = |〈σ〉| > 1. Let H = 〈σ〉 and E = LH , the subfield of L
fixed by H. The primes p ⊂ OK which have at least one degree 1 prime factor in OE
are exactly those divisible by a prime P ⊂ OL such that FrobL/K(P) is conjugate to
some power of σ. In other words p ∈ Sd for some d | n.
For each d | n, let td denote the size of the division of σd. Let Sd denote the
set of OK-primes containing an OL-prime whose Frobenius automorphism lies in the
division of σd. By induction, we have δ(Sd) = td|G| when d 6= 1.
Let SE denote the primes of E having inertial degree 1 over K. For each p ∈ Sd
let n(p) denote the number of primes in SE dividing p. Then each p ∈ Sd is the norm
of exactly n(p) distinct primes in SE. As in the base case, SE contains all the degree
1 primes of E (over Q), so δ(SE) = 1. Therefore
− log(s− 1) ∼∑P∈SE
1
NK/Q(NE/K(P))s=∑d|n
∑p∈Sd
n(p)
N(p)s.
Note that for any p ∈ Sd, n(p) is exactly the number of distinct cosets Hτi such that
Hτiσd = Hτi (see section IV.4 of [15] for details). This coset equivalence occurs if
and only if τiσdτ−1i ∈ H, but since H is cyclic, this can only happen if τi ∈ NG(〈σd〉).
120
Thus n(p) = [NG(〈σd〉) : H] and using the inductive hypothesis, we write
[NG(H) : H]∑p∈S
1
N(p)s∼
−1 +∑d|nd6=1
[NG(〈σd〉) : H]td|G|
log(s− 1).
By Lemma 2.5.1, the coefficient on the right becomes
−1 +∑d|nd 6=1
φ(nd
)[G : NG(〈σd〉)] [NG(〈σd〉) : H]
|G|= −1 +
∑d|nd6=1
φ(nd
)|H|
= −1 +∑d|nd6=1
1
nφ(nd
)
= −1− φ(n)
n+
1
n
∑d|n
φ(nd
).
A well-known property of Euler’s function states that
∑d|n
φ(nd
)= n
so the whole coefficient is −1− φ(n)n
+ 1n· n = −φ(n)
n. Finally, this implies
∑p∈S
1
N(p)s∼ − φ(n)
[NG(H) : H]nlog(s− 1) = − t
|G|log(s− 1)
using Lemma 2.5.1 again. Hence δ(S) = t|G| .
Now we can prove an important property of the Artin map:
Corollary 2.5.2. Let L/K be an abelian extension of number fields and suppose S
is a finite set of primes of K that contains all the primes that ramify in L. Then the
Artin map ϕL/K : ISK −→ Gal(L/K) is surjective.
121
Proof. Let G = Gal(L/K) and take σ ∈ G. Since G is abelian, the division of σ
is precisely the set of generators of the cyclic group 〈σ〉. By the Frobenius density
theorem, there exist infinitely many primes P ⊂ OL such that FrobL/K(P) generates
〈σ〉 and so one can certainly be found outside the finite set S. Recall that when L/K
is abelian, ϕL/K is well-defined on the ideals of OK . Thus we can find p ⊂ OK such
that ϕL/K(p) = σ′, a generator of 〈σ〉. Since σ ∈ G was arbitrary, ϕL/K is onto.
Corollary 2.5.3 ([15]). Let L1 and L2 be Galois extensions of a number field K
and let S1 and S2 be the sets of primes of K which split completely in L1 and L2,
respectively. Then S1 ⊆ S2 if and only if L2 ⊆ L1.
Another important result we can prove now that we have the Frobenius density
theorem is known as the first fundamental inequality of class field theory. Recall the
map i : K∗ → IK that takes α 7→ (α). In Section 2.3 we denoted the image of Km,1
under this map by PK(m, 1); it is also common in the literature to write i(Km,1) so
we will use them interchangeably.
Theorem 2.5.4 (First Inequality). Let L/K be a Galois extension of number fields,
let m be a modulus of K and let ImL denote the subgroup of IL generated by all primes
P ⊂ OL for which P ∩K lies in ImK. Then
[ImK : NL/K(ImL )i(Km,1)] ≤ [L : K].
Proof. With finitely many exceptions, the primes that split completely in L lie in
NL/K(ImL ). By Frobenius density, the density of the set of these primes is
1
|G|=
1
[L : K]
since it is the set of primes p such that FrobL/K(pOL) = 1 ∈ G. Then by properties
122
of Frobenius density,
1
[L : K]≤ 1
[ImK : NL/K(ImL )i(Km,1)]
which implies the first fundamental inequality.
Under certain conditions the reverse inequality holds. This is called, as one might
expect, the second fundamental inequality of class field theory and will be discussed
in the next section.
We conclude the section with a proof of Dirichlet’s famous theorem on the infini-
tude of primes in arithmetic progression. We first use the Frobenius density theorem
to prove a nice fact that is often hard to come by: the cyclotomic polynomials are
irreducible.
Proposition 2.5.5. Let ζm denote a primitive mth root of unity. Then [Q(ζm) : Q] =
φ(m).
Proof. For m ∈ Z+, let m = (m)∞ which is a modulus of Q. Set H = i(Qm,1) ≤ ImQ .
Then by Example 2.3.8, the set of primes in Q that split completely in K = Q(ζm)
is precisely the primes in H. The Frobenius density theorem says that the density of
this set is 1[K:Q]
. Therefore by properties of Dirichlet density, this is at most
1
[ImQ : H]=
1
φ(m)
which implies [K : Q] ≥ φ(m). On the other hand, the minimal polynomial of ζm
over Q, which is by definition the mth cyclotomic polynomial, has degree ≤ φ(m)
since |G| = |(Z/mZ)×| = φ(m). Hence we conclude that [K : Q] = φ(m).
Corollary 2.5.6. For any nonprincipal character χ of the ray class group CQ(m),
where m = (m)∞ as above, L(1, χ) 6= 0.
123
Proof. Apply Theorem 2.4.17 and Proposition 2.5.5 to see that
∑χ 6=χ0
(logL(s, χ)− gχ(s)) + log(s− 1)L(s, χ0)− gχ0(s)− g(s) > 0
since the log(s−1) term from the proof of Theorem 2.4.17 vanishes. The terms in the
expression above are either all bounded at s = 1, or become negatively infinite when
L(1, χ) = 0. Since the expression must be positive, L(1, χ) must be nonzero.
The next result is the main step towards proving Dirichlet’s theorem. It is an
interesting result in its own right, since it unites the theories of L-series, Dirichlet
density and ray class groups we have studied so far.
Theorem 2.5.7. Let k0 be any ray class in CQ(m), where m = (m)∞. The set of
primes in k0 has density 1φ(m)
.
Proof. For any character χ of CQ(m) we have
log(s, χ) ∼∑p prime
χ(p)
ps=
∑k∈CQ(m)
χ(k)∑p∈k
1
ps.
Multiplying by χ(k−10 ) and summing over all characters of CQ(m) yields
logL(s, χ0) +∑χ 6=χ0
χ(k−10 ) logL(s, χ) =
∑k
∑χ
χ(k−10 k)
∑p∈k
1
ps.
Note the following orthogonality relations for a finite abelian group A:
(1) For χ1, χ2 characters on A,
∑a∈A
χ1(a)χ2(a) =
{0 if χ1 6= χ−1
2
|A| if χ1 = χ−12 .
124
(2) For any a, b ∈ A,
∑χ
χ(a)χ(b) =
{0 if ab 6= 1
|A| if ab = 1.
(For details, see section IV.3 of [15].) These imply
∑χ
χ(k−10 k) =
{0 if k 6= k0
φ(m) if k = k0
where the sum is over all characters χ of CQ(m). Moreover, Corollary 2.5.6 implies
that the sum over nonprincipal characters is bounded at s = 1 since L(1, χ) 6= 0 for
χ 6= χ0. Therefore
logL(s, χ0) ∼ φ(m)∑p∈k0
1
ps.
Recall from Section 2.4 that L(s, χ0) differs from the Riemann zeta function ζ(s) only
by finitely many terms, so logL(s, χ0) ∼ log ζ(s) ∼ − log(s − 1). Finally this shows
that ∑p∈k0
1
ps∼ − 1
φ(m)log(s− 1).
By definition this means the Dirichlet density of the set of primes in any k0 in the
ray class group CQ(m) is 1φ(m)
.
Now we are prepared to state and prove the famous result.
Dirichlet’s Theorem. For each positive integer m and each integer a relatively
prime to m, there are infinitely many primes p = mb+ a.
Proof. To access our work with the Dirichlet density, we turn the problem into one
involving ray classes. Suppose p is a prime in the arithmetic progression mb + a,
where b ∈ Z. Then mb + a ≡ a (mod m) implies mb+aa∈ Qm,1, where m = (m)∞ as
125
before. This means p lies in the coset aQm,1. On the other hand, if p ∈ aQm,1 then
p = axy
with x ≡ y (mod m). It follows that x ≡ mq + y and so p = mb+ a for some
b. Hence the primes congruent to a mod m generate a prime ideal in a fixed coset
of i(Qm,1), which is a ray class in the ray class group CQ(m). By Theorem 2.5.7, the
density of such primes is 1φ(m)
so in particular there are infinitely many.
Remarkably, Dirichlet proved his theorem several years before Frobenius had a
proof of the density theorem. We discuss the history of these theorems at greater
length in Section 2.10 and relate everything to Cebotarev’s density theorem.
Dirichlet’s theorem has an important generalization to classes of ideals in gener-
alized ideal class groups which we will examine in Section 2.10. The proof of that
result depends on the condition that L(1, χ) 6= 0 for any nonprincipal character χ of
the class group in question. One should note that such results are highly nontrivial,
as the nonvanishing of L-series in all cases is only guaranteed by a positive proof of
the Generalized Riemann Hypothesis.
2.6 The Second Fundamental Inequality
In Section 2.5, we proved that NL/K(ImL )i(Km,1) has index less than or equal to [L : K]
in ImK for any modulus m of K (the first fundamental inequality). We have also seen
(courtesy of Corollary 2.5.2) that the Artin map is surjective onto Gal(L/K), so
kerϕL/K has index [L : K] in ImK . We want to show kerϕL/K = NL/K(ImL )i(Km,1) for
all abelian extensions L/K precisely when m is divisible by all ramified primes of K.
This is obtained via the second fundamental inequality of class field theory:
Theorem 2.6.1 (Second Inequality). For an abelian extension L/K, if m is divisible
by the primes of K which ramify in L, then
[ImK : NL/K(ImL )i(Km,1)] ≥ [L : K].
126
In his formulation of the main theorems of class field theory, Takagi proved the
general form of the fundamental equality. Since our approach to the Artin reciprocity
theorem in Section 2.7 requires and later generalizes the cyclic case, it will suffice the
prove the second fundamental inequality for cyclic extensions L/K.
Let L/K be a Galois extension with cyclic Galois group G = 〈σ〉. Suppose m
is a modulus of K divisible by all primes that ramify in L. We first compute some
cohomology groups (in the sense of Section A.3 of the Appendix).
Proposition 2.6.2. Let L,K and m be as above. Then
(i) H0(ImL ) = ImK/N (ImL ).
(ii) H1(ImL ) = 1.
(iii) H0(L∗) = K∗/N (L∗).
(iv) H1(L∗) = 1.
Proof. (i) Let a =∏
Paii be a fractional ideal in ImL which is fixed by σ, i.e. a ∈
ker(σ−1). Since σ(a) = a, the distinct conjugates σj(Pi) of the primes over a appear
with the same exponent. If we denote p = Pi ∩K, then
pOL =
g−1∏j=0
σj(Pi)
where g is the smallest positive integer such that σj(Pi) = Pi. This demonstrates
that the Pi contribute precisely the factor pai to the decomposition of a, and since
Pi was arbitrary, we conclude that a ∈ ImK . Therefore ImK is the subgroup of ImL fixed
by G, so
H0(ImL ) = (ImL )G = ImK/N (ImL ).
(ii) Now suppose a ∈ kerN , so N (a) = OK . Let P0 ⊂ OL be a prime in the
factorization of a which has g distinct images under the G-action. For 0 ≤ i ≤ g− 1,
127
let Pi = σi(P0) and as above, let ai be the exponent of Pi in a. Let B =
g−2∏i=0
Pcii
where for each i, ci = a0 + . . .+ai. Then we have (σ−1)B = Pa00 Pa1
1 · · ·Pag−2
g−2 P−cg−2
g−1 .
Let pf = N (P0). Since N (a) = 1, we see that
N
(g−1∏i=0
Paii
)= pf(a0+...+ag−1) = 1.
Since f ≥ 1, this shows that a0 + . . .+ ag−1 = 0, i.e. −cg−2 = ag−1. Thus (σ− 1)B is
precisely the part of a contributed by the Pi. Since Pi was arbitrary, a ∈ im(σ − 1)
so kerN = im(σ − 1). By definition, this proves H1(ImL ) = 1.
(iii) comes from the fact that ker(σ − 1)L∗ = K∗.
(iv) is just Hilbert’s Theorem 90 (Appendix A.3).
Definition. For a modulus m of K divisible by the primes ramifying in L, we define
a G-module homomorphism jm : IL → ImL by
jm(P) =
{P if P - m1 if P | m.
We further define a homomorphism fm : L∗ → ImL as the composite fm = jm ◦ i, where
i : L∗ → IL is the inclusion α 7→ (α).
Let S be the set of primes dividing m and set LS = ker fm. Then we see that
LS = {α ∈ L∗ | i(α) is divisible only by primes in S}.
The following relates the Herbrand quotients (Appendix A.3) of LS, UL and ker jm.
Lemma 2.6.3. If q(UL) and q(ker jm) are defined then q(LS) = q(UL) q(ker jm).
128
Proof. Since fm(LS) = jm ◦ i(LS) = 1, we get an exact sequence
1→ i(LS)→ ker jm → C → 1
for some G-module C satisfying
C ∼=ker jmi(LS)
∼=ker jm
i(L∗) ∩ ker jm∼=i(L∗) ker jm
i(L∗).
Notice that C is itself a subgroup of C(OL) and since the class group is finite by
Corollary 2.3.5, so C is finite as well. Therefore by Corollary A.3.3, q(i(LS)) =
q(ker jm). Finally, the exact sequence
1→ UL → LS → i(LS)→ 1
and Corollary A.3.3 can similarly be used to conclude q(LS) = q(UL) q(i(LS)) =
q(UL) q(ker jm).
This lemma shows that computing q(LS) comes down to finding q(UL) and q(ker jm).
One can obtain the following results using local class field theory [15] or ideles [20].
Theorem 2.6.4. Let r0 be the number of infinite primes ramifying in the extension
L/K. Then q(UL) =[L : K]
2r0.
Theorem 2.6.5. Let jm : IL → ImL be the homomorphism defined above for a modulus
m of K containing every prime that ramifies in L. Then
q(ker jm) =1∏
p|m0epfp
where the product is over all primes p dividing m0 the finite part of m, and ep and fp
denote respectively the ramification index and inertial degree of p.
129
Corollary 2.6.6. Let S be the set of primes which divide m, a modulus of K con-
taining all ramified primes of L/K. Then the Herbrand quotient of LS is
q(LS) =[L : K]∏p|m epfp
.
Theorem 2.6.7. For a cyclic extension L/K, suppose m is a modulus of K divisible
by sufficiently high powers of the ramified primes in L/K. Then
a(m) := [K∗ : N (L∗)Km,1] =∏p|m
epfp.
Denote the main index in the fundamental inequality by
hm(L/K) = [ImK : NL/K(ImL )i(Km,1)].
To prove Theorem 2.6.1, we will prove hm(L/K) = [L : K] under certain conditions
on a cyclic extension L/K.
For the set S of primes dividing m, the map fm = jm ◦ i gives us an exact sequence
1→ LS → L∗fm−→ ImL → V → 1
for some group V . Looking closer, this sequence contains two short exact sequences:
1→ LSγ−→ L∗
α−→ fm(L∗)→ 1 (2.1)
and 1→ fm(L∗)β−→ ImL → V → 1. (2.2)
It is from these two sequences (and their cohomologies) that we derive the ingredients
for the second fundamental inequality. Define
P = {α ∈ K∗ | fm(α) ∈ N (ImL )}
and Q = {α ∈ K∗ | jm(α) ∈ N (ImL )i(Km,1)}.
130
Consider the following commutative diagram, which is constructed using the se-
quences (1) and (2) above.
N (L∗)Km,1
N (L∗)
N (ImL )i(Km,1)
N (ImL )X 1
P
N (L∗)
K∗
N (L∗)
ImKN (ImL )
coker f0 1
Q
N (L∗)Km,1
K∗
N (L∗)Km,1
ImKN (ImL )i(Km,1)
coker g 1
1 1 1
f ∗0 p∗
f0 p
g p′
Set n(m) = [Km ∩ i−1(N (ImL )) : Km,1 ∩N (L∗)]. A standard diagram chase (cf section
V.4 in [15]) shows that coker f0∼= coker g and | ker f0| = | ker g| · n(m). Note that
ker f0 =P
N (L∗)and ker g =
Q
N (L∗)Km,1
.
Next we relate ker f0 and coker f0 to q(LS). Recall from Proposition 2.6.2 that H1(L∗)
and H1(ImL ) are trivial. Then the exact sequences (1) and (2) from above give us exact
hexagons (see Lemma A.3.1) which may be laid flat:
1 H1(fm(L∗)) H0(LS) H0(L∗) H0(fm(L∗)) H1(LS) 1
1 H1(V ) H0(fm(L∗)) H0(ImL ) H0(V ) H1(fm(L∗)) 1
δ1 γ0 α0 δ2
δ3 β0 γ0 δ4
f0
131
The dashed arrow is the identity map on H0(fm(L∗)), and correspondingly the vertical
arrow is f0 = β0α0. Then
| coker f0| = [H0(ImL ) : im β0α0] = [H0(ImL ) : im β0] [im β0 : im β0α0]
= [H0(ImL ) : im β0][H0(fm(L∗)) : imα0]
[ker β0 : ker β0 ∩ imα0]by isomorphism theorems
= | coker β0|| cokerα0|
[ker β0 : ker β0 ∩ imα0]
= | im γ0|| im δ2|
[ker β0 : ker β0 ∩ imα0]by exactness
= | im γ0||H1(LS)|
[ker β0 : ker β0 ∩ imα0].
Also note that |H0(V )| = | im γ0| |H1(fm(L∗))| by the second exact hexagon, so
| coker f0| =|H0(V )| |H1(LS)|
|H1(fm(L∗))| [ker β0 : ker β0 ∩ imα0].
In a similar fashion, we use the exact hexagons to compute | ker f0|:
| ker f0| = | ker β0α0|
= | ker β0 ∩ imα0| | kerα0|
= | ker β0 ∩ imα0| | im γ0|
= | ker β0 ∩ imα0||H0(LS)||H1(fm(LS))|
.
Lemma 2.6.8. q(LS) =| coker f0|| ker f0|
.
132
Proof. By the computations above,
| coker f0|| ker f0|
=|H0(V )| |H1(LS)|
|H1(fm(L∗))| [ker β0 : ker β0 ∩ imα0]· |H1(fm(LS))|| ker β0 ∩ imα0| |H0(LS)|
=|H1(LS)||H0(LS)|
· |H0(V )|
| ker β0|=|H1(LS)||H0(LS)|
· |H0(V )|
|H1(V )|=q(LS)
q(V ).
Now, notice that since V is a quotient of the class group of L, which by Corollary 2.3.5
is finite, V is also finite. Then applying Corollary A.3.3 shows that q(V ) = 1. The
result follows.
We now focus on the bottom row of the big commutative diagram from above,
1 −→ ker g −→ K∗
N (L∗)Km,1
g−−−→ ImKN (ImL )i(Km,1)
p′−−−→ coker g −→ 1.
Using this and Theorem 2.6.7, we know that when m is divisible by sufficiently high
powers of the ramified primes in L/K,
hm(L/K) =| im g|| coker g|
= a(m)| coker g|| ker g|
.
Then by Lemma 2.6.8, this can be written
hm(L/K) = a(m)n(m)| coker f0|| ker f0|
= a(m)n(m)q(LS).
We are now ready to prove the second inequality for cyclic extensions.
Theorem 2.6.9 (Second Inequality for Cyclic Extensions). For L/K a cyclic exten-
sion of number fields and m a modulus of K divisible by sufficiently high powers of
the ramified primes of the extension,
hm(L/K) = [ImK : N (ImL )i(Km,1)] ≥ [L : K].
133
Proof. By the work directly preceding the theorem, hm(L/K) = a(m)n(m)q(LS). The
hypotheses allow us to apply Corollary 2.6.6 and Theorem 2.6.7, which say
q(LS) =[L : K]∏p|m epfp
and a(m) =∏p|m
epfp.
Putting these together with the expression for hm(L/K) yields
hm(L/K) = n(m)[L : K]
so in particular hm(L/K) ≥ [L : K]. This proves the second inequality.
Finally, combining the results from Theorems 2.5.4 and 2.6.9 gives us the funda-
mental equality for cyclic extensions.
Corollary 2.6.10 (Fundamental Equality for Cyclic Extensions). Let L/K be a Ga-
lois extension of number fields such that Gal(L/K) is cyclic. If m is a modulus of K
that is divisible by sufficiently high powers of every prime ramifying in L, then
[ImK : N (ImL )i(Km,1)] = [L : K].
2.7 The Artin Reciprocity Theorem
Recall the subgroup PK(m, 1) ≤ ImK for a modulus m of K. In Section 2.3 it was used
to define the ray class group CK(m) = ImK/PK(m, 1), and Corollary 2.3.5 showed that
PK(m, 1) has finite index in ImK .
Definition. Let K be a number field. A subgroup H of group of fractional ideals
prime to a modulus m of K is a congruence subgroup for m if PK(m, 1) ≤ H ≤ ImK .
The quotient ImK/H is called a generalized ideal class group for m.
Corollary 2.3.5 implies that every congruence subgroup has finite index in ImK .
134
Example 2.7.1. Let m = 1 so that ImK is the full group of fractional ideals IK . Then
PK = PK(m, 1) is a congruence subgroup for m. This shows that generalized ideal
class groups properly encompass the class group.
Example 2.7.2. Let O be the order of conductor f in K = Q(√−n) for n ∈ N.
We proved in Section 1.9 that the ideal class group for O can be written C(O) ∼=
IK(f)/PK,Z(f) where PK,Z(f) is the subgroup generated by principal fractional ideals
αOK with generators satisfying α ≡ a mod fOK , a ∈ Z and (a, f) = 1. Since fOK
is a modulus,
PK(fOK , 1) ≤ PK,Z(f) ≤ IK(f)
so C(O) is a generalized ideal class group for fOK .
It turns out that the generalized ideal class groups are exactly the Galois groups
of all abelian extensions of K. This correspondence is encoded in the Artin map
ϕL/K : ImK −→ Gal(L/K)
where m is chosen so that it is divisible by every ramified prime of K. We have
seen (courtesy of Corollary 2.5.2) that the Artin map is surjective onto Gal(L/K), so
kerϕL/K has index [L : K] in ImK .
The main result in this section is one of central importance in class field theory:
Artin Reciprocity Theorem. Let L/K be an abelian extension of number fields
with G = Gal(L/K). If m is a modulus divisible by sufficiently high powers of every
prime in K that ramifies in L, then the Artin map
ϕL/K : ImK −→ G
is surjective and kerϕL/K = NL/K(ImL )i(Km,1). In particular, G is a generalized ideal
class group for m.
135
We now focus on developing the tools to prove Artin reciprocity.
Definition. Let L/K be an abelian extension of number fields and take m a modulus
of K. We say the reciprocity law holds for the triple (L,K,m) provided i(Km,1) ⊆
kerϕL/K .
The reciprocity law is important to the proof of Artin reciprocity for the following
reason.
Lemma 2.7.3. If m is divisible by all primes ramifying in L and the reciprocity law
holds for (L,K,m) then kerϕL/K = NL/K(ImL )i(Km,1).
Proof. By Corollary 2.2.7 we know NL/K(ImL ) ⊆ kerϕL/K and so NL/K(ImL )i(Km,1) ⊆
kerϕL/K as long as the reciprocity law holds. The first fundamental inequality says
that
[ImK : NL/K(ImL )i(Km,1)] ≤ [L : K],
but since [ImK : kerϕL/K ] = |Gal(L/K)| = [L : K] by surjectivity, we must have
NL/K(ImL )i(Km,1) = kerϕL/K .
Example 2.7.4. We have previously shown (Example 2.3.8) that for a primitive
mth root of unity ζm and the modulus m = (m)∞, the reciprocity law holds for
(Q(ζm),Q,m) – in fact we proved that i(Qm,1) = kerϕQ(ζm)/Q.
Remark. By properties of the Artin map (Section 2.2), one can easily prove that
• If the reciprocity law holds for (L,K,m) and E is any finite extension of K,
then the reciprocity law holds for (LE,E,m).
• If the reciprocity law holds for (L,K,m), then it holds for (L,K,mn) where n
is any modulus of K.
136
• Combining these with the previous example, we see that for any primitive mth
root of unity ζm and any modulus m of K divisible by (m)∞, reciprocity holds
for (K(ζm), K,m).
It is clear that creating certain cyclotomic extensions of number fields is critical
to preserving the reciprocity law. This connection runs deep throughout this section,
culminating in the Kronecker-Weber Theorem at the end.
Before proving some properties of cyclotomic extensions of K, we need two results
from number theory.
Lemma 2.7.5 ([15]). Let a, r ∈ Z such that a, r ≥ 2 and let q be prime. Then there
exists a prime p such that ordp(a) = qr.
Lemma 2.7.6 ([15]). Let n be an integer with prime factorization n = pr11 · · · prss .
Then for any integer a > 1 there exist infinitely many squarefree integers m such that
n | ordm(a). Furthermore, there exists an integer b > 1 such that a 6≡ b (mod m) and
n | ordm(b).
Now let L/K be an abelian extension of number fields.
Proposition 2.7.7. Let n = [L : K] and suppose s is a positive integer. Take a prime
p ⊂ OK which is unramified in L. Then there exists a primitive mth root of unity
ζm, with E = K(ζm), such that m is relatively prime to p and s, and the following
conditions are met:
(i) L ∩ E = K.
(ii) The element ϕE/K(p) in Gal(E/K) has order divisible by n.
(iii) There is some element σ ∈ Gal(E/K) whose order is divisible by n that
satisfies 〈σ〉 ∩ 〈ϕE/K(p)〉 = {1}.
137
Proof. (i) We apply Lemma 2.7.6 to a = N(p). Since L only has finitely many
subfields, there is some M such that Q(e2πi/M) contains every cyclotomic subfield of
L. Lemma 2.7.6 allows us to select m with no prime divisors less than M · s. Then
Q(e2πi/M) ∩ Q(ζm) = Q and L ∩ Q(ζm) = Q. Taking E = K(ζm) it follows that
L ∩ E = K.
(ii) Let τ = ϕE/K(p) ∈ Gal(E/K). By definition ϕE/K(p) is a Frobenius automor-
phism satisfying τ(ζm) = ζN(p)m = ζam. Thus τ has order divisible by n.
(iii) Finally, choose b ∈ Z according to Lemma 2.7.6 and define σ ∈ Gal(E/K) on
the primitive element of E/K by σ(ζm) = ζbm. Then σ has order divisible by n. Since
(a, b) = 1, it is clear that 〈σ〉 ∩ 〈τ〉 = {1} as desired.
Lemma 2.7.8 (Artin). Let L/K be a cyclic extension and p ⊂ OK a prime that is
unramified in L. Then there exists an mth root of unity ζm and an extension F/K
such that
(1) L ∩ F = K.
(2) L ∩K(ζm) = K.
(3) L(ζm) = F (ζm).
(4) p splits completely in F .
Proof. Choose m and ζ = ζm as in Proposition 2.7.7. Then L(ζ) = LE and L ∩
E = K (so (2) is done). This means that Gal(L(ζ)/K) ∼= Gal(L/K) × Gal(E/K).
Let σ be a generator of Gal(L/K) and choose τ ∈ Gal(E/K) according to (iii) of
Proposition 2.7.7. Define H to be the subgroup of Gal(L(ζ)/K) generated by (σ, τ)
and (ϕL/K(p), ϕE/K(p)). We claim that F = (LE)H is the desired field extension of
K.
By Corollary 2.2.3, ϕLE/K(p) = (ϕL/K(p), ϕE/K(p)) which generates the decom-
position group of p (rather, a prime lying over p) in Gal(LE/K), so in particular the
138
decomposition group is contained in H. Since LE is abelian, it follows that p splits
completely in F = (LE)H , proving (4).
Next, note that F (ζ) = FE is the fixed field of H∩(Gal(L/K)×{1}). Suppose we
have an element (σ, τ)a(ϕL/K(p), ϕE/K(p))b of H that lies in Gal(L/K) ∩ {1}. Then
τa ∈ 〈ϕE/K(p)〉 so τa = 1 since 〈τ〉 ∩ 〈ϕE/K(p)〉 = 1 by (iii) of Proposition 2.7.7. This
implies n = [L : K] divides a, and since the order of σ is n we have σa = 1. This
further shows that ϕE/K(p)b = 1 and n | b by Proposition 2.7.7. Thus ϕL/K(p)b = 1.
All of this shows that H ∩ (Gal(L/K) × {1}) = {1} so F (ζ) = LE = L(ζ), proving
(3).
Finally, observe that L∩F is the subfield of L fixed by H. Since (σ, τ) ∈ H, L∩F
is really the subfield fixed by σ, which is K. This proves (1) and we’re finished.
We next prove an intermediate result for cyclic extensions which we will use to
prove the Artin Reciprocity Theorem for all abelian extensions.
Theorem 2.7.9. Let L/K be a cyclic extension, G = Gal(L/K), m a modulus of
K divisible by all ramified (in L) primes of OK. Then the reciprocity law holds for
(L,K,m).
Proof. By Corollary 2.6.10, the fundamental equality holds for the cyclic extension
L/K, so it suffices to prove kerϕL/K ⊆ NL/K(ImL )i(Km,1). Take an ideal a ∈ kerϕL/K
and write its prime factorization a = pa11 · · · parr . The pi are all unramified in L since
a ∈ ImK and m is assumed to contain all the ramified primes. For each pi we may
use Artin’s Lemma to select a root of unity ζmisuch that (mi,mj) = 1 for all i 6= j,
i, j = 1, . . . , r. By Proposition 2.7.7, we can also force K ∩ Q(ζmi) = Q for each
i. Define Gi := Gal(K(ζmi)/K). Then Gi
∼= Gal(Q(ζmi)/Q) and the automorphism
group of L(ζm1 , . . . , ζmr)/K is G×G1 × · · · ×Gr.
139
Suppose G = 〈σ〉. For each i let τi be the element in Gi chosen via (iii) of
Proposition 2.7.7. Let Hi be the subgroup of G×Gi generated by the elements
(σ, τi) and (ϕL/K(pi), ϕK(ζmi )/K(pi)).
Furthermore, let Fi be the fixed field of Hi ×∏j 6=i
Gj and set F = F1 · · ·Fr. We take
a moment to verify that L ∩ F = K and Gal(L/K) = Gal(LF/F ). Note that the
intersection of all the Gal(LF/Fi) fixes F and contains (σ, τ1, . . . , τr). The field L∩F
is also fixed by this element and by (1, τ1, . . . , τr) so L∩F is fixed by σ and therefore
L ∩ F = K.
Now let ϕL/K(paii ) = σdi where di ≥ 0. Then 1 = ϕL/K(a) = σd where d =
d1 + . . .+ dr and [L : K] | d. For a sufficiently large modulus m′, the Artin map
ϕLF/F : Im′
F −→ Gal(LF/F )
is surjective so there is an ideal b0 relatively prime to m and all the mi such that
ϕLF/F (b0) = σ. Let b = NF/K(b0) ∈ ImK . By properties of the Artin map in extensions
(Proposition 2.2.6), we see that ϕL/K(b) = σ. For each i, pi splits completely so there
exists an ideal ci relatively prime to m and each mj such that NFi/K(ci) = paii b−di .
By our choice of di,
ϕLFi/Fi(ci) = ϕL/K(NFi/K(ci)) = 1.
By properties of the reciprocity law, Fi ⊂ LFi ⊂ Fi(ζmi) and so the reciprocity law
holds for (LFi, Fi,m′) as long as m′ is divisible by (mi)∞.
We chose ci prime to the mi so we may select m′ so that ci ∈ Im′
Fi. Then there exist
γi ∈ Fi, γi ≡ 1 mod m′ and an ideal di ∈ Im′
LFisuch that ci = (γi)NLFi/Fi
(di). Taking
K-norms yields
paii b−di = (NFi/K(γi))NLFi/K(di).
140
Selecting m′ so that m | m′ ensures that αi := NFi/K(γi) lies in Km,1. Now taking
products of the above pieces over all i gives us
ab−d =r∏i=1
paii b−di =
r∏i=1
αi
r∏i=1
NLFi/K(di).
Write d′i = NLFi/L(di). Then a = bd(α1 · · ·αr)NL/K(d′1 · · · d′r). Above we saw that [L :
K] divides d, so bd is a norm on L/K. Hence we have shown that a ∈ NL/K(ImL )i(Km,1)
and the theorem is proved.
A small bit of work remains to prove the main result, which we restate here.
Artin Reciprocity Theorem. Let L/K be an abelian extension with G = Gal(L/K).
Suppose m is a modulus of K divisible by all primes in K which ramify in L and as-
sume their exponents are sufficiently large. Then the Artin map
ϕL/K : ImK −→ G
is surjective with kerϕL/K = NL/K(ImL )i(Km,1).
Proof. Surjectivity was proven in Corollary 2.5.2. By the fundamental theorem of
finite abelian groups we can express G as the product of cyclic groups:
G = C1 × · · · ×Gs.
Set Hj =∏i 6=j
Ci so that G = Ci ×Hi for any i. Let Ei denote the subfield of L fixed
by Hi. Then Ei/K is a cyclic extension with Galois group Ci and by Theorem 2.7.9
there is a modulus mi such that the reciprocity law holds for (Ei, K,mi). We may
choose each mi so that mi | m, meaning the reciprocity law also holds for (Ei, K,m)
141
and thus
i(Km,1) ⊆s⋂i=1
kerϕEi/K .
By properties of the Frobenius automorphism (Proposition 2.2.6), we have ϕL/K(a)|Ei=
ϕEi/K(a) for any fractional ideal a ofOK . In particular, if a ∈ i(Km,1) then ϕL/K(a)|Ei=
1 for all i. But E1 · · ·Es = L because the group that fixes all the Ei is⋂
Hi = {1}.
Thus any automorphism acting trivially on all the Ei is the identity on L, which gives
us i(Km,1) ⊆ kerϕL/K . The theorem follows at once from Lemma 2.7.3.
We have therefore also proven Theorem 1.8.4 which was instrumental in construct-
ing the connection between the Hilbert class field and the class group C(OK). Here
we have proven a much stronger connection between Artin maps for a large class of
moduli and generalized ideal class groups. The full picture will become clear in Sec-
tion 2.9 when we show that the finite abelian extensions of K and generalized ideal
class groups are in correspondence.
Corollary 2.7.10 ([15]). Let L/K be abelian and suppose m is a modulus of K such
that the reciprocity law holds for (L,K,m). If E is a normal extension of K such that
NE/K(ImE) ⊆ IL/K(ImL )i(Km,1)
then L ⊂ E.
We use this corollary to prove another important result in class field theory. One
has probably noticed by now that the roots of unity are an important tool in describ-
ing Artin reciprocity for abelian extensions. The famous Kronecker-Weber Theorem
characterizes every abelian extension of Q as a subfield of some cyclotomic field.
Kronecker-Weber Theorem. Every abelian extension K of Q is contained in
Q(ζm) for some primitive mth root of unity ζm.
142
Proof. Our proof of the Artin Reciprocity Theorem shows that the reciprocity law
holds for (K,Q,m) for some modulus m. We may write m = (m)∞ where m is
a positive integer. Let ζm = e2πi/m, a primitive mth root of unity, and consider
L = Q(ζm). In Example 2.3.8 we computed the kernel of ϕL/Q to be i(Qm,1), so we
have
i(Qm,1) = NL/Q(ImL )i(Qm,1) ⊆ NK/Q(ImK)i(Qm,1) = kerϕK/Q.
By Corollary 2.7.10, we conclude that K ⊂ L = Q(ζm).
For a proof of Kronecker-Weber that does not rely on class field theory, see the
exercises in chapter 4 of [18]. This completes our discussion of Artin reciprocity and
the Kronecker-Weber Theorem for now, although these concepts continue to crop up
in future discussions as they are integral to class field theory as a whole.
2.8 The Conductor Theorem
For an abelian extension L/K, the Artin reciprocity theorem and its corollary (2.7.10)
imply that Gal(L/K) is a generalized ideal class group for an infinite number of
moduli m, namely those divislbe by the primes of K that ramify in L. There is in
fact a ‘best’ modulus for a particular extension L/K, called the conductor, which is
divisible by only those primes that ramify.
Fix a prime p ⊂ OK and take m to be any modulus divisible by p. Theorem 2.3.4
gives us an exact sequence
0→ (OK/pm(p))× → Km/Km,1 → CK(m)ϕL/K−−−→ C(OK)→ 0,
where ϕL/K is the Artin map for m. There is a smallest integer f(p) ≤ m(p) such
that this sequence factors through (OK/pf(p))×.
Definition. Let f(p) be as above and let m∞ be the modulus of all infinite primes
143
of K. The modulus f(L/K) = m∞∏
pf(p) is called the conductor of the extension
of L/K. It is the smallest modulus f such that the Artin map ϕL/K factors through
CK(f).
Proposition 2.8.1. If the reciprocity law holds for (L,K,m) then f(L/K) | m.
Proof. Obvious.
So far we do not know if the reciprocity law holds for f(L/K); of particular concern
is that some ramified primes might not divide the conductor. The Conductor Theorem
states that this does not happen.
The Conductor Theorem. Let L/K be abelian with conductor f = f(L/K). Then
a prime of K (finite or infinite) ramifies in L if and only if it divides f. Moreover, a
modulus m is divisible by f if and only if kerϕL/K is a congruence subgroup for m.
The proof of the conductor theorem is rather interesting, as it makes extensive use
of the local Artin map and thus establishes one of the powerful local-global connections
in class field theory. For details, consult sections V.11–12 of [15].
Proposition 2.8.2. Let L = Q(ζm) where ζm is a primitive mth root of unity. The
conductor of L/Q is determined by
f(L/Q) =
1 m ≤ 2
(n)∞ m = 2n where n > 1 is odd
(m)∞ otherwise.
Proof. The conductor theorem says that f(L/Q) is the modulus of L divisible by
exactly those primes, finite and infinite, which ramify in L. Every modulus of L/Q
is of the form (n)∞ for some integer n, so write f = (n)∞. When m = 1, 2 the
conductor is clearly 1 since Q(ζm) = Q in both cases. When m > 2, Example 2.3.8
144
tells us that all ramified primes divide the modulus m = (m)∞, so by definition the
conductor divides (n)∞, that is, n | m.
What’s more, m is a modulus on L that is divisible by every ramified prime of
both L and M = Q(ζn). This implies that kerϕM/K(m) is a subgroup of kerϕL/K(m),
which by Corollary 2.7.10 shows that L ⊂ M . Since both extensions are Galois, we
must have that |Gal(M/Q)| divides |Gal(L/Q)|, that is, φ(m) | φ(n). It is well known
that n | m always implies φ(n) | φ(m) so in this case we see that φ(n) = φ(m). Now,
under the condition n | m, this can only happen when m and n are equal or differ by
a single factor of 2. Notice that this corresponds precisely with the second and third
lines of the formula for f(L/Q) given above, so we are done.
Example 2.8.3. Let K = Q(√D) for a squarefree integer D. Using the definition of
conductor we have
f(K/Q) =
{(|dK |) D > 0
(|dK |)∞ D < 0.
2.9 The Existence and Classification Theorems
Definition. Suppose L/K is an abelian extension and m is a modulus of K. If H is
a congruence subgroup for m then L is said to be a class field of H.
The goal of class field theory is then to classify all abelian extensions by their class
groups. We will prove the Existence Theorem:
Theorem. Let m be a modulus of K and let H be a congruence subgroup for m. Then
there exists an abelian extension L ⊃ K, all of whose ramified primes divide m, such
that H is the kernel of the Artin map ϕL/K : ImK −→ Gal(L/K), that is, L is a class
field of H.
145
Directly constructing a class field for H is hard, so the usual approach in class
field theory texts [15] is to construct enough extensions to force the existence of L.
Lemma 2.9.1. Let m be divisible by all primes of K ramifying in L and suppose
there is a chain of subgroups
i(Km,1) ≤ H0 ≤ H1 ≤ Im
such that H0 is a congruence subgroup for an abelian extension L/K. Then H1 is a
congruence subgroup for the subfield of L fixed by the subgroup ϕL/K(H1) ≤ Gal(L/K).
Proof. Let G1 = ϕL/K(H1) and let E be the subfield of L fixed by G1. Let r :
Gal(L/K) → Gal(E/K) be the natural restriction, so that r(G1) = 1. For any
a ∈ Im, ϕE/K(a) = (r ◦ ϕL/K)(a) so in particular ϕE/K(a) = 1 when a ∈ H1. Thus
H1 ⊂ kerϕE/K .
On the other hand, since H1 is a congruence subgroup the reciprocity law holds
for (E,K,m) and so
[Im : kerϕE/K ] = [Gal(L/K) : G1] = [Im : H1].
This proves H1 = kerϕE/K and the Artin reciprocity theorem implies the rest.
Lemma 2.9.2. Let H be a congruence subgroup of K for the modulus m. To show
there exists a class field L of H, it suffices to prove this when K contains a primitive
nth root of unity, where n = [Im : H].
Proof. We create a tower
K = K(1) ⊂ K(2) ⊂ · · · ⊂ K(r) = K(ζn)
where each subextension K(i+1)/K(i) is cyclic. Now apply Lemma 2.9.1 and Proposi-
tion V.7.2 from [15].
146
This allows us to assume K contains the nth roots of unity. Let S1 be a finite set
of primes of K and let
m1 =∏p∈S1
pm1(p)
for sufficiently high powers m1(p). Define S2 and m2 in the same way and suppose
S1 ∩ S2 = ∅ and that S1 ∪ S2 contains all primes p satisfying
(i) p | n;
(ii) p | ∞;
(iii) and p | ai where {ai} is a finite set of OK-ideals whose images cover C(OK).
Then any ideal a can be expressed as a = ai(α) for some α ∈ K and ai only divisible
by primes in S := S1 ∪ S2. Define the congruence subgroups
H1 = i(Km1,1)(Im1)nI(S2)
and H2 = i(Km2,1)(Im2)nI(S1)
where I(Sj) denotes the group generated by finite primes in Sj. (These are congruence
subgroups since S1 ∩ S2 = ∅ implies H1 ⊆ Im1 and H2 ⊆ Im2 .) Next we define two
subgroups of K∗:
W1 = KSKn ∩Km2,1
and W2 = KSKn ∩Km1,1.
We claim that L1 = K( n√W1) and L2 = K( n
√W2) are the respective class fields over
K for H1 and H2. This is proven in detail in section V.9 of [15]. We will end the
discussion here, since our goal is to explore the consequences of the existence theorem.
In any case, the construction of such a class field L1 for H1 allows us to prove
The Existence Theorem. Every congruence subgroup H of K has a class field.
147
We consolidate the proof here.
Proof. Take a congruence subgroup H and set [Im : H] = n. Lemma 2.9.2 says that
we may assume K contains the nth roots of unity. Let S1 be a finite set of primes
containing all primes dividing m and satisfying (i) – (iii) above. Let S2 = ∅ so that
S = S1 ∪ S2 = S1. Define m1 as above so that m | m1. Then H1 = H ∩ Im1 and by
the above work there is an abelian extension L1 with H1 = kerϕL1/K . Finally, by
Lemma 2.9.1 there is a subfield L of L1 which is class field for H ⊆ H1.
An important corollary is the classification theorem of class field theory, which
bears a resemblance to the fundamental theorem of Galois theory. Such classification
theorems are a primary tool in many areas of modern mathematics. First, we need
Lemma 2.9.3. Suppose n and m are moduli of K such that n | m. If Hn is a
congruence subgroup for n and Hm = Hn ∩ ImK then the class groups InK/Hn and
ImK/Hm are isomorphic.
Proof. Since Hn is a congruence subgroup for n, InK = ImKHn, so by isomorphism
theorems,
ImKHm
=ImK
ImK ∩Hn∼=ImKH
n
Hn=InKHn
.
The Classification Theorem. Let K be a number field. There is a one-to-one,
inclusion-reversing correspondence
{finite abelian
extensions L/K
}←→
{generalized ideal
class groups of K
}.
Proof. The existence theorem shows that every congruence subgroup corresponds to
an abelian extension. Conversely, let L and M be abelian extensions of K. Consider
148
the Artin maps ϕL/K : If(L/K)K → Gal(L/K) and ϕM/K : I
f(M/K)K → Gal(M/K), where
f denotes the conductor of each extension. By the conductor theorem, kerϕL/K and
kerϕM/K are both congruence subgroups for K and by Lemma 2.9.3 it suffices to prove
the correspondence for these congruence subgroups. On one hand, Corollary 2.7.10
shows that if kerϕL/K ⊆ kerϕM/K then M ⊂ L. On the other hand, M ⊂ L implies
that kerϕL/K ⊂ kerϕM/K and so the correspondence is indeed one-to-one.
At this point we return to the defining property of the Hilbert class field which we
have so far neglected to justify. Take the modulus m = 1 on K and the congruence
subgroup PK = PK(m, 1) ≤ ImK = IK . By the existence theorem, there is a unique
abelian extension L/K such that the Artin map induces the isomorphism
C(OK) = IK/PK ∼= Gal(L/K).
Using this, we may now prove
Theorem 2.9.4. For a number field K, the Hilbert class field L/K is the maximal
unramified abelian extension of K.
Proof. Since m = 1, it follows that L is unramified. Let M be another unramified
abelian extension of K. By the conductor theorem, the primes of K dividing the
conductor f(M/K) are exactly those which ramify in M . There are none of these,
so f(M/K) = 1. The conductor theorem also tells us that kerϕM/K is a congruence
subgroup for m = 1. Then PK ⊂ kerϕM/K , but for the Hilbert class field L, PK =
kerϕL/K . Thus kerϕL/K ⊂ kerϕM/K . Finally Corollary 2.7.10 shows thatM ⊂ L.
We have now proven in greater generality all of the main theorems from Sec-
tion 1.8. Finally, we briefly mention a nice property of the Hilbert class field which
was conjectured by Hilbert and proven by Artin and Furtwangler using the transfer
map in group theory.
149
Theorem 2.9.5 (Principal Ideal Theorem). If L is the Hilbert class field of K, then
every ideal a ⊂ OK becomes principal in OL.
2.10 The Cebotarev Density Theorem
In understanding the connections between the density theorems of Frobenius and
Cebotarev, it is important to study how they fit in with other related results. Frobe-
nius proved his theorem in 1880 (and finally published the result 16 years later; see
[26]), but this came several decades after Dirichlet’s more famous theorem on primes
in arithmetic progression (see Section 2.5). Although his original proof did not refer
to the idea of density, Dirichlet’s result essentially showed that for any m ∈ Z, the
density of the set
S = {p prime | p ≡ a (mod m), (a,m) = 1}
is δ(S) = 1ϕ(m)
. Frobenius successfully generalized this result to describe the splitting
behavior of monic polynomials f over Fp, where p is a prime not dividing the dis-
criminant D(f). In loose terms, Frobenius’ result showed that the number of primes
p such that f has a given decomposition over Fp is proportional to the number of au-
tomorphisms σ ∈ Gal(K/Q) with the same cycle type as this decomposition, where
K is a splitting field of f over the rationals. We illustrate this with an example.
Example 2.10.1. Let f = x4 − x− 1. Some decomposition patterns of f over finite
fields are shown below.
f ≡ (x3 + 3x2 + 2x+ 5)(x+ 4) (mod 7)
f ≡ x4 − x− 1 (mod 47)
f ≡ (x2 + 34x+ 24)(x2 + 67x+ 21) (mod 101).
(These factorizations are easy to produce with the MAGMA code found in Ap-
150
pendix A.4.) It turns out [26] that f factors into the different decompositions (parti-
tions of n = 4) with the following approximate frequencies:
decomposition proportion of primes
4 14
3,1 13
2,2 18
2,1,1 14
1,1,1,1 124
For example, the prime 7 falls into the set C1,3 = {p prime | f = gh3 (mod p)}, while
47 ∈ C4 and 101 ∈ C2,2. Correspondingly, Frobenius’ theorem says that the number
of automorphisms σ ∈ G = Gal(K/Q) with cycle type 4 is |G|4
; likewise, the number
of σ with cycle type 1,3 is |G|3
; the number with cycle type 2,2 is |G|8
; and so forth.
In every case, the identity automorphism is the only element of G with cycle type
1,1,1,1, which tells us that |G| = 24 and we can go back and compute the number of
elements of each cycle type accordingly.
So far we have seen that for a field K/Q, classes of primes are in a certain cor-
respondence with the various cycle types of elements of the Galois group of this
extension. The natural question arising from this discussion is: given a polynomial
f and a prime p that doesn’t divide D(f), is it possible to find, in some canonical
way, an element in G with the same cycle type as the decomposition of f over Fp?
This would successfully generalize both Dirichlet’s and Frobenius’ results, and in-
deed Frobenius conjectured that it was possible. The solution was finally found by
Cebotarev after 42 years in the form of his density theorem.
For the next few theorems, we will assume K is a number field and E is a normal,
not necessarily abelian, extension of K, with Galois group G = Gal(E/K).
Let m be a modulus divisible by sufficiently high powers of all the primes of K
which ramify in E. Then the group Hm(E/K) := NE/K(ImE)i(Km,1) is a congruence
151
subgroup for m and so the Existence Theorem tells us there is a (unique) abelian
extension L/K that is class field for Hm(E/K). We may ‘enlarge’ m by forming a
modulus n such that m | n and NE/K(InE) ⊆ Hn(L/K). By Corollary 2.7.10, L ⊂ E
so we may as well use m after all. This tells us that Hm(E/K) = Hm(L/K) and
moreover,
ImK/Hm(E/K) = ImK/H
m(L/K) ∼= Gal(L/K).
To identify Hm(E/K) with Gal(E/K), we prove the following theorem which also
serves to generalize the Artin map to the non-abelian case.
Theorem 2.10.2. L is the largest abelian subfield of E and therefore Gal(L/K) ∼=
G/G′ where G′ denotes the commutator subgroup of G.
Proof. First suppose L ⊂M ⊂ E where M/K is abelian. By norm properties,
NE/K(ImE)i(Km,1) ⊆ NM/K(ImM)i(Km,1) ⊆ NL/K(ImL )i(Km,1)
but we showed that the first and last are equal, so it follows that L = M since both
are abelian. Now this tells us by the classification theorem that Gal(L/K) is the
largest possible quotient of G that is abelian. By definition this is the abelianization
of G, so Gal(L/K) ∼= G/G′.
To describe the isomorphism, let P be a prime in ImE and let p = P ∩K. By the
proof of Theorem 1.3.3, the primes lying over p are Galois conjugates under the action
of G and therefore p determines a conjugacy class of the Frobenius automorphism
FrobE/K(P). This means that p determines a single element in G/G′. We define the
Artin map for non-abelian extensions to be
ϕE/K(p) :=
(E/K
P
)G′.
152
By the work above, this extends to a homomorphism ImK → G/G′.
To complete the description of ϕE/K , we compute its kernel. By Proposition 2.2.2,
(E/K
P
)∣∣∣∣L
=
(L/K
PL
)where PL = P ∩ L.
Thus ϕE/K(p) = ϕL/K(p)G′ so kerϕL/K ≤ kerϕE/K . But kerϕL/K = Hm(E/K)
which was shown to have index [G : G′] in ImK . Hence kerϕE/K = Hm(E/K) and our
description is complete.
Remark. The above proof and discussion shows that [ImK : Hm(E/K)] = [G : G′].
In particular, this means that for a non-abelian extension of number fields the first
fundamental inequality (Theorem 2.5.4) is strict.
As another consequence of the classification theorem, we have the following gen-
eralization of Corollary 2.5.6.
Proposition 2.10.3. Let χ be a nontrivial character of the ray class group CK(m) =
ImK/PK(m, 1). Then L(1, χ) 6= 0.
Proof. Let H = PK(m, 1). Then there is an abelian extension L/K that is the
class field of H – this is called the ray class field for the modulus m. Note that,
except for a finite number, all the primes of K which split in L are contained in H.
Thus by the Frobenius density theorem the density of this set of primes is 1[L:K]
. By
the Artin reciprocity theorem, this is equal to 1[ImK :H]
. Finally, apply the comments
following Theorem 2.4.17 to conclude that L(1, χ) 6= 0 for any nontrivial character of
ImK/H.
This can be used to prove the following generalization of Dirichlet’s Theorem.
153
Theorem 2.10.4 (Dirichlet’s Theorem for Number Fields). Let H be a congruence
subgroup for a modulus m. Then any coset of H in ImK contains infinitely many primes
and the density of this set of primes is1
[ImK : H].
We are now ready to state and prove the main theorem of this section.
Cebotarev Density Theorem. Let L/K be a Galois extension of number fields and
suppose an element σ ∈ G = Gal(L/K) belongs to a conjugacy class C. Then the set
S of all primes p ⊂ OK divisible by a prime P ⊂ OL such that FrobL/K(P) ∈ C has
density
δ(S) =|C|
[L : K].
Proof. Let E be the subfield of L fixed by the cyclic subgroup 〈σ〉. Then since
Gal(L/E) = 〈σ〉, the extension L/E is abelian. Let T ′ be the set of primes P ⊂ OE
with FrobL/E(P) = σ. By Theorem 2.10.4, δ(T ′) = 1|〈σ〉| . Recall that Lemma 2.4.16
says we may restrict our attention to the set T of primes in T ′ with inertial degree
f(E/K) = 1, since δ(T ) = δ(T ′).
For any P ∈ T with p = P ∩ K, we will count the number of Pi ∈ T dividing
p. Take Q ⊂ OL lying over P such that FrobL/E(Q) = σ. Let {τi} be a transversal
of 〈σ〉 in Gal(L/K); one will recall that this means 〈σ〉τi are all the distinct cosets of
〈σ〉. By transitivity of the G-action on primes over P, the primes in L dividing p are
τi(Q) and these are distinct. Likewise the primes of E dividing p are Pj := τj(Q)∩E.
It is a property of the Frobenius automorphism (III.2.8 in [15]) that
Pj ∈ T ⇐⇒ 〈σ〉τjσ = 〈σ〉τj.
154
So in particular,
FrobL/E(Pj) =
(L/E
τj(Q)
)= τj
(L/E
Q
)τ−1j = τjστ
−1j .
It follows that Pj ∈ T ⇐⇒ τjστ−1j = σ. Since the τj and therefore the Pj are distinct
(remember that {τj} is a transversal of 〈σ〉), the number of primes in T dividing p is
equal to [ZG(σ) : 〈σ〉] where ZG(σ) is the centralizer of σ in G = Gal(L/K).
Now let S denote the set of OK-primes divisible by a prime in T and choose some
p ∈ S. There are precisely [ZG(σ) : 〈σ〉] primes P ∈ T for which NE/K(P) = p. This
implies that [ZG(σ) : 〈σ〉] · δ(S) = δ(T ) = 1|〈σ〉| . Finally, we conclude that
δ(S) =1
|〈σ〉| · [ZG(σ) : 〈σ〉]=
1
|ZG(σ)|=|C||G|
=|C|
[L : K].
The Cebotarev density theorem immediately gives us the following result for
abelian extensions.
Corollary 2.10.5. Let L/K be abelian, m a modulus of K divisible by all primes
that ramify in L, and σ ∈ Gal(L/K). Then the set S of primes p - m such that(L/K
p
)= σ has density δ(S) =
1
[L : K]and in particular S is infinite.
This corollary is similar to the conclusion in the proof of Theorem 2.5.4, and both
density theorems imply the surjectivity of the Artin map (this was originally proven in
Corollary 2.5.2). However, Cebotarev’s result implies surjectivity in a much stronger
sense, in that the density of primes in L is uniformly distributed across the collection of
sets S corresponding to conjugacy classes in G. Recall that with Frobenius’ theorem,
this density was only uniformly distributed across divisions, a much less intuitive
object to work with in the group-theoretic sense.
155
The Cebotarev density theorem is undoubtedly one of the most useful tools in
modern algebraic number theory, and is beginning to have practical application in
algebraic geometry. One important result for our purposes answers a question posed
back in Section 1.3.
Proposition 2.10.6. For any Galois extension L/K, there are infinitely many primes
of K that split completely in L.
Proof. Apply the Cebotarev density theorem to the conjugacy class of 1 ∈ Gal(L/K)
to see that the primes p ⊂ OK such that
(L/K
p
)= 1 have density
1
[L : K]. Then
Proposition 1.8.3 says that
(L/K
p
)= 1 ⇐⇒ p splits completely in L.
This implies the result.
Example 2.10.7. To illustrate the differences between conjugacy class, division and
cycle type and their associated densities, consider the group G = Z/3Z×Z/3Z. The
reason is that these three types of partitions are all distinct for G, as we will see in a
moment. To apply the density theorems to G we must find a Galois extension M/Q
such that G = Gal(M/Q). We provide two computational methods of constructing
such an extension below.
The hard way is to find two extensions K/Q and L/Q of degree 3 and take their
compositum. Corollary 14.22 from [10] says if K and L are Galois extensions of
Q and K ∩ L = Q then the Galois group of their compositum is a direct product
Gal(KL/Q) ∼= Gal(K/Q) × Gal(L/Q). There are two concerns: we want M/Q to
be Galois with Gal(M/Q) ∼= Z/3Z× Z/3Z and we also want K and L to be normal
subfields of M .
156
Q
K = Q(α) L = Q(β)
M = Q(α, β)
3 3
3 3
9
Q(ζ9) Q(ζ13)
Q(ζ117)
By the Kronecker-Weber Theorem (Section 2.7), we can find all of these abelian
extensions within cyclotomic fields. It is a fact (cf. 14.27 in [10]) that if gcd(m,n) = 1
then Gal(Q(ζmn)/Z) ∼= Gal(Q(ζm)/Q) × Gal(Q(ζn)/Z) where ζj denotes a primitive
jth root of unity. For our purposes we want an integer k = mn such that gcd(m,n) =
1 and 3 divides ϕ(m) and ϕ(n); this way we can find subfields of degree 3.
Along these lines, we chose m = 9 and n = 13. We found subfields K = Q(α)
and L = Q(β), where α = ζ9 + ζ89 and β = ζ13 + ζ5
13 + ζ813 + ζ12
13 . The previous
paragraphs ensure that M = Q(α, β) is a Galois extension of Q with Galois group
Gal(M/Q) ∼= Z/3Z× Z/3Z. All of this is verified with the following Magma code.
> A<x> := PolynomialRing(Integers());
> C<z> := CyclotomicField(9);
> D<w> := CyclotomicField(13);
> f := MinimalPolynomial(z+z^8);
> g := MinimalPolynomial(w+w^5+w^8+w^12);
> K<a> := NumberField(f);
> L<b> := NumberField(g);
> M<c> := Compositum(K,L);
> IsNormal(M);
true
> G := GaloisGroup(M);
> G;
Permutation group G acting on a set of cardinality 9
157
Order = 9 = 3^2
(1, 5, 6)(2, 3, 8)(4, 7, 9)
(1, 2, 9)(3, 4, 5)(6, 8, 7)
> h<x> := MinimalPolynomial(c);
> h;
x^9 + 3*x^8 - 18*x^7 - 38*x^6 + 93*x^5 + 147*x^4 - 161*x^3 - 201*x^2
+ 57*x + 53
The package galpols provides an easier option for generating a polynomial p ∈
Z[x] such that a splitting field of p over Q has a desired Galois group. For instance,
the code
> load galpols;
> P<x> := PolynomialRing(IntegerRing());
> p := PolynomialWithGaloisGroup(9, 2);
> p;
x^9 - 15*x^7 - 4*x^6 + 54*x^5 + 12*x^4 - 38*x^3 - 9*x^2 + 6*x + 1
returns a polynomial p such that Gal(p) ∼= Z/3Z × Z/3Z — the (9, 2) notation
comes from the numbering scheme for transitive groups created by Butler and McKay
in [4]. As the polynomials h and p are different, their splitting fields are different but
both have the desired Galois group. For the rest of the example we will work with
h(x) = x9 + 3x8 − 18x7 − 38x6 + 93x5 + 147x4 − 161x3 − 201x2 + 57x+ 53.
Consider G = Z/3Z×Z/3Z as above. Since G is abelian, there are nine singleton
conjugacy classes in G. These can be viewed with the ConjugacyClass(G) code in
Magma. On the other hand, there are five different divisions and two cycle types in
G. The cycle types are (1) for the identity and (3, 3, 3) for the remaining elements;
the divisions can be viewed with the following code:
> Divisions(G);
[
Id($),
(1, 5, 6)(2, 3, 8)(4, 7, 9),
(1, 3, 7)(2, 4, 6)(5, 8, 9),
(1, 2, 9)(3, 4, 5)(6, 8, 7),
158
(1, 8, 4)(2, 7, 5)(3, 9, 6)
]
(For the code defining the function Divisions(G), see Appendix A.4.) Now the
next three tables display the distributions of primes p ≤ 10, 000 whose Frobenius
elements occur among the different divisions, cycle types and conjugacy classes of G,
where G is identified with Gal(M/Q) for M defined above.
> frobTal := FrobeniusTally(h,10000);
> divTally := DivisionTally(frobTal);
> divTally;
[
<Id($), 126>,
<(1, 5, 6)(2, 3, 8)(4, 7, 9), 272>,
<(1, 3, 7)(2, 4, 6)(5, 8, 9), 277>,
<(1, 2, 9)(3, 4, 5)(6, 8, 7), 273>,
<(1, 8, 4)(2, 7, 5)(3, 9, 6), 277>
]
> cycTally := CycleTypeTally(frobTal);
> cycTally;
[
<[ <1, 9> ], 126>,
<[ <3, 3> ], 1099>
]
> ccTally := ConjClassTally(frobTal);
> ccTally;
[
<Id($), 126>,
<(1, 5, 6)(2, 3, 8)(4, 7, 9), 135>,
<(1, 6, 5)(2, 8, 3)(4, 9, 7), 137>,
<(1, 2, 9)(3, 4, 5)(6, 8, 7), 137>,
<(1, 9, 2)(3, 5, 4)(6, 7, 8), 136>,
<(1, 3, 7)(2, 4, 6)(5, 8, 9), 139>,
<(1, 7, 3)(2, 6, 4)(5, 9, 8), 138>,
<(1, 8, 4)(2, 7, 5)(3, 9, 6), 143>,
<(1, 4, 8)(2, 5, 7)(3, 6, 9), 134>
]
(These functions can also be found in Appendix A.4.) Notice that the distribution
159
is essentially uniform across each of the three types of partitions of G; that is, the
distribution of primes in an element of a given partition is proportional to the size of
the element of the partition.
2.11 Ring Class Fields
In the final section of Chapter 2, we will utilize class field theory to construct an
extension of an imaginary quadratic field that corresponds to an order O, general-
izing the Hilbert class field from Section 1.8. We will use this extension to prove a
characterization theorem for when a prime has the form x2 + ny2, finally answering
our motivating question.
Let K be a number field. An ideal m ⊂ OK can be viewed as a modulus of K.
We will usually be working with principal ideals αOK , in which case we will denote
the group of fractional ideals derived from the modulus (α) by IK(α), with principal
subgroup PK(α, 1). From Theorem 1.9.13, we know the class group for an order O is
C(O) = I(O)/P (O) ∼= IK(f)/PK,Z(f)
where f is the conductor of O in OK . Then clearly PK,Z(α) is a congruence subgroup:
PK(α, 1) ≤ PK,Z(α) ≤ IK(f)
so C(O) is a generalized ideal class group for K corresponding to the modulus fOK .
The existence theorem (Section 2.9) then says that there is a unique abelian extension
L/K such that Gal(L/K) ∼= C(O).
Definition. For an order O in a number field K, the unique abelian extension L ⊃ K
satisfying Gal(L/K) ∼= C(O) is called the ring class field of the order O.
Some authors ([5] for example) denote a ring class field by KO. It is clear from the
classification theorem that the ring class field of the maximal order OK is precisely the
160
Hilbert class field of K. We will see that ring class fields are a useful generalization
of the Hilbert class field in many ways.
On the group theory side of things, we have the following characterization of the
Galois group of a ring class field.
Lemma 2.11.1. Let L be the ring class field of the order O in an imaginary quadratic
field K. Then L/Q is Galois and its Galois group can be written as a semidirect
product
Gal(L/Q) ∼= Gal(L/K) o (Z/2Z),
where the nontrivial element in Z/2Z acts on Gal(L/K) via σ 7→ σ−1.
Proof. See [7] section 6 and Lemma 9.3.
As we did with the Hilbert class field, we begin by relating a prime p = x2 + ny2
to its splitting behavior in the ring class field of Z[√−n].
Theorem 2.11.2. Fix n ∈ N, let K = Q(√−n) and let L be the ring class field of
the order Z[√−n] in K. If p is an odd prime not dividing n, then
p = x2 + ny2 ⇐⇒ p splits completely in L.
Proof. Let O = Z[√−n] and denote its conductor by f . The discriminant of O is
D = −4n, so we know from Section 1.9 that−4n = f 2dK , where dK is the discriminant
of K. If p - n is an odd prime, then of course p - f 2dK and so by Corollary 1.6.13, p
is unramified in K. As with the analogous Theorem 1.8.8, we prove the equivalence
161
of the following statements:
(i) p = x2 + ny2 ⇐⇒ pOK = pp, p 6= p and p = αOK for some α ∈ O (ii)
⇐⇒ pOK = pp, p 6= p and p ∈ PK,Z(f) (iii)
⇐⇒ pOK = p 6= p and
(L/K
p
)= 1 (iv)
⇐⇒ pOK = p 6= p and p splits in L (v)
⇐⇒ p splits in L. (vi)
(i) ⇐⇒ (ii) Suppose p = x2 + ny2 = (x +√−ny)(x −
√−ny). Let p =
(x +√−ny)OK , so that pOK = pp is the prime factorization of p in OK . Since p is
unramified in K, p 6= p. Also note that x +√−ny ∈ Z[
√−n]. This entire argument
is reversible, as in the proof of Theorem 1.8.8.
(ii) ⇐⇒ (iii) follows from Theorem 1.9.13.
(iii) ⇐⇒ (iv) ⇐⇒ (v) Note that
IK(f)/PK,Z(f) = C(O) ∼= Gal(L/K)
where the isomorphism is the Artin map ϕL/K . This shows that p ∈ PK,Z(f) if and
only if
(L/K
p
)= 1, and Proposition 1.8.3 further implies that
(L/K
p
)= 1 if and
only if p splits completely in L.
(v) ⇐⇒ (vi) Finally, Lemma 2.11.1 shows that L is Galois over Q and so as in
the proof of Theorem 1.8.8, p splits in L if and only p splits in K and some prime lying
over p (e.g. p) splits in L. This proves all equivalences and hence the theorem.
We finally arrive at the main characterization theorem for primes of the form
x2 + ny2. This argument, due to Cox [7], successfully generalizes Theorem 1.8.7 to
all positive integers n.
162
Theorem 2.11.3. For every integer n > 0, there is a monic irreducible polynomial
fn(x) of degree h(−4n) with integer coefficients such that for all odd primes dividing
neither n nor the discriminant of fn,
p = x2 + ny2 ⇐⇒(−np
)= 1 and fn(x) ≡ 0 (mod p) for some x ∈ Z.
Furthermore, any such choice of fn(x) will be the minimal polynomial of a real al-
gebraic integer α for which L = K(α) is the ring class field of the order Z[√−n] in
K = Q(√−n).
Proof. As in the proof of Theorem 1.8.7, knowing L/Q is Galois allows us to pick a real
algebraic integer α that generates L/K, that is L = K(α). Let fn(x) be the minimal
polynomial of α over K. By definition such a polynomial is monic, irreducible and
has integer coefficients. Moreover, fn must have degree [L : K] = h(O) = h(−4n).
Let p be a prime not dividing n or the discriminant of fn. Then fn is separable
mod p, so p splits completely in K if and only if(−np
)= 1. We may assume p splits
completely in K, which means OK/p ∼= Z/pZ for an OK-primes p such that p = p∩Z.
Since fn is separable over Z/pZ, it is also separable over OK/p. Hence Theorem 1.3.4
shows that
p splits completely in L ⇐⇒ fn(x) ≡ 0 mod p has a solution in OK
⇐⇒ fn(x) ≡ 0 mod p has a solution in Z.
The main equivalence follows from Theorem 2.11.2.
To address fn(x), note that there are infinitely many choices of such a polynomial
since there are infinitely many primitive elements of the extension L/K. We want
to prove that the possible fn(x)’s that arise are exactly those which are the minimal
polynomials of primitive elements of L/K. Let f be a monic integral polynomial of
163
degree h(−4n) satisfying the main equivalence of the theorem. Let g ∈ K[x] be an
irreducible factor of f(x) and let M = K(α) where α is a root of g. Note that if we
knew L ⊂M , then
h(−4n) = [L : K] ≤ [M : K] = deg g ≤ deg f = h(−4n).
Therefore if L ⊂ M then we would be able to conclude that L = K(α) and f is the
minimal polynomial of α over K. To verify L ⊂ M we need the next lemma which,
once established, will allows us to finish the proof of Theorem 2.11.3.
To borrow the notation in [7], given two sets S and T , we will write S·⊂ T if S is
contained in T except for a finite number of elements. We will apply this in the next
lemma to the set SL/K = {p ⊂ OK | p is prime and splits completely in L}.
Lemma 2.11.4. Let L and M be Galois extensions of a number field K and define
S = SL/Q = {p ∈ Z prime | p splits completely in L}
and T = {p ∈ Z prime | p is unramified in L, f(p | p) = 1 for some p ⊂ OM}.
Then L ⊂M ⇐⇒ T·⊂ S.
Proof. First, if L ⊂ M then T is clearly a subset of S. Conversely, suppose T·⊂
S. Let N be a Galois extension of K containing both L and M as subfields. By
the fundamental theorem of Galois theory, it will suffice to show that Gal(N/M) ≤
Gal(N/L).
Take any σ ∈ Gal(N/M); we will show that σ restricts to the identity on L. By
the Cebotarev density theorem, there exists an OK-prime p that is unramified in N
for which
(N/K
p
)is the conjugacy class of σ – recall from Section 2.10 that when
N/K is non-abelian, the Artin symbol describes a conjugacy class of the Galois group.
164
Thus for some P ⊂ ON lying over p,
(N/K
P
)= σ. Define Q = P ∩ OM . Then for
any α ∈ OM ,
α ≡ σ(α) ≡ αN (p) mod Q
by definition of the Artin map (and the fact that σ ∈ Gal(N/M)). This shows that
OM/Q ∼= OK/p so f(Q | p) = 1, which further implies that p ∈ T . In fact, the
Cebotarev density theorem guarantees that there are infinitely many of these primes
p and since we assumed T·⊂ S, we may therefore assume p is one of the primes of T
which lies in S. Now this means
(L/K
p
)= 1 and by Proposition 2.2.2,
1 =
(L/K
p
)=
(N/K
P
)∣∣∣∣L
= σ|L .
Hence σ ∈ Gal(N/L) and the lemma is proved.
To finish the proof of Theorem 2.11.3, let L,M and K be as described previously.
Define S = SL/Q and T as in Lemma 2.11.4. By Theorem 2.11.2, S is exactly the
set of primes p = x2 + ny2. Since f is assumed to satisfy the main equivalence in
Theorem 2.11.3, S contains, with finitely many exceptions, the primes p which split
completely in K and for which f(x) ≡ 0 is solvable mod p. If p ∈ T , there is some
prime P ∈ OM such that f(P | p) = 1. Let p = P ∩ OK so that by properties of
inertial degree,
1 = f(P | p) = f(P | p)f(p | p) =⇒ f(p | p) = 1.
Thus p splits completely in K.
Let α ∈ OM be the algebraic integer for f from the theorem. Then since
g(α) = f(α) = 0, f(x) ≡ 0 mod P has a solution. However, f(P | p) = 1 im-
plies that Z/pZ ∼= OM/P and so f(x) ≡ 0 has solution in integers. By definition this
165
means p is in S which proves S contains T with finitely many exceptions. Applying
Lemma 2.11.4 shows that L ⊂M and therefore we have finished checking everything
in the proof of Theorem 2.11.3.
Let’s pause for a moment to see how far we have come. Beginning with Exam-
ple 1.3.7, where we proved Fermat’s theorem on primes of the form x2 +y2, we utilized
the properties of the Hilbert class field to characterize primes of the form x2 +ny2 for
infinitely many n – this was Theorem 1.8.7. In order to answer the x2 +ny2 question
for all integer n, we needed the full force of class field theory, notably Cebotarev’s
density theorem, culminating in Theorem 2.11.3. However, both theorems have the
same weakness: they do not provide a method for producing the primitive element α
of the ring class field L for Q(√−n).
It turns out that there is an element j(O), called the j-invariant of the order O,
that generates L/K where L is the ring class field of K. Its defining characteristics
are described in the so-called First Fundamental Theorem of Complex Multiplication:
Theorem 2.11.5. Let O be an order in an imaginary quadratic field K.
(1) For any proper fractional O-ideal a, j(a) is an algebraic integer.
(2) For any proper fractional O-ideal a, K(j(a)) is the ring class field of K.
(3) For any two proper fractional ideals a, b ⊂ O, j(a) and j(b) are conjugate and
therefore they are all roots of a single irreducible polynomial HO(x) ∈ Q[x]
which satisfies
HO(x) =
h(O)∏i=1
(x− j(ai)),
where h(O) is the class number of O and ai are distinct representatives of the
class group for O.
166
(4) The equation HO(x) = 0 is called the class equation for O and there exists an
algorithm for computing the class equation.
The First Fundamental Theorem of CM usually refers to (1) and (2). The details
of (3) and (4), i.e. the computation of the minimal polynomial of j(O), can be
found in [7]. In practice, it is rather difficult to compute HO(x) but there have been
significant results in recent years (surveyed in [7]) that make it easier to compute in
special cases.
167
Chapter 3: Quadratic Forms and n-Fermat Primes
The main focus in the first two chapters was on developing the tools necessary
for answering the question “Given a natural number n and a prime p, when does
p = x2 +ny2 have a solution in integers x and y?” The object x2 +ny2 is an example
of a quadratic form. In this chapter we will further explore the theory of quadratic
forms and then prove several results about the special case x2 + ny2. Finally, in
Section 3.3 we define a symmetric n-Fermat prime to be a prime x2 + ny2 such that
y2 + nx2 is also prime and describe the distribution of such primes for various values
of n.
3.1 The Theory of Binary Quadratic Forms
There is a rich history of the study of quadratic forms dating back at least to Fermat.
Some of the greatest mathematical minds, from Euler and Gauss to Legendre and
Lagrange, contributed to the theory which we survey here.
Definition. A binary quadratic form is a function f(x, y) = ax2 +bxy+cy2 where
a, b and c are integers.
Fermat was one of the earliest mathematicians to study binary quadratic forms
[7]. His motivation was the study and proof of such theorems as
Theorem 3.1.1 (Fermat). Let p be an odd prime.
(i) p = x2 + y2, x, y ∈ Z ⇐⇒ p ≡ 1 (mod 4).
(ii) p = x2 + 2y2, x, y ∈ Z ⇐⇒ p ≡ 1, 3 (mod 8).
(iii) p = x2 + 3y2, x, y ∈ Z ⇐⇒ p = 3 or p ≡ 1 (mod 3).
168
Euler was able to prove more complicated formulas of this flavor using his two-step
Descent-Reciprocity method (explained in detail in chapter 1 of [7]) which ultimately
evolved into Gauss’s cherished quadratic reciprocity. We have proven (i) ourselves in
Example 1.3.7 and (ii) and (iii) are easy consequences of Theorem 2.11.3 so we have
already done a lot of work on the easiest types of these problems.
Definition. A form f(x, y) = ax2 + bxy + cy2 is primitive if gcd(a, b, c) = 1.
Since any binary quadratic form is a multiple of a primitive one, we will implicitly
assume any form we are working with is primitive.
Definition. A form f(x, y) represents an integer k is there exist integers x and y
such that f(x, y) = k. Further, f(x, y) properly represents k if x and y may be
chosen such that gcd(x, y) = 1.
In the theory of quadratic forms, there is a crucial idea of equivalence called proper
equivalence, which we define here:
Definition. Two forms f(x, y) and g(x, y) are properly equivalent if there is an
invertible matrix P ∈ SL2(Z) such that f(x) = g(Px).
It is easy to see that proper equivalence is an equivalence relation on the set of
binary quadratic forms and furthermore, that properly equivalent forms represent the
same integers.
Example 3.1.2. Let f(x, y) = ax2 + bxy + cy2 and take any integer n. Note that
the matrix T =
[1 n0 1
]has determinant 1 and therefore T ∈ SL2(Z). Consider
f(T x) = f(x+ ny, y)
= a(x2 + 2ny + n2y2) + b(x+ ny)y + cy2
= ax2 + (b+ 2an)xy + (an2 + bn+ c)y2.
169
Therefore f(x, y) is properly equivalent to ax2 + (b + 2an)xy + (an2 + bn + c)y2 for
any n ∈ Z.
Lemma 3.1.3. A form f(x, y) properly represents k ∈ Z if and only if f(x, y) is
properly equivalent to kx2 + b′xy + c′y2 for some b′, c′ ∈ Z.
Proof. ( =⇒ ) Let f(x, y) = ax2 + bxy + cy2 and suppose k = f(p, q) for relatively
prime integers p, q. Then there exist integers r, s such that ps−qr = 1. Set P =
[p qr s
]and notice that detP = ps − qr = 1 so P ∈ SL2(Z). Then writing xT = (x y) we
have
f(Px) = f(px+ qy, rx+ sy)
= a(px+ qy)2 + b(px+ qy)(rx+ sy) + c(rx+ sy)2
= f(p, q)x2 + (2apr + bps+ brq + 2cqs)xy + f(r, s)y2
which is of the form kx2 + b′xy + c′y2.
( =⇒) If f is properly equivalent to g(x, y) = kx2 +b′xy+c′y2 then they represent
the same integers. Notice that g(1, 0) = k so g properly represents k and therefore so
does f .
Definition. The discriminant of a binary quadratic form ax2 + bxy + cy2 is D =
b2 − 4ac.
This is not to be confused with the discriminant of an ideal (Section 1.6) or an
order (Section 1.9). We will see in Section 3.2 that there is a close connection between
quadratic forms and orders in imaginary quadratic fields and the multiple notions of
discriminant will actually coincide in the end.
It’s easy to prove that properly equivalent forms have the same discriminant.
Moreover, the second half of the proof of Lemma 3.1.3 actually shows that every
170
integer is properly represented by some quadratic form, so the proper equivalence on
forms corresponds to a partition of Z.
If D > 0 is the discriminant of f(x, y) then f represents some positive and negative
integers, but if D < 0, the integers represented by f are either all positive or all
negative. Accordingly, we define
Definition. Let f(x, y) be a binary quadratic form of discriminant D. If D < 0 we
say f is positive definite or negative definite according to the sign of the integers
f represents. If D > 0 we say f is indefinite.
Proposition 3.1.4. Let f(x, y) = ax2 + bxy + cy2 be a primitive form.
(i) For every prime p, one of f(1, 0), f(0, 1), f(1, 1) is relatively prime to p.
(ii) For every integer M , f(x, y) properly represents an integer relatively prime
to M .
Proof. (i) If p divides f(1, 0) and f(0, 1), this implies p | a and p | c, so f(1, 1) =
pa′ + b + pc′ where a = pa′ and c = pc′. Since f(x, y) is primitive, gcd(a, b, c) = 1 so
p cannot divide b and therefore p - f(1, 1). Similarly, if p divides f(1, 0) and f(1, 1),
p must divide a and a + b which implies p | b as well. Then f(0, 1) = c but since
gcd(a, b, c) = 1, p cannot divide c. Thus p - f(0, 1). The third case is identical to the
second.
(ii) Let M be given. For each prime pi in the prime factorization of M , part (i)
says that one of f(1, 0), f(0, 1), f(1, 1) represents a number that is relatively prime
to pi. We will prove the case where M = p1p2 and then induction on the number of
prime factors will finish the proof of (ii).
Let k1 and k2 be integers such that p1 - k1 and p2 - k2. By (i), we may suppose
f(x, y) represents k1 (mod p1) via f(x1, y1) and it represents k2 (mod p2) via f(x2, y2)
171
for some x1, x2, y1, y2 ∈ Z. By the Chinese remainder theorem, let K be the unique
integer modulo p1p2 satisfying
K ≡ k1 (mod p1)
K ≡ k2 (mod p2).
Also using the Chinese remainder theorem, define A and B to be the unique solutions,
modulo p1p2, to
A ≡ 1 (mod p1) B ≡ 1 (mod p2)
A ≡ 0 (mod p2) B ≡ 0 (mod p1).
Then we can write K = Ak1 +Bk2. In other words, K is the inverse image of (k1, k2)
under the isomorphism given by the primary decomposition of M :
Z/(M) ∼= Z/(p1)× Z/(p2)
Ai+Bj 7−→(i, j).
We use these ingredients to show that f(x, y) properly represents K modulo p1p2.
Consider
f(Ax1 +Bx2, Ay1 +By2) = a(A2x21 + ABx1x2 +B2x2
2)
+ b(A2x1y1 + ABx2y1 + ABx1y2 +B2x2y2)
+ c(A2y21 + ABy1y2 +B2y2
2).
Reducing mod p1, the Bs are all 0 so we have
f(Ax1 +Bx2, Ay1 +By2) ≡ ax21 + bx1y1 + cy2
1 ≡ k1 (mod p1).
On the other hand, reducing mod p2 yields
f(Ax1 +Bx2, Ay1 +By2) ≡ ax22 + bx2y2 + cy2
2 ≡ k2 (mod p2).
172
By our choice of K, this shows that f(Ax1 + Bx2, Ay1 + By2) is congruent to K
(mod p1p2). Therefore f(x, y) represents K, which is relatively prime to M by con-
struction.
Example 3.1.5. To illustrate Proposition 3.1.4, consider f(x, y) = 2x2 + 3xy + 6y2.
Let p1 = 11 and p2 = 13, whereby M = p1p2 = 143. By (i) of the proposition, we
can represent k1 = 2 using f(1, 0) and k2 = 6 using f(0, 1). Calculations show that
A = 78 and B = 66 (e.g. using a computer algorithm for the Chinese remainder
theorem) which gives us
K = Ak1 +Bk2 = 78(2) + 66(6) ≡ 123 (mod 143).
Note that K and M are coprime, so we can show that f(x, y) represents K in order
to demonstrate the conclusion in Proposition 3.1.4(ii). Letting (x1, y1) = (1, 0) and
(x2, y2) = (0, 1), we compute
f(Ax1 +Bx2, Ay1 +By2) = f(A,B)
= 2A2 + 3AB + 6B2
= 2(78)2 + 3(78)(66) + 6(66)2
= 12168 + 15444 + 26136 = 53748
≡ 123 (mod 143).
So f(A,B) represents K which is relatively prime to M .
Lemma 3.1.6. Let D be an integer and suppose k is an odd integer such that
gcd(D, k) = 1.
(i) D ≡ 0, 1 (mod 4) if D is the discriminant of a binary quadratic form.
(ii) k is properly represented by a primitive form of discriminant D if and only if
D is a quadratic residue mod k.
173
Proof. (i) If D is the discriminant of f(x, y) = ax2 + bxy + cy2 then D = b2 − 4ac
which means D ≡ b2 (mod 4). The only squares mod 4 are 0 and 1 so D ≡ 0, 1
(mod 4).
(ii) If k is properly represented by some form f(x, y) of discriminantD, Lemma 3.1.3
allows us to assume f(x, y) = kx2 + bxy + cy2 for b, c ∈ Z. Then D = b2 − 4kc so
D ≡ b2 (mod k), that is, D is a quadratic residue mod k. On the other hand, if
D ≡ b2 (mod k) then D ≡ 0, 1 (mod 4) implies D = b2 − 4kc for some c ∈ Z.
The form g(x, y) = kx2 + bxy + cy2 properly represents k and since gcd(D, k) = 1,
gcd(k, b, c) = 1 so g(x, y) is primitive.
Corollary 3.1.7. Let n ∈ Z and p be a prime not dividing n. Then
(−np
)= 1 if
and only if p is represented by a primitive form of discriminant −4n.
Proof. Note that −4n is a quadratic residue mod p ⇐⇒(−4n
p
)=
(−np
)= 1.
Apply part (ii) of the lemma.
Definition. A positive definite form ax2 + bxy + cy2 is reduced if it is primitive,
|b| ≤ a ≤ c and if either |b| = a or a = c then b ≥ 0.
There is a powerful characterization of primitive, positive definite (p.p.d.) forms
in terms of reduced forms:
Theorem 3.1.8. Every proper equivalence class of primitive, positive definite forms
contains a unique reduced form.
Proof. See [7] or [17].
Example 3.1.9. For any n ∈ N, x2+ny2 is a reduced, primitive, positive definite form
of discriminant −4n. For this reason, Corollary 3.1.7 explains one of the conditions
for p to be represented by x2 + ny2 in Theorems 1.8.7 and 2.11.3.
174
Lemma 3.1.10. For every reduced form ax2 + bxy + cy2 of discriminant D < 0,
a ≤√−D3
.
Proof. Let f(x, y) = ax2 + bxy + cy2. Since f(x, y) is reduced, b2 ≤ a2 and a ≤ c.
Thus −D = 4ac− b2 ≥ 4a2 − a2 = 3a2 which implies the result.
Definition. For a fixed D < 0, the number h(D) of equivalence classes of primitive,
positive definite forms of discriminant D is called the class number of D.
Theorem 3.1.11. For every D < 0, the class number h(D) is finite.
Proof. By Theorem 3.1.8, h(D) is the number of distinct reduced forms of discrim-
inant D. For a reduced form ax2 + bxy + cy2 of discriminant D, there are only a
finite number of choices for a and b since |b| ≤ a ≤ −D3
by Lemma 3.1.10. Moreover,
D = b2 − 4ac shows that the choices of D, a and b determine c. Therefore there are
only a finite number of reduced forms of discriminant D, so h(D) is finite.
3.2 The Form Class Group
Our first goal in this section is to justify the word group in the following definition.
Definition. For a negative integer D ≡ 0, 1 (mod 4), the set of equivalence classes
of primitive, positive definite forms of discriminant D is called the form class group
for D, denoted C(D). We will sometimes abuse notation and write f(x, y) ∈ C(D)
for a single form f .
Note that |C(D)| = h(D) which is equal to the number of reduced forms of
discriminant D. To prove C(D) is a group, we need to define a law of composition
on classes of quadratic forms. Legendre realized that since each class in C(D) has
a unique representative that is reduced, the composition may be defined on reduced
175
forms. However, his method was cumbersome to work with, so instead we follow
Dirichlet’s method of form composition.
Lemma 3.2.1. Suppose f and g are p.p.d. forms of discriminant D, where f(x, y) =
ax2 + bxy + cy2 and g(x, y) = a′x2 + b′xy + c′y2. If gcd(a, a′, b+b
′
2
)= 1 then there is
an integer B, unique modulo 2aa′, satisfying
B ≡ b (mod 2a)
B ≡ b′ (mod 2a′)
B2 ≡ D (mod 4aa′).
Proof. See [7].
Definition. Given two p.p.d. forms f(x, y) = ax2 + bxy + cy2 and g(x, y) = a′x2 +
b′xy+ c′y2 of discriminant D which satisfy gcd(a, a′, b+b
′
2
)= 1, their Dirichlet com-
position is
(f ∗ g)(x, y) = aa′x2 +Bxy +B2 −D
4aa′y2,
where B is the unique integer modulo 2aa′ chosen in Lemma 3.2.1.
Lemma 3.2.2. For any primitive, positive definite forms f and g of discriminant D,
if f ∗ g is defined, it is a primitive, positive definite form of discriminant D.
Proof. Suppose f(x, y) = ax2 + bxy + cy2 and g(x, y) = a′x2 + b′xy + c′y2 satisfy the
conditions of Lemma 3.2.1. Set C = B2−D4aa′
and F (x, y) = aa′x2 + Bxy + Cy2. The
discriminant of F is B2 − 4aa′(B2−D4aa′
)= D so F (x, y) is positive definite. Suppose
m is a number dividing all the coefficients of F . By Lemma 3.1.3, f and g are
properly equivalent to the quadratic forms ax2 +Bxy+a′Cy2 and a′x2 +Bxy+aCy2,
176
respectively. Notice that
f(x, y)g(x, y) ∼ (ax2 +Bxy + a′Cy2)(a′x2 +Bxy + aCy2)
= aa′x4 + aBx3y + a2Cx2y2 + a′Bx3y +B2x2y2 + aBCxy3
+ (a′)2Cx2y2 + a′BCxy3 + aa′C2y4
= aa′(x4 + C2y4) +B(ax3y + a′x3y +Bx2y2 + aCxy3 + a′Cxy3)
+ C(a2x2y2 + aBxy3 + (a′)2x2y2 + aa′Cy4 + a′Bxy3)
= aa′(x2 − Cy2)2 +B(x2 − Cy2)(axy + a′xy +By2)
+ C(axy + a′xy +By2)2
= aa′z2 +Bzw + Cw2.
So the product f(x, y)g(x, y) is properly equivalent to F (x, y). This means m divides
every number represented by f(x, y)g(x, y) but by Proposition 3.1.4, f and g represent
some numbers relatively prime to m. Therefore m = 1 so F (x, y) is primitive.
Definition. Let D ≡ 0, 1 (mod 4) be a negative integer. The principal form of
discriminant D is defined to be
FD(x, y) =
{x2 − D
4y2, D ≡ 0 (mod 4)
x2 + xy + 1−D4y2, D ≡ 1 (mod 4).
Notice that when D = −4n for an integer n ≥ 1, the principal form is x2 + ny2.
Theorem 3.2.3. Let D ≡ 0, 1 (mod 4) be a negative integer. The set C(D) is a finite
abelian group under Dirichlet composition. Moreover, the identity element is the class
containing the principal form and the inverse of the class containing ax2 + bxy + cy2
is the class containing ax2 − bxy + cy2.
Proof. First, Theorem 3.1.11 says that |C(D)| = h(D) is finite. If f(x, y) = ax2 +
bxy + cy2 and g(x, y) are p.p.d. forms of discriminant D then Proposition 3.1.4(ii)
177
shows we can replace g with a properly equivalent form g′(x, y) = a′x2 + b′xy + c′y2
with gcd(a, a′) = 1. Therefore Dirichlet composition is well-defined on classes of p.p.d.
quadratic forms. Moreover, Dirichlet composition is clearly abelian, so it suffices to
check the identity and inverses.
Let f(x, y) = ax2 + bxy + cy2 ∈ C(D). Note that for the principal form FD(x, y),
a′ = 1 so gcd(a, a′) = 1 and Dirichlet composition is well-defined for f and FD. The
integer B that satisfies Lemma 3.2.1 is precisely b, so
FD ∗ f(x, y) = aa′x2 + bxy +b2 −D
4aa′y2
= ax2 + bxy +4ac
4ay2
= ax2 + bxy + cy2 = f(x, y).
Hence FD is the identity.
Next, note that Dirichlet composition is not defined on the forms f(x, y) and
f ′(x, y) = ax2 − bxy + cy2 but by proper equivalence we can replace f ′(x, y) with
g(x, y) = f ′(−y, x) = cx2 + bxy + ay2 — the transformation matrix S =
[0 −11 0
]has determinant 1. Since f(x, y) is primitive, gcd(a, b, c) = 1 so f ∗ g(x, y) is defined.
Again, B = b satisfies Lemma 3.2.1 so
f ∗ g(x, y) = acx2 + bxy +b2 −D
4acy2 = acx2 + bxy + y2.
To finish, we show that F (x, y) = acx2 + bxy + y2 is properly equivalent to FD(x, y).
Using the matrix S again, F (x, y) is properly equivalent to F (−y, x) and by Exam-
ple 3.1.2 we can replace F (−y, x) with x2 + (−b + 2n)xy + (n2 − bn + ac)y2 for any
178
n ∈ Z. If D ≡ 0 (mod 4), b must be even so let n = b2. Then
x2 + (−b+ 2n)xy + (n2 − bn+ ac)y2 = x2 + (−b+ b)xy +
(b2
4− b2
2+ ac
)y2
= x2 +
(−b+ 4ac
4
)y2
= x2 − D
4y2 = FD(x, y).
On the other hand, if D ≡ 1 (mod 4), b is odd so let n = b+12
. Then
x2 + (−b+ 2n)xy + (n2 − bn+ ac)y2
= x2 + (−b+ b+ 1)xy +
(b2 + 2b+ 1
4− b2 − b
2+ ac
)y2
= x2 + xy +
(1− b2 + 4ac
4
)y2
= x2 + xy +1−D
4y2 = FD(x, y).
In both cases, F (x, y) is properly equivalent to the principal form so the inverse of
the class containing ax2 + bxy + cy2 is the class containing ax2 − bxy + cy2. This
completes the proof that C(D) is a finite abelian group.
We now return to a statement in Section 1.8 regarding the relationship between
C(dK) and the ideal class group C(OK). In fact, we will prove a more general relation
between C(D) and C(O) where O is an order in an imaginary quadratic field.
Theorem 3.2.4. Let K be an imaginary quadratic number field, let D ≡ 0, 1 (mod 4)
be a negative integer and let O be the order of discriminant D in K.
(1) If f(x, y) = ax2 +bxy+cy2 is a p.p.d. form of discriminant D then[a, −b+
√D
2
]is a proper ideal of O.
179
(2) There is an isomorphism Ψ : C(D) → C(O) defined by f(x, y) 7→[a, −b+
√D
2
]and therefore |C(O)| = h(D).
(3) A positive integer m is represented by a form f(x, y) ∈ C(D) if and only if
m = N(a) for some proper ideal a ∈ Ψ(f(x, y)).
Proof. We will prove (1) and (2). The details of (3) can be found in [7].
(1) Let f(x, y) = ax2 + bxy+ cy2 be p.p.d. of discriminant D. Then α = −b+√D
2ais
a root of the polynomial f(x, 1) = ax2 + bx+ c so by Lemma 1.9.5, a[1, α] is a proper
ideal of the order [1, aα]. Notice that a[1, α] = [a, aα] =[a, −b+
√D
2
]so it suffices to
show [1, aα] = O. Let f be the conductor of O. Then we showed in Section 1.9 that
D = f 2dK where dK is the field discriminant, so
aα =−b+
√D
2=−b+ f
√dK
2
=−b+ fdK
2+ f
dK +√dK
2
=−b+ fdK
2+ fwK
where wK is defined as in Section 1.9. Since D = b2 − 4ac = f 2dK , fdK and b have
the same parity which means that −b+fdK2
is an integer. Therefore [1, aα] = [1, fwK ]
by the above work and since every order is determined by its conductor, this shows
[1, aα] = O.
(2) Let f(x, y) and g(x, y) be p.p.d. forms of discriminant D. Let α, β ∈ C∗ be
the roots of f(x, 1) and g(x, 1), respectively, with positive imaginary parts. First, we
show
f(x, y), g(x, y) are properly equivalent ⇐⇒ β =aα + b
cα + dfor a, b, c, d ∈ Z, ad− bc = 1
⇐⇒ [1, α] = λ[1, β] for some λ ∈ K∗.
180
Suppose f(x) = g(Ax) where A =
[a bc d
]∈ SL2(Z). Then since α is a root of f(x, 1),
0 = f(α, 1) = g(aα + b, cα + d) = (cα + d)2g
(aα + b
cα + d, 1
).
Thus aα+bcα+d
is a root of g(x, 1) and it is easy to verify that it has positive imaginary
part, so β = aα+bcα+d
. On the other hand, the equation above shows that if β = aα+bcα+d
for A =
[a bc d
]in SL2(Z) then f(x, 1) and g(A(x, 1)) have the same root. It follows
that f(x) = g(Ax) so the forms are properly equivalent. This proves the first of the
equivalences above.
Next, suppose β = aα+bcα+d
where ad− bc = 1. Then cα+ d ∈ K∗ so set λ = cα+ d.
This implies
λ[1, β] = (cα + d)
[1,aα + b
cα + d
]= [cα + d, aα + b]
but since ad− bc = 1, [cα+d, aα+ b] = [1, α]. On the other hand, if [1, α] = λ[1, β] =
[λ, λβ] for some λ ∈ K∗ then
λβ = eα + f
and λ = gα + h
for some e, f, g, h such that
[e fg h
]∈ GL2(Z). Then β = eα+f
λ= eα+f
gα+hand since α
and β both have positive imaginary parts, we must have eh− fg = 1, that is
[e fg h
]is in SL2(Z). Therefore f and g are properly equivalent if and only if [1, α] = λ[1, β]
181
for some λ ∈ K∗. This establishes an injection
Ψ : C(D) −→ C(O)
f(x, y) 7−→ a[1, α] =
[a,−b+
√D
2
].
We next show that Ψ is surjective. Let a be a fractional O-ideal which, by the
proof of Proposition 1.9.6, can be written a = [α, β] for some α, β ∈ K. Without loss
of generality assume βα
has positive imaginary part. Set γ = βα
and let ax2 + bx + c
be the minimal polynomial of γ over Q – we may rescale the coefficients to ensure
gcd(a, b, c) = 1 and a > 0. Let f(x, y) = ax2 + bxy + cy2 which is then a p.p.d.
quadratic form. We next check that f(x, y) has discriminant D = disc(O). Writing
O = [1, aγ] we compute the discriminant by
D =
∣∣∣∣1 aγ1 aγ
∣∣∣∣2 = a2(γ − γ)2 = 4a2 im(γ)2.
The roots of ax2 + bx+ c are γ and γ which are solutions to the quadratic formula:
γ =−b+
√b2 − 4ac
2aand γ =
−b−√b2 − 4ac
2a.
So im(γ) =√b2−4ac
2aand hence D = 4a2
(√b2−4ac
2a
)2
= b2 − 4ac. This is precisely
the discriminant of f(x, y). Therefore f(x, y) is a primitive, positive definite form of
discriminant D which maps to a[1, γ] ∼ α[1, γ] = a in C(O). Hence Ψ is surjective.
Now we show that Ψ preserves the group structure of C(D). If f and g are p.p.d.
forms of discriminant D, denote their Dirichlet composition by F (x, y). In the proof
of Theorem 3.2.3, we saw that B = b satisfies the conditions of Lemma 3.2.1 for f
182
and g, so we can write the images of f, g and F under Ψ as:
Ψ([f ]) =
[a,−b+ f
√dK
2
]= [a,∆];
Ψ([g]) =
[a′,−b′ + f
√dK
2
]= [a′,∆];
and Ψ([F ]) =
[aa′,−B + f
√dK
2
]= [aa′,∆] where ∆ =
−b+ f√dK
2.
We want to show [a,∆][a′,∆] = [aa′,∆] in C(O). Note that the conditions on B from
Lemma 3.2.1 give us ∆2 ≡ −B∆ mod aa′ so we have
[a,∆][a′,∆] = [aa′, a∆, a′∆,∆2] = [aa′, a∆, a′∆,−B∆].
Since f, g and F are all primitive, the conditions on B also force gcd(a, a′, B) = 1 so
[a,∆][a′,∆] = [aa′, a∆, a′∆,−B∆] = [aa′,∆] as desired. Hence Ψ : C(D)→ C(O) is
an isomorphism.
3.3 n-Fermat Primes
In the final section of this text, we describe our approach to the following question:
Question 3.3.1. If p = x2 + ny2 is prime, when is q = y2 + nx2 also prime?
The following definitions are not standard in the literature. We have introduced
them in order to facilitate our discussion of Theorem 2.11.3 and Question 3.3.1.
Definition. Let n ≥ 1 be an integer. A number of the form x2 +ny2, where x, y ∈ Z,
is called an n-Fermat number. If p = x2 +ny2 is prime, p is said to be an n-Fermat
prime.
Definition. An n-Fermat prime p = x2 + ny2 is a symmetric n-Fermat prime
provided q = y2 + nx2 is also prime.
183
Question 3.3.1 can therefore be restated: When is an n-Fermat prime symmetric?
The question is stated rather broadly for a reason, as there are several ways we could
answer this.
In this language, Theorems 2.11.3 and 2.11.5 together say the following: Let f(x)
be the minimal polynomial of the j-invariant j(O) for the order O = Z[√−n] in
Q(√−n). Then a prime p not dividing disc(f) is an n-Fermat prime if and only if(
−np
)= 1 and f(x) ≡ 0 (mod p) has an integer solution. In other words, n-Fermat
primes are characterized by congruence conditions in all but finitely many cases. The
best possible situation would therefore be a positive answer to the following question:
Question 3.3.2. For an integer n ≥ 1, are there congruence conditions that deter-
mine when an n-Fermat prime is a symmetric n-Fermat prime?
There is fortunately a case when the answer to Question 3.3.1 is quite trivial.
When n = 1, an n-Fermat prime is always symmetric. This is certainly the only case
when the ratio of symmetric n-Fermat primes to total n-Fermat primes is 1, as the
next example shows.
Example 3.3.3. Let n = 2. The first few symmetric 2-Fermat primes are: p = 3,
11, 19, 43, 59, 67, 83, 107, 139, 163, 179, . . . For small primes it appears that p is a
symmetric 2-Fermat prime if and only if p ≡ 3 (mod 8). However, 131 is a 2-Fermat
prime since it can be written 131 = 92 + 2 · 52, but 52 + 2 · 92 = 187 = 11 · 17 is not
prime. Therefore the condition p ≡ 3 (mod 8) breaks early on.
The function getSymmRatio(n,M) generates a list (n, symmRat, expSymm, N ,
S), whose entries are
184
n a positive integer
N the number of n-Fermat primes p = x2 + ny2 whose solutions (x, y)
satisfy x, y ≤M
S the number of these that are symmetric n-Fermat primes
symmRat the ratio of symmetric n-Fermat primes to total n-Fermat primes that
satisfy x, y ≤M , i.e. symmRat = SN
expSymm the ratio of observed symmetric n-Fermat primes to the expected
number of symmetric n-Fermat primes satisfying x, y ≤M , as predicted
by the Prime Number Theorem.
(See Appendix A.4 for further documentation; a description of our Prime Number
Theorem heuristic for expSymm follows in the next paragraph.) Now consider the
data generated for n = 2 and M = 1000:
> getSymmRatio(2,1000);
[ 2.0, 0.1142830989, 0.9586592239, 2501656.0, 285897.0 ]
Empirically, it appears that the ratio of symmetric 2-Fermat primes to total 2-Fermat
primes is about 0.1143; that is, about 11.43% of 2-Fermat primes are symmetric. On
the other hand, the data shows that the ratio of the number of symmetric 2-Fermat
primes to the expected number of 2-Fermat primes, under the assumptions of our
Prime Number Theorem heuristic below, is about 0.9587. That is, there are slightly
less symmetric 2-Fermat primes than we expect. Something interesting is going on
here.
For an integer n ≥ 1, let πsym,n(M) denote the number of primes y2 + nx2 such
that x2 + ny2 is prime and x, y ≤ M . Notice that if x2 + ny2 is prime and x and y
are both relatively prime to n, then y2 + nx2 is necessarily odd. Of course a number
has twice the probability of being prime given that it is odd – this is accounted for
185
in the Magma code in Appendix A.4 – so our Prime Number Theorem heuristic says
that for each n ≥ 1, there is a nonnegative real number αn such that
πsym,n(M) ∼ 2αn∑q≤M
1
log q,
where log is the natural logarithm and the sum is over n-Fermat numbers q = y2+nx2,
x, y ≤ M , for which x2 + ny2 is prime. For example, the data above shows that α2
is close to 0.9328. We posit several conjectures related to αn and the asymptotic
behavior of πsym,n(M) below, along with empirical results that lead us to believe they
might hold.
Conjecture. For all n ≥ 1, αn > 0.
Theorem 2.11.3 characterizes primes of the form x2 + ny2 up to solvability con-
ditions of fn(x) ≡ 0 (mod p). Moreover, Cox [7] gives a general formula for the
Dirichlet density δ(f) of primes represented by a p.p.d. quadratic form f of discrim-
inant D < 0:
δ(f) =
{1
h(D)if f is properly equivalent to its opposite
12h(D)
otherwise.
Therefore there are infinitely many n-Fermat primes for any n ≥ 1. In other words,
the sum∑
q≤M1
log qover n-Fermat numbers q obtained by switching solutions for n-
Fermat primes diverges as M → ∞, so Conjecture 3.3 would imply that there are
infinitely many symmetric n-Fermat primes for every n ≥ 1. To test this conjecture,
we turned Magma loose on some computations with large search spaces, including
> bigData := getSymmRatios(40000,1000);
(The reader should be warned that the above computation took the better part
of a weekend, so proceed with caution when running some of the code contained in
186
Appendix A.4.) Through the first 40,000 values for n, and with search parameters
x, y ≤ 1, 000, Conjecture 3.3 is seen to hold. There were several other interesting
observations made, which are discussed via the next two conjectures.
Conjecture. The average value of αn over all n ≥ 1 is equal to 1.
Informally, Conjecture 3.3 means that, on average, n-Fermat primes are about as
likely to be symmetric as the Prime Number Theorem predicts. This is supported by
the statistical analysis of the list above:
> averageRatio(bigData);
> stdDevRatio(bigData);
This describes a global property of the natural numbers, which reinforces the pre-
dictions of the Prime Number Theorem. This shouldn’t be a surprise, as the PNT
makes a strong, global statement about the natural numbers and subsets thereof.
However, we know from experience that the integers often behave more erratically
from a local perspective. The function getInterestingPrimes returns the informa-
tion from getSymmRatio on the values of n such that αn exceeds a certain threshold r.
For example, there are a handful of numbers n in the first 40,000 such that αn > 2:
> getInterestingPrimes(40000,1000,2);
{@
[ 2277.0, 0.2115438844, 2.038176783, 56740.0, 12003.0 ],
[ 12699.0, 0.1919496912, 2.018037461, 44048.0, 8455.0 ],
[ 13629.0, 0.1934944899, 2.042425528, 28130.0, 5443.0 ],
[ 14540.0, 0.1919620775, 2.030119412, 24260.0, 4657.0 ],
[ 15091.0, 0.1888063168, 2.002682571, 64590.0, 12195.0 ],
[ 16615.0, 0.1901929816, 2.025452899, 81044.0, 15414.0 ],
[ 22576.0, 0.1899758230, 2.051227164, 43016.0, 8172.0 ],
[ 24089.0, 0.1846163922, 2.000292558, 30539.0, 5638.0 ],
[ 27250.0, 0.1843967171, 2.011252308, 31801.0, 5864.0 ],
[ 29127.0, 0.1883313546, 2.059048767, 41273.0, 7773.0 ],
[ 29798.0, 0.1849947634, 2.024235890, 21006.0, 3886.0 ],
187
[ 31927.0, 0.1852485929, 2.034336859, 85280.0, 15798.0 ],
[ 33060.0, 0.1826687458, 2.007948420, 13467.0, 2460.0 ],
[ 34159.0, 0.1816352201, 2.002051748, 71550.0, 12996.0 ],
[ 35814.0, 0.1822754491, 2.010832673, 16700.0, 3044.0 ],
[ 36743.0, 0.1834493426, 2.026459979, 51720.0, 9488.0 ]
@}
These n have the apparent property that there are more than twice the number of
symmetric n-Fermat primes than expected. The function getInterestingPrimes2
returns similar data for n values such that αn is less than a threshold r. In the future
we hope to be able to discern why certain numbers have higher or lower densities of
symmetric n-Fermat primes than predicted, but if one is to believe that the values of
αn follow a normal distribution, then such outliers are to be expected in larger and
larger data sets.
Conjecture. The set of αn is bounded. That is, there are positive constants ε and
M such that for all n, ε ≤ αn ≤M .
This conjecture is offered solely based on the observations made for large param-
eter searches for symmetric n-Fermat primes. It appears so far that 0.4 ≤ αn ≤ 2.1.
Finally, a question lingering on the edge of this discussion is
Question 3.3.4. If p is an n-Fermat prime, is there an algorithm for finding solutions
x, y ∈ Z to p = x2 + ny2? And if so, how many solutions (x, y) are there?
The authors in [8] note that Question 3.3.4 is unsolved and it would be difficult at this
time to implement a method of solving p = x2 +ny2 even for small n. However, there
is clear motivation for answering such a question, as there are important implications
to the theory of quadratic partitions and cryptography [8].
In a related sense, the characterization (Example 1.3.7) of primes of the form
x2 + y2, that is 1-Fermat primes, forms the basis of a primality test discovered by
188
Euler: m = x2 + y2 has a single solution (x, y) in positive integers when m is prime.
In the future, the complexity of n-Fermat primes and symmetric n-Fermat primes
may contribute to the rise of more secure cryptosystems and faster primality test
algorithms.
189
Bibliography
[1] Artin, Emil. Collected Papers, edited by Lang, S. and Tate, J. Addison-Wesley
Publishing, Reading (1965).
[2] Artin, Emil and Tate, John. Class Field Theory. W.A. Benjamin, New York
(1968).
[3] Borevich, Z.I. and Shafarevich, I.R. Number Theory. Translated from Russian.
Academic Press, New York (1966).
[4] Butler, Gregory and McKay, John. The transitive groups of degree up to
eleven. Communications in Algebra, 11(8) (1983). pp. 863-911.
[5] Cho, Bumkyu. Primes of the form x2 + ny2 with conditions x ≡ 1 mod N ,
y ≡ 0 mod N . Journal of Number Theory. 130 (2010), pp. 852-861.
[6] Conrad, Keith. The different ideal. http://www.math.uconn.edu/∼kconrad/blurbs/
(2009).
[7] Cox, David A. Primes of the Form x2 +ny2: Fermat, Class Field Theory, and
Complex Multiplication, 2nd ed. John Wiley & Sons, Hoboken (2013).
[8] Cusick, T.W., Ding, C. and Renvall, A. Stream Ciphers and Number Theory.
Elsevier Science B.V., Amsterdam (1993).
[9] Dirichlet, Peter Gustav Lejeune. There are infinitely many prime numbers in
all arithmetic progressions with first term and difference coprime. Translated
from German. arXiv:0808.1408v1 [math.HO] (2008).
190
[10] Dummit, David S. and Foote, Richard M. Abstract Algebra, 3rd ed. Wiley &
Sons, Hoboken (2004).
[11] Golod, E.S. and Shafarevich, I.R. On the class-field tower. Translated from
Russian. Izvestiya Akademii Nauk 28, S.S.S.R. (1964). pp. 261-272.
[12] Gouvea, Fernando Q. p-adic Numbers: An Introduction, 2nd ed. Springer-
Verlag, New York (1997).
[13] Hilbert, David. The Theory of Algebraic Number Fields. Translated from Ger-
man. Springer-Verlag, New York (1998).
[14] Hungerford, Thomas W. Algebra. Springer-Verlag, New York (1974).
[15] Janusz, Gerald J. Algebraic Number Fields. Academic Press, New York (1973).
[16] Kolmogorov, A.N. and Fomin, S.V. Introductory Real Analysis. Dover, New
York (1970).
[17] Lee, Holden. Algebraic Number Theory, Class Field Theory and Complex Mul-
tiplication. http://web.mit.edu/∼holden1/ (2012).
[18] Marcus, Daniel A. Number Fields. Springer-Verlag, New York (1977).
[19] Milne, J.S. Algebraic Number Theory, v3.05. http://www.jmilne.org/math/
(2013).
[20] Milne, J.S. Class Field Theory, v4.02. http://www.jmilne.org/math/ (2013).
[21] Schulze, Volker. “Die Primteilerdichte von ganzzahligen Polynomen, I”. Jour-
nal fur die reine und angewandte Mathematik, 253 (1972). pp. 175-185.
191
[22] Schoof, Rene. Class numbers of real cyclotomic fields of prime con-
ductor. Mathematics of Computation, vol. 72, no. 242 (2002). pp. 913-
937. Published electronically: http://www.ams.org/journals/mcom/2003-72-
242/S0025-5718-02-01432-1/S0025-5718-02-01432-1.pdf.
[23] Serre, Jean-Pierre. A Course in Arithmetic. Springer-Verlag, New York (1973).
[24] Stein, William. A Brief Introduction to Classical and Adelic Algebraic Number
Theory. http://modular.math.washington.edu/129/ant/ (2004).
[25] Stein, William. Introduction to Algebraic Number Theory.
http://modular.math.washington.edu/129-05/notes/129.pdf (2005).
[26] Stevenhagen, P. and Lenstra, H.W., Jr. Chebotarev and his Density Theorem.
The Mathematical Intelligencer, vol. 18, no. 2 (1996). pp. 26-37.
192
Appendix A: Appendix
A.1 The Four Squares Theorem
We discuss a beautiful application of Minkowski’s theorem from Section 1.7. The four
squares theorem is a famous result in number theory which was proven by Lagrange
in 1770, well over 100 years before Minkowski’s theorem was discovered. Here we
provide a neat proof of the four squares theorem using Minkowski’s geometry of
numbers arguments.
Theorem A.1.1 (Four Squares). Every positive integer is the sum of the squares of
four integers.
Proof. It suffices to prove this for primes p, since
(a2 + b2 + c2 + d2)(e2 + f 2 + g2 + h2) = (ae+ bf + cg + dh)2 + (af − be+ ch− dg)2
+ (ag − ce+ df − bh)2 + (ah− de+ bg − cf)2.
(This is due to Euler.) Also note that 2 = 12 + 12 + 02 + 02 so we may assume p is an
odd prime. Consider the congruence
x2 + y2 + 1 ≡ 0 (mod p).
As x runs through 0, 1, . . . , p − 1, x2 takes on exactly p+12
distinct values mod p.
Similarly, −1 − y2 takes on p+12
distinct values, so together x2 and −1 − y2 take on
p+ 1 values, which implies one of them must be shared. This shows x2 + y2 + 1 ≡ 0
(mod p) has a solution in integers.
Fix one of these solutions, say (x, y), and consider the lattice Λ ⊂ Z4 consisting
193
of (a, b, c, d) such that
c ≡ ax+ by and d ≡ bx− ay (mod p).
Then Z4 ⊃ Λ ⊃ pZ4 and Λ/pZ4 is a two-dimensional subspace of F4p since once we
pick a and b, the c and d are determined. Thus Λ has index p2 in Z4 so µ(D) = p2 for
D a fundamental parallelopiped for Λ. Let T be a closed ball about the origin with
radius r. Then µ(T ) = 12π2r4 so we may choose r such that
2p > r2 > 1.9p.
This gives us µ(T ) > 16µ(D) so by Minkowski’s theorem there exists a nonzero point
(a, b, c, d) in T ∩ (Λ r {0}). This means
a2 + b2 + c2 + d2 ≡ a2 + b2 + (ax+ by)2 + (bx− ay)2
≡ a2 + b2 + a2x2 + 2abxy + b2y2 + b2x2 − 2abxy + a2y2
≡ a2(1 + x2 + y2) + b2(1 + x2 + y2)
≡ 0 (mod p).
Moreover, since (a, b, c, d) ∈ T we have
a2 + b2 + c2 + d2 < 2p.
But since a2 +b2 +c2 +d2 is a positive integer and p is prime, p = a2 +b2 +c2 +d2.
194
A.2 The Snake Lemma
The following results from commutative algebra are used in the proof of the finiteness
of ray class groups (Theorem 2.3.4).
Snake Lemma. Given a commutative diagram of R-modules with exact rows
A′ A A′′ 0
0 B′ B B′′
α′ α
f g h
β′ β
there is an exact sequence
0→ kerα′ → ker f → ker g → kerh→ coker f → coker g → cokerh→ coker β → 0.
Proof. We prove the classic snake lemma, which asserts that the following sequence
is exact:
0→ ker fα′−→ ker g
α−→ kerhð−→ coker f
β′−→ coker gβ−→ cokerh→ 0.
Note that it is straightforward to verify that kerα′ ⊆ ker f which shows exactness at
the first position and likewise for coker g ⊆ coker β at the final position.
To begin, note that the left square commutes so α′ restricts to α′ : ker f → ker g.
Likewise α restricts to α : ker g → kerh and these both inherit injectivity. Consider
the projection B′β′−→ B → B/ im g = coker g. For any b′ ∈ im f , there is some a′ ∈ A′
such that f(a′) = b′, but also gα′(a′) = β′(b′) by commutativity. So β′(b′) ∈ im g
which induces a map β′ : coker f → coker g – in other words β′ factors through f .
Likewise, β induces a map β and both β′ and β inherit surjectivity.
The interesting part of the sequence is at kerh → coker f ; we have labelled this
with the character ð. How can we define ð? Start with a′′ ∈ kerh, so that h(a′′) = 0.
195
By exactness, there is some a ∈ A such that α(a) = a′′. If b is the image of a under
g, b 7→ 0 in B′′ and so there is some b′ ∈ B′ such that b′ 7→ b 7→ 0.
With this notation, we define
ð : kerh −→ coker f
a′′ 7−→ b′ + im f.
Since all maps in this diagram are linear, it suffices to check well-definedness. Once
we do so, we will have all maps defined and can then proceed to check exactness.
Suppose we have α(a1) = α(a2) = a′′. Then there is some a′ ∈ A′ such that
α′(a′) = a1 − a2 since this difference lies in kerα. Let b′1 and b′2 be the unique (by
injectivity) elements of B′ such that β′(b′1) = g(a1) and β′(b′2) = g(a2). We want
to show that b′1 − b′2 = f(a′). But this is easily seen, since commutativity of the
left square gives us a′ 7→ a1 − a2 7→ g(a1) − g(a2) := b1 − b2 around the top, and
a′ 7→ f(a′) 7→ b1− b2 around the bottom. Then by injectivity, f(a′) must equal b′1− b′2
as desired. This shows that ð maps into B′/ im f = coker f , i.e. it is well-defined.
We now have all our maps, so we can proceed to show exactness. First α′α = 0
implies im(α′ |ker f ) ⊆ ker(α |ker g). For the other containment, take a ∈ kerα that
maps to 0 under g. Then by exactness, there exists an a′ 7→ a but the left square
commutes, so a′ 7→ b′ 7→ 0 and injectivity of β′ shows b′ = 0.
Next suppose b ∈ coker g maps to b′′ ∈ cokerh = B/ imh. Then there is some
a′′ mapping to b′′ under h, and α is onto to there also exists a ∈ A mapping to a′′.
Let g(a) = b ∈ B; by commutativity b 7→ b′′. Then b − b 7→ 0 ∈ B′′ implies there
exists b′ ∈ B′ such that β′(b′) = b − b = g(a) − b. So b = g(a) − β′(b′) and thus
β′(b′ + im g) = β′(b′) + im g = b+ im g.
The hard part is showing exactness around the connecting map ð. For a′′ ∈ kerh,
we can lift back to some a ∈ A which maps to a′′ ∈ A′′ and has image g(a) = b. But
196
b 7→ 0 under β so there is some b′ ∈ B′ such that b′ 7→ b. Recall that we defined
ð(a′′) = b′ + im f . Then im(α |ker g) ⊂ kerð because g(a) = 0. On the other hand,
suppose ð(a′′) = 0. Then in the diagram chase for the definition of ð, choose a′ ∈ A′
such that f(a′) = b′. Set a = α′(a′). By commutativity, g(a−a) = 0 and by exactness,
α(a− a) = α(a) = a′′.
Let b′ ∈ B′ such that β′(b′ + im f) = im g, so β′(b′) ∈ im g. The fact that
imð ⊂ ker β′ is just by definition of the connecting map. Conversely, suppose b′ 7→ b
such that there is some a ∈ A with g(a) = b. If we set α(a) = a′′, we have already
constructed the connecting map for a′′. Moreover we know b′ 7→ b 7→ 0 so a′′ must map
to 0 along h, i.e. a′′ ∈ kerh. This shows exactness at every part of the sequence.
We use the Snake Lemma to prove
Lemma A.2.1. Given a pair of homomorphisms Af−→ B
g−→ C there is an exact
sequence
0→ ker f → ker g ◦ f → ker g → coker f → coker g ◦ f → coker g → 0.
Proof. Apply the Snake Lemma to
A B coker f 0
0 C C 0
f
g ◦ f g
id
to produce the desired exact sequence.
197
A.3 Cyclic Group Cohomology
Here we briefly introduce the basic concepts in group cohomology and use them to
define the Herbrand quotient of a G-module. Familiarity with Tor, Ext and group
algebras is assumed. The Herbrand quotient is used extensively in Section 2.6 in the
proof of the second fundamental inequality.
Definition. Let G be a group and consider an abelian group A on which G acts on
the left. Then A is called a G-module. Alternatively, A may be thought of as a left
module over the group algebra ZG.
Group homology/cohomology is the study of Tor and Ext over these group alge-
bras. If G-Mod is the category of G-modules, there are two important functors:
(−)G : G-Mod −→ Ab
A 7−→ AG := {a ∈ A | g · a = a for all g ∈ G}
and (−)G : G-Mod −→ Ab
A 7−→ AG := A/〈g − 1〉A.
The functor (−)G sends A to the set of G-invariants of A, that is, AG is the largest
subspace of A which is fixed by G. On the other hand, the image of (−)G are the
quotients AG = A/〈g − 1〉A which comprise the G-coinvariants of A. It turns out
that for any G-module A, AG ∼= HomZG(Z, A) and AG ∼= Z⊗ZG A (these are natural
isomorphisms of functors). By Hom-tensor adjointness, this means (−)G and (−)G
are adjoint functors.
Definition. For a left G-module A, we define the nth group cohomology of A by
Hn(G;A) := ExtnZG(Z, A).
Note that H0(G;A) = AG. The dual construction defining group homology relates
198
to AG in exactly the same way. One perspective of group cohomology is that it
measures how far G is from being a finite group. For practical purposes, we can use
Hn(G;Z) to measure this property of G.
For the rest of this section, let G = 〈t〉 be a cyclic group of order n. We construct
a free resolution of Z over ZG,
· · · N−→ ZG t−1−−→ ZG N−→ ZG t−1−−→ ZG→ Z→ 0
where N = 1+t+t2+. . .+tn−1 =∑g∈G
g, called the norm element of G. This resolution
is an infinite, 2-periodic resolution of Z as a left ZG-module. Therefore there are only
two distinct cohomology groups for k ≥ 1, one for odd homological degrees and one
for even:
Heven(G;A) := H2k(G;A) and Hodd(G;A) := H2k−1(G;A).
(Recall that the 0th cohomology is H0(G;A) = AG.)
Lemma A.3.1 (Exact Hexagon). Given an exact sequence 0→ A→ B → C → 0 of
G-modules, the long exact sequence in cohomology is an exact hexagon:
H0(G;A) H0(G;B)
H0(G;C)
H1(G;A)H1(G;B)
H1(G;C)
Proof. The exact hexagon is just the long exact sequence in cohomology when G is
cyclic and the cohomologies are 2-periodic after the 0th homological degree.
199
Definition. Let A be a G-module. The Herbrand quotient of A is
q(A) =|H1(A)||H0(A)|
which is defined whenever the cohomology groups of A are finite.
Lemma A.3.2. Let 0 → A → B → C → 0 be an exact sequence of G-modules. If
any two of q(A), q(B), q(C) are defined then so is the third, and q(A)q(C) = q(B).
Proof. Apply the exact hexagon.
Corollary A.3.3. If A ⊂ B are G-modules and C = B/A is a finite quotient, then
q(A) = q(B) whenever either of these are defined.
Proof. If C is finite, we have
q(C) =[kerN : im(t− 1)]
[ker(t− 1) : imN ]=| kerN | | im(t− 1)|| ker(t− 1)| | kerN |
=|C||C|
= 1.
Then apply Lemma A.3.2.
There is a special case of cyclic cohomology for finite, Galois extensions L/K,
famously listed as Theorem 90 in [13].
Theorem (Hilbert’s Theorem 90). If G = Gal(L/K) is the Galois group for L/K,
a finite, Galois extension of number fields then H1(G;L∗) = 1 where L∗ denotes the
invertible elements of L.
200
A.4 Helpful Magma Functions
Here we list some functions we developed to compute the distribution of primes
among divisions, cycle types and conjugacy classes of G in the sense of Frobenius’ and
Cebotarev’s density theorems (Sections 2.5 and 2.10). These are referenced frequently
in Example 2.10.7.
FrobeniusTally := function(f,n)
// INPUT: f - monic polynomial with integer coefficients
// n - positive integer
// OUTPUT: Outputs a tally of how often each element of the Galois
// group occurs as the Frobenius element of a prime p for
// p less than n.
K := NumberField(f);
disc := Discriminant(f);
discPrimes := [p[1] : p in Factorization(Integers() ! disc)];
validPrimes := [ p : p in [2..n] | IsPrime(p) and p notin discPrimes ];
frobElts := [ FrobeniusElement(K,p) : p in validPrimes ];
uniqueFrobElts := SequenceToSet(frobElts);
frobTally := [ <p,#[q : q in frobElts | q eq p]> : p in uniqueFrobElts ];
return frobTally;
end function;
Divisions := function(G)
// INPUT: G - a finite group
// OUTPUT : A set of representatives of the divisions of G.
ccG := ConjugacyClasses(G);
ccSubgroups := SequenceToSet([ sub < G | cc[3] > : cc in ccG ]);
divReps := [ GeneratorsSequence(H)[1] : H in ccSubgroups ];
return divReps;
end function;
Divisionmates := function(G,x)
// INPUT: G - a finite group
// x - an element of G
// OUTPUT: The set of elements of G that are in the same division as x.
subgp := sub < G | x >;
ordx := Order(subgp);
gens := [ x^k : k in [1..ordx-1] | Gcd(k,ordx) eq 1 ];
ccs := [ Conjugates(G,y) : y in gens ];
201
return (&join ccs);
end function;
Cyclemates := function(G,x)
// INPUT: G - a finite group
// x - an element of G
// OUTPUT: The set of elements of G that are in the same division as x.
ccG := ConjugacyClasses(G);
cycleCCGs := [ Conjugates(G,cc[3]) : cc in ccG | CycleStructure(cc[3])
eq CycleStructure(x) ];
return (&join cycleCCGs);
end function;
DivisionTally := function(T)
// INPUT: T - tally from frobeniusTally
// OUTPUT: Collect all elements that are in the same division in T.
G := Parent(T[1][1]);
divsG := Divisions(G);
divTally := [ < divRep, &+[ p[2] : p in T | (sub < G | p[1] >) eq (
sub < G | divRep >) ] > : divRep in divsG ];
return divTally;
end function;
CycleTypeTally := function(T)
// INPUT: T - tally from frobeniusTally
// OUTPUT: Collect all elements that are in the same cycle type in T.
G := Parent(T[1][1]);
ccG := ConjugacyClasses(G);
cycleTypesG := SequenceToSet ([ CycleStructure(cc[3]) : cc in ccG ]);
ccTally := [ <ct, &+[ p[2] : p in T | CycleStructure(p[1]) eq ct] > :
ct in cycleTypesG ];
return ccTally;
end function;
ConjClassTally := function(T)
// INPUT: T - tally from frobeniusTally
// OUTPUT: Collect all elements that were conjugate from T.
G := Parent(T[1][1]);
ccG := ConjugacyClasses(G);
ccTally := [ < cc[3], &+[ p[2] : p in T | IsConjugate(G,cc[3],p[1])
] > : cc in ccG ];
return ccTally;
202
end function;
The function FrobeniusElement(K,p) is a built-in Magma function that com-
putes a Frobenius element FrobN/K(p) for a prime p ∈ OK , where N is the Galois
closure of K. By the definition of FrobN/K(p) in Section 2.10 we know that such an
element is unique up to conjugacy, and is completely unique in the abelian case.
In light of the importance of Theorem 2.11.3, it is desirable to be able to compute
the minimal polynomial of the j-invariant of OK for K = Q(√−n) and n ≥ 2. The
following Magma function does just that.
getJInvMinPoly := function(n)
// INPUT : n - an integer
// OUTPUT : The minimal polynomial of the j invariant of the ring of
// integers in number field QQ(\sqrt{-n}), i.e. a primitive
// element for the Hilbert class field of QQ(\sqrt{-n}).
R<x> := PolynomialRing(Integers());
f := x^2+n;
K<y> := NumberField(f);
H := HilbertClassField(K);
z := PrimitiveElement(H);
return MinimalPolynomial(z,RationalField());
end function;
The next function makes it easy to create a list of class numbers for the first n
negative discriminants. This is useful when creating examples such as Example ??
where a particular class number is desired.
getJInvDegrees := function(n);
// INPUT : n - a positive integer
// OUTPUT : A list of tuples [i,d] where d is the degree of the minimal
// polynomial of the j-invariant of the ring of integers of
// QQ(\sqrt{-i}) for i less than or equal to n.
jInvDegs := {@ [i,Degree(getJInvMinPoly(i))] : i in [2..n] @};
return jInvDegs;
end function;
The next sequence of functions are used in Section 3.3 to study n-Fermat primes
and symmetric n-Fermat primes. Note that if x2 +ny2 is prime then gcd(x, n) = 1; if
203
in addition y2 + nx2 is prime, gcd(y, n) = 1. So when searching for n-Fermat primes
and symmetric n-Fermat primes, we can reduce our search parameters on x and y to
those solutions that are relatively prime to n.
isNFermatPrime := function(f,n,p)
// INPUT: f - minimal polynomial of the jInvariant of the order
// Z[sqrt(-n)]
// n - positive integer
// p - prime integer (primality is checked below)
// OUTPUT: true if p = x^2+ny^2 and false otherwise.
if not IsPrime(p) then
return false;
end if;
if n eq p then
return true;
end if;
// At this point, p is prime and n neq p.
// Create polynomial ring over F_p
R<x> := PolynomialRing(FiniteField(p));
// Test if Legendre symbol is 1
jacobiCond := LegendreSymbol(-n,p) eq 1;
// Check if f mod p has a root
congruenceCond := HasRoot(R ! f,R);
return (jacobiCond and congruenceCond);
end function;
// This is the brute force factoring method.
factorNFermatPrimeNaive := function(n,p)
// INPUT : n - a positive integer
// p - a positive integer
// OUTPUT : Returns a sequence [p, x, y] if p = x^2 + ny^2.
// If no such x or y exist, [p,0,0] is returned.
factor := {@ [x,y] : x in [0..p], y in [0..p] | p eq x^2 + n*y^2 @};
if #factor eq 0 then
return [p,0,0];
else
return [p,factor[1][1],factor[1][2]];
end if;
end function;
getNFPNaive := function(n,M)
// INPUT : n - a positive integer
204
// M - a positive integer
// OUTPUT : Return a list of tuples [p,x,y] where p is an n-Fermat
// prime with x and y less than or equal to M.
searchSpace := [m : m in [1..M] | Gcd(n,m) eq 1];
primesList := {@ [x^2+n*y^2,x,y] : x in searchSpace, y in searchSpace |
IsPrime(x^2 + n*y^2) @};
return primesList;
end function;
getNFPNaive2 := function(n,M)
// INPUT : n - a positive integer
// M - a positive integer
// OUTPUT : Return a list of tuples [p,x,y] where p is an n-Fermat
// prime with x and y less than or equal to M.
// Similar to getNFPNaive, but the search space is traversed in the
// opposite order.
searchSpace := [m : m in [1..M] | Gcd(n,m) eq 1];
primesList := {@ [x^2+n*y^2,x,y] : y in searchSpace, x in searchSpace |
IsPrime(x^2 + n*y^2) @};
return primesList;
end function;
getSymmNFPNaiveList := function(n,primesList);
// INPUT : n - a positive integer
// primesList - a list of n-Fermat primes in the form of a
// tuple [p,x,y] with p = x^2 + ny^2.
// OUTPUT : Return a list of tuples [p,x,y,q] where q = y^2 + nx^2 is
// prime.
symmPrimes := {@ [p[1],p[2],p[3],p[3]^2+n*p[2]^2] : p in primesList |
IsPrime(p[3]^2+n*p[2]^2) and p[2] lt p[3] @};
return symmPrimes;
end function;
getSymmNFPNaive := function(n,M);
// INPUT : n - a positive integer
// M - a positive integer
// OUTPUT : Return a list of tuples [p,x,y,q] where p = x^2 + ny^2 and
// q = y^2 + nx^2 are both prime with x and y less than or equal
// to M.
primesList := getNFPNaive(n,M);
return getSymmNFPNaiveList(n,primesList);
end function;
205
In Section 3.3, we are interested in the number of symmetric n-Fermat primes.
This is best understood in terms of the Prime Number Theorem. We will use the
following heuristic: let π(L) denote the number of primes in a set L of positive
integers; then
π(L) ∼ C∑m∈L
1
logm
where log is the natural logarithm and C is a constant. For our purposes, we are
interested in πsym,n(M) = π(L) where L is the set of n-Fermat numbers y2 +nx2 such
that x2 + ny2 is prime and x, y ≤M . We have written a Magma function that takes
a list of n-Fermat primes x2 + ny2 and calculates the expected number of y2 + nx2
that are prime.
expectedNumberOfPrimes := function(n,L);
// INPUT : n - a positive integer
// L - A list of tuples of the form [p,x,y] where p = x^2 + ny^2
// and p is prime.
// OUTPUT : The number of elements of L that are expected
// to be prime, as predicted by PNT. This is the sum
// of the reciprocals of the logs of the elements of L.
return &+[1/Log(p[3] + n*p[2]^2) : p in L];
end function;
The next four functions generate data on πsym,n(M) for a range of values for
n and M . The getInterestingPrimes and getInterestingPrimes2 functions in
particular are useful for picking out n values for which there are either many more
or many less symmetric n-Fermat primes than predicted by our PNT heuristic. In
the getSymmRatio function, the factor of 2 in the output for expSymm is explained in
Section 3.3.
getSymmRatio := function(n,M);
// INPUT : n - a positive integer
// M - a positive integer
// OUTPUT : A tuple of the form [n,symmRat,expSymm,N,S] where:
// symmRat - the number of symmetric n-Fermat primes divided by the
206
// number of n-Fermat primes
// expSymm - The ratio of the number of symmetric n-Fermat tuples to
// the expected number of symmetric n-Fermat tuple as pre-
// dicted by PNT
// N - the number of n-Fermat primes with x and y leq M
// S - the number of symmetric n-Fermat primes with x and y leq M.
R := RealField(10);
primeList := getNFPNaive(n,M);
symmList := getSymmNFPNaiveList(n,primeList);
// In the below command, the 2 is present in the third entry because
// if x^2 + ny^2 is prime and (x,n) = (y,n) = 1, y^2 + nx^2 is odd,
// so is already twice as likely to be prime.
return [
n,
R ! #(symmList)/#(primeList),
#(symmList)/(2*expectedNumberOfPrimes(n,primeList)),
#primeList,
#symmList];
end function;
getSymmRatios := function(n,M);
// INPUT : n - a positive integer
// M - a positive integer
// OUTPUT : A list of tuples returned by getSymmRatio(i,M) for i
// between 2 and n.
// See the description above for the entries in the tuple.
symRats := {@ getSymmRatio(i,M) : i in [2..n] @};
return symRats;
end function;
getInterestingPrimes := function(n,M,r);
// INPUT : n - a positive integer
// M - a positive integer
// r - a real number
// OUTPUT : A list of tuples returned by getSymmRatio(i,M) for i
// between 2 and n such that the ratio of symmetric n-Fermat
// primes to the number of symmetric n-Fermat primes expected
// by PNT *exceeds* r.
symRats := getSymmRatios(n,M);
return {@ p : p in symRats | p[3] gt r @};
end function;
207
getInterestingPrimes2 := function(n,M,r);
// INPUT : n - a positive integer
// M - a positive integer
// r - a real number
// OUTPUT : A list of tuples returned by getSymmRatio(i,M) for i
// between 2 and n such that the ratio of symmetric n-Fermat
// primes to the number of symmetric n-Fermat primes expected
// by PNT *is less than* r.
symRats := getSymmRatios(n,M);
return {@ p : p in symRats | p[3] lt r @};
end function;
The following functions are used to generate statistics for the data generated by
the functions above.
averageRatio := function(primeData)
retVal := &+[ p[3] : p in primeData ] / (#primeData);
return retVal;
end function;
stdDevRatio := function(primeData)
N := #primeData;
avg := averageRatio(primeData);
retVal := SquareRoot(&+ [ (p[3] - avg)^2 : p in primeData ]/N);
return retVal;
end function;
ratioFrequency := function(primeData,i)
avg := averageRatio(primeData);
stdDev := stdDevRatio(primeData);
numStdDev := Abs(primeData[i-1][3] - avg) / stdDev;
freq := 1/(1-Erf(numStdDev/SquareRoot(2)));
return freq;
end function;
208
Curriculum Vitae
————————————————————————————————————–
EDUCATION
Wake Forest University, Winston-Salem, North Carolina
M.A. Mathematics, expected graduation May, 2015
GPA: 4.00
Honors: Teaching Assistantship (2013-2015)
Outstanding Graduate Student Award (2014)
Wake Forest University, Winston-Salem, North Carolina
B.S. Mathematics; B.A. Spanish
GPA: 3.88; Dean’s List (8 semesters); Summa Cum Laude
Honors: Kenneth Tyson Raynor scholar (2011 and 2012)
John Y. Phillips Prize in Mathematics (2013)
Phi Beta Kappa
Pi Mu Epsilon (NC Lambda Chapter President)
————————————————————————————————————–
RESEARCH EXPERIENCE
Class Field Theory, with Dr. Frank Moore, Wake Forest University, 2013-2015
• Master’s thesis entitled Class Field Theory and the Study of n-Fermat Primes
Knot Mosaic Theory, with Dr. Hugh Howards, Wake Forest University, 2012-2014
• Co-authored paper, Crossing Number Bounds in Knot Mosaics; submitted for con-sideration to Journal of Knot Theory and Its Ramification
• Available at http://arxiv.org/abs/1405.7683 (arXiv:1405.7683v2 [math.GT])
————————————————————————————————————–
TEACHING EXPERIENCE
Teaching Assistant, Wake Forest University Dept. of Mathematics, 2013-2015
• Tutoring, study sessions, grading and some lecture experience
• Statistics, discrete math, linear algebra, multivariable calculus and real analysis
Mathematics Tutor, Wake Forest University Math Center, 2010-2013
• Statistics, calculus (single and multivariable), algebra, discrete math
Spanish Tutor, Wake Forest University Dept. of Romance Languages, 2012-2014
————————————————————————————————————–
209