cosmo math

133
Omnipotence: The cosmology Version 0.0 Mikael Astner December 12, 2012 Abstract This is an article centering around the maths behind the cosmology, and while not necessary for the ctional article it’s educational and gives you a brief—yet rigorous— insight to the intricacy behind the cosmology, and hopefully a better understanding of the concept of innity. F urthe rmore, this article is intended to grow. In one senses I hope to x inevitab le typos, and in the other to elaborate on a number of topics, such as large cardinal numbers, which are cardinal num bers satisfying the condition card (X ) = card (P (X )), and is not provable in ZFC by itself, forced cardinals, very large cardinal, etc. Topics for surreal numbers that I want to cover are the length and subsystems, sums as subshues, a bit about the number theory, generalized epsilon numbers, surreal exponentiation, and large and very large cardinals in the surreal numbers. These elements are obviously all accounted for in the ctional article. 1

Upload: astner

Post on 04-Apr-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 1/133

Omnipotence: The cosmologyVersion 0.0

Mikael Astner

December 12, 2012

Abstract

This is an article centering around the maths behind the cosmology, and while not

necessary for the fictional article it’s educational and gives you a brief—yet rigorous—insight to the intricacy behind the cosmology, and hopefully a better understanding of the concept of infinity.

Furthermore, this article is intended to grow. In one senses I hope to fix inevitabletypos, and in the other to elaborate on a number of topics, such as large cardinalnumbers, which are cardinal numbers satisfying the condition card (X ) = card (P (X )),and is not provable in ZFC by itself, forced cardinals, very large cardinal, etc. Topics forsurreal numbers that I want to cover are the length and subsystems, sums as subshuffles,a bit about the number theory, generalized epsilon numbers, surreal exponentiation,and large and very large cardinals in the surreal numbers. These elements are obviouslyall accounted for in the fictional article.

1

Page 2: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 2/133

Copyright c 2012, by Mikael Astner  2

Contents

1 Axioms 51.1 Peano’s axiomatic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.1.1 The axiom of zero . . . . . . . . . . . . . . . . . . . . . . . . . . 51.1.2 The axiom of the successor . . . . . . . . . . . . . . . . . . . . . 51.1.3 The axiom of successor inequality . . . . . . . . . . . . . . . . . . 51.1.4 The axiom of predecessor of zero . . . . . . . . . . . . . . . . . . 51.1.5 The axiom of induction . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 Zermelo-Fraenkel’s axiomatic . . . . . . . . . . . . . . . . . . . . . . . . 51.2.1 The axiom of extensionality . . . . . . . . . . . . . . . . . . . . . 61.2.2 The axiom of pairing . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.3 The axiom schema of separation . . . . . . . . . . . . . . . . . . 61.2.4 The axiom of union . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.5 The axiom of the power set . . . . . . . . . . . . . . . . . . . . . 61.2.6 The axiom of infinity . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.7 The axiom of schema replacement . . . . . . . . . . . . . . . . . 6

1.2.8 The axiom of regularity . . . . . . . . . . . . . . . . . . . . . . . 61.2.9 The axiom of choice . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Consistency and independence of Zermelo-Fraenkel’s axiomatic . . . . . 6

2 The construction of N, Z, Q, and Q 92.1 The construction of the natural numbers . . . . . . . . . . . . . . . . . . 92.2 The construction of the integers . . . . . . . . . . . . . . . . . . . . . . . 92.3 The construction of the rational numbers . . . . . . . . . . . . . . . . . 92.4 The construction of the algebraic numbers . . . . . . . . . . . . . . . . . 10

3 The construction of R 103.1 The construction of the real numbers through Dedekin’s cuts . . . . . . 13

4 Why axiomatic set theory? 174.1 Language of set theory, formulas . . . . . . . . . . . . . . . . . . . . . . 184.2 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.3 Extensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.4 Pairing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204.5 Separation schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.6 Union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.7 Power set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.8 Infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.9 Replacement schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5 Ordinal numbers 265.1 Linear and partial ordering . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.2 Well-ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275.3 Ordinal numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285.4 Induction and recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.5 Ordinal arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Page 3: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 3/133

Copyright c 2012, by Mikael Astner  3

5.6 Well-founded relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6 Cardinal numbers 346.1 Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6.2 Alephs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366.3 The canonical well-ordering of α × α . . . . . . . . . . . . . . . . . . . . 376.4 Cofinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

7 Real numbers 407.1 The cardinality of the continuum . . . . . . . . . . . . . . . . . . . . . . 417.2 The ordering of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417.3 Suslin’s problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437.4 Borel sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447.5 Lebesque . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457.6 The Baire space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457.7 Polish spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

8 The axiom of choice and cardinal arithmetic 478.1 The axiom of choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478.2 Using the axiom of choice in mathematics . . . . . . . . . . . . . . . . . 498.3 The countable axiom of choice . . . . . . . . . . . . . . . . . . . . . . . 498.4 Cardinal arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508.5 Infinite sums and products . . . . . . . . . . . . . . . . . . . . . . . . . . 518.6 The continuum function . . . . . . . . . . . . . . . . . . . . . . . . . . . 558.7 Cardinal exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . 578.8 The singular cardinal hypothesis . . . . . . . . . . . . . . . . . . . . . . 59

9 The axiom of regularity 599.1 The cumulative hierarchy of sets . . . . . . . . . . . . . . . . . . . . . . 609.2 ∈-induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

9.3 Well-founded relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6410 Surreal numbers 66

10.1 The definition and the fundamental existence theorem . . . . . . . . . . 6610.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6610.1.2 Fundamental existence theorem . . . . . . . . . . . . . . . . . . . 6710.1.3 Order properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

10.2 The basic operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7310.2.1 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7310.2.2 Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7610.2.3 Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8010.2.4 Square root . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

10.3 Real numbers and ordinals . . . . . . . . . . . . . . . . . . . . . . . . . . 84

10.3.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8410.3.2 Dyadic fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . 8610.3.3 Real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

10.4 Ordinal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Page 4: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 4/133

Copyright c 2012, by Mikael Astner  4

10.5 N ormal form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10310.5.1 Combinatory lemma on semigroups . . . . . . . . . . . . . . . . . 10310.5.2 The ω-map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

10.6 N ormal form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

10.6.1 Application to real closure . . . . . . . . . . . . . . . . . . . . . . 11810.6.2 Sign sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

11 Miscellaneous 13111.1 Uncategorized proof s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

Page 5: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 5/133

Copyright c 2012, by Mikael Astner  5

1 Axioms

1.1 Peano’s axiomatic

Before jumping into the axioms we’ll be clarifying the equivalence symbols. The equal to symbol denotes equality. A = B means that A and B are the same number. Thenot equal to symbol denotes non-equality. A = B means that A and B aren’t thesame number. To these the following three properties apply; reflexivity, symmetry,and transitivity.

i. Reflexivity: A = A for every A.

ii. Symmetry: If  A = B then B = C .

iii. Transitivity: If  A = B and B = C , then A = C .

1.1.1 The axiom of zero

Zero, denoted: 0, is a natural number.

1.1.2 The axiom of the successor

The successor, denoted S (n) for a natural number n, is a natural number.

1.1.3 The axiom of successor inequality

For any given natural number there exists either no number or exactly one numberwhose successor is the given number. That is, S (m) = S (n), then m = n.

1.1.4 The axiom of predecessor of zero

Number one has no natural numbered predecessor. That is, there’s no such naturalnumbered n so that

S (n) = 0.

1.1.5 The axiom of induction

Let N be the set of natural numbers, with the following properties:

i. 1 belongs to N.

ii. If n belongs to N, then S (n) belongs to N.

1.2 Zermelo-Fraenkel’s axiomatic

Zermelo-Fraenkel’s axiomatic, commonly abbreviated ZFC (ZF- for Zermelo-Fraenkel,and -C for the axiom of choice) centers around the construction of sets. So, what is aset? A set is a bin, denoted by curly brackets, that holds a number of mathematical

objects. These mathematical objects will be referred to as elements and are separatedby commas.

Page 6: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 6/133

Copyright c 2012, by Mikael Astner  6

1.2.1 The axiom of extensionality

If  X  and Y  have the same elements, then X  = Y .

1.2.2 The axiom of pairingFor any a and b there exists a set {a, b} that contains exactly a and b.

1.2.3 The axiom schema of separation

If  P  is a property (with parameter p), then for any X  and p, there exists a set Y  ={u ∈ X  : P (u, p)} that contains all those u ∈ X  that have property P .

1.2.4 The axiom of union

For any X  there exists a set Y  =

X , the union of all elements of  X .

1.2.5 The axiom of the power set

For any set X  there exists a set Y  = P (X ), the set of all subsets of  X .

1.2.6 The axiom of infinity

There exists an infinite set.

1.2.7 The axiom of schema replacement

If a class F  is a function, then for any X there exists a set Y  = F (X ) = {F (X ) : x ∈ X }.

1.2.8 The axiom of regularity

Every nonempty has an ∈-minimal element.

1.2.9 The axiom of choice

Every family of nonempty sets has a choice function.

1.3 Consistency and independence of Zermelo-Fraenkel’s ax-

iomatic

While most of these axioms are intuitively consistent and independent there are a fewof them whose consistency and independence may seem less obvious. These axiomsare 4.3 the axiom of extensionality, 1.2.4 the axiom of the sum set, 1.2.5 the axiom of of the power set and 1.2.9 the axiom of choice. For mathematical rigorousness we’re

going to prove the consistency and independence of these axioms. This through onefinite model and three denumerable models.

Page 7: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 7/133

Copyright c 2012, by Mikael Astner  7

Proof. To prove the consistency of the four axioms, consider the following model:

A ∈ A

In this model the axiom of extensionality is valid. Furthermore P (A) = A = A and

A is disjointed and its own selection set. So the remaining axioms are valid as well,proving the consistency of the axioms.To prove that the power set is independent of the remaining axioms consider the fol-lowing model:

B /∈ B

In this model the axiom of extensionality is valid. However, in this model there is no setwhose element is B meaning that the axiom of the power set is invalid. Furthermore,since in this model

B = B, and since B isn’t its own selection set, the remaining

axioms are valid.To prove the independence of the axiom of extensionality consider the model whosedomain consists of the sets A1, A2, A3,... according to (1):

A1 ={

A1

}A2 = {A1}A3 = {A1, A2}A4 = {A1, A2, A3} (1)

...

An = {A1, A2, A3,...,An−1}...

We se that in the above:

i. A1 = A2 and that A1 ∈ A1, however, A2 /∈ A1. Thus in (1) the axiom of extensionality is invalid.

ii. A1 = A2 = A3 = A1 and An = An

−1 for n > 3. Thus, in (1) the axiom

of the sum set is valid.iii. P (A1) = A3 and P (An) = An+1 for n > 1. Thus in (1) the axiom of the power

set is valid.

iv. In the set of equations above the set A1 is a selection set of  A1 and of A2, whichare the only disjointed sets of the model. Thus, in (1) the axiom of choice is valid.

To prove the independence of the axiom of the sum set, let’s consider a model whosedomain consists of the sets A1, B1, A2, B2, A3, B3,... and which is described by (2).

A1 = {A1} B1 = {A1, A2}A2 = {A1, B1, B2} B2 = {A1, B1}A3 = {A1, A2, B2, B3} B3 = {A1, B2}A4 =

{A1, A3, B1, B3, B4

}... (2)

A5 = {A1, A4, B2, B4, B5} ...

An = {A1, An−1, Bn−3, Bn−1, Bn} Bn = {A1, Bn−1}... ...

Page 8: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 8/133

Copyright c 2012, by Mikael Astner  8

We see that in the above:

i. No two differently sets are equal. Thus, in (2) the axiom of extensionality is valid.

ii. There’s no set whose elements are A1,A2, and B1. Therefore, there is no sum setof the set A

2. Thus, in (2) the axiom of the sum set is invalid.

iii. P (A1) = A1 and P (An) = An+1, for n > 1. Also, P (Bn) = Bn+1, for n ≥ 1.Thus, in (2) the axiom of the power set is valid.

iv. In (2) the set A1 is a selection set of  A1 which is the only disjointed set of themodel. Thus, in (2) the axiom of choice is valid.

To prove that the independence of the axiom of choice consider a model whose domainconsists of the sets A1, B1, C 1, A2, B2, C 2, A3, B3, C 3,... and which is described by (3):

A1 = {A2, B1} B1 = {C 2}A2 = {A1, B2} B2 = {B1}A3 = {A2, B3} ...

A4 ={

A3,B4

}...

... ...

An = {An−1, Bn} Bn = {Bn−1}... ...

C 1 = {A1, B2, C 2} (3)

C 2 = {A2, B1, B3, C 1}C 3 = {A1, A3, B2, B4, C 2}C 4 = {A2, A4, B1, B3, B5, C 1, C 3}...

C 2n = {A2, A4,...,A2n, B1, B3,...,B2n+1, C 1, C 3, C 2n−1}C 2n+1 =

{A1, A3,...,A2n+1, B2, B4,...,B2n+2, C 2, C 4,...,C 2n

}...

Now, we see that in above:

i. No two differently sets are equal. Thus, in (3) the axiom of extensionality is valid.

ii. It can be readily checked that in (3)A1 = C 1 and

An = An−1, for n > 1,

B1 = C 2 and

Bn = Bn−1, for n > 1,C 1 = C 2 and

C n = C n−1, for n > 1.

Thus, in (3) the axiom of the sum set i valid.

iii. It can readily be checked that in (3) for n > 1,

P (An) = An+1, P (Bn) = Bn+1, P (C n) = C n+1.

Thus, in (3) the axiom of the power set is valid.

Page 9: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 9/133

Copyright c 2012, by Mikael Astner  9

iv. In (3) we see that

A1 = {A2, B1} = {{A1, B2} , {C 2}}is a disjointed set. However, neither

{A1, C 2

}nor

{B2, C 2

}exists in (3). Hence,

A1 cannot possibly have a selection set in the model described by (3). Thus, in(3) the axiom of choice is invalid.

2 The construction of N, Z, Q, and Q

2.1 The construction of the natural numbers

As readily hinted at in 1.1.5 the natural numbers exist. But what is a natural number?We’ll identify them with sets (based on the empty set) according to the following model:

0 = ∅

1 = {∅}2 = {∅, {∅}}3 = {∅, {∅} , {∅, {∅}}}... = ...

n = {∅, {∅} , {∅, {∅}} , {∅, {∅} , {∅, {∅}}} ,... {∅, {∅} , {∅, {∅}} , {∅, {∅} , {∅, {∅}}} ,...}}... = ...

From this model we have that S (n) = n ∪ {n}.

2.2 The construction of the integers

We’ve postulated the successor operator in 1.1, we’re now going to define the addition

operator asm ⊕ n = S (S (....S    

n terms

(m)...)).

In other words m + n is the nth successor of  m. Further we’re going to define theinverse operator,

m n = k, then m = n ⊕ k.

This gives rise to negative natural numbers for n ≥ m. The set the natural numbersand the negative natural numbers are together the integers, denoted Z.

2.3 The construction of the rational numbers

Define the multiplication operator as

m ⊗ n = m ⊕ m ⊕ m ⊕ ... ⊕ m   n times

,

Page 10: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 10/133

Copyright c 2012, by Mikael Astner  10

with the following attribute: m × n ≥ 0 if either m ≤ 0 and n ≤ 0, or m ≥ 0 andn ≥ 0. Then define the division operator,

m n = k, then m = k ⊗ n, if and only if  n = 0.

The set of rational numbers are going to be all the numbers mn

, where m is any integer,and n is natural number strictly greater than zero.

Note. Since the operators aren’t of significant interest I won’t explain them togreater extent, nor why said attributes and restrictions are added. However, the at-tributes are significant in the construction of the real numbers and will be elaboratedon in 3.1.

2.4 The construction of the algebraic numbers

The algebraic numbers are constructed following the same sequence of hyperoperators(successor, addition, multiplication, exponentiation, tetration, etc.) and their respec-

tive inverses. This step of course being exponentiation and rooting. In which case analgebraic number would be written as:

a1 + a22√ 

x1 + a33√ 

x2 + ... + an n√ 

xn−1.

Where n are natural numbers, an are rational numbers, and xn are positive rationalnumbers so that no algebraic number is the same. In other words, the algebraic number2

 14 is the same as the algebraic number 1

2 (which also happens to be a rational number)

and while they can be written in different ways the set of algebraic numbers may onlycontain one of them.

3 The construction of R

Definition 3.1. Let A be a set. An order on A is a relation, denoted by <, with thefollowing two properties:

i. If x ∈ A and y ∈ A then one and only one of the following statements

x < y, x = y, y < x

is true.

ii. If x,y,z ∈ A, x < y, and y < z , then x < z.

Definition 3.2. An ordered set is a set A in which an order is defined. (For instanceQ is an ordered set if  x, y ∈ Q and x < y is defined to mean that y − x is a positiverational number.)

Definition 3.3. Suppose that A is an ordered set, and B ⊂ A. If there exists a x ∈ Asuch that y ≤ x for every b ∈ B, we say that B is bounded above, and refer to x as anupper bound of  B. Lower bounds are defined in the same manner, with ≥ in place of ≤.

Page 11: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 11/133

Copyright c 2012, by Mikael Astner  11

Definition 3.4. Suppose that A is an ordered set, B ⊂ A, and B is bounded above.Suppose there exists an x ∈ A with the following properties:

i. x is an upper bound of  B.

ii. If z < x then z is not an upper bound of  B.

Then x is called the least upper bound of  B (that there’s at most one such x is clearfrom definition 3.2) or the supremum of  B, denoted:

x = sup (B)

The greatest lower bound, or infimum, of a set B which is bounded below is defined inthe same manner, denoted:

x = inf (B)

means that x is a lower bound of  B and that no y with y > x is a lower bound of  B.

Definition 3.5. An ordered set A is said to have least-upper-bound property if the

following is true: B ⊂ A, B is not empty, and B is bounded above, then sup (B) existsin A.

Definition 3.6. A field is a set A with two operations, called adition and multiplica-tions, which satisfy the following axioms. (Commonly referred to as field axioms .)

Axiom for addition

i. if x ∈ A and y ∈ A, then their sum x + y is in A.

ii. Addition is commutative: x + y = y + x for all x, y ∈ A.

iii. Addition is associative: (x + y) + z = x + (y + z) for all x,y,z ∈ A.

iv. A contains an element 0 such that 0 + x = x for every x ∈ A.

v. To every x ∈ A corresponds an element −x ∈ A such that x + (−x) = 0.

Axioms for multiplication

i. If x ∈ A and y ∈ A, then their product xy is in A.

ii. Multiplication is commutative: xy = xy for all x, y ∈ A.

iii. Multiplication is associative: (xy) z = x (yz) for all x,y,z ∈ A.

iv. A contains an element 1 = 0 such that 1 · x = x for every x ∈ A.

v. If x ∈ A and x = 0, then there exists an element 1x

∈ A such that x · 1x

= 1.

The distributive law

x(y + z) = xy + xz ∀x,y,z ∈ A.

Definition 3.7. An ordered field is a field A which is also an ordered set, such that

i. x + y < x + z if  x,y,z ∈ A and y < z .

Page 12: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 12/133

Copyright c 2012, by Mikael Astner  12

ii. xy > 0 if  x, y ∈ A, x > 0, and y > 0.

If  x > 0, we call x positive ; if  x < 0, we call x negative .

Proposition 3.8. The axioms for addition imply the following statements.

i. If x + y = x + z then y = z.ii. If x + y = x then y = 0.

iii. If x + y = 0 then y = −x.

iv. − (−x) = x.

Statement i is the cancellation law. Note that ii asserts the uniqueness of the elementswhose existence is assumed in axioms for addition’s iv, and iii the same for axioms foraddition’s v.

Proof. If  x + y = x + z, the axioms give

y = 0 + y = (−x + x) + y = −x + (x + y) = −x + (x + z) = (−x + x) + z = 0 + z = z

This proves i. Take z = 0 in i to obtain ii. Take z = −x in i to obtain iii. Since−x + x = 0, we have that iii (with −x in place of  x) gives iv.

Proposition 3.9. The axioms for multiplication imply the following statements.

i. If x = 0 and xy = xz then y = z.

ii. If x = 0 and xy = x then y = 1.

iii. If x = 0 and xy = 1 then y = 1x

.

iv. If x = 0 then 1/ 1x

= x.

The proof for proposition 3.9 is analogous to proposition 3.8 and hence omitted.

Proposition 3.10. The field axioms imply the following statements, for any x,y,z

∈A.

i. 0 · x = 0.

ii. If x = 0 and y = 0, then xy = 0.

iii. (−x) y = − (xy) = x (−y).

iv. (−x) (−y) = xy.

Proof. 0 · x + 0 · x = (0 + 0)x = 0 · x. Hence ii in proposition 3.8 implies that 0 · x = 0,and i holds. Next, assume that x = 0, y = 0, but xy = 0. Then i gives

1 =x

x· y

y=

1

x· 1

y

xy =

1

x· 1

y

· 0 = 0,

a contradiction. Thus ii holds. The first equality in iii comes from

(−x) y + xy = (−x + x) y = 0 · y = 0,

Page 13: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 13/133

Copyright c 2012, by Mikael Astner  13

combined with iii in proposition 3.8; the other half of iii is proved in the same way.Finally,

(−x) (−y) = − (x (−y)) = − (− (xy))

by iii, and iv in proposition 3.8.

Proposition 3.11. The following statements are true in every ordered field.

i. If x > 0 then −x < 0, and vice versa.

ii. If x > 0 and y < z , then xy < xz.

iii. If x < 0 and y < z , then xy > xz.

iv. If x = 0 then x2 > 0. In particular, 1 > 0.

v. If 0 < x < y then 0 < 1y

< 1x

.

Proof. If x > 0 then 0 = −x + x > −x +0, so that −x < 0. If x < 0 then 0 = −x + x <

−x + 0, so that −x > 0. Proving i.Since z > y, we have z − y > y − y = 0, hence x (z − y) > 0, and therefore

xz = x (z − y) + xy > 0 + xy = xy.

By i, ii, and iii in proposition 3.8 (only iii is from proposition 3.8),

− (x (z − y)) = (−x) (z − y) > 0,

so that x (z − y) < 0, hence xz < xy.

If x > 0, part ii of definition 3.7 gives x2 > 0. If x < 0, then −x > 0, hence (−x)2 > 0.

But x2 = (−x)2

, by iv in proposition 3.10. Since 1 = 12, 1 > 0.If  y > 0 and s ≤ 0, then ys ≤ 0. But y · 1

y= 1 > 0. Hence 1

y> 0. Likewise, 1

x> 0 If 

we multiply both sides of the inequality x < y by the positive quantity

1

x ·1

y , we obtain1y

< 1x

.

3.1 The construction of the real numbers through Dedekin’s

cuts

There exists an ordered field R which has the least-upper-bound property. Moreover,R contains Q as a subfield.

Proof. The proof will (for the sake of convenience) be divided into nine consecutivesteps.

Step 3.12. The members of R will be certain subsets of Q, called cuts. A cut is, by

definition, any set α ⊂ Q with the following properties.

i. α is not empty, and α = Q.

ii. If x ∈ α, y ∈ Q, and y < x, then y ∈ α.

Page 14: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 14/133

Copyright c 2012, by Mikael Astner  14

iii. x ∈ α, then x < y for some y ∈ α.

The letters x,y,z,... will in this section always denote rational numbers, and α,β,γ,...will be used to denote cuts.

Step 3.13. Define ”α < β ” to mean: α is a proper subset of  β . Does this definitionmeet the requirements of definition 3.1? If  α < β  and β < γ  it’s clear that α < γ , asa proper subset of a proper is a proper subset of the initial set. It’s also clear that atmost one of the three reations

α < β, α = β, β < α

can hold for any pair of  α, β . To show that at least one holds, assume that the firsttwo fail. Then α is not a subset of  β . Hence there is an x ∈ α with x /∈ β . If  y ∈ β ,it follows that y < x (since x /∈ β ), hence y ∈ α, by ii in step 3.12. This β  ⊂ α. Sinceβ = α, we conclude: β < α. Thus R is now an ordered set.

Step 3.14. The ordered set R has the least-upper-bound property. To prove this, letA be a nonempty subset of R, and assume that β 

∈R is an upper bound of  A. Define

γ  to be the union of all α ∈ A. We shall prove that γ ∈ R and that γ  = sup (A).Since A is not empty, there exists an α0 ∈ A. This α0 is not empty (as cuts cannot beempty). Since α0 ⊂ γ , γ  is not empty. Next, γ ⊂ β  (since α ⊂ β  for every α ∈ A), andtherefore γ = Q. Thus γ  satisfies property i in step 3.12. Pick x ∈ γ . Then x ∈ α1 forsome α1 ∈ A. If  y < x, then y ∈ α1, hence y ∈ γ ; this proves property ii in step 3.12.If  z ∈ α1 is so chosen that z > x, we see that z ∈ γ  (since α1 ∈ γ ), and therefore γ satisfies property iii in step 3.12.Thus γ  ∈ R. It’s clear that α ≤ γ  for every α ∈ A. Suppose δ < γ . Then there is as ∈ γ  so that s /∈ δ . Since s ∈ γ , s ∈ α for some α ∈ A. Hence δ < α, and δ  is not anupper bound of  A. This gives the desired result: γ  = sup (A).

Step 3.15. If α ∈ R and β ∈ R we define α + β  to be the set of all sums x + y, wherex ∈ α and y ∈ β .

We define 0∗ to the set of all negative rational numbers. It’s clear that 0∗ is a cut. Weverify that the axioms for additions (see definition 3.6) hold in R, with 0∗ playing therole of 0.

i. We have to show that α + β  is a cut. It’s clear that α + β  is a nonempty subsetof Q. Take x /∈ α and y /∈ β . Then x + y > x + y for all choices of  x ∈ α andy ∈ β . Thus x + y /∈ α + β . It follows that α + β  has property i in step 3.12.Pick z ∈ α + β . Then z = x + y, with x ∈ α and y ∈ β . If  s < z, then s − y < x,so s − y ∈ α, and s = (s − y) + y ∈ α + β . Thus property ii in step 3.12 holds.Choose t ∈ α so that t > x. Then z < t + y and t + y ∈ α + β . Thus property iiiin step 3.12 holds.

ii. α + β  is the set of all x + y, with x ∈ α and y ∈ β . By the same definition,β + α is the set of all y + x. Since x + y = y + x for all x ∈ Q, y ∈ Q, we have

α + β  = β + α.iii. As above, this follows from the associative law in Q.

Page 15: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 15/133

Copyright c 2012, by Mikael Astner  15

iv. If x ∈ α and y ∈ 0∗, then x + y < x, hence x + y ∈ α. Thus α + 0∗ ⊂ α. To obtainthe opposite inclusion, pick z ∈ α, and pick x ∈ α, x > z. Then z − x ∈ 0∗, andz = x + (z − x) ∈ α + 0∗. We conclude that α + 0∗ = α.

v. Fix α

∈R. Let β  be the set of all z with the following property: There exists 

x > 0 such that  −z − x /∈ α. In other words, some rational number smaller than−z fails to be in α.If  y /∈ α and z = −y − 1, then −z − 1 /∈ α, hence z ∈ β . So β  is not empty. If s ∈ α, then −s /∈ β . So β = Q. Hence β  satisfies property i in step 3.12.Pick z ∈ β , and pick x > 0, so that −z − x /∈ α. If  s < z, then −s − x > −z − x,hence −s − x /∈ α. Thus s ∈ β , and property ii in step 3.12 holds. Put t = z + x

2 .Then t > z, and −t − x

2 = −z − x /∈ α, so that t ∈ β . Hence β  satisfies propertyiii in step 3.12.We’ve now proved that  β ∈ R.If  x ∈ α and y ∈ β , then −y /∈ α, hence x < −y, x + y < 0. Thus α + β ⊂ 0∗.To prove the opposite inclusion, pick u ∈ 0∗, put v = −u

2 . Then v > 0, and thereis an integer n such that nv ∈ α but (n + 1) v /∈ α. (Note that this depends onthe fact that Q has the archimedean property.) Put z =

−(n + 2) v. Then z

∈β ,

since −z − v /∈ α, and

u = nv + z ∈ α + β .

Thus 0∗ ⊂ α + β .We conclude that α + β  = 0∗.This β  will of course be denoted by −α.

Step 3.16. Having proved that addition defined in step 3.15 satisfies the axioms foraddition of definition 3.6, it follows that property 3.8 is valid in R, and we can proveone of the requirements of 3.7:If  α,β,γ ∈ R and β < γ , then α + β < α + γ .Indeed, it’s obvious from the definition of + in R that α + β  ⊂ α + γ ; if we had

α + β  = α + γ , the cancellation law in property 3.8 would imply β  = γ .It also follows that α > 0∗ if and only if  −α < 0∗

Step 3.17. Multiplication is a bit more bothersome than addition in the present con-text, since products of negative rational numbers are positive. For this reason weconfine ourselves first to R+, the set of all α ∈ R with α > 0∗.If  α ∈ R+ and β  ∈ R+, we define αβ  to be the set of all z such that z ≤ xy for somechoice of  x ∈ α, y ∈ β , x > 0, and y > 0.We define 1∗ to be the set of all s < 1.Then the axioms for multiplication and the distributive law in 3.6 hold, with R+ inplace of  A, and with 1∗ in the role of 1.The proofs are analogous to those given in detail in step 3.15 so we omit them.Note. In particular, that the second requirement of 3.7 holds: If  α > 0∗ and β > 0∗

then αβ > 0∗

.

Page 16: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 16/133

Copyright c 2012, by Mikael Astner  16

Step 3.18. We complete the definition of multiplication by setting α · 0∗ = 0∗ · α = 0∗

and by setting

αβ  = (−α) (−β ) if  α < 0∗, β < 0∗,

− ((−α) β ) if  α < 0∗, β > 0∗.− (α (−β )) if  α > 0∗, β < 0∗.

The products on the right were defined in step 3.17.Having proved in step 3.17 that the axioms for multiplication hold in R+, it’s nowperfectly simple to prove them in R, by repeated application of the identity γ  = − (−γ )in 3.8. (See step 3.16.)The proof of the distributive law

α (β + γ ) = αβ + αγ 

breaks into two cases. For instance, suppose α > 0∗, β < 0∗, β  + γ > 0∗. Thenγ  = (β + γ ) + (−β ), and (since we already know that the distributive law holds in R+)

αγ  = α (β + γ ) + α (−β ) .

But α (−β ) = − (αβ ). Thus

αβ + αγ  = α (β + γ ) .

The other cases are handled analogously.

We’ve now completed the proof that R is an ordered field with the least-upper-bound property.

Step 3.19. We associate with each x ∈ Q the set x∗ which consists of all z ∈ Q suchthat z < x. It’s clear that each x∗ is a cut; that is, x∗ ∈ R. These cuts satisfy thefollowing relations:

i. x∗ + y∗ = (x + y)∗,

ii. x∗y∗ = (xy)∗

,

iii. x∗ < y∗ if and only if  x < y .

To prove i, choose z ∈ x∗ + y∗. Then z = s + t, where s < x, t < y. Hence z < x + y,which says that z ∈ (x + y)

∗.

Conversely, suppose z ∈ (x + y)∗

. Then z < x + y. Choose u so that 2u = x + y − z,put

x = x − u, y = y − u.

Then x ∈ x∗, y ∈ y∗, and z = x + y, so that z ∈ x∗ + y∗.This proves i. The proof of ii is analogous.If  x < y then x ∈ y∗, but x /∈ x∗; hence x∗ < y∗.If  x∗ < y∗, then there is a z ∈ y∗ such that z /∈ x∗. Hence x ≤ z < y, so that x < y.This proves iii.

Page 17: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 17/133

Copyright c 2012, by Mikael Astner  17

Step 3.20. We saw in 3.19 that the replacement of the rational numbers x by thecorresponding ”rational cuts” x∗ ∈ R preserves sums, products, and order. This factmay be expressed by saying that the ordered field Q is isomorphic to the ordered fieldQ∗ whose elements are the rational cuts. Of course, x∗ is by no means the same as x,

but the properties we are concerned with (arithmetic and order) are the same in thetwo fields.It is this identification of Q with Q∗ which allows us to regard Q as a subfield of R.The second part of the theorem is to be understood in terms of this identification. It’sa fact, which we will not prove here, that any two ordered fields with the least-upper-bound property are isomorphic. The first part of the theorem therefore characterizesthe real field R completely.

4 Why axiomatic set theory?

Intuitively, a set is a collection of all elements that satisfy a certain given property.

In other words, we might be tempted to postulate the following rule of formations forsets.

Axiom 4.1 (Schema of comprehension, false). If  P  is a property then there exists aset X  = {x : P (x)}.

This principle, however, is false as proven by the following theorem.

Theorem 4.2 (Russell’s paradox). Consider the set A whose elements are those (andonly those) sets that are members of themselves. S  = {X  : X /∈ X }.Question: Does S  belong to S ? If  S  belongs to S , then S  isn’t a member of itself soS /∈ S . On the other hand, if S /∈ S , then S  belongs to S . In either case, we have acontradiction. Thus must conclude that

X  : X /∈

isn’t a set, and we must revise the intuitive notion of a set.

The safe way to eliminate paradoxes of this type is to abandon the Schema of comprehension and week its weak version, the schema of separation , 4.5.

Once we give up the full comprehension schema, Russell’s paradox is no longer athread; moreover, it provides this useful information: The set of all sets doesn’t exists.(Otherwise, apply the separation schema to the property x /∈ x.)

In other words, it’s the concept of the set of all sets that is paradoxical, not theidea of the comprehension itself.

Replacing full comprehension by separation presents us with a new problem. Theseparation axioms are too weak to develop set theory with its usual operations and

constructions. Notably, these axioms aren’t sufficient to prove that e.g., the unionX ∪ Y  of two sets exists, or to define the notation of a real number.

Thus we have to add further construction principles that postulate the existence of sets obtained from other sets by means of certain operations.

Page 18: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 18/133

Copyright c 2012, by Mikael Astner  18

The axioms of ZFC are generally accepted as a correct formalization of those prin-ciples that mathematicians apply when dealing with sets.

4.1 Language of set theory, formulas

The axiom schema of separation as formulated above uses the vague notion of a prop-erty . To give the axioms a precise form, we develop axiomatic set theory in the frame-work of the first order predicate calculus. Apart from the equality predicate =, thelanguage of set theory consists of the binary predicate ∈, the membership relation .

The formulas  of set theory are built up from the atomic formulas 

x ∈ y, x = y

by means of  connectives 

ϕ ∧ ψ, ϕ ∨ ψ, ¬ϕ, ϕ ⇒ ψ, ϕ ⇔ ψ

—conjunction, disjunction, negation, implication, equivalence,—and quantifiers 

∀xϕ, ∃xϕ.

In practice, we shall use in formulas other symbols, namely defined predicates,operations, and constants, and even use formulas informally; but it will be tacitlyunderstood that each such formula can be written in a form that only involves ∈ and= as nonlogical symbols.

Concerning formulas with free variables, we adopt the notation convention that allfree variables of a formula

ϕ (u1,...,un)

are among u1,...,un—possibly some ui are not free, or even don’t occur, in ϕ.—A

formula without free variables is called a sentence .

4.2 Classes

Although we work in ZFC which—unlike alternative axiomatic set theories—has onlyone type of object, namely sets, we introduce the information notion of a class . We dothis for practical reasons: It’s easier to manipulate classes than formulas.

If  ϕ (x, p1,...,pn) is a formula, we call

C  = {x : ϕ (x, p1,...,pn)}a class . Member of the class C  are all those sets x that satisfy ϕ (x, p1,...,pn):

x∈

C  if and only if  ϕ (x, p1,...,pn) .

We say that C  is definable from  p1,...,pn; if  ϕ (x) has no parameters pi then the classC  is definable .

Page 19: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 19/133

Copyright c 2012, by Mikael Astner  19

Two classes are considered equal if they have the same elements: If 

C  = {x : ϕ (x, p1,...,pn)} , D = {x : ψ (x, q 1,...,q m) ,}

then C  = D if and only if for all x

ϕ (x, p1,...,pn) ⇔ ψ (x, q 1,...,q m) .

The universal class , or universe , is the class of all sets:

V  = {x : x = x} .

We define inclusion  of classes—C  is a subclass of  D—

C  ⊂ D if and only if  ∀x ∈ C  implies x ∈ D,

and the following operations on classes:

C ∩ D = {x : x ∈ C ∧ x ∈ D} ,C ∪ D = {x : x ∈ C ∨ x ∈ D} ,

C − D = {x : x ∈ C ∧ x /∈ D} ,C  = {x : x ∈ S  for some S ∈ C } =

{S  : S  ∈ C } .

Every set can be considered a class. If  S  is a set, consider the formula x ∈ S  and theclass

{x : x ∈ S } .

That the set S  is uniqely determined by its elements follows from the axiom of exten-sionality, 4.3.

A class that isn’t a set is a proper class.

4.3 Extensionality

If  X  and Y  have the same elements, then X  = Y :

∀u (u ∈ X  ⇔ u ∈ Y ) ⇒ X  = Y 

The converse, namely, if  X  = Y  then u ∈ X  ⇔ u ∈ Y , is an axiom of predicatecalculus. This we have

X  = Y  if and only if  ∀u (u ∈ X  ⇔ u ∈ Y ) .

The axiom expresses the basic idea of a set: A set is determined by its elements.

Page 20: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 20/133

Copyright c 2012, by Mikael Astner  20

4.4 Pairing

For any a and b there exists a set {a, b} that contains exactly a and b:

∀a∀b∃c∀x (x ∈ c ⇔ x = a ∨ x = b) .

By extensionality, the set c is unique, and we can define the pair 

{a, b} = the unique c such that ∀x (x ∈ c ⇔ x = a ∨ x = b) .

The singleton  {a} is the set

{a} = {a, a} .

Since {a, b} = {b, a}, we futher define an ordered pair 

(a, b)

so as to satisfy the following condition:

(4) (a, b) = (c, d) if and only if  a = c and b = d.

For the formal definition of an ordered pair, we take

(a, b) = {{a} , {a, b}} .

The verification of (4) is rather simple.

Proof. If  a = c and b = d, then

{{a} , {a, b}} = {{c} , {c, d}} .

Thus (a, b) = (c, d).If  a = b, then

(a, b) = {{a} , {a, b}} = {{a} , {a, a}} = {{a}} ;

(c, d) =

{{c

},

{c, d

}}=

{{a

}}.

Thus {c} = {c, d} = {a}, which implies that a = c and a = d. By hypothesis, a = bhence b = d.

If  a = b, then (a, b) = (c, d) implies

{{a} , {a, b}} = {{c} , {c, d}} .

Suppose {c, d} = {a}, then c = d = a, and so

{{c} , {c, d}} = {{a} , {a, a}} .

But then {{a} , {a, b}} would also equal {{a}}, so a = b contradicting a = b.Suppose {c} = {a, b}. Then a = b = c, which also contradicts a = b.Therefore{c} = {a}, so that c = a and {c, d} = {a, b}.If  d = a were true, then

{c, d} = {a, a} = {a} = {a, b} ,

a contradiction. Thus d = b is the case, so that a = c and b = d.

Page 21: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 21/133

Copyright c 2012, by Mikael Astner  21

We further define ordered triples, quadruples, etc. as follows:

(a1, a2, a3) = ((a1, a2) , a3) ,

(a1, a2, a3, a4) = ((a1, a2, a3) , a4) ,

...

(a1,...,an) = ((a1,...,an) , an+1) .

It follows that two ordered n-tuples (a1,...,an) and (b1,...,bn) are equal if and only if a1 = b1,...,an = bn.

4.5 Separation schema

Let ϕ (u, p) be a formula. For any X  and p, there exists a set Y  = {u ∈ X  : ϕ (u, p)}:

(5) ∀X ∀ p∃Y ∀u (u ∈ Y  ⇔ u ∈ X ∧ ϕ (u, p)) .

For each formula ϕ (u, p), the formula (5) is an axiom (of separation). The set Y 

in (5) is unique by extensionality.Note that a more general version of separation axioms can be proved by usingordered n-tuples: Let ψ (u, p1,...,pn) be a formula. Then

(6) ∀X ∀ p1...∀ pn∃Y ∀u (u ∈ Y  ⇔ u ∈ X ∧ ψ (u, p1,...,pn)) .

Simply let ϕ (u, p) be the formula

∃ p1...∃ pn ( p = ( p1,...,pn) and ψ (u, p1,...,pn))

and then given X  and p1,...,pn, let

Y  = {u ∈ X  : ϕ (u, ( p1,...,pn))} .

We can give the separation axioms the following form: Consider the class C  =

{u : ϕ (u, p1,...,pn)}; then by (6)

∀X ∃Y  (C ∩ X  = Y ) .

Thus the intersection of a class C  with any set is a set; or, we can say even moreinformally, a subclass of a set is a set . One consequence of the separation axioms isthat the intersection and the difference of two sets is a set, and so we can define theoperations

X ∩ Y  = {u ∈ X  : u ∈ Y } and X − Y  = {u ∈ X  : u /∈ Y } .

Similarly, it follows that the empty class

∅ = {u : u = u}is a set, the empty set ; this, of course, only under the assumption that at least one setX  exists (because ∅ ⊂ X ):

(7) ∃X (X  = X ) .

Page 22: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 22/133

Copyright c 2012, by Mikael Astner  22

We have not included (7) among the axioms, because it follows from the axiom of infinity.

Two sets X , Y  are called disjoint if  X ∩ Y  = ∅.If  C  is a nonempty class of sets, we let

C  = {X  : X  ∈ C } = {u : u ∈ X  for all X  ∈ C } .

Note that

C  is a set—it’s a subset of any X  ∈ C .—Also, X ∩ Y  = {X, Y }.

Another consequence of the separation axioms is that the universal class V  is aproper class; otherwise,

S  = {x ∈ V  : x /∈ x}would be a set.

4.6 Union

For any X  there exists a set Y 

X :

(8) ∀X ∃Y ∀u (u ∈ Y  ⇔ ∃z (z ∈ X ∧ u ∈ z)) .Let us introduce the abbreviations

(∃z ∈ X ) ϕ for ∃z (z ∈ X ∧ ϕ) ,

and

(∀z ∈ X ) for ∀z (z ∈ X  ⇒ ϕ) .

By (8), for every X  there is a unique set

Y  = {u : (∃z ∈ X ) u ∈ z} =

{z : z ∈ X } =

X ,

the union  of  X .Now we can define

X ∪ Y  = {X, Y } ,

X ∪ Y  ∪ Z  = (X ∪ Y ) ∪ Z ,...

X 1 ∪ X 2 ∪ ... ∪ X n+1 = (X 1 ∪ ... ∪ X n) ∪ X n+1.

and also

{a,b,c} = {a, b} ∪ {c} ,

and in general

{a1,...,an} = {a1} ∪ ... ∪ {an} .

We also letX  Y  = (X − Y ) ∪ (Y  − X ) ,

the symmetric difference  of  X  and Y .

Page 23: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 23/133

Copyright c 2012, by Mikael Astner  23

4.7 Power set

For any X  there exists a set Y  = P (X ):

∀X 

∃Y 

∀u (u

∈Y 

⇔u

⊂X ) .

A set U  is a subset  of  X , U  ⊆ X , if 

∀z (z ∈ U  ⇒ z ∈ X ) .

If  U  ⊂ X  and U  = X , then U  is a proper subset  of  X .The set of all subsets of  X ,

P (X ) = {u : u ⊆ X } ,

is called the power set  of  X .Using the power set axiom we can define other basic notions of set theory.The cartesian product  of X  and Y  is the set of all pairs (x, y) such that x ∈ X  and

y ∈ Y :(9) X × Y  = {(x, y) : x ∈ X ∧ y ∈ Y } .

The notation {(x, y) : ...} in (9) is justified because

{(x, y) : ϕ (x, y)} = {u : ∃x∃y (u = (x, y) ∧ ϕ (x, y))} .

The cartesian product X × Y  is a set because

X × Y  ⊂ P (P (X ∪ Y )) .

Further, we define

X × Y  × Z  = (X × Y ) × Z ,

and in general

X 1 × ... × X n+1 = (X 1 × ... × X n) × X n+1.

Thus

X 1 × ... × X n = {(x1,...,xn) : x1 ∈ X 1 ∧ ... ∧ xn ∈ X n} .

We also let

X n = X × ... × X 

   n times

.

An n-art relation  R is a set of  n-tuples. R is a relation on X  if  R ⊆ X n. It’scustomary to write R (x1,...,xn) instead of (x1,...,xn) ∈ R, and in case that R isbinary, then we also use xRy for (x, y) ∈ R.

Page 24: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 24/133

Copyright c 2012, by Mikael Astner  24

If  R is a binary relation, then the domain  of  R is the set

dom(R) = {u : ∃v (u, v) ∈ R} ,

and the range  of  R is the set

ran(R) = {v : ∃u (u, v) ∈ R} .

Note that dom (R) and ran(R) are sets because

dom(R) ⊂

R

and ran (R) ⊂

R

The field  of a relation R is the set field (R) = dom(R) ∪ ran(R).In general, we call a class R an n-ary relation if all its elements are n-tuples; in

other words, if 

R ⊆ C n = the class of all n-tuples,

where C n (and C 

×D) is defined in the obvious way.

A binary relation f  is a function  if (x, y) ∈ f  and (x, z) ∈ f  implies y = z. Theunique ysuch that (x, y) ∈ f  is the value  of  f  at x; we use the standard notationy = f (x) or its variations f  : x → y, y = f x, etc. for (x, y) ∈ f .

f  is a function on  X  if  X  = dom (f ). If dom(f ) = X n, then f  is a n-ary function on X .

f  is a function from  X  to Y , f  : X  → Y , if dom(f ) = X  and ran (f ) ⊆ Y . The setof all functions from X  to Y  is denoted by Y X . Note that Y X is a set:

Y X ⊆ P (X × Y ) .

If Y  = ran (f ), then f  is a function onto Y . A function f  is one-to-one  if f (x) = f (y)implies x = y. An n-ary operation on  X  is a function f  : X n → X .

The restriction  of a function f  to a set X —usually a subset of dom(f )—is the

function

f  X  = {(x, y) ∈ f  : x ∈ X } .

A function g is an extension of a function f  if  g ⊇ f , i.e., dom(f ) ⊆ dom(g) andg (x) = f (x) for all x ∈ dom(f ).

If f  and g are functions such that ran (g) ⊆ dom(f ), the the composition  of f  and gis the function f ◦g with domain dom (f ◦ g) = dom(g) such that (f ◦ g) (x) = f (g (x)).

We denote the image  of  X  by f  as f →X :

f →X  = {y : (∃x ∈ X ) y = f (x)} ,

and the inverse image

f ←X  ={

x : f (x)∈

X }

.

If  f  is one-to-one, then f −1 denotes the inverse  of  f .

f −1 (x) = y if and only if  x = f (y) .

Page 25: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 25/133

Copyright c 2012, by Mikael Astner  25

The previous definitions can also be applied to classes instead of sets. A class F  isa function if it’s a relation such that (x, y) ∈ F  and (x, z) ∈ F  implies that y = z. Forexample, F →C  denotes the image of class C  by the function F .

It should be notes that a function is often called a mapping  or a correspondence —

and similarly, a set is called a family or a collection.—An equivalence relation  on a set X  is a binary relation ≡ which is reflexive , sym-

metric , and transitive : For all x,y,z ∈ X ,

x ≡ x,

x ≡ y implies y ≡ x,

if  x ≡ y and y ≡ z, thenx ≡ z.

A family of sets is disjoint  if any two of its members are disjoint. A partition  of aset X  is a disjoint family P  of nonempty sets such that

X  =

{Y  : Y  ∈ P } .

Let ≡ be an equivalence relation on X . For ever x ∈ X , let

[x] = {y ∈ X  : y ≡ x}

—the equivalence class  of  x.—The set

X/ ≡= {[x] : x ∈ X }

is a partition of X —the quotient  of X  by ≡.—Conversely, each partition P  of X  definesan equivalence relation on X :

x ≡ y if and only if (∃Y  ∈ P ) (x ∈ Y  ∧ y ∈ Y ) .

If an equivalence relation is a class, then its equivalence classes may be properclasses.

4.8 Infinity

There exists an infinite set.To give a precise formulation of the axiom of infinity, we have to define first the

notion of finiteness. The most obvious definition of finiteness uses the notion of anatural number, which is as yet undefined. We shall define the natural numbers—asfinite ordinals—in 5.

In principle, it’s possible to give a definition of finiteness that doesn’t mentionnumbers, but such definitions necessarily look artificial.

We therefore formulate the axiom of infinity differently:

∃S (∅ ∈ S ∧ (∀x ∈ S ) x ∪ {x}) .

We call a set S  with the property aboveinductive , Thus we have:

Page 26: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 26/133

Copyright c 2012, by Mikael Astner  26

Axiom of infinity. There exists an inductive set.The axiom provides for the existence of infinite sets. In 5 we show that an inductive

set is infinite—and that an inductive set exists if there exists an infinite set.—We shall introduce natural numbers and finite sets in 5, as a part of the introduction

of ordinal numbers.

4.9 Replacement schema

If a class F  is a function, then for every set X , F (X ) is a set.For each formula ϕ (x,y,p), the formula (10) is an axiom (of replacement):

(10)

∀x∀y∀z (ϕ (x,y,p) ∧ ϕ (x,z,p) ⇒ y = z) ⇒ ∀X ∃Y ∀y (y ∈ Y  ⇔ (∃x ∈ X ) ϕ (x,y,p)) .

As in the case of separation axioms, we can prove the version of replacement axiomwith several parameters: Replace p by p1,...,pn.

If  F  =

{(x, y) : ϕ (x,y,p)

}, then the premise of (10) says that F  is a function, and

we get the formulation above. We can also formulate the axioms in the following ways:”If a class F  is a function and dom (F ) is a set, then ran (F ) is a set.”

”If a class F  is a function, then ∀X ∃f (F  X  = f ).”

5 Ordinal numbers

5.1 Linear and partial ordering

Definition 5.1. A binary relation < on a set A is a partial ordering  of  A if:

i. a ≮ a for any a ∈ A,

ii. if a < b and b < c, then a < c.

(A, <) is called a partially ordered set . A partial ordering < of A is a linear ordering  if 

iii. a < b, a = b, or a > b for all a, b ∈ A.

If  < is a partial (linear) ordering, then the relation ≤ (where a ≤ a if either a < b ora = b) is also called a partial (linear) ordering. < is sometimes called a strict ordering .

Definition 5.2. If (A, <) is a partially ordered set, X  is a nonempty subset of A, anda ∈ A, then:

a is a maximal  element of  X  if  a ∈ X  and ∀x ∈ X : a ≮ x;

a is a minimal  element of  X  if  a ∈ X  and ∀x ∈ X : x ≮ a;

a is the greatest  element of  X  if  a ∈ X  and ∀x ∈ X : x ≤ a;

a is the least  element of  X  if  a ∈ X  and ∀x ∈ X : a ≤ x;a is an upper bound  of  X  if ∀x ∈ X : x ≤ a;

a is a lower bound  of  X  if ∀x ∈ X : a ≤ x;

Page 27: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 27/133

Copyright c 2012, by Mikael Astner  27

a is the supremum  of  X  if  a is the least upper bound of  X ;

a is the infimum  of  X  if  a is the greatest lower bound of  X .

The supremum and infimum of a set X  (if they exist) are denoted sup (X ) and inf (X )

respectively. Note that if  X  is a linearly ordered by <, then a maximal element of  X is its greatest element (similarly for a minimal element).If (A, <) and (B, <) are partially ordered sets and f  : A → B, then f  is order-preserving if x < y implies f (x) < f (y). If A and B are linearly ordered, then an order-preservingfunction is also called increasing .A one-to-one function of  A onto B is an isomorphism  of  A and B if both f  and f −1

are order-preserving, (A, <) is then isomorphic  to (B, <). An isomorphism of A ontoitself is an automorphism  of (A, <).

5.2 Well-ordering

Definition 5.3. A linear ordering < of a set A is a well-ordering  if every nonemptysubset of  A has a least element.

The concept of well-ordering is of fundamental importance. It is shown below thatwell-ordered sets can be compared by their lengths, ordinal numbers will be introducedas order-types of well-ordered sets.

Lemma 5.4. If (W, <) is a well-ordered set and f  : W  → W  is an increasing function,then f (x) ≥ x for each x ∈ W .

Proof. Assume that the set X  = {x ∈ W  : f (x) < x} is nonempty and let y be theleast element of  X . If  w = f (y), then f (w) < w, a contradiction.

Corollary 5.5. The only automorphism of a well-ordered set is the identity.

Proof. By 5.4 we have that f (x)

≥x for all x, and f −1 (x)

≥x for all x.

Corollary 5.6. If two well-ordered sets W 1 and W 2 are isomorphic, then the isomor-phism of  W 1 onto W 2 is unique.

If  W  is well-ordered and a ∈ W , then {x ∈ W  : x < a} is an initial segment  of  W (given by a).

Lemma 5.7. No well-ordered set is isomorphic to an initial segment of itself.

Proof. If ran(f ) = {x : x < a}, then f (a) < a, contrary to 5.4.

Theorem 5.8. If  W 1 and W 2 are well-ordered sets, then exactly one of the following

three cases holds:

i. W 1 is isomorphic to W 2;

ii. W 1 is isomorphic to an initial segment of  W 2;

Page 28: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 28/133

Copyright c 2012, by Mikael Astner  28

iii. W 2 is isomorphic to an initial segment of  W 1.

Proof. For a ∈ W i, (i = 1, 2), let W i (a) denote the initial segment of  W i given by a.Let

f  = {(x, y) ∈ W 1 × W 2 : W 1 (x) is isomorphic to W 2 (y)} .

Using 5.7, it’s easy to see that f  is a one-to-one function. If  g is an isomorphismbetween W 1 (x) and W 2 (y), and x < x, then W 1 (x) and W 2 (g (x)) are isomorphic.It follows that f  is order-preserving.If dom(f ) = W 1 and ran (f ) = W 2, then case i holds.If y1 < y2 and y2 ∈ ran(f ), then y1 ∈ ran(f ). Thus if ran(f ) = W 2 and y0 is the leastelement of  W 2—ran(f ), we have ran (f ) = W 2 (y0). Necessarily, dom (f ) = W 1, otherwe would have (x0, y0) ∈ f , where x0 is the least element of  W 1. Thus case ii holds.Analogically, if dom (f ) = W 1, then case iii holds.In view of 5.7, the three cases are mutually exclusive.

If  W 1 and W 2 are isomorphic, we say that they have the same order-type .Informally, an ordinal number is the order-type of a well-ordererd set. We shall nowgive a formal definition of ordinal numbers.

5.3 Ordinal numbers

The idea is to define ordinal numbers so that

α < β  if and only if  α ∈ β , α = {β  : β < α} .

Definition 5.9. A set T  is transistive  if every element of  T  is a subset of  T .

Equivalently,

T  ⊂ T , or T  ⊂ P (T )

Definition 5.10. A set is an ordinal number  (an ordinal ) if it’s transistive and well-ordered by ∈.

We’re going to denote ordinal numbers with denoted by lowercase Greek letters:α,β,γ,..., and the class of ordinals will be denoted Ord.

We define

α < β  if and only if  α ∈ β .

Lemma 5.11.

i. 0 = ∅ is an ordinal.

ii. If α is an ordinal and β ∈ α, then β  is an ordinal.

iii. If α = β  are ordinals and α ⊂ β , then α ∈ β .iv. If α, β  are ordinals, then either α ⊂ β  or β ⊂ α.

Proof.

Page 29: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 29/133

Copyright c 2012, by Mikael Astner  29

i. True by definition.

ii. True by definition.

iii. If a ⊂ β , let γ  be the least element of the set β −α. Since α is transitive, it followsthat α is the initial segment of  β  given by γ . Thus α =

{ξ 

∈β  : ξ < γ 

}= γ , and

so α ∈ β .

iv. α∩β  = γ  is clearly an ordinal. We have that either γ  = α or γ  = β , otherwise wehave that γ  ∈ α, and γ  ∈ β , by 3. Then γ  ∈ γ , which contradicts the definitionof an ordinal (specifically that ∈ is a strict ordering of  α).

Using 5.11 one gets the follow facts about ordinal numbers (the proofs are routine):

i. < is a linear ordering of the class Ord.

ii. For each α, α = {β  : β < α}.

iii. If  C  is a nonempty class of ordinals, then

C  is an ordinal,

C  ∈ C  and

C  = inf (C ).

iv. If X  is a nonempty set of ordinals, then X  is an ordinal, and X  = sup (X ).v. For every α, so that α ∪ {α} is an ordinal and α ∪ {α} = inf ({β  : β < α}).

We thus define α + 1 = α ∪ {α} (the successor of  α). In view of iv, the class Ord is aproper class; otherwise, consider sup (Ord + 1).

We can now prove that the above definition of ordinals provides us with order-typesof well-ordered sets.

Theorem 5.12. Every well-ordered set is isomorphic to a unique ordinal number.

Proof. The uniqueness follows from 5.7. Given a well-ordered set W , we find an iso-morphic ordinal as follows: Define f (x) = α is α is isomorphic to the initial segmentof  W  given by x. If such an α exists, then it’s unique. By the axiom of replacement,f (W ) is a set. For each x

∈W , such an α exists (otherwise consider the least x for

which such an α does exist). If  γ  is the least γ /∈ f (W ), then f (W ) = γ  and we havean isomorphism of  W  onto γ .

If  α = β + 1, then α is a successor ordinal . If  α is not a successor ordinal, thenα = sup ({β  : β < α}) =

α; α is called a limit ordinal . We also consider 0 a limit

ordinal and define sup (∅) = 0.The existence of limit ordinals other than 0 follows from the axiom of infinity.

Definition 5.13 (Natural numbers). We denote the least nonzero limit ordinal ω (orN). The ordinals less than ω (elements of  N) are called finite ordinals , or natural numbers .

Specifically,

0 = ∅, 1 = 0 + 1, 2 = 1 + 1, 3 = 2 + 1, etc.

A set X  is finite  if there’s a one-to-one mapping of  X  onto some n ∈ N. X  is infinite if it’s not finite.

Page 30: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 30/133

Copyright c 2012, by Mikael Astner  30

5.4 Induction and recursion

Theorem 5.14 (Transfinite induction). Let C  be a class of ordinals and assume that:

i. 0

∈C ;

ii. if α ∈ C , then α + 1 ∈ C ;

iii. if α is a nonzero limit ordinal and β ∈ C  for all β < α, then α ∈ C .

Then C  is the class of ordinals.

Proof. Otherwise, let α be the least ordinal α /∈ C  and apply i, ii, or iii.

A function whose domain is the set N is called an (infinite) sequence  (a sequencein X  is a function f  : N → X .) The standard notation for a sequence is

an : n < ω

or variants thereof. A finite sequence  is a function s such dom(s) = {i : i < n} forsome n ∈ N; then s is a sequence of length  n.A transfinite sequence  is a function whose domain is an ordinal:

aξ : ξ < α .

It is also called an α-sequence  or a sequence of length  α. We also say that a sequenceaξ : ξ < α is an enumeration  of its range {αξ : ξ < α}. If s is a sequence of length α,then sx or simply sx denotes the sequence of length α + 1 that extends s and whoseαth term is x:

sx = sx = s ∪ {(α, x)} .

Sometimes we call a ”sequence”

aα : α ∈ Orda function (a proper class) on Ord.

Definition by transfinite recursion usually takes the following form: Given a functionf  (on the class of transfinite sequences), then for every θ there exists a unique θ-sequence

aα : α < θ

such that

aα = f (aξ : ξ < α)

for every α < θ .We shall give a general version of this theorem, so that we can also construct

sequences aα : α ∈ Ord.

Page 31: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 31/133

Copyright c 2012, by Mikael Astner  31

Theorem 5.15 (Transfinite recursion). Let f  be a function (on V ), then (11) belowdefines a unique function g on Ord such that

g (α) = f (g α)

for each α. In other words, if we let aα = g (α), then for each α,

aα = f (aξ : ξ < α) .

(Note that we tactically use replacement: g α is a set for each α.)

Corollary 5.16. Let X  be a set and θ an ordinal number. For every function f  on theset of all transfinite sequences in X  of length < θ such that ran (f ) ⊂ X  there exists aunique θ-sequence aα : α < θ in X  such that aα = f (aξ : ξ < α) for every α < θ .

Proof.

(11) f (α) = x ⇔ there is a sequence aξ : ξ < α such that:

i. (∀ξ < α) aξ)f (aη : η < ξ );

ii. x = f (aξ : ξ < α).

For every α, if there is n α-sequence that satisfies i, then such a sequence is unique:If  aξ : ξ < α and bξ : ξ < α are two α-sequences satisfying i, one shows aξ = bξ byinduction on ξ . Thus g (α) is determined uniquely by ii, and therefore g is a function.It follows, again by induction, that for each α there’s an α-sequence that satisfies i (atlimit steps, we use replacement to get the α-sequence as the union of all ξ -sequences,ξ < α). Thus g is defined for all α ∈ Ord. It obviously satisfies

g (α) = f (g α) .

If  g is any function on Ord that satisfies

g (α) = f (g α)

then it follows by induction that g (α) = g (α) for all α.

Definition 5.17. Let α > 0 be a limit ordinal and let γ ξ : ξ < α be a nondecreasing sequence of ordinals (i.e. ξ < η implies γ ξ ≤ γ η). We define the limit  of the sequenceby

limξ→α

γ ξ = sup ({γ ξ : ξ < α}) .

A sequence of ordinals γ α : α ∈ Ord is normal  if it’s increasing and continuous , i.e.,

for every limit α, γ α = limξ→α γ ξ.

Page 32: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 32/133

Copyright c 2012, by Mikael Astner  32

5.5 Ordinal arithmetic

We shall now define addition, multiplication, and exponentiation of ordinal numbersusing transfinite recursion.

Definition 5.18 (Addition). For all ordinal numbers α

i. α + 0 = α,

ii. α + (β + 1) = (α + β ) + 1, for all β ,

iii. α + β  = limξ→β (α + ξ ) for all limit β > 0.

Definition 5.19 (Multiplication). For all ordinal numbers α

i. α · 0 = α,

ii. α · (β + 1) = α · β + α for all β ,

iii. α · β  = limξ→β α · ξ  for all limit β > 0.

Definition 5.20 (Exponentiation). For all ordinal numbers α

i. α0 = 1,

ii. αβ+1 = αβ · α for all β ,

iii. αβ = limξ→β αξ for all limit β > 0.

As defined, the operators α + β , α · β , and αβ are normal functions in the secondvariable β . Their properties can be proved by transfinite induction. For instance, +and · are associative:

Lemma 5.21. For all ordinals α, β , and γ ,

i. α + (β + γ ) = (α + β ) + γ ,

ii. α · (β · γ ) = (α · β ) · γ .

Proof. By induction on γ .

Neither + nor · are commutative:

1 + ω = ω = ω + 1, 2 · ω = ω = ω · 2 = ω + ω.

Ordinal sums and products can be also defined geometrically, as can sums and productsof arbitrary linear orders:

Definition 5.22. Let (A, <A) and (B, <B) be disjoint linearly ordered sets. The sumof these linear orders is the set A ∪ B with the ordering defined as follows: x < y if and only if 

i. x, y ∈ A and x <A y, orii. x, y ∈ B and x <B y, or

iii. x ∈ A and y ∈ B.

Page 33: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 33/133

Copyright c 2012, by Mikael Astner  33

Definition 5.23. Let (A, <) and (B, <) be linearly ordered sets. The product  of theselienar orders is the set A × B with the ordering defined by

(a1, b1) < (a2, b2) if and only if either b1 < b2 or (b1 = b2 and a1 < a2).

Lemma 5.24. For all ordinals α and β , α + β  and α · β  are, respectively, isomorphicto the sum and to the product of  α and β .

Proof. By induction on β .

Ordinal sums and products have some properties of ordinary addition and multiplica-tion of integers. For instance:

Lemma 5.25.

i. If β < γ  then α + β < α + γ .

ii. If α < β  then there exists a unique δ  such that α + δ  = β .

iii. If β < γ  and α > 0, then α

·β < α

·γ .

iv. If α > 0 and γ  is arbitrary, then there exists a unique β  and a unique < α suchthat γ  = α · β + .

v. If β < γ  and α > 1, then αβ < αγ .

Proof.

i. By induction on γ .

ii. Let δ  be the order-type of the set {ξ  : α ≤ ξ < β }; δ  is unique by i.

iii. By induction on γ .

iv. Let β  be the greatest ordinal such that α · β ≤ γ .

v. By induction on γ .

Theorem 5.26 (Cantor’s normal form theorem). Every ordinal α < 0 can be repre-sented uniquely in the form

α = ωβ1 · k1 + ... + ωβn · kn,

where n ≥ 1, α ≥ β 1 > ... > β n, and k1,...,kn are nonzero natural numbers.

Proof. By induction on α. For α = 1 we have 1 = ω0 · 1; for arbitrary α > 0 let β  bethe greatest ordinal such that ωβ ≤ α. By 5.25, iv there exists a unique δ  and a unique < ωβ such that α = ωβ · δ + ; this δ  must necessarily be finite. The uniqueness of the normal form is proved by induction.

In the normal form it’s possible to have α = ωα. The least ordinal property is calledε0.

Page 34: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 34/133

Copyright c 2012, by Mikael Astner  34

5.6 Well-founded relations

Now we shall define an important generalization of well-ordered sets.A binary relation E  on a set A is well-founded  if every nonempty X  ⊂ A has an

E -minimal  element, that’s a∈

X  such that there is no x∈

X  with x E a.Clearly, a well-ordering of  A is a well-founded relation.Given a well-founded relation E  on a set A, we can define the height  of  E , and

assign to each x ∈ A an ordinal number, the rank of  x in E .

Theorem 5.27. If  E  is a well-founded relation on A, then three exists a uniquefunction f  from A into the ordinals such that for all x ∈ A,

(12) f (x) = sup({f (y) + 1 : y E x}) .

The range of  f  is an initial segment of the ordinals, thus an ordinal number. Thisordinal is called the height  of  E .

Proof. We shall define a function f  satisfying (12) and then prove its uniqueness. Byinduction, let

A0 = ∅, Aα+1 = {x ∈ A : ∀y (y E x ⇒ y ∈ Aα)} ,

Aα =ξ<α

Aξ if  α is a limit ordinal.

Let θ be the least ordinal such that Aθ+1 = Aθ (such θ exists by replacement). First,it should be easy to see that Aα ⊂ Aα+1 for each α (by induction). Thus A0 ⊂ A1 ⊂... ⊂ Aθ. We claim that Aθ = A. Otherwise, let a be an E -minimal element of A − Aθ.It follows that each x E a is in Aθ, and so a ∈ Aθ+1, a contradiction. Now we definef (x) as the least α such that x ∈ Aα+1. It’s obvious that if  x E y, then f (x) < f (y),and (12) is easily verified. The ordinal θ is the height of  E .

The uniqueness of f  is established as follows: Let f  be another function satisfying(12) and consider an E -minimal element of the set

{x

∈A : f (x)

= f  (x)

}.

6 Cardinal numbers

6.1 Cardinality

The two sets X  and Y  have the same cardinality ,

(13) card(X ) = card (Y ) ,

is there exists a one-to-one mapping of  X  onto Y .The relation (13) is an equivalence relation. We assume that we can assign to each

set X  its cardinal numbers  card(X ) so that two sets are assigned the same cardinal

 just in case they satisfy (13). Cardinal numbers can be defined by either using theaxiom of regularity—via equivalence class of (13),—or using the axiom of choice. Inthis chapter we define cardinal numbers of well-ordered, this defines the cardinals inZFC.

Page 35: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 35/133

Copyright c 2012, by Mikael Astner  35

we recall that a set X  is finite  if card (X ) = (n) for some n ∈ N; then X  is said tohave  n elements . Clearly, card (n) = card(m) if and only if  m = n, and so we define

 finite cardinals  as natural numbers, i.e., card (n) = n for all n ∈ N.The ordering of cardinal numbers is defined as follows:

(14) card(X ) ≤ card(Y )

if there exists a one-to-one mapping of  X  into Y . We also define the strict orderingcard(X ) < card(Y ) to mean that card (X ) ≤ card(Y ) and card (X ) = card(Y ). Therelation ≤ in (14) is clearly transitive. 6.2 below shows that it’s indeed a partial order,and it follows from the axiom of choice that the ordering is linear—any two sets arecomparable in this ordering.—

The concept of cardinality is central to the study of infinite sets. The followingtheorem tells us that this concept isn’t trivial.

Theorem 6.1 (Cantor). For every set X , card(X ) < card(P (X )).

Proof. Let f  be a function from X  into

P (X ). The set

Y  = {x ∈ X  : x /∈ f (x)}

isn’t in the range of  f . If  z ∈ X  were such that f (z) = Y , then z ∈ Y  if andonly if  z /∈ Y , a contradiction. Thus f  isn’t a function of  X  onto P (X ). Hencecard(X ) < P (X ).

In view of the following theorem, < is a partial ordering of cardinal numbers.

Theorem 6.2 (Cantor-Bernstein). If card(A) ≤ card(B) and card(B) ≤ card(A),then card (A) = card (B).

Proof. If  f 1 : A → B and f 2 : B → A are one-to-one, then if we let B = f 2 (B)

and A1 = f 2 (f 1 (A)), we have A1 ⊂ B ⊂ A and card (A1) = card(A). Thus we mayassume that A1 ⊂ B ⊂ A and that f  is a one-to-one function of  A onto A1; we’ll showthat card (A) = card (B).

We define (by induction) for all n ∈ N:

A0 = A, An+1 = f (An) ,

B0 = B, Bn+1 = f (Bn) .

Let g be the function on A defined as follows:

g (x) =

f (x) if  x ∈ An − Bn for some n,x otherwise.

Then g is a one-to-one mapping of  A onto B. Thus card(A) = card (B).

Page 36: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 36/133

Copyright c 2012, by Mikael Astner  36

The arithmetic operations on the cardinals are defined as follows:

κ + λ = (A ∪ B) where card (A) = κ, card(B) = λ, and A and B are disjoint;

κ · λ = card (A × B) where card(A) = κ, card(B) = λ; (15)

κλ

= card AB where card (A) = κ, and card (B) = λ.

Naturally, the definition (15) are meaningful only if they’re independent of the choiceof  A and B. Thus one has to check that, e.g., if card (A) = card(A) and card (B) =card(B), then card (A × B) = card (A × B).

Lemma 6.3. If card (A) = κ, then P (A) = 2κ.

Proof. For every X  ⊂ A, let χX be the function

χX =

1 if  x ∈ X ,0 if  x ∈ A − X .

The mapping f  : X  → χX is a one-to-one correspondence between P (A) and {0, 1}A.

Thus Cantor’s 6.1 can be formulated as follows:

κ < 2κ for every cardinal κ.

A few simple facts about cardinal arithmetic:

(16) + and · are associative, commutative, and distributive.

(17) (κ · λ)µ

= κµ · λµ.

(18) κλ+µ = κλ · κµ.

(19) (κλ)µ = κλ·µ.

(20) If κ ≤ λ, then κµ ≤ λµ.

(21) If 0 < λ

≤µ, then κλ

≤κµ.

(22) κ0 = 1; 1κ = 1; 0κ = 0 if κ > 0.

To prove (16)–(22), one only has to find the appropriate one-to-one functions.

6.2 Alephs

An ordinal α is called a cardinal number  (a cardinal) is card (α) = card(β ) for allβ < α. We shall use κ,λ,µ,... to denote cardinal numbers.

If W  is a well-ordered set, then there exists an ordinal number α such that card (W ) =α.Thus we let

card(W ) = the least ordinal such that card (W ) = card (α)

Clearly card(W ) is a cardinal number.

Every natural number is a finite cardinal ; and if S  is a finite set, then card (S ) = nfor some n.

The ordinal ω is the least infinite cardinal. Note that all infinite cardinals are limitordinals. The infinite ordinal numbers that are cardinals are called alephs .

Page 37: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 37/133

Copyright c 2012, by Mikael Astner  37

Lemma 6.4.

i. For every α there is a cardinal number greater than α.

ii. If X  is a set of cardinals then sup (X ) is a cardinal.

For every α let α+ be the least cardinal number greater than α, the cardinal suc-cessor  of  α.

Proof. i. For any set X , let

(23) h (X ) = the least α such that there’s no one-to-one function of α into X .

There’s only a set of possible well-orderings of X . Hence there is only a set of ordinalsfor which a one-to-one function of α into X  exists. Thus h (X ) exists.

If  α is an ordinal, then card (α) < α (h (α)) by (23). That proves i.ii. Let α = sup (X ). If  f  is a one-to-one mapping of  α onto some β  : β < α, let

κ ∈ X  be such that β < κ ≤ α. Then card (κ) = ({f (ξ ) : ξ < κ}) ≤ β , a contradiction,thus α is a cardinal.

Using lemma 6.4, we define the increasing enumeration of all alephs. We usuallyuse ℵα when referring to the cardinal number, and ωα to denote the order-type:

ℵ0 = ω0 = ω, ℵα+1 = ωα+1 = ℵ+α ,

ℵα = ωα = sup ({ωβ : β < α}) if  α is a limit ordinal.

Sets whose only cardinality is α0 are called countable ; a set is at most countable if it’s either finite or countable. Infinite sets that aren’t countable are uncountable .

A cardinal ℵα+1 is a successor cardinal . A cardinal ℵα whose index is a limit ordinalis a limit cardinal .

Addition and multiplication of alephs is a trivial matter, due to the following fact.

Theorem 6.5. To prove theorem 6.5 we use a pairing function for ordinal numbers:

6.3 The canonical well-ordering of α× α

We define a well-ordering of the class Ord × Ord of ordinal pairs. Under this well-ordering, each α × α is an initial segment of  Ord2; the induced well-ordering of classOrd2 is isomorphic to the class Ord, and we have a one-to-one function Γ of  Ord2

onto Ord. For many α’s the order type of  α × α is α; in particular for these α thatare alephs.

We define:

(α, β ) < (γ, δ )

⇔ either max ({α, β }) < max({γ, δ }),or max(

{α, β 

}) = max(

{γ, δ 

}) and α < γ ,

or max({α, β }) = max({γ, δ }), α = γ  and β < δ .

(24)

Page 38: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 38/133

Copyright c 2012, by Mikael Astner  38

The relation < defined in (24) is a linear ordering of the class Ord × Ord. Moreover,if  X  ⊆ Ord × Ord is nonempty, then X  has a least element. Also, for each α, α × αis he initial segment given by (0, α). If we let

Γ (α, β ) = the order-type of the set {(ξ, η) : (ξ, η) < (α, β )} ,then Γ is a one-to-one mapping of  Ord2 onto Ord, and

(25) (α, β ) < (γ, δ ) if and only if Γ(α, β ) < Γ (γ, δ ) .

Note that Γ (ω × ω) = ω since γ (α) = Γ(α × α) is an increasing function of  α, wehave γ (α) ≥ α for every α. However, γ (α) is also continuous, and so Γ (α × α) = αfor arbitrary α.

Proof of theorem 6.5. Consider the canonical one-to-one mapping of Γ of  Ord × Ordonto Ord. We shall show that Γ (ωα × ωα) = ωα. This is true for α = 0. Thuslet α be the least ordinal such that Γ (ωα × ωα) = ωα. Let β, γ < ωα be such thatΓ (β, γ ) = ωα. Pick δ < ωα such that δ > γ . Since δ × δ  is an initial segment of 

Ord × Ord in the canonical well-ordering and contains (β, γ ), we have Γ (δ × δ ) ⊇ ωα,and so card(δ × δ ) ≥ ℵα. However, card (δ × δ ) = card(δ ) · card(δ ), and by theminimality of  α, card(δ ) · card(δ ) = card (δ ) < ℵα. A contradiction.

Corollary 6.6.

(26) ℵα + ℵβ = ℵα · αβ = max ({ℵα, ℵβ}) .

Exponentiation of cardinals will be dealt with in section 8. Without the axiom of choice, one cannot prove that 2ℵα is an aleph—or that P (ωα) can be well-ordered,—

and there is very little one can prove about 2ℵα or ℵℵβα .

6.4 CofinalityLet α > 0 be a limit ordinal. We say that an increasing β -sequence αξ : ξ < β , β a limit ordinal, is cofinal  in α if limξ→β αξ = α. Similarly, A ⊆ α is cofinal  in α if sup(A) = α. If  α is an infinite limit ordinal, the cofinality  of  α is

cf (α) = the least limit ordinal β  such that there is an increasing

β -sequence αξ : ξ < β  with limξ→β

αξ = α.

Obviously, cf (α) is a limit ordinal, and cf (α) ≤ α. Examples: cf (ω + ω) = cf (ℵω) = ω.

Lemma 6.7. cf(cf(α)) = cf(α).

Proof. If 

αξ : ξ < β 

is cofinal in α and

ξ (ν ) : ν < γ 

is cofinal in β , then αξ(ν )ν < γ is cofinal in α.

Lemma 6.8. Let α > 0 be a limit ordinal.

Page 39: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 39/133

Copyright c 2012, by Mikael Astner  39

i. If A ⊆ α and sup (A) = α, then the order-type of  A is at least cf (α).

ii. If β 0 ≤ β 1 ≤ ... ≤ β ξ ≤ ..., ξ < γ  is a nondecreasing γ -sequence of ordinals in αand limξ→γ β ξ = α, then cf(γ ) = cf (α).

Proof. i. The order-type of  A is the length of the increasing enumeration of  A whichis an increasing sequence with limit α.

ii. If γ  = limν →cf(γ ) ξ (ν ), then α = limν →cf(γ ) β ξ(ν ), and the nondcreasing sequenceβ ξ(ν ) : ν < cf (γ )

has an increasing subsequence of length ≤ cf (γ ), with the same

limit. Thus cf (α) ≤ cf (γ ).To show that cf (γ ) ≤ cf (α), let α = limν →cf(γ ) αν . For each ν < cf (α), let ξ (ν ) be

the least ξ  greater than all ξ (ι), ι < ν , such that β ξ > αν . Since limν →cf(α) β ξ(ν ) = α,it follows that limν →cf(α) ξ (ν ) = γ , and so cf (γ ) ≤ cf (α).

An infinite cardinal ℵα is regular  is cf(ωα) = ωα. It’s singular  if cf(ωα) < ωα.

Lemma 6.9. For every limit ordinal α, cf (α) is a regular cardinal.

Proof. It’s easy to see that if  α isn’t a cardinal, then using a mapping of card (α)onto α, one can construct a cofinal sequence in α of length ≤ card(α), and thereforecf (α) < α.

Since cf (cf (α)) = cf(α), it follows that cf (α) is a cardinal and is regular.

Let κ be a limit ordinal. A subset X  ⊆ κ is bounded  if sup(X ) < κ, and unbounded if sup(X ) = κ.

Lemma 6.10. Let κ be an aleph.

i. If X  ⊆ κ and card (X ) < cf (κ), then X  is bounded.

ii. If λ < cf (κ) and f  : λ → κ, then the range of  f  is bounded.

It follows from i that every unbounded subset of a regular cardinal has cardinalityκ.

Proof. i. Apply lemma 6.8.

ii. If X  = ran (f ) then card (X ) ≤ λ, and use i.

There are arbitrary large singular cardinal. For each α, ℵα+ω is a singular cardinalof cofinality ω.

Using the axiom of choice, we shall show in 8 that every ℵα+1 is regular.—Theaxiom of choice is necessary.—

Lemma 6.11. An infinite cardinal κ is singular if and only if there exists a cardinalλ < κ and a family

{S ξ : ξ < λ

}of subsets of κ such that card (S ξ) < κ for each ξ < λ,

and κ = ξ<λ S ξ. The least cardinal λ that satisfies the condition is cf (κ).

Page 40: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 40/133

Copyright c 2012, by Mikael Astner  40

Proof. If  κ is singular, then there is an increasing sequence αξ : ξ < cf (κ) withlimξ→cf(κ) αξ = κ. Let λ = cf (κ), and S ξ = αξ for all ξ < λ.

If the condition holds, let λ < κ be the least cardinal for which there’s a family{S ξ : ξ < λ} such that κ = ξ<λ and card (S ξ) < κ for each ξ < λ. For every ξ < λ,

let β ξ be the order-type of ν<ξ S ν . The sequence β ξ : ξ < λ is nondecreasing, andby the minimality of  λ, β ξ < κ for all ξ < λ. We shall show that limξ→λ β ξ = κ, thusproving that κ ≤ λ.

Let β  = limξ→λ β  : ξ . There’s a one-to-one mapping f  of κ =

ξ<λ S ξ into λ×β : If α ∈ κ, let f (α) = (ξ, γ ), where ξ is the least ξ  such that α ∈ S ξ and γ  is the order-typeof  S ξ ∩ α. Since λ < κ and card (λ × β ) = λ · card(β ), it follows that β  = κ.

One cannot prove without the axiom of choice that ω1 isn’t a countable union of countable sets.

The only cardinal inequality we have proved so far is theorem 6.1 κ < 2κ. It followsthat κ < λκ for every λ > 1, and in particular κ < κκ for κ = 1. The following theoremgives a better inequality. This and the other cardinal inequalities will also follow from

theorem 8.10, to be proved in 8.Theorem 6.12. If  κ is an infinite cardinal, then κ < κcf(κ).

Proof. Let F  be a collection of  κ functions from cf (κ) to κ : F  = {f α : α < κ}. It’senough to find f  : cf(κ) → κ that is different from all the f α. Let κ = limξ→cf(κ) αξ.For ξ < cf (κ), let

f (ξ ) = least γ  such that γ = f α for all α < αξ.

Such γ  exists since card ({f α (ξ ) : α < αξ}) ≤ card(αξ) < κ. Obviously f  = f α for allα < κ.

Consequently, κλ

> κ whenever λ ≥ cf (κ).An uncountable cardinal κ is weakly inaccessible  if it’s a limit cardinal and is regular.Weakly inaccessible inaccessible cardinals aren’t provable in ZFC.

To get an idea of the size of an inaccessible cardinal, note that ℵα > ℵ0 is a limitand a regular, then ℵα = cf (ℵα) = cf (α) < α, and so ℵα = α.

Since the sequence of alephs is a normals sequence, it has arbitrarily large fixedpoints; the problem is whether some of the are regular cardinals. For instance, theleast fixed point ℵα = α has cofinality ω:

κ = lim ω, ωω, ωωω ,... = limn→ω

κn where κ0 = ω, κn+1 = ωκn .

7 Real numbers

The set of all real numbers R (the real line, or the continuum) is the unique orderfield in which every nonempty bounded set has a least upper bound. The proof of thefollowing thorem marks the beginning of Cantor’s theory of sets.

Page 41: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 41/133

Copyright c 2012, by Mikael Astner  41

Theorem 7.1 (Cantor). The set of all real numbers is uncountable.

Proof. Let us assume that the set R of all reals is countable, and let c0, c1,...,cn,...where n ∈ N, be an enumeration of N. We shall find a real number different from cn.

Let a0 = c0 and b0 = ck0 where k0 is the least k such that a0 < ck. Fort each n, letan+1 = cin where in is the least i such that an < ci < bn, and and bn+1 = ckn where knis the least k such that an+1 < ck < bn. If we let a = sup({an : n ∈ N}), then a = ckfor all k.

7.1 The cardinality of the continuum

Let c denote the cardinality of R. As the set of Q of all rational numbers is dense inR, every real number r is equal to c ≤ card(P (Q)) = 2ℵ0 .

Let C (the Cantor set) be the set of all reals of the form∞

n=1an3n , where each an = 0

or 2. C is obtain by removing from the closed intercal [0, 1], the open intervals

13 , 2

3

,

12

, 29,

79

, 89, etc.—the middle-third intervals.—C is in a one-to-one correspondence

with the set of all ω-sequence of 0’s and 2’s and so card (C) = 2ℵ0

.Therefore c ≥ 2ℵ0 , and so by theorem 6.2 (the Cantor-Bernstein theorem) we have

(27) c = 2ℵ0 .

By theorem 7.1 (Cantor’s theorem), or by theorem 6.1, c > ℵ0. Cantor conjecturedthat ever set of reals is either at most countable or has a cardinality of the continuum.In ZFC, every infinite cardinal is an aleph, and so 2ℵ0 ≥ ℵ1. Cantor’s conjecture thenbecomes the statement

2ℵ0 = ℵ1

known as the Continuum hypothesis  (CH).Among sets of cardinality c are the sets of all sequences of natural numbers, the

set of all sequences of real numbers, the set of all complex numbers. This is becauseℵℵ00 =

2ℵ0

ℵ0= 2ℵ0 , 2ℵ0 · 2ℵ0 = 2ℵ0 .

7.2 The ordering of R

A linear ordering (P, <) is complete  if every nonempty bounded subset of P  has a leastupper bound. We stated above that R is the unique complete ordered field. We shallgenerally disregard the field properties of R and will concern ourselves more with theorder properties.

One consequence of being a complete ordered field is that R contains the set Q of all rational numbers as dense as a subset. The set Q is countable and its ordering isdense.

Definition 7.2. A linear ordering (P, <) is dense  if for all a < b there exists a c suchthat a < c < b.

A set D ⊂ P  is a dense subset  if for all a < b in P  there exists a d ∈ D such thata < d < b.

Page 42: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 42/133

Copyright c 2012, by Mikael Astner  42

The following theorem proves the uniqueness of the ordered set (R, <). We say thatan ordered set is unbounded  if it has neither a least- nor a greatest element.

Theorem 7.3.

i. Any two countable unbounded dense linearly order sets are isomorphic.ii. (R, <) is the unique complete linear ordering that has a countable dense subset

isomorphic to (Q, <).

Proof. i. Let P 1 = {an : n ∈ N} and let P 2 = {bn : n ∈ N} be two such linearly orderedsets. We construct an isomorphism f  : P 1 → P 2 in the following way: We first definef (a0), then f −1 (b0), then f (a1), then f −1 (b1), etc., so as to keep f  order-preserving.For example, to define f (an), if it’s not yet defined, we let f (an) = bk where k is theleast index such that f  remains order-preserving—such a k always exists because f  hasbeen defined for only finitely many a ∈ P 1, and because P 2 is dense and unbounded.—

ii To prove the uniqueness of R, let C  and C  be two complete dense unboundedlinearly ordered sets, let P  and P  be dense in C  and C , respectively, and let f  be anisomorphism of  P  onto P . Then f  can be extended (uniquely) to an isomorphism f ∗

of  C  and C : For x ∈ C , let f ∗ (x) = sup({f ( p) : p ∈ P  and p ≤ x}).

The existence of (R, <) is proved by means of Dedekind cuts  in (Q, <). The followingtheorem is a general version of this construction.

Theorem 7.4. Let (P, <) be a dense unbounded linearly ordered set. Then there is acomplete unbounded linearly ordered set (C, ≺) such that:

i. P  ⊆ C , and <, and ≺ agree on P ;

ii. P  is dense in C .

Proof. As we recall from 3.1 a Dedekind cut in P  is a pair (A, B) of disjoint nonemptysubsets of  P  such that

i. A ∪ B = P ;

ii. a < b for any a ∈ A and b ∈ B;

iii. A doesn’t have a greatest element.

Let C  be the set of all Dedekind cuts in P  and let (A1, B1) (A2, B2) if  A1 ⊂ A2—andB1 ⊃ B2.—The set C  is complete: If {(Ai, Bi) : i ∈ I } is a nonempty bounded subsetof  C , then

i∈I Ai,

i∈I Bi

is supremum. For p ∈ P , let

A = {x ∈ P  : x < p} , B p = {x ∈ P  : x ≥  p} .

Then P  = {(A p.B p) : p ∈ P } is isomorphic to P  and is dense in C .

Page 43: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 43/133

Copyright c 2012, by Mikael Astner  43

7.3 Suslin’s problem

The real line is, up to isomorphism, the unique linearly ordered set that is dense,unbounded, complete, and contains a countable dense subset.

Since Q is dense in R, every nonempty interval of R contains a rational number.Hence if  S  is a disjoint collection of open intervals, S  is at most countable. (Letrn : n ∈ N be a enumeration of the rationals. To each J  ∈ S  assign rn ∈ J  with theleast possible index n.)

Let P  be a dense linearly ordered set. If every disjoint collection of open intervalsin P  is at most countable, then we say that P  satisfies the countable chain condition .

Problem 7.5 (Suslin’s). Let P  be a complete dense unbounded linearly ordered setthat satisfies the countable chain condition. Is P  isomorphic to the real line?

This question cannot be decided in ZFC.The topology of the real line  The real line is metric space with the metric d (a, b) =

|a − b|. Its metric topology coincides with the order topology of (R, <). Since Q isa dense set in R and since every Cauchy sequence of real numbers converges, R is a

separable complete metric space. (A metric space is separable  if it has a countabledense set; it’s complete  if every Cauchy sequence converges.)Open sets are unions of open intervals, and in fact, every open set is the union of 

open intervals with rational endpoints. This implies that the number of all open setsin R is the continuum and so is the number of all closed sets in R.

Every open interval has cardinality c, therefore every nonempty open set has car-dinality c. Proving this was Cantor’s first step in the search for the proof of theContinuum hypothesis.

A nonempty closed set is perfect  if it has no isolated points. Theorem 7.6 and 7.7below show that every uncountable closed set contains a perfect set.

Theorem 7.6. Every perfect set has cardinality c.

Proof. Given a perfect set P , we want to find a one-to-one function F  from

{0, 1

into

P . Let S  be the set of all finite sequences of 0’s and 1’s. By induction on the lengths ∈ S  one can find closed intervals I s such that for each n and all s ∈ S  of length n,

i. I s ∩ P  is perfect,

ii. the diameter of I s is ≤ 1n

,

iii. I s0 ⊆ I s, I s1 ⊆ I s, and I s0 ∩ I s1 = ∅.

For each f  ∈ {0, 1}ω, the set P  ∩∞n=1 I f n has exactly one element, and we let F (f )to be this element of  P .

Theorem 7.7 (Cantor-Bendixson). If F  is an uncountable closed set, then F  = P ∪S ,where P  is perfect and S  is at most countable.

Corollary 7.8. If  F  is a closed set, then either card(F ) ≤ ℵ0 or card (F ) = 2ℵ0 .

Page 44: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 44/133

Copyright c 2012, by Mikael Astner  44

Proof. For every A ⊂ R, let

A = the set of all limit points of  A.

It’s easy to see that A is closed, and if A is closed then A⊂

A. Thus we let

F 0 = F , F α+1 = F αF α =

γ<α

F γ  if  α > 0 is a limit ordinal.

Since F 0 ⊃ F 1 ⊃ ... ⊃ F α ⊃ ...,, there exists an ordinal θ such that F α = F θ for allα ≥ θ. (In fact, the least θ with this proeprty must be countable, by the argumentbelow.) We let P  = F θ.

If  P  is nonempty, then P  = P  and so it’s perfect. Thus the proof is completed byshowings that F  − P  is at most countable.

Let J k : k ∈ N be an enumeration of rational intervals. We have F  − P  =

α<θ (F α − F α); hence if  a ∈ F  − P , then there is a unique α such that a is anisolated point of  F 

αin the interval J 

k. Note that if α

≤β , b

= a, and F 

β −F β, then

b /∈ J k(α), and hence k (b) = k (a). Thus the correspondence a → k (a) is one-to-one,and it follows that F  − P  is at most countable.

A set of reals is called nowhere dense  if its closure has empty interior. The followingtheorem shows that R isn’t the union of countably many nowhere dense sets (R isn’tof the first category ).

Theorem 7.9 (The Baire category theorem). If  D0, D1,...,Dn,..., n ∈ N, are denseopen sets of reals, then the intersection D =

∞n=0 Dn is dense in R.

Proof. We show that D intersects every nonempty open interval I . First note that foreach n, D0 ∩ ... ∩Dn is dense and open. Let J k : k ∈ N be an enumeration of rational

intervals. Let I 0 = I , and let, for each n, I n+1 = J k = (q k, rk), where k is the leastk such that the closed interval [q k, rk] is included in I n ∩ Dn. Then a ∈ D ∩ I , wherea = limk→∞ q k.

7.4 Borel sets

Definition 7.10. An algebra of sets  is a collection S  if subsets of a given set S  suchthat

i. S ∈ S ,ii. if X  ∈ S and Y  ∈ S , then X ∪ Y  ∈ S , (28)

iii. if X  ∈ S  then S − X  ∈ S .(Note that S is also closed under intersections.)

A σ-algebra  is additionally closed under countable unions (and intersections):

iv. If X n ∈ S for all n, then∞

n=0 X n ∈ S .

Page 45: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 45/133

Copyright c 2012, by Mikael Astner  45

For any collection X of subsets of of  S  there is a smallest algebra (σ-algebras) S of subsets of  S  for which X ⊆ S .Definition 7.11. A set of reals B is Borel  if it belongs to the smallest σ-algebra B of a set of reals that contains all open sets.

7.5 Lebesque

Lebesgue measurable sets form a σ-algebra and contain all open intervals (the measureof an interval is its length). Thus all Borel sets are Lebesgue measurable.

7.6 The Baire space

The Baire space  is the space N  = ωω of all infinite sequences of natural numbers,an, n ∈ N, with the following topology: For every finite sequence s = ak : k < n, let

(29) O (s) = {f  ∈ N : s ⊂ f } = {ck : k ∈ N : (∀k < n) ck = ak} .

The sets 29 form a basis for the topology of  N . Note that each O (s) is also closed.The Baire space is separable and is metrizable: consider the metric d (f, g) = 1

2n+1

where n is the least number such that f (n) = g (n). The countable set of all eventuallyconstant sequences is dense in N . This separable metric spaces is complete, as everyCauchy sequence converges.

Every infinite sequence an : n ∈ N of positive integers defines a continued fraction 1/ (a0 + 1/ (11 + 1/ (a2 + ...))), an irrational number between 0 and 1. Conversely,every irrational number in the interval (0, 1) can be so represented, and the one-to-onecorrespondence is a homeomorphism. It follows that the Baire space is homeomorphicto the space of all irrational numbers.

For various reasons, modern descriptive set theory uses the Baire space rather thanthe real line. Often functions in ωω are called reals.

Clearly, the space

N satisfies the Baire category theorem (theorem 7.9); the proof 

is analogical to the proof of 7.9 above. The Cantor-Bendixson theorem (theorem 7.7)holds as well. For completeness we give a description of perfect sets in N .

Let Seq denote the set of all finite sequences of natural numbers. A (sequential)tree  is a set T  ⊂ Seq that satisfies

(30) if t ∈ T  and s = t n for some n, then s ∈ T .

If  T  ⊂ Seq is a tree, let [T ] be the set of all infinite paths  through T :

(31) [T ] = {f  ∈ N : f  n ∈ T  for all n ∈ N.}

The set [T ] is a closed set in the Baire space: Let f  ∈ N  be such that f /∈ [T ].Then there’s n ∈ N such that f  n = s is not in T . In other words, the open set

O (s) = {g ∈ N : g ⊃ s}, a neighborhood of f , is disjoint from [T ]. Hence [T ] is closed.Conversely, if  F  is a closed set in N , then the set

(32) T F  = {s ∈ Seq : s ⊂ f  for some f  ∈ F }

Page 46: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 46/133

Copyright c 2012, by Mikael Astner  46

is a tree, and it’s easy to verify that [T F ] = F : If  f  ∈ N  is such that f  n ∈ T  for alln ∈ N, then for each n there is some g ∈ F  such that g n = f  n; and since D isclosed, it follows that f  ∈ F . Thus the following definition:

A nonempty sequential tree T  is perfect  if for every t ∈ T  there exists s1 ⊃ t and

s2 ⊃ t, both in T , that are incomparable , i.e., neither s1 ⊂ s2 nor s1 ⊃ s2.Lemma 7.12. A closed set F  ⊂ N  is perfect if and only if the tree T F  is a perfecttree.

The Cantor-Bendixson analysis for closed sets in the Baire space is carried out asfollows: For each tree T  ⊂ Seq, we let

(33) T  = {t ∈ T  : there exists incomparable s1 ⊃ t and s2 ⊃ t in T } .

(Thus T  is perfect if and only if ∅ = T  = T .)The set [T ] − [T ] is at most countable: For each f  ∈ [T ] such that f /∈ [T ], let

sf  = f  n where n is the least number such that f  n /∈ T . If  f, g ∈ [T ] − [T ], thensg

= sg

, by 33. Hence the mapping f →

sf 

is one-to-one, and [T ]−

[T ] is at most

countable.Now we let

(34)

T 0 = T , T α+1 = T α,

T α =β<α

T β if  α > 0 is a limit ordinal.

Since T 0 ⊃ T 1 ⊃ ... ⊃ T α ⊃ ..., and T 0 is at most countable, there is an ordinal θ < ω1

such that T θ+1 = T θ. If  T θ = ∅, then it’s perfect.

Now it’s easy to see that

β<α T β

=

β<α [T β], and so

(35) [T ] − [T θ] = α<θ

(θ [T α] − [T θ]) ;

hence (35)is at most countable. Thus if [T ] is an uncountable closed set in N , thesets [T θ] and [T ] − [T θ] constitute the decomposition of [T ] into a perfect and at mostcountable set.

In modern descriptive set theory one often speaks about the Lebesgue measure  on N . This measure is the extension of the product measure on m on Borel sets in theBaire space induced by the probability measure on N that gives the singleton {n}measure 1

2n+1 . Thus for every sequence s ∈ Seq of length n ≥ 1 we have

(36) m (O (s)) =n−1k=0

1

2s(k)+1.

7.7 Polish spaces

Definition 7.13. A Polish space  is a topological space that is homeomorphic to aseparable complete metric space.

Page 47: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 47/133

Copyright c 2012, by Mikael Astner  47

Examples of Polish spaces include R, N , the Cantor space, the unit interval [0, 1],the unit circle T , the Hilbert cube [0, 1]

ω, etc.

Every Polish space is a continuous image of the Baire space.

8 The axiom of choice and cardinal arithmetic

8.1 The axiom of choice

The axiom of choice. Every family of nonempty sets has a choice function of  f on S  such that

(37) f (X ) ∈ X 

for every X  ∈ S .The axiom of choice postulates that for every S  such that ∅ ∈ S  there exists a

function f  on S  that satisfies (37).The axiom of choice differs from the other axioms of ZF by postulating the existence

of a set (i.e., a choice function) without defining it (unlike, for instance, the axiom of pairing or the axiom of power set). Thus it’s often interesting to know whether amathematical statement can be proved without using the axiom of choice. It turnsout that the axiom of choice is independent of the other axioms of set theory and thatmany mathematical theorems are unprovable in ZF without AC.

In some trivial cases, the existence of a choice function can be proven outright ZF:

i. when every X  ∈ S  is a singleton X  = {x};

ii. when S  is finite; the existence of a choice function for S  is proved by inductionon the size of  S ;

iii. when every X  ∈ S  is a finite set of real numbers; let f (X ) =the least element of X .

On the other hand, one cannot prove existence of a choice function (in ZF) just fromthe assumption that set in S  are finite; even when every X  ∈ S  has just two elements(e.g., sets of reals), we cannot necessarily prove that S  has a choice function.

Using the axiom of choice, one proves that every set can be well-ordered, andtherefore every infinite set has cardinality equal to some ℵα. In particular, any twosets have comparable cardinals, and the ordering

card(X ) ≤ card(Y )

is a well-ordering of the class of all cardinals.

Theorem 8.1 (Zermelo’s well-ordering theorem). Every set can be well-ordered

Proof. Let A be a set. To well-order A, it suffices to construct a transfinite one-to-one

sequence aα : α < θ that enumerates A. That we can do by induction, using a choicefunction f  for the family of  S  of all nonempty subsets of  A. We let for every α

aα = f (A − {aξ : ξ < α})

Page 48: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 48/133

Copyright c 2012, by Mikael Astner  48

if  A − {aξ : ξ < α} is nonempty. Let θ be the least ordinal such that A = {aξ : ξ < θ}.Clearly, aα : α < θ enumerates A.

In fact, Zermelo’s theorem (theorem 8.1) is equivalent to the axioms of choice: If every set can be well-ordered, then every family of  S  of nonempty sets has a choicefunction. To see this, well-order

S  and let f (X ) be the least element of  X  for every

X  ∈ S .Of particular importance is the fact that the set of all real numbers can be well-

ordered. It follows that 2ℵ0 is an aleph and so that 2ℵ0 ≥ ℵ1.The existence of a well-ordering of R yields some interesting counterexamples. Well

known is Vitali’s construction of a nonmeasurable set.If every set can be well.ordered, then every infinite set has a countable subset: Well-

order the set and take the first ω elements. Thus every infinite set is Dedekind-infinite,and so finiteness and Dedekind finiteness coincide.

Dealing with cardinalities of sets is much easier when we have the axiom of choice.In the first place, any two sets have comparable cardinals. Another consequence is:

(38) if f  maps A onto B then card (B) ≤ card(A).

To show (38), we have to find a one-to-one function from B to A. This is done bychoosing one element from f ← ({b}) for each b ∈ B.

Another consequence of axiom of choice is:

(39) The union of a countable family of countable sets is countable.

(By the way, this is often used fact cannot be proved in ZF alone.) To prove (39) let An

be a countable set for n ∈ N. For each n, let us choose  an enumeration an,k : k ∈ Nof  An. That gives us a projection of N× N onto

∞n=0 An:

(n, k) → an,k.

Thus∞

n=0 An is countable.In a similar fashion, one can prove a more general statement.

Lemma 8.2. card(

S ) ≤ card(S ) · sup({card(X ) : X  ∈ S }) .

Proof. Let κ = card (S ) and λ = sup ({card(X ) : X  ∈ S }). We have S  = {X α : α < κ}and for each α < κ, we choose an enumeration X α = {aα,β : β < λα}, where λα ≤ λ.Again we have a projection

(α, β ) → aα,β

of  κ × λ onto S , and so card (S ) ≤ κ · λ.

In particular, the union of  ℵα sets, each of cardinality ℵα, has cardinality ℵα.

Corollary 8.3. Every ℵα is a regular cardinal.

Page 49: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 49/133

Copyright c 2012, by Mikael Astner  49

Proof. This is because otherwise ωα+1 would be the union of at most ℵα sets of cardi-nality of most ℵα.

8.2 Using the axiom of choice in mathematics

In algebra and point set topology, one often uses the following version of the axiom of choice. We recall that if (P, <) is a partially ordered set, then a ∈ P  is called maximal in P  if there’s no upper x ∈ P  such that a < x. If  X  is a nonempty subset of  P , thenc ∈ P  is an upper bound  of  X  if  x ≤ c for every x ∈ X .

We say that a nonempty C  ⊂ P  is a chain  in P  if  C  is linearly ordered by <.

Lemma 8.4 (Zorn’s lemma). If (P, <) is a nonempty partially ordered set such thatevery chain in P  has an upper bound, then P  has a maximal element.

Proof. We construct (using a choice function for nonempty subset of  P ), a chain in P that leads to a maximal element of  P . We let, by induction,

aα = an element of  P  such that aα > aξ for every ξ < α if there is one.

Clearly, if  α > 0 is a limit ordinal, then C α = {aξ : ξ < α} is a chain in P  and aαexists by the assumption. Eventually, there is θ such that there’s no aθ+1 ∈ P  suchthat aθ+1 > aθ. Thus aθ is the maximal element of  P .

Like Zermelo’s theorem (theorem 8.1) and Zorn’s lemma (lemma 8.4) is equivalentto the axiom of choice (in ZF).

There are numerous examples of proof using Zorn’s lemma (lemma 8.4). To mentiona few:

Every vector space has a basis.Every field has a unique algebraic closure.

The Hahn-Banach extension theorem.Tikhonov’s product theorem for compact spaces.

8.3 The countable axiom of choice

Many important consequences of the axiom of choice, particularly many concerningthe real numbers, can be proven by a weaker version of the axiom of choice.

The countable axiom of choice Every countable family of nonempty sets hasa choice function.

For instance, the countable AC implies that the union of countably many countablesets is countable. In particular, the real line isn’t a countable union of countable sets.Similarly, it follows that

ℵ1 is a regular cardinal On the other hand, the countable AC

doesn’t imply that the set of all reals can be well-ordered.Several basic theorems about Borel sets and Lebesgue measure use the countable

AC; for instance, one needs it to show that the union of countably many F σ sets is F σ.In modern descriptive set theory one often works without the axiom of choice and uses

Page 50: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 50/133

Copyright c 2012, by Mikael Astner  50

the countable AC instead. In some instances, descriptive set theorists use a somewhatstronger principle (that follows from AC):

The principle of dependent choices (DC). If  E  is a binary relation on a

nonempty set A, and if for every a ∈ A there exists b ∈ A such that b R a, then there’sa sequence a0, a1,...,an,... in A such that

an+1 E an for all n ∈ N. (40)

The principle of dependent choices is stronger than the countable axiom of choice.As an application of DC we have the following characterization of well-founded

relations and well-orderings:

Lemma 8.5.

i. A linear ordering < of a set P  is a well-ordering of  P  if and only if there’s noinfinite descending sequence a0 > a1 > ... > an... in A.

ii. A relation E  on P  is well-founded if and only if there’s no infinite sequence

an : n ∈ N in P  such that

an+1 E an for all n ∈ N. (41)

Proof. Note that i is a special case of ii since a well-ordering is a well-founded linearordering.

If  a0, a1,...,an,... is a sequence that satisfies (41), then the set {an : n ∈ N} has noE -minimal element and hence E  is not well-founded.

Conversely, if  E  isn’t well-founded, then there’s a nonempty set A ⊂ P  with noE -minimal element. Using the principle of dependent choices we construct a sequencea0, a1,...,an,... that satisfies (41).

8.4 Cardinal arithmetic

In the presence of the axiom of choice, every set can be well-ordered and so everyinfinite set has the cardinality of some ℵα. Thus addition and multiplication of infinitecardinal numbers is simple: If  κ and λ are infinite cardinals then

κ + λ = κ · λ = max ({κ, λ}) .

The exponentiation of cardinals is more interesting. The rest of section 8 is devoted tothe operations 2κ and κλ, for infinite cardinals κ and λ.

Lemma 8.6. If 2 ≤ κ ≤ λ and λ is infinite, then κλ = 2λ.

Proof.

(42) 2λ ≤ κλ ≤ (2κ)λ

= 2κ·λ = 2λ.

Page 51: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 51/133

Copyright c 2012, by Mikael Astner  51

If  κ and λ are infinite cardinals and λ < κ then the evaluation of  κλ is morecomplicated. First, if 2λ ≥ κ then we have κλ = 2λ (because κ2 ≤

2λλ

= 2λ=, butif 2λ < κ then (because κλ ≤ κκ = 2κ) we can only conclude

(43) κ≤

κλ

≤2κ.

Not much more can be claimed at this point, except that by theorem 6.12 ( κcf(κ) > κ)we have

κ < κλ if  λ ≥ cf (κ). (44)

If  λ is a cardinal and card (A) ≥ λ, let

(45) [A]λ

= {X  ⊂ A : card (X ) = λ} .

Lemma 8.7. If card (A) = κ ≥ λ, then the set [A]λ has cardinality κλ.

Proof. On the other hand, every f  : λ → A is a subset of  λ × A, and card(f ) = λ.

Thus κλ ≤ card ([λ × A])λ

= card

[A]

λ

. On the other hand, we construct a one-to-

one function F  : [A]λ

→ Aλ

as follows: If  X  ⊂ A and card (X ) = λ, let F (X ) be somefunction f  on λ whose range is X . Clearly, F  is one-to-one.

If  λ is a limit cardinal, let

(46) κ<λ = sup ({κµ : µ is a cardinal and µ < λ}) .

For the sake of completeness, we also define κ<λ+ = κλ for infinite successor cardinalsλ+.

If  κ is an infinite cardinal and card (A) ≥ κ, let

(47) [A]<κ

= P κ (A) = {X  ⊂ A : card (A) < κ} .

It follows from lemma 8.7 and lemma 8.8 below that the cardinality of  P κ (A) is

card(A)<κ.

8.5 Infinite sums and products

Let {κi : i ∈ I } be an indexed set of cardinal numbers. We define

(48)i∈I 

κi = card

i∈I 

X i

,

where {X i : i ∈ I } is a disjoint family of sets such that card (X i) = κi for each i ∈ I .This definition doesn’t depend on the choice of  {X i}i; this follows from the axiom

of choice.Note that if κ and λ are cardinals and κi = κ each i < κ, then

i<λ

κi = λ · κ.

In general, we have the following:

Page 52: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 52/133

Copyright c 2012, by Mikael Astner  52

Lemma 8.8. If  λ is an infinite cardinal and κi > 0 for each i < λ, then

(49)i<λ

κi = λ · supi<λ

κi.

Proof. Let κ = supi<λ κi and σ =

i<λ κi. On the one hand, since κi ≤ κ for all i, wehave

i<λ κ ≤ λ·κ. On the other hand, since κi ≥ 1 for all i, we have λ =

i<λ 1 ≤ σ,

and since σ ≥ κi for all i, we have σ ≥ supi<κ κi = κ. Therefore σ ≥ λ · κ.

In particular, if λ, if  λ ≤ supi<λ κi, we havei<λ

κi = supi<κ

κi.

Thus we can characterize singular cardinals as follows: An infinite cardinal κ is singular just in case

κ =i<λ

κi

where λ < κ and for each i, κi < κ.An infinite product of cardinals is defined by using infinite products of sets. If 

{X i : i ∈ I } is a family of sets, then the product  is defined as follows:

(50)i∈I 

X i = {f  : f  is a function on I  and f (i) ∈ X  for each i ∈ I } .

Note that if some X i is empty, then the product is empty. If all the X i are nonempty,then AC implies that the product is nonempty.

If {κi : i ∈ I } is a family of cardinal numbers, we define

(51)i∈I 

κi = card

i∈I 

X i

,

where {X i : i ∈ I } is a family of sets such that card (X i) = κi for each i ∈ I . (We abusethe notion by using

both for the product of sets and for the product cardinals.)

Again it follows from the axiom of choice that the definition doesn’t depend on thechoice of the sets X i.

If  κi = κ for each i ∈ I , and card(I ) = λ, then

i∈I κi = κλ. Also, infinite sumsand products satisfy some of the rules satisfies by the finite sums and products. Forinstance,

i κλ

i = (

i κi)

λ, or

i κλi = κ

i λi. Or if I  is a disjoint union I  =

j∈J  Aj ,

then

(52)i∈I 

κi =j∈J 

i∈Ajκi

.

Page 53: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 53/133

Copyright c 2012, by Mikael Astner  53

If  κi ≥ 2 for each i ∈ I , then

(53)i∈I 

κi ≤i∈I 

κi.

(The assumption κi ≥ 2 is necessary: 1 + 1 > 1 · 1.) If I  is finite, then (53) is certainlytrue; thus assume that I  is infinite. Since

i∈I κi ≥

i∈I 2 = 2card(I ) > card(I ), itsuffices to show that

i κi ≤ card(I ) · i κi. If  {X i : i ∈ I } is a disjoint family, we

assign to each x ∈ i X i a pair (i, f ) such that x ∈ X i, if  f  ∈

i X i and f (i) = x.Thus we have (53).

Infinite product of cardinals can be evaluated using the following lemma:

Lemma 8.9. If  λ is an infinite cardinal and κi : i < λ is a nondecreasing sequenceof nonzero cardinals, then

i<λ

κi =

supi

κi

λ

.

Proof. Let κ = supi κi. Since κi ≤ κ for each i < λ, we havei<λ

κi ≤i<λ

κ = κλ.

To prove that κλ ≤ i<λ κi, we consider a partition of  λ into λ disjoint sets Aj , each

of cardinality λ:

(54) λ =j<λ

Aj .

(To get a partition (54), we can, e.g., use the canonical pairing function Γ : λ

×λ

→λ

and let Aj = Γ (λ × { j}).) Since a product of nonzero cardinals is greater than or equalto each factor, we have

i∈Aj κi ≥ supu∈Aj κi = κ, for each j < λ. Thus, by (52),

i<λ

κi =j<λ

i∈Ajκi

j<λ

κ = κλ.

The strict inequalities in cardinal arithmetic that we proved in section 6 can beobtained as special cases of the following general theorem.

Theorem 8.10 (Konig). If  κi < λi for every i ∈ I , theni∈I 

κi <i∈I 

λi.

Page 54: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 54/133

Copyright c 2012, by Mikael Astner  54

Proof. We shall show that

i κi

i λi. Let T i, i ∈ I , be such that card (T i) = λi

for each i ∈ I . It suffices to show that if  Z i, i ∈ I , are subsets of  T  =

i∈I T i,card(Z i) ≤ κi for each i ∈ I , then

i∈I Z i = T .

For every i ∈ I , let S i be the projection of  Z i into the ith coordinate:

S i = {f (i) : f  ∈ Z i} .

Since card (Z i) < card(T i), we have S i ⊂ T i. Now let f  ∈ T  be a function suchthat f (i) /∈ S i for every i ∈ I . Obviously, f  doesn’t belong to any Z i, i ∈ I , and so

i∈I Z i = T .

Corollary 8.11. κ < 2κ for every κ.

Proof.

1 + 1 + ...

   κ times

< 2 · 2 · ...

   κ times

.

Corollary 8.12. cf 

2ℵα

> ℵα.

Proof. It suffices to show that if  κi < 2ℵα for i < ωα, then

i<ωακi < 2ℵα . Let

λi = 2ℵα .i<ωα

κi <i<ωα

λi =

2ℵαℵα

= 2ℵα

Corollary 8.13. cf ℵℵβα >

ℵβ .

Proof. We show that if  κi < ℵℵβα for i < ωβ , then

i<ωβκi < ℵℵβα . Let λi = ℵℵβα .

i<ωβ

κi <i<ωβ

λi =ℵℵβα

ℵβ= ℵℵβα .

Corollary 8.14. κcf(κ) > κ for every infinite cardinal κ.

Proof. Let κi < κ, i < cf (κ), be such that κ =

i < cf (κ). Then

κ = i<cf(κ)

κi < i<cf(κ)

κi = κcf(κ).

Page 55: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 55/133

Copyright c 2012, by Mikael Astner  55

8.6 The continuum function

Cantor’s theorem (theorem 6.1) states that 2ℵα > ℵα, and therefore 2ℵα ≥ ℵα+1, forall α. The generalized continuum hypothesis  (GCH) is the statement

2ℵα = ℵα+1

for all α. GCH is independent of the axioms of ZFC. Under the assumption of GCH,cardinal exponentiation is evaluated as follows:

Theorem 8.15. If GCH holds and κ and λ are infinite cardinals then:

i. If κ ≤ λ, then κλ = λ+.

ii. If cf(κ) ≤ λ < κ, then κλ = κ+.

iii. If λ < cf (κ), then κλ = κ.

Proof.

i. By lemma 8.6.

ii. This follows from (43) and (44).

iii. By lemma 6.10, ii, the set κλ is the union of the sets αλ, α < κ, and card

αλ ≤

2card(α)·λ = (card (α) · λ)+ ≤ κ.

The beth function  is defined by induction:

  0 = ℵ0,  α+1 = 2α ,

  α = sup ({  β : β < α}) if  α is a limit ordinal.

Thus GCH is equivalent to the statement  α = ℵα for all α.We shall now investigate the general behavior of the continuum function 2κ, without

assuming GCH.Theorem 8.16.

i. If κ < λ then 2κ ≤ 2λ.

ii. cf(2κ) > κ.

iii. If κ is a limit cardinal then 2κ = (2<κ)cf(κ)

.

Proof.

ii. By corollary 8.12.

iii. Let κ =

i<cf(κ) κi, where κi < κ for each i. We have

2κ = 2i κi =

i

2κi

≤ i

2<κ = 2<κcf(κ)

≤(2κ)

cf(κ)

≤2κ.

Page 56: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 56/133

Copyright c 2012, by Mikael Astner  56

For regular cardinals, the only conditions theorem 8.16 places on the continuumfunction are 2κ > κ and 2κ ≤ 2λ if  κ < λ. We shall see that these restrictions on 2κ

for regular κ that are provable in ZFC.

Corollary 8.17. If κ is a singular cardinal and if the continuum function is eventuallyconstant below κ, with value λ, then 2κ = λ.

Proof. If  κ is a singular cardinal that satisfies the assumption of the theorem thenthere is µ such that cf (κ) ≤ µ < κ and that 2<κ = λ = 2µ. Thus

2κ =

2<κcf(κ)

= (2µ)cf(κ)

= 2µ.

The gimel function  is the function

(55) (κ) = κcf(κ).

If  κ is a limit cardinal and if the continuum function below κ isn’t eventuallyconstant, then the cardinal λ = 2<κ is a limit of a nondecreasing sequence

λ = 2<κ = limα→κ

2card(α)

of length κ. By 6.8, ii, we have

cf (λ) = cf (κ) .

Using theorem 8.16, iii, we get

(56) 2κ =

2<κcf(κ)

= κcf(λ).

If  κ is a regular cardinal, then κ = cf (κ); and since 2κ = κκ, we have

(57) 2κ = κcf(κ).

Thus (56) and (57) show that the continuum function can be defined in terms of thegimel function:

Corollary 8.18.

i. If κ is a successor cardinal, then 2κ = (κ).

ii. If  κ i s a limit cardinal and if the continuum function below κ is eventuallyconstant, then 2κ = 2<κ · (κ).

iii. If  κ is a limit cardinal and if the continuum function below κ isn’t eventuallyconstant, then 2κ = (2<κ).

Page 57: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 57/133

Copyright c 2012, by Mikael Astner  57

8.7 Cardinal exponentiation

We shall now investigate the function κλ for infinite cardinal numbers κ and λ.We start with the following observation: If  κ is a regular cardinal and λ < κ, then

every function f  : λ→

κ is bounded (i.e., sup ({

f (ξ ) : ξ < λ}

) < κ). Thus

κλ =α<κ

aλ,

and so

κλ = σα<κcard(a)λ

.

In particular, if κ is a successor cardinal, we obtain the Hausdorff formula 

(58) ℵℵβα+1 = ℵℵβα · ℵα+1.

(Note that (58) holds for all α and β .)In general, we can compute κλ using the following lemma. If κ is a limit cardinal,

we use the notation limα→κ αλ to abbreviate sup µλ : µ is a cardinal and µ < κ.

Lemma 8.19. If  κ is a limit cardinal, and λ ≥ cf (κ), then

κλ =

limα→κ

αλcf(κ)

.

Proof. Let κ =

i<cf(κ) κi, where κi < κ for each i. We have κλ ≥

i<cf(κ) κi

λ=

i κλi ≤

i

limα→κ αλ

=

limα→κ αλcf(κ) ≤

κλcf(κ)

= κλ.

Theorem 8.20. Let λ be an infinite cardinal. Then for all infinite cardinals κ, thevalue of  κλ is computed as follows, by induction on κ:

i. If κ ≤ λ then κλ = 2λ.

ii. If there exists some µ < κ such that µλ ≥ κ, then κλ = µλ.

iii. If κ > λ and if µλ < κ for all µ < κ, then:

(a) if cf(κ) > λ, then κλ = κ;

(b) if cf(κ) ≤ λ, then κλ = κcf(κ).

Proof.

i. By lemma 8.6.

ii. µλ ≤ κλ ≤

µλ

λ

= µλ.

iii. If κ is a successor cardinal, we use the Hausdorff formula (equation 58). If  κ isa limit cardinal, we have limα→κ αλ = κ. If cf(κ) > λ then every f  : λ → κ isbounded and we have κλ = limα→κ αλ = κ. If cf(κ) ≤ λ, then by lemma 8.19,

κλ = (limα→καλ)cf(κ)

= κcf(κ).

Page 58: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 58/133

Copyright c 2012, by Mikael Astner  58

Theorem 8.20 shows that all cardinal exponentiation can be defined in terms of thegimel function:

Corollary 8.21. For every κ and λ, the value of  κλ is either 2λ, κ, or (µ) for someµ such that cf (µ) ≤ λ < µ.

Proof. If  κλ > 2λ · κ, let µ be the least cardinal such that µλ = κλ, and by theorem8.20 (for µ and λ), µλ = µcf(µ).

A cardinal κ is a strong limit  cardinal if 

2λ < κ for every λ < κ.

Obviously, every strong limit cardinal is a limit cardinal. If the GCH holds, then everylimit cardinal is a strong limit.

It’s easy to see that if κ is a strong limit cardinal, then

λν  < κ for all λ,ν < κ.

An example of a strong limit cardinal is ℵ0. Actually, the strong limit cardinals forma proper class: If  α is an arbitrary cardinal, then the cardinal

κ = sup

α, 2α, 22α ,...

(of cofinality ω) is a strong limit cardinal.Another fact worth mentioning is:

(59) If κis a strong limit cardinal, then 2κ = κcf(κ).

We recall that κ is weakly inaccessible if it’s uncountable, regular, and limit. Wesay that a cardinal κ is inaccessible  (strongly) if  κ > ℵ0, κ is regular, and κ is stronglimit.

Every inaccessible cardinal is weakly inaccessible. If the GCH holds, then everyweakly inaccessible cardinal κ is inaccessible.

The inaccessible cardinals owe their name to the fact that they cannot be obtainedfrom smaller cardinals by the usual set-theoretical operations.

If  κ is inaccessible and card(X ) < κ, then card (P (X )) < κ. If card(S ) < κ and if card(X ) < κ for every X  ∈ S , then card (

S ) < κ.

In fact, ℵ0 has this property too. Thus we can say that in a sense an inaccessiblecardinal is to smaller cardinals what ℵ0 is to finite cardinals. This one of the the mainthemes of the theory of large cardinals.

Page 59: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 59/133

Copyright c 2012, by Mikael Astner  59

8.8 The singular cardinal hypothesis

The singular cardinal hypothesis  (SCH) is the statement: For every singular cardinalκ, if 2cf(κ), then κcf(κ) = κ+.

Obviously, the singular cardinal hypothesis follows from GCH. If 2cf(κ)

≥κ then

κcf(κ) = 2cf(κ). If 2cf(κ) < κ, then κ+ is the least possible value of  κcf(κ).We shall prove later in the book that if SCH fails then a large cardinal axiom

holds. In fact, the failure of SCH is equiconsistent with the existence of a certain largecardinal.

Under the assumption of SCH, cardinal exponentiation is determined by the con-tinuum function on regular cardinals:

Theorem 8.22. Assume that SCG holds.

i. If κ is a singular cardinal then:

(a) 2κ = 2<κ if the continuum function is eventually constant below κ,

(b) 2κ = (2<κ)+

otherwise.

ii. If κ and λ are infinite cardinals, then:

(a) If κ ≤ 2λ, then κλ = 2λ.

(b) If 2λ < κ and λ < cf (κ), then κλ = κ.

(c) If 2λ < κ and cf(κ) ≤ λ, then κλ = κ+.

Proof.

i. If  κ is a singular cardinal, then by theorem 8.16, 2κ is either λ or λcf(κ) whereλ = 2<κ. The latter occurs if 2α isn’t eventually constant below κ. Then cf(λ) =cf (κ), and since 2cf(κ) < 2<κ = λ, we have λcf(λ) = λ+ by the singular cardinalhypothesis.

ii. We proceed by induction on κ, for a fixed λ. Let κ > 2λ. If  κ is a successor

cardinal, κ = ν +, then ν λ ≤ κ (by the induction hypothesis), and κλ = (ν +)λ

=ν +

·ν λ = κ, by the Hausdorff formula (equation (58)).

If  κ is a limit cardinal, then ν λ < κ for all ν < κ. By theorem 8.20, κλ = κ isλ < cf (κ), and κλ = κcf(κ) if  λ ≥ cf (κ). In the latter case, 2cf(κ) ≤ 2λ < κ, andby the singular cardinals hypothesis, κcf(κ) = κ+.

9 The axiom of regularity

The axiom of regularity states that the ∈ relation on any family of sets is well-founded:

Axiom of regularity. Every nonempty set has an ∈-minimal element:

∀S (S 

=∅ ⇒

(∃

x∈

S ) S ∩

x =∅

) .

As a consequence, there’s no infinite sequence

x0 x1 x2 ...

Page 60: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 60/133

Copyright c 2012, by Mikael Astner  60

(Consider the set S  = {x0, x1, x2,...} and apply the axiom.) In particular, there’s noset x such that

x ∈ x

and there are no ”cycles”

x0 ∈ x1 ∈ ...xn ∈ x0.

The axiom of regularity postulates that set of certain type don’t exist. This restric-tion on the universe of sets isn’t contradictory (i.e., the axiom is consistent with theother axioms) and is irrelevant for the development of ordinal and cardinal numbers,natural-, and real numbers, and in fact of all ordinary mathematics. However, it’sextremely useful in the metamathematics of set theory, in construction of models. Inparticular, all sets can be assigned ranks and can be arranged in cumulative hierarchy.

We recall that a set T  is transitive if  x ∈ T  implies that x ⊂ T .

Lemma 9.1. For every set S  there exists a transitive set t ⊃ S .

Proof. We define by induction

S 0 = S , S n+1 =

S n;

T  =∞n=0

S n. (60)

Clearly, T  is transitive and T  ⊃ S .Since every transitive set much satisfy

T  ⊂ T , it follows that the set in (60) is

the smallest transitive T  ⊃ S ; it’s called transitive closure  of  S :

TC (S ) =

{T  : T  ⊃ S  and T  is transitive} .

Lemma 9.2. Every nonempty class C  has an ∈-minimal element.

Proof. Let S  ∈ C  be arbitrary. If  S ∩ = ∅, then S  is a minimal element of  C ; if S ∩C  = ∅, we let X  = T ∩C  where T  = TC (S ). X  is a nonempty set and by the axiomof regularity, there is x ∈ X  such that x∩X  = ∅. It follows that x∩C  = ∅; otherwise,if  y ∈ x and y ∈ C , then y ∈ T  since T  is transitive, and so y ∈ x ∩ T  ∩ C  = c ∩ X.Hence x is a minimal element of  C .

9.1 The cumulative hierarchy of sets

We define, by transfinite induction,

V 0 = ∅, V α+1 =

P (V α) ;

V α =β<α

V β if  α is a limit ordinal.

The sets V α have the following properties (by induction):

Page 61: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 61/133

Copyright c 2012, by Mikael Astner  61

i. Each V α is transitive.

ii. If α < β , then V α ⊂ V β .

iii. α ⊂ V α.

The axiom of regularity implies that every set is in some V α:Lemma 9.3. For every x there is α such that x ∈ V α:

(61)

α∈OrdV α = V .

Proof. Let C  be the class of ll x that aren’t in any V α. If  C  is nonempty, then C has an ∈-minimal element x. That is, x ∈ C , and z ∈

α V α for every z ∈ x. Hencex ∈

α∈Ord V α. By replacement, there exists an ordinal γ  such that x ⊂ α<γ V α.

Hence x ∈ V γ  and so x ∈ V γ +1. Thus C  is empty and we have (61).

Since every x is in some V α, we may define the rank  of  x:

(62) rank(x) = the least α such that x ∈ V α+1.

Thus each V α is the collection of all sets of rank less than α, and we have

i. If x ∈ y, thenrank (x) < rank(y).

ii. rank(α) = α.

One of the uses of the rank function is a definition of equivalence classes for equiv-alence relations on a proper class. The basic trick is the following:

Given a class C , let

(63) C  = {x ∈ C  : (∀z ∈ C )rank(x) ≤ rank(z)} .

C  is always a set, and if  C  is nonempty, then C  is nonempty. Moreover, (63) can beapplied uniformly.

Thus, for example, if ≡ is an equivalence on a proper class C , we apply (63) to eachequivalence class of ≡, and define

[x] = {y ∈ C  : y ≡ x and ∀z ∈ C (z ≡ x ⇒ rank(x) ≤ rank(z))}

and

C/ ≡= {[x] : x ∈ C } .

In particular, this trick enables us to define isomorphism types for a given isomorphism.For instance, one can define order-types of linearly ordered sets, or cardinal numbers

(even without AC).We use the same argument to prove the following.

Page 62: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 62/133

Copyright c 2012, by Mikael Astner  62

Collection principle.

(64) ∀X ∃Y  (∀u ∈ X ) (∃vϕ (u,v,p) ⇒ (∃v ∈ Y ) ϕ (u,v,p))

( p is a parameter).The collection principle is a schema of formulas. We can formulate it as follows:Given a ”collection of classes” C u, u ∈ X (X  is a set), then there’s a set Y  such that

for every u ∈ X ,

if  C u = ∅, then C u ∩ Y  = ∅.

To prove (64), we let

Y  =u∈X

C u

where C u = {v : ϕ (u,v,p)}, i.e.,

v ∈ Y  ⇔ (∃u ∈ X ) (ϕ (ϕ (u,v,p) and ∀z (ϕ (u,z,p) ⇒ rank(v) ≤ rank(z)))).

That Y  is a set follows from the replacement schema.Note that the collection principle implies the replacement schema: Given a function

F , then for every set X  we let Y  be a set such that

(∀u ∈ X ) (∃v ∈ Y ) F (u) = v.

Then

F  X  = F  ∩ (X × Y )

is a set by the separation schema.

9.2 ∈-induction

The method of transfinite induction can be extended to an arbitrary transitive class(instead of  Ord), both for the proof and for the definition by induction:

Theorem 9.4 (∈-induction). Let T  be a transitive class, let Φ be a property. Assumethat

i . Φ (∅);

ii. if x ∈ T  and Φ(z) holds for every z ∈ x, then Φ(x).

Then every x ∈ T  has property Φ.

Proof. Let C  be the class of all x ∈ T  that don’t have the property Φ. If C  is nonempty,

then it has an ∈-minimal element x; apply i or ii.

Page 63: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 63/133

Copyright c 2012, by Mikael Astner  63

Theorem 9.5 (∈-recursion). Let T  be a transitive class and let G be a function(defined for all x). Then there’s a function F  on T  such that

(65) F (x) = G (F  x)

for every x ∈ T .Moreover, F  is the unique function that satisfies (65).

Proof. We let, for every x ∈ T ,

F (x) = y ⇔

there exists a function f  such thatdom(f ) is a transitive subset of T  and:

i. (∀z ∈ dom(f )) f (z) = G (f  z) ,ii. f (x) = t.

That F  is a (unique) function on T  satisfying (65) is proved by ∈-induction.

Corollary 9.6. Let A be a class. There’s a unique class B such that

(66) B = {x ∈ A : x ⊂ B} .

Proof. Let

F (x) =

1 if  x ∈ A and F (z) = 1 for all z ∈ x,0 otherwise.

Let B = {x : F (x) = 1}. The uniqueness of  B is proved by ∈-induction.

We say that each x ∈ B is hereditarily  in A.One consequence of the axiom of regularity is that the universe doesn’t admit

nontrivial ∈-automorphisms. More generally:

Theorem 9.7. Let T 1 and T 2 be transitive classes and let π be an ∈-isomorphism of T 1 onto T 2; i.e., π is one-to-one and

(67) u ∈ v ⇔ πu ∈ πv.

Then T 1 = T 2 and πu = u for every u ∈ T 1.

Proof. We show, by ∈-induction, that πx = x for every x ∈ T 1. Assume that πz = zfor each z ∈ x and let y = πx.

We have x

⊂y because if z

∈x, then z = πz

∈πx = y.

We also have y ⊂ x: Let t ∈ y. Since y ⊂ T 2, there’s z ∈ T 1 such that πz = t. Sinceπz ∈ y, we have z ∈ x, and so t = πz = z. Thus t ∈ x.

Therefore πx = x for all x ∈ T 1, and T 2 = T 1.

Page 64: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 64/133

Copyright c 2012, by Mikael Astner  64

9.3 Well-founded relations

The notion of well-founded relations that was introduced in section 5 can be generalizedto relations on proper classes, and one can extend the method of induction to well-founded relations.

Let E  be a binary relation on a class P . For each x ∈ P , we let

extE  (x) = {z ∈ P  : z E x}

the extension  of  x.

Definition 9.8. A relation E  on P  is well-founded, if:

i. every nonempty set x ⊂ P  has an E -minimal element;

ii. extE (x) is a set, for every x ∈ P .

(Condition ii is vacuous if  P  is a set.) Note that the relation ∈ is well-founded onany class, by the axiom of regularity.

Lemma 9.9. If  E  is a well-founded relation on P , then every nonempty class C ⊂

P has a E -minimal element.

Proof. We follow from the proof lemma 9.2; we are looking for x ∈ C  such thatextE  (x) ∩ C  = ∅. Let S  ∈ C  be arbitrary and assume that extE  (S ) ∩ C  = ∅. We letX  = T  ∩ C  where

T  =

∞n=0

S n and

S 0 = extE  (S ) , S n+1 =

{extE  (z) : z ∈ S n} .

As in lemma 9.2, it follows that an E -,minimal element x of  X  is E -minimal in C .

Theorem 9.10 (Well-founded induction). Let E  be a well-founded relation on P . LetΦ be a property. Assume that:

i. Every E -minimal element x has property Φ;

ii. if x ∈ P  and if Φ (z) holds for every z such that z E x, then Φ(x).

Then every x ∈ P  has property Φ.

Proof. A modification of proof of theorem 9.4.

Theorem 9.11 (Well-founded recursion). Let E  be a well-founded relation on P . LetG be a function on (on V  × V ). Then there’s a unique function F  on P  such that

(68) F (x) = G (x, F  extE  (x))

for every x ∈ P .

Page 65: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 65/133

Copyright c 2012, by Mikael Astner  65

Proof. A modification of the proof of theorem 9.5.

(Note that if  F (x) = G (F  ext(x)) for some G, then F (x) = F (y) whenever

ext(x) = ext(y); in particular, F (x) is the same for all minimal elements.)Example 9.12 (The rank function). We define, by induction, for all x ∈ P :

(x) = sup({ (z) + 1 : z E x})

(compare with lemma 5.7). The range of  is either an ordinal or the class Ord. Forall x, y ∈ P ,

x E y ⇒ (x) < (y) .

Example 9.13 (The transitive collapse). By induction, let

π (x) = {π (z) : z E x}for every x ∈ P . The range of  π is a transitive class, and for all x, y ∈ P ,

x E y ⇒ π (x) ∈ π (y) .

The transitive collapse of a well-founded relation isn’t necessarily a one-to-one func-tion. It’s one-to-one if  E  satisfies an additional condition, extensionality.

Definition 9.14. A well-founded relation E  on a class P  is extensional if 

(69) extE  (X ) = extE  (Y )

whenever X  and Y  are distinct elements of  P .A class M  is extensional  if the relation ∈ on M  is extensional, i.e., if for any distinct

X, Y  ∈ M , X ∩ M  = Y  ∩ M .

The following theorem shows that the transitive collapse of an extensional well-founded relation is one-to-one, and that every extensional class is ∈-isomorphic to atransitive class.

Theorem 9.15 (Mostowski’s collapsing theorem).

i. If E  is a well-founded and extensional relation on a class P , then there’s a tran-sitive class M  and an isomorphism π between (P, E ) and (M, ∈). The transitiveclass M  and the isomorphism π are unique.

ii. In particular, every extensional class P  is isomorphic to a transitive class M . Thetransitive class M  and the isomorphism π is unique.

iii. In case ii, if T  ⊂ P  is transitive, then π (x) = x for every x ∈ T .

Page 66: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 66/133

Copyright c 2012, by Mikael Astner  66

Proof. Since ii is a special case of i (E  = ∈ in case of ii), we shall prove the existenceof an isomorphism in the general case.

Since E  is a well-founded relation, we can define π by well-found induction (theorem9.11), i.e., π (x) can be defined in terms of the π (z)’s, where z E x. We let, for each

x ∈ P 

(70) π (x) = {π (z) : z E x} .

In particular, in case of  E  = ∈, (70) becomes

(71) π (x) = {π (z) : z ∈ x ∩ P } .

The function π maps P  onto a class M  = π (P ), and it’s immediate from the equation(70) that M  is transitive.

We use the extensionality of  E  to sow that π is one-to-one. Let z ∈ M  be the leastrank such that z = π (x) = π (y) for some x = y. Then extE  (x) = extE  (y) and there’s,e.g., some u

∈extE  (x) such that u /

∈extE  (y). Let t = π (u) . Since t

∈z = π (y),

there’s v ∈ extE  (y) such that t = π (v). Thus we have t = π (u) = π (v), u = v, and tis of lesser rank than z (since t ∈ z). A contradiction!

Now it follows easily that

(72) x E y ⇔ π (x) ∈ π (y) .

If  x E y, then π (x) ∈ π (y) by equation (70), π (x) = π (z) for some z E y. Sinceπ (x) ∈ π (y), then by (70), π (x) = π (z) for some z E y. Since π is one-to-one, wehave x = z and so x E y. The uniqueness of 9.7. If  π1 and π2 are two isomorphisms of P , and M 1 and M 2, respectively, then π2π−1

1 is an isomorphism between M 1 and M 2,and therefore the identity mapping. Hence π1 = π2.

It remains to prove iii. If T  ⊂ P  is transitive, then we first observe that x ∈ P  forevery x

∈T , and so we have

π (x) = {π (z) : z ∈ x}

for all x ∈ T . It follows easily by ∈-induction that π (x) = x for all x ∈ T .

10 Surreal numbers

10.1 The definition and the fundamental existence theorem

10.1.1 Definition

Definition 10.1. A surreal number  is a function from an initial segment f the ordinals

into the set {+, −}, i.e. informally, an ordinal sequence consisting of pluses and minuseswhich terminate. The empty sequence is included as a possibility.

Page 67: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 67/133

Copyright c 2012, by Mikael Astner  67

Example 10.2. One example is the function f  defined as f (0) = +, f (1) = −, andf (2) = + which informally is written as (+−+). An example of infinite length is thesequence of  ω pluses followed by ω minuses.

The length (a) of a surreal number is the least ordinal α for which it’s undefined.

(Since an ordinal is the set of all its predecessors this is the same as the domain of  a,but I prefer to avoid this point of view.) An initial segment of  a is a surreal numberb such that (b) ≤ (a) and b (α) = a (α) for all α where b (α) is defined. The tailof  b in a is the surreal number c of length (α) − (b) satisfying c (α) = a ( (b) + a).Informally, this is the sequence obtained from a by chopping off  b from the beginning.a may be regarded as the juxtaposition of b and c written bc.

For stylistic reasons I shall occasionally say that a (α) = 0 if a is undefined at α.This should be regarded as an abuse of notion since we do not want the domain of  ato be the proper class of all ordinals.

Definition 10.3. If  a and b are surreal numbers we define an order as follows:

a < b, if  a (α) < b (α)

where α is the first place where a and b differ, with the convention that − < 0 < +, e.g.(+−) < (+) < (++).

It’s clear that this is a linear order. In fact, this is essentially a lexicographicalorder.

10.1.2 Fundamental existence theorem

Theorem 10.4 (Fundamental existence theorem). Let F  and G be two sets of surrealnumbers such that a ∈ F  and b ∈ G ⇒ a < b. Then there’s a unique c of minimallength such that a ∈ F  ⇒ a < c and b ∈ G ⇒ c < b. Furthermore c is an initialsegment of any surreal number strictly between F  and G (note that F  or G may beempty).

Notation 10.5. Henceforth I’ll use the natural convention that if  F  and G are setsthen ”F < G” means: ”a ∈ F  and b ∈ G ⇒ a < b”, ”F < c” means: ”a ∈ F  ⇒ a < c”,and c < G means: ”b ∈ G ⇒ c < b.” Thus we may rewrite the hypothesis as F < G.

Example 10.6. Let F  consist of all finite sequences of pluses, and G be the unit setof whose only member is the sequence of  ω pluses. Then F < G. It’s trivial to verifydirectly that c consists of  ω pluses followed by a minus, i.e., F < c < G and that anysequence d satisfying F < d < G beings with c.

Proof. Clearly, it suffices to prove the initial segment property.

Case 10.7. If  F  and G are empty, then clearly the empty sequence works.

Case 10.8. G is empty but F  is nonempty.Let α be the least ordinal such that there doesn’t exist a

∈F  such that a (β ) = +

for all β < α. Thus α cannot equal zero, since any a vacuously satisfies the conditiona (β ) = + for all β < 0.

Page 68: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 68/133

Copyright c 2012, by Mikael Astner  68

Subcase 10.9. α is a limit ordinal. I claim that the desired c is the sequence of  αpluses, i.e., (c) = α and c (β ) = + if  β < α.

Since, by choice of  α, no element a ∈ F  exists such that a (β ) = + for all β < α,every element of  F  is less than c. Now let d be any surreal number such that F < d.

Suppose that γ < α. Then γ + 1 < α, since α is a limit ordinal. Hence, by choice of α, there exists a ∈ F  such that a (β ) = + for all β < γ  + 1, i.e. β  ≤ γ . Since a < d,d (β ) = 0 for all β ≤ γ . In particular, d (γ ) = +. Thus c is an initial segment of  d.

Subcase 10.10. α is a non-limit ordinal, γ  + 1. In this case there exists an a ∈ F such that a (β ) = + for all β ≤ γ  but thre is no a ∈ F  such that a (β ) = + for all β ≤ γ .Hence any a ∈ F  satisfying:

(β < γ ⇒ a (β ) = +)

must satisfy:

(a (γ ) = − or 0) .

If all such a satisfy a (γ ) = − then it’s easy to see that the sequence of  γ  pluses worksfor c. If there exist such an a ∈ F  such that a (γ ) = 0, i.e. the sequence of γ  plusesbelongs to F , then the sequence of (γ + 1) pluses works for c.

Case 10.11. F  is empty but G is nonempty. This is similar to case 10.8.

Case 10.12. Both F  and G are nonempty.Let α be the least ordinal such that there do not exist a ∈ F , b ∈ G, such that

a (β ) = b (β ) for all β < α. Again α = 0.

Subcase 10.13. α is a limit ordinal. Suppose γ < α; then γ  + 1 < α. Hence thereexist a ∈ F , b ∈ G such that a (β ) = b (β ) for all β  ≤ γ . The value a (γ ) is well-defined in the following sense. If (a1, b1) is another pair satisfying the above propertiesthen a (β ) = a1 (β ) for all β  ≤ γ . Otherwise, suppose δ  ≤ γ  is the least ordinal for

which a (β ) = a1 (β ). Without loss of generality assume a (δ ) < a1 (δ ). Then by thelexicographical order b < a1, which is a contradiction since b ∈ G and a1 ∈ F . Thusthere exists a sequence of length α, such that for all γ < α there exist a ∈ F  and b ∈ Gsuch that a (β ) = d (β ) = b (β ) for β ≤ γ .

By hypothesis on α, d cannot be an initial segment of an element in F  as well asan element in G. Furthermore, an element of  F  which doesn’t have d as an initialsegment must be less than d. (Otherwise we obtain the same contradiction, as before.)Similarly an element of  G which doesn’t have d as an initial segment must be greaterthan d.

It follows that if  d is neither an initial segment of an element of  F  not an initialsegment of an element of  G then d works.

Now suppose F  has elements with initial segment d. Then G doesn’t have suchelements. Let F  be the set of tails with respect to d of all such element in F . Apply

case 10.8 to F  and ∅ to obtain d. The the juxtaposition dd works.First, as before the required inequality is satisfied with respect to all elements in F 

or G which don’t begin with d. Since F  < d it follows from the lexicographical orderthat dd is larder than all elements in F  beginning with d.

Page 69: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 69/133

Copyright c 2012, by Mikael Astner  69

On the other hand, let e be any element satisfying F < e < G. Recall that for allγ < α there exist a ∈ F  and b ∈ G such that a (β ) = d (β ) = b (β ) for β  ≤ γ . Thisimplies by the lexicographical order that e (β ) = d (β ) for β < α. Thus d is an initialsegment of  e. Again using the lexicographical order the tail e must satisfy F  < e.

Hence d is an initial segment of  e. Therefore dd is an initial segment of  e.A similar argument applies if  G has elements with initial segment d.

Subcase 10.14. α is a non-limit ordinal γ + 1. Then there exists a ∈ F , b ∈ G suchthat a (β ) = b (β ) for all β < γ  but there don’t exist a ∈ F , b ∈ G which agree forall β  ≤ γ . As before, the values a (β ) are well-defined, and we obtain a sequence d of length γ . Again, as before, any element in F  which doesn’t have d as an initial segmentmust be less than d, and an element in G which doesn’t have d as an initial segmentmyst be greater than d.

Let F  be the set of tails with respect to d of elements in F  which begin with dand similarly for G. Then as stated perviously, there don’t exist a ∈ F , b ∈ G suchthat a (0) = b (0). (Note that in contrast to subcase 10.13, F  and G are nonemptyalthough one of these sets might contain the empty sequence as its only element.) Since

F  < G, it follows that a (0) < b (0) for all a ∈ F , b ∈ G.Now suppose that d ∈ F  and d ∈ G. This means that neither F  nor G containsthe empty sequence, i.e. a (0) and b (0) are never undefined. Since a (0) < b (0), weobtain: a (0) = − and b (0) = +. It’s then clear that d works.

Since F  and G are disjoint, d belongs to at most one of  F  and G. Suppose thatd ∈ G. A similar argument will apply if  f  ∈ F . Then every a ∈ F  satisfies a (0) = −.Let F  be the set of tails of  F  with respect to this −. (such an iterated tail is, clearlythe tail with respect to the sequence (d−).) Apply case 10.8 to F  and ∅ to obtain d.Then the juxtaposition c = d−d works. We already know that c satisfies the requiredinequality with respect t those elements that do not begin with d. Since no b ∈ G

has b (0) = −, this takes care of all G. The choice of  d takes care of all elements inF  beginning with d (the next term of which is necessarily −). On the other hand, anyelement e satisfying F < e < G must begin with d. Since d ∈ G, the next term must

be−

. By choice of  d, it must be an initial segment of the tail of  e with respect to d−

,i.e. e must begin with d−d.

Definition 10.15. F |G is the unique c of minimal length such that F < c < G.

Remark 10.16. A slightly easier but less constructive proof is possible. First oneextracts what is needed from the above proof to obtain an element c such that F <c < G. Although this is all that’s required for the conclusion, the proof does notsimplify tremendously. Nevertheless, it simplifies slughtly since there is no concernabout the initial segment property. Once a c is obtained, the well-ordering principlegives us a c of minimal length. At this stage it’s useful to have a definition.

Definition 10.17. The common initial segment  of a and b where a

= b is the element

c whose length is the least α such that a (α) = b (α) and such that c (β ) = a (β ) = b (β )for β < α. If  a = b then c = a + b.

Page 70: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 70/133

Copyright c 2012, by Mikael Astner  70

If one of  a or b is an initial segment of the other, then c is the shorter element.If neither is an initial segment of the other, the either a (γ ) = + and b (γ ) = −, ora (γ ) = − and b (γ ) = +. In either case c is strictly between a and b.

Now let c satisfy F < c < G and be of minimal length. Suppose F < d < G. Let

e be the common initial segment of  c and d. Then F < e < G. Since c has minimallength and e is an initial segment of  c, e = c. Hence c = e is an initial segment of  d.

10.1.3 Order properties

Theorem 10.18. If  G = ∅ then F |G consists solely of pluses.

Proof. This follows immediately from the construction in the proof of the fundamentalexistence theorem (theorem 10.4). It can also be seen trivially as follows. Suppose chas minuses. Let d be the initial segment of  c of length γ  where γ  is the least ordinalat which c has the value plus. Then clearly F < d and d has a shorter length than c.This contradicts the minimality of the length of  c.

Theorem 10.19. If  F  = ∅ then F |G consists solely of minuses.

Proof. Analogous to theorem 10.18.Note that the empty sequence consists solely of pluses and solely of minuses!

Theorem 10.20. (F |G) ≤ the least α such that

∀a (a ∈ F  ∪ G ⇒ (a) < α) .

This is trivial because of the lexicographical order, since otherwise the initial seg-ment b of  F |G of length α would also satisfy F < b < G contradicting the minimalityof  F |G.

Note that < cannot be replaced by ≤. For example, if  F  = {(+)} and G = {(++)},then F |G = (++−). The result also follows from the construction in the proof of theorem10.18. In fact, the construction gives the more detailed information that every properinitial segment of F |G is an initial segment of an element of  F ∪G. (An initial segmentb of  a is proper if  a = b.)

Theorem 10.20 has a kind converse.

Theorem 10.21. Any a of length α can be expressed in the form F |G where allelements of  F  ∪ G have length less than α.

Proof. Let

F  = {b : (b) < α ∧ b < a}

and

G = {b : (b) < α ∧ b > a}.

Page 71: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 71/133

Copyright c 2012, by Mikael Astner  71

Then F < a < G and every element of length less than α is, by definition, in F  or G sothat a satisfies the minimum length condition. Note that the argument is valid even if a is the empty sequence.

Theorem 10.22. Suppose F |G = c and F |G = d. Then c ≤ d if and only if  c < G

and F < d.

Proof. We know that F < c < G and F  < d < G. Suppose that c ≤ d; thenc ≤ d < G and F < c ≤ d. For the converse, assume c ≤ G and F < d. We show thatd < c leads to a contradiction. This assumption yields F < d < c < G. Hence c is aninitial segment f  d. Also F  < d < c < G so d is an initial segment of  c. Hence c = dwhich contradicts d < c.

Of fundamental importance here will be what’s called the cofinality theorems . Theyare analogous to classical results such as: In the ε, δ  definition of a limit, it suffices toconsider rational ε; and in the definition of a direct limit of objects with respect to a

directed set, a cofinal subset gives rise to an isomorphic object.

Definition 10.23. (F , G) is cofinal  in (F, G) if 

(∀a ∈ F  : ∃b ∈ F  : b ≥ a) ∧ (∀a ∈ G : ∃b ∈ G : b ≤ a) .

It’s clear that (F, G) is cofinal in (F, G), that (F , G) cofinal in (F , G), and(F , G) is cofinal (F, G) implies (F , G) cofinal in (F, G). Also if F  ⊂ F  and G ⊂ G

then (F , G) is cofinal in (F, G).

The following theorems are important although they’re trivial to prove.

Theorem 10.24 (The cofinality theorem 1). Suppose F |G = a, F  < a < G, and

(F , G) is cofinal in (F, G); then F |G = a.Proof. Suppose (b) < (a) and F  < b < G. It follows immediately from cofinalitythat F < b < G, contradicting the minimality of  (a). Hence a = F |G.

Theorem 10.25 (Cofinality theorem 2). Suppose (F, G) and (F , G) are mutuallycofinal in each other. Then F |G = F |G.

Note that it’s enough to assume that F |G has meaning since F < G ⇒ F  < G.

Proof.

{x : F < x < G} = {x : F  < x < G}.

Hence the element of minimal length is the same.

Page 72: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 72/133

Copyright c 2012, by Mikael Astner  72

Although the two above theorems are closely related they are not quite the same.The cofinality theorem (theorem 10.24) will be especially useful in the sequel. I em-phasize that in spite of the simplicity of the proof, it’s more convenient to quote theterm cofinality  than to repeat the trivial argument every time it’s used. Also it’s con-

venient often with abuse of notation to say that H  is cofinal in H . However, this isunambiguous only if it’s clearly understood whether H  and H  appear on the left orright, i.e. we must consider whether we are comparing (H, G) with (H , G), or (F, H )with (F , H ). This is usually clear from the context.

Cofinality will be used to sharpen theorem 10.21 to obtain the canonical representa-tion of a as F |G. Of course, the representation in theorem 10.21 itself may be regardedas the canonical  representation. The choice is a matter of taste.

Theorem 10.26. Let a be a surreal number. Suppose that

F  = {b : b < a and b is an initial segment of  a}.

Then a = F |G. (In the sequel F |G will be called the canonical representation of  a.)

Proof. We first use the representation in theorem 10.21. Then F  ⊂ F  and G ⊂ G.Since it’s clear that F  < a < G, it suffices by the cofinality theorem (theorem 10.24)to show that (F , G) is cofinal in (F, G). Let b ∈ F . Then (b) < (a). Suppose cis the common initial segment of  a and b. Then b ≤ c < a. Hence c ∈ F . A similarargument shows that G is cofinal in G.

The above representation is especially succinct. It’s easy to see that F  is the set of all initial segments of  a of length β  for those β  such that a (β ) = +, and similarly G

is the set of all initial segments of  a of length β  for those β  such that a (β ) = -. Thusthe elements of  F  and G are naturally parameterized by ordinals. Furthermore, theelements of F  from an increasing function of β , and the elements of G form a decreasingfunction of  β . Thus by further use of the cofinality theorem we may restrict F  or G

to initial segments of length γ  where the set of γ  is cofinal in the set of  β . For example,let a = (++−+−−+). Then

F  = {( ) , (+) , (++−) , (++−+−−+)},

and

G = {(++) , (++−+) , (++−+−)}.

To avoid confusion it’s important to recall that the ordinals begin with 0. So, e.g.,a (3) = +. Hence the initial segment of length

3 = (a (0) , a (1) , a (2)) = ++−.

In other words, this terminates just before a (3) = + so that it really belongs to F .Note the way F  and G get closer and closer to a in a manner analogous to that of partial sims of an alternating series approximating its sum. However, the analogy islimited by the possibility of having many alike signs in a row; e.g., in the extreme case

Page 73: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 73/133

Copyright c 2012, by Mikael Astner  73

of all pluses, there are no approximations from above. As an application of the lastremark on cofinal sets of ordinals we also have

a = {(+) , (++−+−−)}|{(++) , (++−+−)}

or even more simply as

a = {(++−+−−)}|{(++−+−)},

since any subset containing the largest ordinal is cofinal in a finite set of ordinals.In view of the above it seems natural to use F  and G for the canonical represen-

tation of  a. F  and G on the other hand appear to contain lots of extra garbage .Finally we need a result which may be regarded as a partial converse to the cofinality

theorem. First, it’s unreasonable to expect a true converse; in fact, it’s surprising atfirst that any kind of converse is possible. If  a = F |G choose b so that F < b < a.Such b exists by the fundamental existence theorem (theorem 10.4). By the cofinalitytheorem F  ∪ {b}|G = a. However, F  isn’t cofinal in F  ∪ {b} by choice of  b.

Theorem 10.27 (The inverse cofinality theorem). Let a = F |G be the canonicalrepresentation of  a. Also let a = F |G be an arbitrary representation. Then (F , G)is cofinal in (F, G).

Proof. Suppose b ∈ F . Then b < a < F . Since a has minimal length among elementssatisfying F  < x < G and b has smaller length than a, F  < b is impossible, i.e.∃c ∈ F  : c ≥ b. This is precisely what we need. A similar argument applies to F  andF .

The same proof works if the representation in theorem 10.21 is used. At any rate,we now have what we need to build up the algebraic structure on the surreal numbers.It’s hard to believe at this stage, but relatively simple-munded system we have supportsa rich algebraic structure.

10.2 The basic operations

10.2.1 Addition

We define addition by induction on the natural sum of the lengths of the addends.Recall that the natural sum is obtained by expressing the ordinals in normal form interms of sums of powers of  ω and then using ordinary polynomial addition, in contrastto ordinary ordinal addition which has absorption. This the natural sum is a strictlyincreasing function of each addend.

The following notation will be convenient. If a = A|B is the canonical representa-tion of  a, then a is a typical element of  F  and a is a typical element of  G. Hencea < a < a. We are now ready to give the definition.

Definition 10.28.

a + b = {a + b, a + b}|{a + b, a + b}.

Page 74: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 74/133

Copyright c 2012, by Mikael Astner  74

Several remarks are appropriate here. First, since the induction is on the natural sumof the lengths, we are permitted to use sums such as a + b in definition. Secondly, nofurther definition is needed for the beginning. Since at the beginning we have only theempty set, we can use the trite remark that {f (x) : x ∈ ∅} = ∅ regardless of  f . For

example, ∅|∅ + ∅|∅ = ∅|∅. Thirdly, there’s the a priori  possibility that the sets F and G used in the definition of  a + b do not satisfy F < G. To make the definitionformally precise, one can use the convention that F |G = u if  u ∈ F  ∪ G. In the sequelwhen a definition of an operation is given in the above form, we will show that F  isalways less than G so that the operation is really defined, i.e. u is never obtained as avalue.

Note that since we use a specific representation of elements in the form of  F |G,the operations are automatically well defined. Nevertheless, in oder to advance it’snecessary to have the fact that the result is independent of the representation.

Let’s illustrate the definition with several simple examples. Denote the emptysequence ( ) by 0 and the sequence (+) by 1. Now (+) = {0}|∅. (It’s easy to getconfused. Note that G is the empty set and F  is the unit set whose only element is theempty sequence. They’re thus not the same.) Then

1 + 0 = {0}|∅+∅|∅ = {0 + ∅|∅}|∅ = {0}|∅ = 1.

Similarly 0 + 1 = 1. Also

1 + 1 = {0}|∅+ {0}|∅ = {0 + 1, 1 + 0}|∅ = {1}|∅ = {(+)}|∅ = (++)

which is natural to call 2.It does look rather cumbersome to work directly with the definition, but so would

ordinary arithmetic if we were forced to use {∅}, {∅, {∅}}, instead of 1, 2, etc. andgo back to inductive definitions.

Theorem 10.29. a+b is always defined, i.e. never u, and furthermore b > c ⇒ a+b >a + c and b > c

⇒b + a > c + a.

Remark 10.30. Although the first part is what is most urgent, we need the secondpart to carry through the induction.

Remark 10.31. As a matter of style, one can prove commutativity first (which istrivial) and then simplify the statement of the above theorem and its proof. Howeverit seems preferable to prove that a + b exists as a surreal number before proving anyof its properties.

Proof. We use induction on the natural um of the lengths. In other words, supposetheorem 10.29 is true for all pairs of (a, b) of surreal numbers such that (a) + (b)is less than α. We show that the statements remain valid if we include pairs whosenatural sum is α.

Now a + b =

{a + b, a + b

}|{a + b, a + b

}. First, we must show that F < G.

Since a < a, it follows from the inductive hypothesis that a + b < a + b. Similarlya + b < a + b. Also a + b < a + b < a + b and a + b < a + b < a + b. Hencea + b is defined.

Page 75: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 75/133

Copyright c 2012, by Mikael Astner  75

By definition a + b < a + b < a + b and a + b < a + b < a + b. This proves therequired inequality when either of  b and c is an initial segment of each other.

Now suppose that neither b nor c is an initial segment of the other such that (a) + (b) ≤ α and (a) + (c) ≤ α. Let d be the common initial segment of  b

and c. Now assume b > c. Hence b > d > c. Hence a + b > a + d > a + c andb + a > d + a > c + a.

It follows immediately that a > b and c > d ⇒ a + c > b + d.

Theorem 10.32. Suppose a = F |G and b = H |K ; then

a + b = {f  + b + b, a + h}|{g + b, a + k}where f  ∈ F , g ∈ G, h ∈ H , k ∈ K . I.e. although the definition of a + b is givenin terms of the canonical representation of  a and b, all representations give the sameanswer.

Remark 10.33. We shall call this the uniformity theorem for addition , and say that

the uniformity property holds for addition.

Proof. Let a = F |G, b = H |K . Suppose the canonical representations are a = F |F ,b = G|G.

By the inverse cofinality theorem (theorem 10.27), F  is cofinal in F  and similarlyfor the other sets involved. Consider {f  + b, a + h}|{g + b, a + k}. It’s now easy tocheck that the hypotheses of the cofinality theorem (theorem 10.24) are satisfied. Thebetweenness property of a+b follows immediately from theorem 10.29, e.g. f +b < a+b.Also suppose a + b is one of the typical lower elements in the canonical representationof  a + b as in the definition. Since F  is cofinal in F  (∃f  ∈ F ) (f  ≥ a). By theorem10.29, f + b ≥ a + b. A similar argument applies to the other typical elements. Hencethe cofinality condition is satisfied so by cofinality theorem i (theorem 10.24) we geta + b.

Theorem 10.34. The surreal numbers from an Abelian group with respect to addition.The empty sequence is the identity, and the inverse is obtained by reversing all signs.(Note that one should be aware of potential set-theoretic problems since the system of surreal numbers is a proper class.)

Proof.

i. Commutative law. This is trivial because of the symmetric nature of thedefinition.

ii. Associative law. We use induction on the natural sum of the lengths of theaddends

(a + b) + c = {(a + b) + c, (a + b) + c}|{(a + b) + c, (a + b) + c}.

Page 76: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 76/133

Copyright c 2012, by Mikael Astner  76

By theorem 10.32 we may use a + b and b + a instead of (a + b)

and similarlyfor (a + b)

. I.e. it’s convenient to use the representation in the definition of 

addition rather than the canonical representation. We thus obtain

(a + b) + c = {(a + b) + c, (a + b) + c, (a + b) + c}|{(a + b) + c, (a + b) + c, (a + b) + c}.

A similar result is obtained for a + (b + c). Associativity follows by induction.

iii. The identity: Denote the empty sequence by 0. Then 0 = ∅|∅. We again useinduction:

a + 0 = {a + 0, a + 0}|{a + 0, a + 0}.

There are no terms 0, 0, so this simplifies to {a+0}|{a+ 0} which is {a}|{a}by the inductive hypothesis. We thus get a + 0 = a.

iv. The inverse: We use induction. Let −a be obtained from a by reversing allsigns, and let F |G be the canonical representation of  a. Again, let a and a bethe typical elements of F  and G respectively. Note that, in general, if b is an initial

segment of  c then −b is an initial segment of  −c and b < c ⇒ −b > −c. Hencethe canonical representation of  −a maybe be expressed as −a| − a. Therefore

a + (−a) = {a + (−a) , a + (−a)}|{a + (−a) , a + (−a)}.

Since a < a < a, it’s clear from the lexicographical order that −a < −a < a.Using induction and the fact that addition preserves order, we obtain

a + (−a) < a + (−a) = 0;

a + (−a) < a + (−a) = 0;

a + (−a) > a + (−a) = 0;

a + (−a) > a + (−a) = 0.Hence in the representation of  a + (−a), as H |K , H < 0 < K . Since 0 vacuouslysatisfies any minimality property, a + (−a) = H |K  = 0.

This we now know that the surreal numbers form an ordered Abelian group. Theidentity and inverses are obtained in a way which is heuristically natural.

10.2.2 Multiplication

The definition of multiplication is more complicated than that of addition; in fact, ittook some time before the standard for multiplication was discovered.

Definition 10.35.

a·b = {a ·b+a·b−a ·b, a ·b+a·b−a ·b}|{a ·b+a·b−a ·b, a ·b+a·b−a ·b}.

Page 77: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 77/133

Copyright c 2012, by Mikael Astner  77

Note, from this point on we’re going to omit the multiplication symbol, i.e. a · b isgoing to be written ab.

As partial motivation note that if  a, b, a, and b are ordinary real numbers suchthat a < a, b < b, then (b − b) (a − a) > 0, i.e. ab + ab − ab < ab. Similar

computations apply to get the appropriate inequalities if either a is replaced by a orb replaced by b. Thus the inequalities are consistent with what is desired.

Theorem 10.36. ab is always defined. Futhermore a > b and c > d ⇒ ac−bc < ad−bd.

Proof. We use induction on the natural sum of the lengths of the factors. We shall referto the inequalities ac − bc > ad − bd as P (a,b,c,d) and the expression a◦b + ab◦− a◦b◦

where a◦, b◦ are proper initial segments of  a and b respectively as f (a◦, b◦). Note thata◦ is of the form a or a. In the former case we call a◦ a lower element  and in thelatter case an upper element.

It follows immediately from the definition that the relation f  is transitive on thelast two variables. Since, at this point, we can freely use the properties of addition inordinary algebra P  (a,b,c,d) is equivalent to ac − ad > bc − bd. This makes it clearthat f  is transitive on the first two variables.

Now let b◦1 and b◦2be initial segment of  b and consider f (a◦, b◦2) − f (a◦, b◦1). This is

(a◦b + ab◦2 − a◦b◦2) − (a◦b + ab◦1 − a◦b◦1) = (ab◦2 − a◦b◦2) − (ab◦1 − a◦b◦1) .

If  a > a◦ and b◦2 > b◦1, the inductive hypothesis says that the above expression ispositive, i.e. P  (a, a◦, b◦2, b◦1), so f (a◦, b◦) is an increasing function of  b◦ if  a◦ < a. If a◦ > a and b◦1 > b◦2, the above expression may be written (a◦b◦1 − ab◦1) − (a◦b◦2 − ab◦2)which, again, is positive. Hence if  a◦ > a, f (a◦, b◦) is a decreasing function of  b◦.Similarly, f (a◦, b◦) is an increasing function of  a◦ if  b◦ < b, and decreasing if  b◦ > b.

To show that ab is defied, we must check inequalities such as f (a1, b) < f  (a2, b).This follows easily from the above. If  a1 = a2, this is immediate since b < b anda1 < a. More generally, let a be max(a1, a2); then

f (a1, b) ≤ f (a, b) < f (a, b) ≤ f (a2, b) .

We now consider f (a1 , b), f (a2, b), and a = min (a1 , a2 ). Then

f (a1 , b) ≤ f (a, b) < f  (a, b) ≤ f (a2 , b) .

Since the remaining cases are similar to the ones we already checked, this shows thatab is defined.

To prove the second statement of the theorem, assume fist that in each of the pairs{a, b}, {c, d} one element is an initial segment of the other. By definition f (a, b) < ab,f (a, b) < ab, f (a, b) > ab, and f (a, b) > ab. Now

ab − f (a, b) = ab − (ab + ab − ab) = (ab − ab) − (ab − ab) .

Thus the first inequality says P  (a, a, b , b). Similarly, the other inequalities giveP  (a, a , b, b), P (a, a, b, b), and P (a,a,b,b). This proves the statement in thisspecial case.

Page 78: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 78/133

Copyright c 2012, by Mikael Astner  78

Next, remove the above restriction on {a, b}, but still assume that one of  c or d isan initial segment of the other. Let e be the common initial segment of  a and b. Thena > e > b. By the above P (a,e,c,d) and P  (e,b,c,d). Hence by transitivity we obtainP  (a,b,c,d).

Finally, suppose neither c nor d is an initial segment of the other and let f  be theircommon initial segment. By the above, we have P  (a,b,c,f ) and P  (a,b,f,d), so wefinally obtain P  (a,b,c,d) by transitivity. This takes care of all cases.

Theorem 10.37 (The uniformity theorem for multiplication). The uniformity holdsfor multiplication.

Proof. This is similar to the proof of theorem 10.32 and, in fact, all theorems of thistype have a similar proof once we have basic inequalities of a suitable kind.

Now that we have theorem 10.36 and, in particular, the fact that the inequalitystated there is valid in general, the same computation as in the beginning of the proof of the theorem gives us the behavior of  f (c◦, d◦) in general for c◦

= a and d◦

= b. (We

no longer require that c◦ and d◦ be initial segments of  a and b respectively.)Suppose a = F |G, b = F |G, c◦ ∈ F  ∪ G, d◦ ∈ F  ∪ G. Then f (c◦, d◦) is an

increasing function of  d◦ if  c◦ < a and a decreasing function of  d◦ for c◦ > a andsimilarly for fixed d◦.

We know check the hypotheses of cofinality theorem i (theorem 10.24). The be-tweenness property of  ab follows from the same computation as in the latter part of the proof of theorem 10.36. For example, since

ab − f (c, d) = (ab − cb) − (ad − cd) ,

g (a, c, b , d) says that ab > f (c, d). The other parts of the betweenness propertyfollow the same way. Note that we are now going in the reverse direction to the onewe went earlier, i.e. we have g and we obtain the betweenness property.

To check the cofinality let e.g. f (a, b) be a lower element using the canonical rep-resentation of  ab. By the inverse cofinality theorem (theorem 10.27) (∃c ∈ F ) (c ≥ a)and (∃d ∈ F ) (d ≥ b). Then f (c, d) ≥ f (c, b) ≥ f (a, b). A similar argument appliesto the other cases. Actually all the cases may be elegantly unified by nothing thatf (c◦, x) maintains the side of  x if  c◦ < a and reverses it if  c◦ > a. Thus f (c◦, x)maintains the side of x if and only if it’s an increasing function of  x. Hence in all casesf (x, d◦). This is just what is needed to obtain cofinality.

Theorem 10.38. The surreal numbers form an ordered commutative ring with identitywith respect to the above definitions of addition and multiplication. The multiplicativeidentity 1 is the sequence (+) = {0}|∅.

Proof.i. Commutative law. Because of symmetry this is just as trivial here as it was

for addition.

Page 79: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 79/133

Copyright c 2012, by Mikael Astner  79

ii. Distributive law. We use induction on (a) + (b) + (c).

(a + b) c = {(a + b) c + (a + b) c − (a + b)

c,...}|....

By the uniformity theorem for multiplication (theorem 10.37) we may use a + b

and a + b instead of (a + b). For unification purposes we consider a typicalterm (a + b)

◦c + (a + b) c◦ − (a + b)

◦c◦ in the representation of  a (b + c) where

an element is lower if and only if even number of naughts correspond to doubleprimes. For (a + b)

◦we use {1◦ + b, a + b◦} by the above. Thus, typical terms

become (a◦ + b) c + (a + b) c◦− (a◦ + b) c◦ and (a + b◦) c + (a + b) c◦− (a + b◦) c◦.By induction the first term becomes

a◦c + bc + ac◦ + bc◦ − a◦c◦ − bc◦ = a◦c + ac◦ − a◦c◦ + bc.

A similar result is obtained if  a + b◦ is used instead of  a◦ + b.

A typical term in the representation of ac+bc is (ac)◦

+bc which is (a◦c + ac◦ − a◦c◦)+bc. Note that this is justified by theorem 10.32. Since a similar result applies if 

we take ac + (bc)

and since the parity rule as to which element is upper or loweris the same as before, we see that (a + b) c = ac + bc.

We are now permitted to write ab − (a◦b + ab◦ − a◦b◦) as (a − a◦) (b − b◦).

iii. Associative law. We use induction on (a) + (b) + (c). A typical termin representation of (ab) c is (ab)

◦ c + (ab) c◦ − (ab)◦ c◦ which by the uniformity

theorem for multiplication (theorem 10.37) may be written

(a◦b + ab◦ − a◦b◦) c + (ab) c◦ − (a◦b + ab◦ − a◦b◦) c◦.

Again, an element is lower if and only if an even number of naughts correspondto double primes. By the distributive law the above expression is

(a◦b) c + (ab◦) c − (a◦b◦) c + (ab) c◦ − (a◦b) c◦ − (ab◦) c◦ + (a◦b◦) c◦.

Of crucial importance the following kind of symmetry in the expression: thetermare all obtained from (ab) c by putting a superscript on at least one of thefactors, and the term has a minus if and only if there is an even number of superscripts.

Exactly the same thing happens with the expansion of  a (bc) except for the brack-eting. I.e. the parity rules as to which term is upper or lower, and which addendsin a term have a minus is the same as before. In fact, we obtain for a typicalterm

a (bc) = a◦ (bc) + a (b◦c + bc◦ − b◦c◦) − a◦ (b◦c + bc◦ − b◦c◦) =

a◦ (bc) + a (b◦c) + a (bc◦) − a (b◦c◦) − a◦ (b◦c) − a◦ (bc◦) + a◦ (b◦c◦) .

The result now follows by induction.

iv. The identity. First note that a · 0 = 0. This follows from the distributive law.It also follows immediately from the definition. Since 0 = ∅|∅, and all terms

Page 80: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 80/133

Copyright c 2012, by Mikael Astner  80

used in representing a product must contain lower or upper representatives of each factor, a · 0 = ∅|∅. (Note that this wasn’t the case for addition.)

We again use induction to compute a · 1. Since 1 = {0}|∅ the expression for a · 1reduces to

a · 1 = {a · 1 + a · 0}|{a · 1 + a · 0 − a · 0}which is {a · 1}|{a · 1}. By induction this is {a|a} which is a.

v. Compatibility of ordering. Suppose a > 0 and b > 0. Then by theorem 10.36we have g (a, 0, b, 0), i.e. ab − 0 · b > a · 0 − 0 · 0. Hence ab > 0.

Thus we now know that the surreal numbers form an integral domain. We sawthat multiplication behaves somewhat more subtly than addition. We shall seein the next section that division is handled much more subtly. At any rate, it’sremarkable that all this is possible.

It’s possible to get a nice form for the representation of an n-fold sum and productby inductive use of the uniformity theorems. It’s trivial that a1 + a2 + ... + anmay be represented as

{a1 + a2 + ... + an, a1 + a2 + ...an,...,a1 + a2 + ... + an}|{a1 + a2 + ... + an, a1 + a2 + ... + an,...,a1 + a2 + ...an}.

We claim that a1a2...an may be represented by terms

a1a2...an − a1 − a0

1

a2 − a0

2

...

an − a0n

where an element is lower if and only if the number of naughts correspond todouble primes.

The identity ab − (a◦b + ab◦ − a◦b◦) = (a − a◦) (b − b◦) may be written in thesuccinct form ab − (ab)

◦= (a − a◦) (b − b◦). The uniformity theorem for multi-

plication (theorem 10.37) allows us to use this representation for ab if we multiply

by other factors. Thus it’s clear by induction that

(a1a2...an) − (a1a2...an)◦

= (a1 − a◦1) (a2 − a◦2) ... (an − a◦n) .

The parity rule is easily checked. In fact, it’s essentially the same as the one inordinary algebra for multiplying pluses and minuses.

It’s possible to use the above computation to prove the associative law. Of course,one would have to be more cautious with the bracketing before one has the law.

10.2.3 Division

We will define a reciprocal for all a > 0. As usual, induction will be used, but the

definition is more involved than the ones for the earlier operations. Let a = A|A bethe canonical representation. One naive attempt would be as follows. We try

x = {0,1

a}|{ 1

a}

Page 81: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 81/133

Copyright c 2012, by Mikael Astner  81

where a ∈ G and a ∈ F  − {0}. (Note that 0 ∈ A since a > 0.) Unfortunately thisdoesn’t work. Although x has some of the properties expected of a candidate for 1

a,

xa = 1 in general. It turns out that more elements are needed to get a representationfor the reciprocal. Roughly speaking, the idea is to insert as many elements into the

representation of  x as is needed to force the crucial inequalities, i.e. in the standardrepresentation of  ax as a product we want the lower elements to be less than one andthe upper elements to be greater than one.

What is needed is more complicated. First we define objects a1, a2,...,an for everyfinite sequence where ai ∈ A∪A−{0}. For arbitrary b we define b◦ai as the unique so-lution of (a − ai) b + aix = 1. This exists by the inductive hypothesis which guaranteesthat that ai as an initial segment of a has an inverse. Uniqueness is automatic. Now let = 0 and a1, a2,...,an+1 = a1, a2,...,an ◦ an+1. For example a1 = 0 ◦ a1 = a−1

1 .We now claim that a−1 = F |G where F  = (a1,...,an : the number of  ai ∈ A is even)and G = (a1,...,an : the number of ai ∈ A is odd).

Theorem 10.39. The surreal numbers for a field.

Proof. We first show that x

∈F 

⇒ax < 1 and x

∈G

⇒ax > 1. This will show that

F < G. Since ∈ F , = 0, and a · 0 = 0 < 1, the result is valid for . We nowuse induction on the length of the finite sequence. In other words it’s enough to showthat if  b has this property so does x = b ◦ a1. Now by definition (a − a1) b + a1x = 1.Clearly (a − a1) b + a1b = ab. Since a1 > 0 it follows that x > b if and only if 1 > ab.Also it follows from the above identity that ax = 1 + (a − ai) (x − b).

Now for fixed a1, the map b → b ◦ a1 preserves being in F  or G if and only if  a1 isupper. For example, b ∈ F  and a1 ∈ A+b◦a1 ∈ G. In that case ab < 1 by the inductivehypothesis, thus x > b. Since a > a1 it follows that ax = 1 + (a − a1) (x − b) > 1.Hence x ∈ G and ax > 1 as desired. The other three cases can be unified by notingthat any change in b or a1 from lower to upper leads to a change in b ◦ a1 and hencereverses the desired inequality. At the same time, any change in a1 or b reverses thesign of a − a1 and x − b respectively so it also reverses the inequality we actually have.This proves what we desired, so that now we know that F < G and therefore F 

|G has

meaning. Let F |G = c.Finally we compute ac. A typical element used in the definition of the product has

the form a1c + ac1 − a1c1 where a1 ∈ A ∪ A and c1 ∈ F ∪ G. First, 0 ∈ A and 0 ∈ F .Thus we get a lower element in the representation of ac by choosing a1 = c = 0. Hence0 · c + a · 0 − 0 · 0 = 0 is a lower element.

Suppose a1 = 0. Then the elements in the definition of the product reduce to ac1

and ac1 is an upper element for ac if and only if  c1 ∈ G.However, we know that c1 ∈ G → ac1 > 1 and c1 ∈ F  → ac1 < 1. Hence if ac1 is

an upper element ac1 > 1 and if ac1 is a lower element ac1 < 1.Now suppose a1 = 0. Then c1 ◦ a1 is defined, is contained in F  ∪ G, and satisfies

the equation (a − a1) c1 + a1x = 1. Now a1c + ac1 − a1c1 is a lower element for acif and only if  a1 and c1 are on the same side of  a and c respectively if and only if 

c1 ◦ a1 ∈ G if and only if c1 ◦ a1 > c. (This follows from the earlier statement regardingmap b → b ◦ a1.) Since c1 ◦ a1 satisfies the equation (a − a1) c1 + a1x = 1 and a1 > 0the inequality c1 ◦ a1 > c is equivalent to (a − a1) c1 + a1c < 1. The left-hand side of this inequality is nothing but a1c + ac1 − a1c1. Hence the lower elements for ac are

Page 82: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 82/133

Copyright c 2012, by Mikael Astner  82

less than 1 and the upper elements greater than 1. (Note that since c1 ◦ a1 ∈ F  ∪ G,it follows that c1 ◦ a1 = so that the negation of  > may be taken to be < in the proof.)

We have shown that in the expression for ac, 1 satisfies the betweenness propertybut 0 being in the lower part does not. Hence 1 = ( +) is the number of minimal length

satisfying the betweenness property. Therefore ac = 1.This finally we know that we have a field. Although we don’t need the information

it’s passing interest to note how b ◦a1 varies as a function of b and a1. First, by solvingthe defining equation for b ◦ a1 we get

b ◦ a1 =1 − (a − a1) b

a1= b +

1 − ab

a1.

The first expression implies that b ◦ a1 is an increasing function of  b if and only if a < a1 and the second expression that b ◦ a1 is an increasing function of a1 if and onlyif  ab > 1. Hence b ◦ a1 is an increasing function of one of the variables if and only if the other variable is upper. At the same time the function preserves sides if and andonly if the fixed variable is upper. All this can be unified by saying that b ◦ a1 getscloser to c if  b gets closer to c and if  a1 gets closer to a. As we already saw in thedealing with addition and multiplication, this is essentially what’s needed to prove auniformity theorem. Since the details are routine and since we don’t need it, this willnot be pursued.

Finally it’s recommended to any reader who’s confused by the unified argumentsto think first in terms of individual cases, e.g. in the above assume b and a1 are upperand regard b ◦ a1 as a function of  b for fixed a1.

10.2.4 Square root

We assume a > 0 and use induction. I.e. we assume that all initial segments of a (they’re necessarily negative) have square roots. Let a = A

|A be the canonical

representation. Then all elements in A ∪ A have square roots. Let H  be the freegroupoid, with product denoted by ◦, generated by the elements of  A|A. We shalldefine inductively a partial map from H  into the surreal numbers.

If  b ∈ A ∪ A then f (b) =√ 

b.If  b, c ∈ H , f (b) and f (c) are defined and are not both 0, then

f (b ◦ c) =a + f (b) f (c)

f (b) + f (c).

(In analogy with this case it was possible to use the concept of free semigroup todeal with division, but we preferred to be more concrete. Here we are stuck withthis formalism since we are dealing with non-associative juxtaposition.) By induction(

∀x) (f (x)

≥0). Furthermore f (x) > 0 unless x = 0.

F  and G are now defined inductively as follows.If  b ∈ A then f (b) ∈ F . If  b ∈ A then f (b) ∈ G. f (b ◦ c) ∈ G if  f (b) and

f (c) are both F , or both in G. If one of  f (b) and f (c) is in F  and the other in Gthen f (b ◦ c) ∈ F . Since we are not assuming that f  is one-one, a priori , it may seem

Page 83: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 83/133

Copyright c 2012, by Mikael Astner  83

possible that F  and G have elements in common. However, we shall prove that F  andG have elements in common. However, we shall prove that F < G which in particularguarantees that F  and G are disjoint.

Theorem 10.40. Every positive element has a square root.

Proof. We will show that F |G =√ 

a. First we show that x ∈ F  ⇒ x2 < a andx ∈ G ⇒ x2 > a. This clearly true for x ∈ A ∪ A. In order to carry through aninduction it’s necessary to study the behavior of  x ◦ y = a+xy

x+yas a function of  x and

y. First

x1 ◦ y − x2 ◦ y =

y2 − a

(x1 − x2)

(x1 + y) (x2 + y).

Hence, for fixed y, x ◦ y is an increasing function of  x if and only if  y2 > a if andonly if  y ∈ G using induction. Also y ∈ G if and only if the map x → x ◦ y preservesthe presence in F  or G according to one we have previously, i.e. preserving sides isequivalent to being an increasing function if one variable is fixed.

Now assume x1 ◦ x2 ∈ G. First, if  x1, x2 ∈ F , let x = max (x1, x2). Then , by theabove x1 ◦ x2 ≥ x ◦ x. Now x ◦ x = a+x2

2x .By the inductive hypothesis, if  x ∈ F , then x2 < a and if  x ∈ G, then x2 > a. In

either case, x2 = a. Hence

a − x22

> 0. Thus a2 + 2ax2 + x4 > 4ax2. Therefore

a + x2

2x

2

=a2 + 2ax2 + x4

4x2> a.

A fortiori (x1 ◦ x2) ◦ > a.Now assume x ◦ y ∈ F . Without loss of generality (because x ◦ y is symmetric in x

and y) suppose x ∈ F  and y ∈ G. Then x2 < a and y2 > a.

Assume first that xy = a. Then x ◦ y = a+xyx+y

= 2xyx+y

and (x ◦ y)2

= 4xya(x+y)2

. Clearly

x = y since x2 < a < y2; hence4xy

(x+y)2 < 1. Therefore (x ◦ y)2

< a as desired.If  xy = a, then either x < a

yor x > a

y. If  x < a

y, we apply the above to a

yand y

obtainay

◦ y2

< a. The above applies since membership in F  or G is not required in

the argument. All we want is thatay

2

< a. (Even the latter isn’t really needed since

we can get by even if all we have isay

◦ y2

≤ a.) Since y2 > a, x ◦ y is an increasing

function of  x and since x < ay

, we have

(x ◦ y)2 <

a

y◦ y

2

< a.

This finally shows that x ∈ F  ⇒ x2

< a and x ∈ G ⇒ x2

> a. Since x ≥ 0 this showsthat F < G. Hence F |G has meaning. Let F |G = c.We now compute c2. Then a typical term in the representation of  c2 is c1c + c2c −

c1c2. This is lower if and only if c1 and c2 are on the same side of c if and only if c1 ◦ c2

Page 84: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 84/133

Copyright c 2012, by Mikael Astner  84

is an upper element, i.e. c < c1 ◦ c2 = a+c1c2c1+c2

if and only if  c (c1 + c2) < a + c1c2 if andonly if  c1c + c2c − c1c2 < a.

The argument breaks down if  c1 = c2 = 0 since c1 ◦ c2 is undefined. But this caseleads to a lower element which is 0 which is less than a. So lower elements are less than

a and upper elements greater than a, i.e. a satisfies the betweenness property for c2

.Now 0, √ a ∈ F . Hence one of the lower elements in the representation of  c2 is

c√ 

a + c (0) −√ 

a

(0) = c√ 

a ≥√ 

a√ 

a = a.

Now√ 

a ∈ F . Hence one of the upper elements is

c√ 

a + c (0) −√ 

a

(0) = c√ 

a ≤√ 

a√ 

a = a.

By cofinality theorem i (10.24) c2 = a.

Note that as in the case of division, what we did was to insert just enough terms 

into F  and  G in order to force the betweenness condition. Again, just as in the case of division, we have what’s needed to prove a uniformity theorem.

10.3 Real numbers and ordinals

The main task of this section is to show that the surreal numbers contain the boththe real numbers and the ordinal. (The distinction as to whether the surreal numberscontain the real numbers or a set isomorphic to the real numbers is very much like thedistinction as whether the Call of Cthulhu was written by Lovecraft or someone elseof the same name.) Along the way we shall see the explicit representation of ordinarynumbers as sequences of pluses and minuses.

10.3.1 IntegersSo far we know that the additive identity 0 is the empty sequence and that the mul-tiplicative identity 1 is the sequence (+). Since the surreal numbers form an orderedfield, the expression

(1 + 1 + 1 + ...   n times

)

may be identified with the positive integer n. We now have the following result whichis consistent with one’s heuristic explications.

Theorem 10.41. The postive integer n is

(1 + 1 + 1 + ...   n times

).

Page 85: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 85/133

Copyright c 2012, by Mikael Astner  85

Proof. We use complete induction, i.e. suppose the theorem is true for all integersm ≤ n. Then

1 + 1 + 1 + ...

   n + 1 times

= (1 + 1 + 1 + ...

   n times

) + 1 = (+++...

   n times

) + (+)

by the inductive hypothesis. By applying cofinality to the canonical representation weknow that

(+++...   n times

)

may be expressed as

(+++...   n times

)|∅ = {n − 1}|∅

by the inductive hypothesis. (+) is clearly {0}|∅. Hence by definition

n + 1 = {(n − 1) + 1, n + 0}|∅ = {n}|∅ = ( +++...   n times

)|∅

which is

( +++...   n + 1 times

)

again by applying cofinality to the canonical representation of 

(+++...n + 1 times).

Note that the symbol + has been used in two different senses, once for addition(denoted +), and once as one of the symbols used in the ordinal sequences which weconsider (denoted +). This happens also in ordinary algebra where, for example, in+ (a + b) the two pluses have different meanings. However, the meanings are relatedin such a way that it’s convenient in practice to use the same symbol for each. In ourcase the use of the symbols + and − is consistent with the intuitive feeling of order,i.e. plus is above zero which is above minus. In any case, the meaning should be clearfrom the context.

Corollary 10.42. The negative integer −n is

(−−−...

   n times

).

Proof. This is an immediate consequence of the theorem and the formula for the ad-ditive inverse obtained previously.

Page 86: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 86/133

Copyright c 2012, by Mikael Astner  86

10.3.2 Dyadic fractions

Since the class of surreal numbers contain the rational numbers, it seems natural toconsider them next and even to conjecture that the rational numbers correspond tofinite sequences of pluses and minuses. Since 0 = ( ) < (+−) < (+) = 1, it’s natural toconjecture that (+−) = 1

2 . A heuristic guess for (+−−) would be a toss-up between 13

and 14 . Actually (+−) = 1

4 and (+−−) = 14 .

It turns out that the finite sequences correspond to the dyadic fractions, i.e. ra-tionals of the form a

2n . Although they form a proper subset of the rationals, they aredense in the reals. Thus they can be used just as well as the rationals as buildingblocks later in developing the reals.

Lemma 10.43. If {2a}|{2b} = a + b then {a}|{b} = a+b2

Proof. Let {a}|{b} = c. Then 2c = c + c = {a + c}|{b + c} by definition of addition. Weshow that the latter is a + b by cofinality. First, a < c < b. Hence a + c < a + b < b + cwhich is the betweenness property. Now {2a}|{2b} = a + b. Also it follows froma < c < b that 2a < a + c and b + c < 2b. Thus we have the cofinality property. So

2c = a + b; therefore

{a}|{b} =a + b

2

The above is the key lemma for dealing with dyadic fractions. For example 1 ={0}|∅ = {0}|{2} by cofinality. Hence the hypothesis of lemma 10.43 is satisfied if a = 0 and b = 1. Hence {0}|{1} = 1

2 . So 12 = {( )}|{(+)} = (+−). This says that

the hypothesis of lemma 10.43 is valid for a = 0 and b = 12 . Hence {0}|{ 1

2} = 14 . So

14 = {( )}|{(+)} = (+−−). This sets up an induction, but only numbers of the form1

2n will be reached. However, we also have, for example, 3 = {0, 1, 2}|∅ = {2}|{4} bycofinality. Again, by the lemma we obtain 3

2

={

1}|{

2}

={

(+)}|{

(++)}

= (++−).We now show that this process enables us to determine the rational number which

corresponds to an arbitrary surreal number of finite length.

Theorem 10.44. Surreal numbers of finite length correspond to dyadic fractions.Specifically, let d be a surreal number of length m + n which satisfies i, j < m ⇒d (i) < d ( j), d (m) = d (0).

b (i) =

1 if  i < m and d (i) = +;−1 if  i < m and d (i) = −;

12i−m+1 if  i ≥ m and d (i) = +;− 1

2i−m+1 if  i ≥ m and d (i) = −.

Then

d =m+n−1i=0

b (i) .

Page 87: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 87/133

Copyright c 2012, by Mikael Astner  87

Remark 10.45. The above says informally that a plus is counted as a 1 and a minusas a −1 until change in sign occurs at which point the sequence of pluses and minusesis treated like a binary decimal (with 1 and −1 rather than 1 and 0). For example

+++−+− is 3 −1

2 +1

4 −1

8 = 2 +5

8 .

Proof. Let d (0) = +. A similar argument applied is d (0) = −. (As an alternative onecan take the result for d (0) = + and take the negative for both sides.)

The case n = 0, which is the case where there’s no change in sign, is essentially thestatement in theorem 10.41.

We do the case n = 1 individually since this case is special. Here we have d (m) = −.The sequence consists of m pluses followed by a minus. To avoid confusion recall thatthe ordinals begin with 0 and the length is the least ordinal which d is not  defined.(This may seem unnatural in the finite case, but is required if one wants a generaldefinition.) In any case, the two unnatural  conventions cancel so that the length isreally the number of terms in the sequence!

Of course m≥

1 (d (m)= d (0)). Now 2m

−1 =

{0, 1, 2,..., 2m

−2}|∅. This

is the canonical representation by theorem 10.41. By cofinality we obtain 2m − 1 ={2m − 2}|{2m}.

We can now apply lemma 10.43 with a = m − 1 and b = m to definition that{m−1}|{m} = m− 1

2 . It’s easy to see directly from the definition that {m−1}|{m} = d.m − 1 consists of m − 1 pluses, and m of m pluses. Any surreal number between m − 1and m must by the lexicographical order begin with m pluses followed by a minus, i.e.have d as an initial segment. Hence d = m − 1

2 , which is exactly what the theoremsays.

We now use induction n. Assume that the theorem is true for all n ≤ r and letn = r + 1.

We note that an immediate application of induction to lemma 10.43 shows that

{2a}|{2b} = a + b → {a

2s }|{b

2s } =

a + b

2s+1

for all positive integers s. We already noted that the hypothesis is valid for consecutiveintegers c and c + 1. Hence

{ c

2s}|{c + 1

2s} =

c

2s+

1

2s+1.

Let d be the initial segment of  d of length m + r. Then d = d+ or d− since a haslength m + n = m + r + 1. Assume d = d+. A similar argument applies if  d = d−.Since the case n = 1 has been done separately, we can assume that r ≥ 1, i.e. that d

begins with m pluses followed by a minus.Now let d = F |G be the canonical representation. F  and G are finite, so by

cofinality a has the form{

x}|{

y}

where x is the largest element in F  and y is thesmallest element in G. Since d = d+, clearly x = d. We can’t be as explicit with y,since in this general situation we have very little information about the minuses in G.We know that there is a minus after m pluses; hence y ≤ m. Also, we can apply the

Page 88: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 88/133

Copyright c 2012, by Mikael Astner  88

inductive hypothesis to x and y. Hence d = x = c2r for some integer c. If we can show

that y = c+12r then we can apply the above formula to obtain

d =

{x

}|{y

}=

c

2s

+1

2s+1

which is exactly what we need to prove the theorem.Our apparent lack of control over y will be overcome by the box principle. Let H 

be the set of all surreal numbers of length not greater than m + r that begin with mpluses followed by a minus. The cardinality of  H  is 1 + 2 + 22 + ... + 2r−1. By theinductive hypothesis every element of H  is of the form k

2r for some integer k and by thelexicographical order is strictly between m − 1 and m. Since there are precisely 2r − 1such numbers, by the box principle every  number of the form k

2r strictly between m−1and m is in H . In particular, c+1

2r ∈ H  unless c+12r = m. In either case

c+1

2r

≤ m + r.Now c+1

2r > c2r = d. Since d = d+ it follows from the lexicographical order that

c+12r > d. Now G ⊂ H  ∪ {m}. Hence every element of  G has the form k

2r . Sincec

2r = d < d and d < G, c+12r is a lower bound to G. In fact, c+1

2r is actually an initialsegment of  d. Otherwise, by considering the common initial segment of  d and c+1

2r we

obtain an element of  G below c+12r which is a contradiction. Since c+1

2r> d, it follows

that c+12r ∈ G and hence the least element of  G, i.e. y = c+1

2r .

During the proof we showed that all dyadic fractions are obtained this way. Also it’seasy to see how to express a dyadic fraction constructively as a sequence. Heuristicallyspeaking, we always go in the right direction to close in on the fraction. For example,consider 2 + 3

8 . Since 2 < 2 + 38 < 3, we begin with +++−. This is 2 + 1

2 . Since2 + 3

8 < 2 + 12 we want a minus next. Now +++−− is 2 + 1

4 so we need a +. Finally+++−−+ hits what we desire on the nose. More formally, one can set up an elementaryinduction. We assume that all fractions of the form a

2n with a odd correspond to

sequences of length m + n. Consider 2b+12n+1 . Precisely one of  b and b + 1 is odd.

2b + 1

2n+1=

b

2n+

1

2n+1=

b + 1

2n− 1

2n+1.

Thus depending on which is odd we take the corresponding sequence of length m + nand juxtapose a plus or minus, e.g. if  b is odd we juxtapose a plus to the sequenceof  b. Note the lack of choice. Since b + 1 is even in that case, b+1

2n is a sequence of 

length less than m + n. Tacking on a minus thus not  give b+12n . This is, of course, what

we expect from the beginning since the numbers are sequences and not equivalenceclasses of sequences. Anyway, whatever appearance there may be of choice, it’s clearlydeceptive.

The whole idea of a shift from ordinary counting to a binary decimal computationat the first change in sign may seem unnatural at first. However, such phenomena seeminevitable in a sufficiently rich system.

It’s an amusing exercise in arithmetic to add numbers in this form. Carrying existsas usual but since we deal with pluses and minuses and don’t have zeros, an adjustmentis necessary if we would otherwise obtain 0 in a place. Specifically, 0+ must be replaced

Page 89: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 89/133

Copyright c 2012, by Mikael Astner  89

by +−. Also, one must be aware of the dividing line where the shift from ordinarycounting to the binary decimal computation occurs.

Finally, a suitable succinct way of expressing the result of arithmetical operationson dyadic fractions given in the above form may be useful in studying certain problems

in the theory of surreal numbers. Although the dyadic fractions look like a drop in theocean of surreal numbers, they form an important building block.

10.3.3 Real numbers

At this stage we develop a theory which is somewhat analogous to that of Dedekindcuts. However, there is at least one important difference. All the objects and opera-tions are already present as a subsystem of the surreal numbers. The analogy arisesbecause of the theory in subsection 10.1. First, roughly speaking, the fundamentalexistence theorem (theorem 10.4) gives a well-defined element for every cut. Secondly,the canonical representation gives a natural cut associated with any element. This cor-respondence, of course, occurs at every stage, although the resemblance to Dedekindcuts is closer in some stages than others.

Definition 10.46. A real number is a surreal number a which is either of finite lengthor is of length ω and satisfies

(∀n0) (∃n1) (∃n2) {(n1 > n0) ∧ (n2 > n0) ∧ (a (n1) = +) ∧ (a (n2) = −)} .

In other words, the definition simply requires that the terms of the sequence a (n) of pluses and minuses don’t eventually have constant signs. This is analogous to the situ-ation for ordinary decimals where one might rule out an eventual sequence of nines toensure that each number has a unique representation. In our case a sequence consistingeventually of pluses will be a surreal number outside the set of reals.

To show that the set of real numbers forms a field, it suffices to check the closureproperties. However, it’s convenient to have several lemmas in order to carry thisthrough.

Note first that the distinction between surreal numbers of length ω which are realand those which aren’t can be expressed in terms of canonical representation. If  a =F |G is the canonical representation, then a is a real precisely when F  and G are non-empty, F  has no maximum and G has no minimum; e.g. if there is a last plus, thenF  has a maximal element. (It’s clear from what we already know that this element isinfinitesimally close to a but this issue will not be pursued now.)

The elements of  F  ∪ G have finite length so they’re dyadic fraction.

Lemma 10.47. Let F  and G be non-empty sets of dyadic fractions such that F < G,F  has no maximum elements, and G has no minimum. Then F |G is a real number.

Proof. F |G exists by the fundamental existence theorem (theorem 10.4). Let F |G = a.By theorem 10.20 (a) ≤ ω. It suffices to rule out the possibility that (a) = ω and a

has eventually constant sign. Suppose that the constant sign is+

. (A similar argumentapplies if it’s −.) The possibility that a consists exclusively of pluses is ruled out sincea < G and G is nonempty.

Page 90: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 90/133

Copyright c 2012, by Mikael Astner  90

Now suppose a (n0) = − but n > n0 ⇒ a (n) = +. Let b be the nitial segmentof  a of length n0. Then b > a. By the inverse cofinality theorem (theorem 10.27),(∃c ∈ G) (c ≤ b) since b is an upper element in the canonical representation of  b. SinceG has no minimum ∃d ∈ G : d < c. Since b, c, and d are all dyadic fractions we can

choose m so that d ≤ b −1

2m .Let c be an initial segment of  a of length n for some n > n0. Then c is a lower

element. Also, c − b has the form

− 1

2r+

1

2r+1+ ... +

1

2r+s= − 1

2r+s

for some r and s. (Actually s = n − n0 − 1.) In any case, for n sufficiently high,c > b − 1

2m . Hence a > c > b − 12m . Also a < d ≤ b − 1

2m . Thus we have the desiredcontradiction.

Lemma 10.48. Let a = F |G. Suppose that (∀x ∈ F ) (∃ a positive dyadic fraction r) (y ≥ x + r ∧ y ∈ F )and (

∀x

∈G) (

∃a positive dyadic fraction r) (

∃y) (y

≤x

−r

∧y

∈G).

Also let F  < a < G. Suppose that (∀ positive dyadic r) (∃x ∈ F ) (∃y ∈ G : y − x ≤ r).Then a = F |G.

Proof. It’s enough to check the cofinality. We do this for F . The case for G is similar.Suppose x ∈ F . Choose y and r such that r is positive dyadic y ≥ x + r and y ∈ F .For some r choose w ∈ F  and z ∈ G such that z − w ≤ r. Since F < a we havea > y > x + r. Since G > a we have w ≥ z − r > a − r ≥ x. Since w ∈ F  this provescofinality.

Note that the hypothesis doesn’t require any of the sets to consist only of dyadicfractions.

Lemma 10.49. There are infinite number of dyadic fractions between any two distinct

real numbers a and b.

Proof. It clearly suffices to obtain one dyadic fraction. If neither a nor b is dyadic thenthe common segment works. More generally, if neither a nor b is an initial segment of the other, we can use the common segment. Suppose a is a proper initial segment of b. (a is necessarily dyadic.) If  b is dyadic the result is trivial. Otherwise consider thecanonical representation A|B of  b. a ∈ F  ∪ G. Since F  has no maximum and G nominimum we obtain a dyadic fraction between a and b whether a ∈ F  or a ∈ G.

Remark 10.50. Note that if  a has a last plus and b is the initial segment of  a whichstops just before the last plus, then there’s no dyadic fraction between a and b. Thusthe requirement in the lemma that the numbers be real is essential.

Lemma 10.51. Let a = F |G be the canonical representation of a real number a whichis not a dyadic fraction. Then for all positive dyadic r there exists b ∈ F , c ∈ G suchthat c − b ≤ r.

Page 91: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 91/133

Copyright c 2012, by Mikael Astner  91

Proof. Since there’s no last + and no last − in a, then for all n there are elements b ∈ F ,c ∈ G which agree in the first n terms. Thus c − b is bounded above by an expressionof the form

1

2s +

1

2s+1 + ... +

1

2s+n ≤1

2s−1

since s can be made arbitrarily large by a suitable choice of  n this proves the lemma.

Not that it’s easy to see that the requirement that a be real can be relaxed but thisis of no special concern.

We are not ready to check the closure properties.

i. Addition. Let a and b be real numbers. If both are dyadic functions then so isthe sum. Suppose a is dyadic fraction and b isn’t. Let a = F |G and b = F |G bethe canonical representations. Then

a + b = {a + b, a + b}|{a + b, a + b}.

We claim that numbers of the form a + b are cofinal on the left. Consider anumber of the form a + b. Since a, a are dyadic fractions and a > a, a − a is apositive dyadic fraction. By lemma 10.51, (∃b ∈ F ) (∃b ∈ G) (b − b ≤ a − a).Then a + b ≥ a + b ≥ a + b. Similarly, we can see that numbers of theform a + b are cofinal on the right. By cofinality theorem i (theorem 10.24)a + b = {a + b}|{a + b}. Note that a + b and a + b are dyadic fractions. Alsosince F  = {b} has no maximum, neither does {a + b}. Similarly {a + b} hasno minimum. By lemma 10.47, a + b is a real number.

Now suppose neither a not b is a dyadic fraction. Again let

a + b = {a + b, a + b}|{a + b, a + b}.

As before it’s clear that the left elements have no maximum and the right elementsno minimum. More specifically, since the numbers a, a, b, and b are all dyadicthe above representation of a + b satisfies the first condition of lemma 10.48. Nowlet F  = {a + b} and G = {a + b}. Then F  < a + b < G. Also lemma10.51 together with the usual ε

2 argument  shows that F  and G satisfy the otherconditions of lemma 10.48. Hence a = F |G. Finally, F  and G satisfy thehypothesis of lemma 10.47 so a is real. 0 and 1 are clearly real. The additiveinverse of a real is a real since the definition is symmetric in pluses and minuses.

ii. Multiplication. Let a and b be real numbers. If both are dyadic fractions thenso is the product. Now let a be a real number which is not a dyadic fraction.There is no restriction on b. A typical element in the representation of  ab is

c = ab −

a − a0

b − b0

.

(Recall that, e.g., a0 is either of the form a or a.) Since a is real we can choosea0

1 which is on the same side of  a as a0 and also closer to obtain the element.

c1 = ab − a − a0

1

b − b0

.

Page 92: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 92/133

Copyright c 2012, by Mikael Astner  92

Then

c1 − c =

a01 − a0

b − b0

.

Now by applying lemma 10.49 to an arbitrary positive real r and 0 we obtaina positive dyadic fraction d satisfying d < r. For a product of positive realsr1 and r2 we obtain in this manner the positive dyadic fraction d1d2 satisfyingd1d2 < r1r2. Thus we have what we need to show that the first condition of lemma 10.48 is satisfied for the representation of  ab. The choice of  F  and G

would depend on the signs of a and b; however it suffices to assume that a,b > 0.

Let

A = {a1b1 : (a1 is dyadic ∧ 0 ≤ a1 < a) ∧ (b1 is dyadic ∧ 0 ≤ b1 < b)}

and

B = {a1b1 : (a1 is dyadic ∧ a < a1) ∧ (b1 is dyadic ∧ b < b1)}.

(This formulation is convenient since we can obtain the required conclusion if  bis a dyadic fraction even though lemma 10.51 doesn’t apply in that case.) Forarbitrary dyadic r we can choose a1, a2, b1, and b2 such that b2 − b1 ≤ r anda2 − a1 ≤ r. The usual argument proving that the limit of a product is of thelimits shows that the other conditions of lemma 10.48 are satisfied. Note thatwe can use the fact that for every real number a there exists an integer n suchthat |a| ≤ n. This is clear since in the canonical representation of a as F |G, bothF  and G are nonempty. (Incidentally, for the surreal number of length ω whichconsists solely of pluses the argument would break down.)

Hence by lemma 10.48 ab = F |G. Again, as in the case of addition, F  and G

satisfy the hypothesis of lemma 10.47, so that ab is real.

Note that during the proofs of closure we obtained nice representations of sums

and products of reals with the help of lemma 10.48. Since such intuitive-lookingrepresentations fail in general, it’s interesting that they work in the special caseof real numbers.

iii. Reciprocals. It’s best to ignore the previous construction of reciprocals. Itsuffices to consider the case where a is a real number larger than 0. Let

F  = {d : (d is a dyadic fraction) ∧ (da < 1)}

and

G = {d : (d is a dyadic fraction) ∧ (da > 1)}.

Clearly F < G, F  and G are nonempty, and 0 ∈ F . Also by lemma 10.49 we can

find m so that a1

2m . Hence 2m

a > 1.We now show that F  has no maximum. Suppose that d is a dyadic fractionsatisfying da < 1. Then da− 1 is a real number because of the closure properties,and 1 − da > 0. By lemma 10.49 one can find m so that 1 − da > 1

2m . Now we

Page 93: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 93/133

Page 94: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 94/133

Copyright c 2012, by Mikael Astner  94

r < d, contradicting the fact that F < r. Finally, suppose s < r and s is an upperbound to H . Let s < d < r for some dyadic d. Then d is also an upper bound to H .Hence d ∈ G. But d < r . This contradicts the fact that r < G.

Theorem 10.52. The real numbers form an ordered field with the least upper boundproperty (i.e. they are essentially the same as the reals  defined in more traditionalways.)

We close this section with several remarks. Let a be a real number which isn’t adyadic fraction, thus a has length ω. If  a = F |G is the canonical representation, weknow that F < a < G and that the elements of  F  form an increasing sequence andthat the elements of  G form a decreasing sequence. Since F  has no maximum and Gno minimum, both sequences are infinite. It’s clear from this and from lemma 10.51that if  an is the initial segment of  F  of length n then limn→n = a. Of course, in thedefinition of a limit it makes no difference whether is taken to be real, rational, ordyadic; however we certainly can’t use general surreal numbers.

Theorem 10.53. The following three conditions are equivalent.

i. a is a real number (by our definition);

ii. for some integer n, −n < x < n and a has no initial segment aα such that |a−aα|is infinitesimal;

iii. for some integer n, −n < x < n; and

a = {a − 1, a − 1

2, a − 1

3,...}|{a + 1, a +

1

2, a +

1

3,...}.

Remark 10.54. An element a is infinitesimal if it’s nonzero but for every positiverational r : |a| < r. It’s clear, for example, from what we already know that theelement a of length ω which begins with + and then after that consists only of minusesis an infinitesimal. The heuristic idea in statement iii is that of the possibility of writing

a in the form F |G without forcing either F  nor G to be too close  to a. Of course, bycofinality theorem i (theorem 10.24), if  a = F |G then one still gets a if  F  and Gare both enlarged to include elements closer to a. The challenge lies in the oppositedirection. How far away can F  and G be from a and still have a = F |G? As a roughrule of thumb, the larger the length of  a, th closer F  and G must be to a.

Proof. i⇒ii. Since all initials segments are dyadic functions, this is clear.¬i⇒ ¬ii. (This contrapositive notion seems natural here even if it may be unusual.)

Suppose that a is not real. Then (a) ≥ ω. If for all n less than ω, a has a fixedsign then the condition −n < a < n fails, as is clearly from the ordering, so that case isclear. Now let aω be he initial segment of a of length ω. Assume first that aω consistsof pluses only. (A similar argument will apply if  aω consists eventually of minusesonly.) Then the argument which is essentially the same as the one used in the proof 

for lemma 10.47. We let an be the initial segment of  aω obtained by stopping justbefore the last minus. (We already ruled out the case where aω contains no minuses.)Then an > a. For all positive rational r we can find m sufficiently high such that the

Page 95: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 95/133

Copyright c 2012, by Mikael Astner  95

initial segment of  am of  a of length m satisfies am < a and an − am < r. We use acomputation of the form

1

2s

+1

2s+1

+1

2s+2

+ ....

Hence an − a < r for all r. Since n is fixed, we have |an − a| is infinitesimal. Note thatthis part of the argument is independent of whether a is the same as aω or not.

Now assume that aω doesn’t eventually have constant sign. Since a isn’t real,a = aω, i.e. aω is a proper initial segment of a. Let r be an arbitrary rational. Supposeaω = F |G. Then by lemma 10.51 there exists b ∈ F  and c ∈ G such that c − b ≤ r.Now b < aω < c. It’s also clear that b < a < c by the lexicographical ordering. (In fact,b and c occur in the canonical representation of  a as well as aω.) Hence |aω − a| < r.Therefore |aω − a| is infinitesimal.ii⇒iii. Express a canonically as F |G. Then the conclusion is immediate by cofinalitytheorem i (theorem 10.24). Given a ∈ F , then a − a is not infinitesimal. Choose n sothat a − a > 1

n. Then a − 1

n= a. A similar argument applies to a ∈ G.

iii

⇒ii. Let

a = {a − 1, a − 1

2, a − 1

3,...}|{a + 1, a +

1

2, a +

1

3,...}.

and let a = F |G canonically. Then by the inverse cofinality theorem (theorem 10.27)the set {a−1, a− 1

2, a− 1

3,...} is cofinal in F . Let a ∈ F . Then for some n, a− 1

n≥ a,

i.e. a − a ≥ 1n

. Therefore a − a isn’t infinitesimal. A similar argument applies to G.In line with what was said earlier, note as a point of caution that the theorem

doesn’t rule out the possibility that a is real of the form F |G with elements in F  or Ginfinitesimally close to a. This is ruled out only for the canonical representation.

10.4 Ordinal

Our next task is to show that the surreal numbers contain the ordinals. Once this isdone it will be legitimate to deal with expressions of the form ω − 1 or 1

2ω. First, of 

course, we have to make precise what we mean by the statement that the ordinals area subclass of the surreal numbers. Recall that ordinary addition and multiplicationon the ordinals are not commutative. However, there do exist natural commutativeoperations on the ordinals that have been considered in the literature and they docorrespond to the operations we defined on the surreal numbers.

We identify the ordinal α with the sequence aα if length α such that ∀n < α :a (n) = +.

First note that by theorem 10.41 this is consistent with the situation for positiveintegers. Also it’s immediate from the lexicographical order that α < β  ⇒ aα < aβ .Furthermore, the canonical representation of  aα is {aβ : β < α}|∅. Again, if H  has no

maximum then α is the least upper bound of H  is called the sequent of H  and denotedby seq (H ) in the literature. In summary, as far as the order properties are concerned,the identification is reasonable. Thus for convenience of notation we use α instead of aα. So the ordinals are the sequences which consist only of pluses. (Incidentally, note

Page 96: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 96/133

Copyright c 2012, by Mikael Astner  96

that our definition of cofinality (theorem 10.24 and theorem 10.25) is consistent withthe usual one for ordinals.)

We now show that addition and natural multiplication. They’re obtained by takingthe usual expansion in powers of  ω and operating as if they’re ordinary polynomials

(i.e. no absorption). In order to state the next theorem precisely we tentatively use +for ordinary addition, ⊕ for natural addition and for surreal addition.

Theorem 10.55. For any ordinals α and β , α β  = α ⊕ β 

Proof. We use induction as usual. In view of our earlier remarks

α β  = seq (α γ, δ  β ) = seq[γ<β,δ<α] (α ⊕ γ, δ ⊕ β )

by the inductive hypothesis. The problem is now reduced to an elementary exercise inthe ordinary theory of the ordinals. Specifically, we may express α and β  respectivelyin the forms α + ωr and β  + ωs, r or s may be 0. Then typical lower elements are(α + ωr) ⊕ (β  + γ ) with γ < ωs and (α + δ ) ⊕ (β  + ωs) with δ < ωr. Without lossof generality suppose r

≥s. Then the set (α + ωr) + (β  + γ ) with γ < ωs is cofinal

and we clearly obtain (α + ωr) ⊕ (β  + ωs) as the sequent which is what we want.

In order to state the next theorem precisely we tentatively use for natural mul-tiplication and for surreal multiplication.

Theorem 10.56. For any ordinals α and β , α β  = α β .

Proof. Again we use induction. Now every ordinal has the formk

i=1 ωrini withri > ri+1 for all i and for all ni integers. This may also be written in the formk

i=1 ωri with ri ≥ ri+1 by breaking up all ωrini for which ni > 1. The sum is alsoa natural sum therefore by theorem 10.55 a surreal number. Suppose α =

ωri and

β  =

ωsi . By the distributive law we obtain a β  =

i,j ωri ωsj . If at least one

of  α and β  is not a power of  ω, then we can use the inductive hypothesis to obtain

α β  =i,j

ωri · ωsj = α · β 

by the distributive law for natural multiplication over natural addition.Now suppose that both a and b are powers of  ω. Let α = ωr and β  = ωs. Then by

definition

α β  = {(ωr δ ) (γ  ωs) (δ  γ )}|∅

where δ < ωs and γ < ωr. Note that unlike in the case of addition, we are stuckwith the surreal operation which doesn’t correspond to an operation on ordinals.First, since there are no upper elements by theorem 10.18 α β  is an ordinal. The

typical lower element is certainly not larger than (ωr

δ ) (γ  ωs

) which by theinductive hypothesis and theorem 10.55 is (ωr δ ) + (γ · ωs). This is clearly less thanωr+s. Hence ωr⊕s satisfies the betweenness property in the definition of  α β . Hence (α β ) ≤ ωr⊕s and since α β ≤ ωr⊕s.

Page 97: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 97/133

Copyright c 2012, by Mikael Astner  97

Now if  r < r and n is an arbitrary positive integer we have α β > ωrn ωs =ωrn ωs by the inductive hypothesis. Futhermore this is ωr⊕sn. Similarly if s < swe have α β > ωr⊕sn. We’ve already noted during our proof of theorem 10.55 thatseq(r ⊕ s, r ⊕ s) = r ⊕ s. By the basic properties of the expansion of ordinals in

powers of  ω, it follows that seq ωr

+sn, ωr+s

n = ωr+s. Hence α β  ≥ ωr+s. Sofinally α β  = ωr⊕s = α β .

In view of theorem 10.55 and theorem 10.56 we no longer need symbols such as and . For convenience we will even drop the symbols ⊕ and since it will be clearfrom the context whether natural or ordinary operations are intended. In fact, as arule of thumb, in discussing elements we use natural operations which, as we have justshown, are the same as the surreal operations. On the other hand, when discussinglengths and juxtaposition of sequences, the ordinary operations are appropriate. Thisissue involving choice of notation will occur in other places as well. In general, read-ability and reliance on context will take precedence over a picayune attitude which

complicates the notion unnecessarily.We now know the exiting fact that the surreal form a field containing both the

reals and the ordinals. So, for example, elements such as ω − 1 and 12 ω have meaning.

Incidentally, our ω−1 has nothing to do with the meaning used in the literature, namelyω − 1 = ω, since 1 + ω = ω where + stands for ordinal addition. We shall compute thesign sequence of several such strange elements. These are all special cases of the generaltheory of the next subsection (subsection 10.5). However, it’s worthwhile to see someelementary concrete examples before getting involved with a general representationtheorem. In fact there’s some pedagogical value for the reader to experiment withother elementary examples before coping with a more genera and abstract situation.

We begin with ω − 1 which is about the simplest looking exotic  element. This isω + (−1). The canonical representation of  ω is {n}|∅ where n stands for an arbitrarynonnegative integer and

−1 = ∅

|{0

}. Hence using the definition of ω

−1 =

{n

−1

}|{ω+

0}. By cofinality theorem ii (theorem 10.25), the left-hand side may be replaced by{n}; hence we have {n}|{ω} which we can see immediately from the definition is thesequence of length ω + 1 which consists of  ω pluses followed by a minus.

We make several remarks before continuing with other examples. It’s often conve-nient to use cofinality theorem i (theorem 10.24). Cofinality theorem ii (theorem 10.25)has the convenience of of avoiding specific reference to a = F |G but referring only to F and G themselves. This helps to streamline a computation; in fact, in future we shalluse cofinality theorem ii (theorem 10.25) freely without quoting it if its use is obvious(just as in elementary algebra one doesn’t bother to quote the distributive law everytime it’s used!) For example, an expression such as {n + 5

2}|{ω − 12m } where m and n

run through all positive integers can be replaced by {n}|{ω − 1m

}.Note that the last part of the computation can be done in two ways. One can

directly use the definition and see that any x satisfying n < x < ω for all n myst

necessarily begin with the sequence of length ω + 1 referred to above. One can alsotake a good guess at the answer and note that {n}|{ω} happens to be the canonicalrepresentation. This is worth emphasizing, because in more complication situations

Page 98: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 98/133

Copyright c 2012, by Mikael Astner  98

we often have a representation which is cofinal in a canonical representation. In thesecases it’s much easier to use the second method.

Recall that by the uniformity theorems, the computations may be performed usingany representations; thus we may choose ones which appear to be most tractable. For

example, any real number a we may use the representation a = F |G where F  is theset of all dyadic fractions below a and G is the set of all dyadic fractions above a. Thisis a unified formula which applies whether or not a is a dyadic fraction. If  a isn’t adyadic fraction, this is mutually cofinal if the standard representation, but it definitelyisn’t mutually cofinal if  a is a dyadic fraction. I.e. in the latter case the representationcan’t be justified by cofinality theorem ii (theorem 10.25), although cofinality theoremi (theorem 10.24) applies.

As obvious as these above remarks may be, they’re worthwhile to state and get outof the way once and for all, since it would be clumsy to make them in the middle of aproof. With this in mind, we can now present computations and proofs more efficiently,i.e. using convenient representations and simplifications without explicitly stating theobvious justification.

Now consider ω

−2. This is

ω + (−2) = {n}|∅+ ∅|{−1} = {n − 2}|{ω − 1} = {n}|{ω − 1}which is the sequence of length ω + 2 consisting of  ω pluses followed by two minuses.

Using induction we can compute ω − (m + 1) for positive integer n. In fact,

ω − (m + 1) = {n}|∅+ ∅|{−m} = {n − m − 1}|{ω − m}.

(To avoid confusion note that m is fixed although n varies.) This is {n}|{ω − m}which, using the obvious inductive hypothesis, gives the sequence of length ω + m + 1consisting of  ω pluses followed by m + 1 minuses.

It’s natural now to ask what ω pluses followed by ω minuses represents. The patternsuggests naively that it may be ω − ω but this is obviously impossible. The element is,of course, still infinite, i.e. above every positive integer because of the lexicographicalorder. In fact, no sequence of minuses no matter how large can undo the effect of thefirst ω pluses. We shall return to the above example shortly.

The computation for ω − m works for any limit ordinal as well as ω. For exampleω2 + 6ω

− 3 is the sequence of  ω2 + 6ω pluses, followed by 3 minuses. One can alsoapply the computation to non-limit ordinals to check the consistency of what we’vealready done. For example (ω + 6) − 1 is simply ω + 5. The computation will thus notgive you the sequence of  ω + 6 pluses followed by a minus if it’s done correctly. Thestep to beware of is the following. For limit ordinal α we can simplify a lower set suchas {β − 1 : β < α} by replacing it by {β  : β < α}. This i, of course, not valid fornon-limit ordinals, since the former set is not cofinal in the latter.

We now compute ω + 12 . This is

{n}|∅+ {0}|{1} = {n +1

2 , ω + 0}|{ω + 1} = {ω}|{ω + 1}.

This is the sequence of  ω + 1 pluses followed by a minus. In line with our earliercomputations, this example illustrates the contrast between the case where α is a limit

Page 99: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 99/133

Copyright c 2012, by Mikael Astner  99

ordinal and where α is a non-limit ordinal in the value of a sequence with α plusesfollowed by a minus.

In the same manner induction can be used to evaluate ω + r for any positive realnumber r. ω + r = {n}|∅+ F |G where F |G is the canonical representation of r. This is

{ω + F, n + r}|{ω + G}. Since 0 ∈ F , the left-hand side may be replaced by ω + F  andwe obtain {ω + F }|{ω + G}. By the lexicographical order and the inductive hypothesis,this is the sequence with ω pluses followed by the sequence for r. Incidentally, thisreasoning with the lexicographic order and juxtaposition is similar to what has alreadybeen used back in the proof for the fundamental existence theorem (theorem 10.4) andrepresents an important skull facilitate computation.

This argument works just as well when r is negative as long as r isn’t an integer,in which case F  is empty. The latter case has in fact been done earlier and the generalconclusion is still valid although ω + F  can no longer be used as a lower set.

As before, similar results apply if  ω is replaced by other limit ordinals.We now consider 1

2ω. This is ({0}|{1}) ·({n}|∅). By the definition of multiplication

this is

{1

2 n + ω · 0 − n · 0}|{1

2 n + ω · 1 − n · 1} = {1

2 n}|{ω −1

2 n} = {n}|{ω − n}.

Using the earlier result for ω − n, this is the sequence of length ω · 2 beginning with ωpluses followed by ω minuses. This answers an earlier question which was left open.

It’s instructive to see another proof. Let a be the sequence of  ω pluses followedby ω minuses. Then a = {n}|{ω − n}. Hence 2a = a + a = {n + a}|{ω − n + a}.We now prove that this is ω by cofinality. We know that for all positive integers n,n < a < ω − n. Hence n + a < ω. Since a > n, it follows that ω < ω − n + a. Soω satisfies the betweenness condition. Since ω = {n}|∅ and n + a ≥ n, the cofinalitycondition is satisfied; hence 2a = {n + a}|{ω − n + a} = ω.

The first proof involves more computation but is more routine. The second methodrequires a good guess at the answer and being able to use cofinality in spite of a limitedknowledge of  a.

12 ω − 1 and more generally 1

2 ω + r for arbitrary real r can be handled similarly toω − 1 and ω + r. In fact, the sequence for 1

2ω + r is the sequence for 1

2ω followed by

the sequence for r. Such simple results make the subject quite tractable. However, thereader is warned that crude juxtaposition doesn’t work in every case.

Several more remarks are worthwhile to mention at this point. First, we take forgranted obvious inequalities involving infinite or infinitesimal elements (e.g., for allpositive real r and integers n, ωr −n is positive) and apply them to obtain informationabout cofinality. Secondly, recall the definition of multiplication and consider a =F |F  and b = G|G. Then ab is

{ab + ab − ab, ab + ab − ab}|{ab + ab − ab, ab + ab − ab}.

It’s often convenient to think of the lower sums in the form of  ab + (a

−a) b and

ab−(a − a) b and the first upper sum in the form ab+(a − a) b or ab−a (b − b).In spite of the triviality of the algebra, a suitable form supplies a considerable gain inintuition.

Page 100: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 100/133

Copyright c 2012, by Mikael Astner  100

We now compute 34 ω. 3

4 = (+−+) = {12}|{1} by our work on dyadic fractions. Hence

3

4ω =

{1

2}|{1}

· ({n}|∅) = {1

2ω +

3

4− 1

2

n}|{1 · ω −

1 − 3

4

n}.

Note that the use of the above remark. Also note that a typical lower term mayalternatively be written as 3

4 n + 12 (ω − n), but the form we have exhibits the order of 

magnitude more clearly. In fact, it’s immediate by cofinality that 34 ω = { 1

2 ω + n}|{ω −n}. If we accept the juxtaposition results for 1

2 ω + n we see that this is the sequenceof length ω · 3 consisting of  ω pluses, followed by ω minuses, and then ω pluses.

By a similar process we can use a double induction to obtain the sequence for ωr,one for n in ωr ± n and one for r. It turns out to be just like the sequence for r exceptthat each sign is repeated ω times. The earlier remarks on multiplication which arepetty in an individual numerical example gain in value as we generalize.

Now we consider ω2 − ω. It’s easy to see that in this case the induction procedurefor ω2 − n extents even to ω2 − ω so we obtain that ω2 − ω consists of  ω2 plusesfollowed by ω minuses. The key point is the elementary fact about ordinals which say

that α < ω2

→ α + ω < ω2

. The reader can experiment with obvious generalizationsin this direction and see that there are no further dramatic surprises.Before studying the ultimate representation theory, it’s worthwhile to see what

happens in the other direction, i.e. with expressions such as 1ω

. We prefer to ignoreour previous construction of reciprocals just as we did in our study of the real numbers.Our primary interest in the latter construction was, of course, in the issue of existencesince it’s usually not convenient to use such an inductive construction for computationeven if it’s possible.

Instead, we use the good guess approach. Let ε be the sequence of length ω whichconsists of a single plus followed by ω minuses. This is clearly a positive infinitesimal.In fact, it’s immediate that it’s the unique positive infinitesimal of length ω; henceit may be regarded as the canonical infinitesimal. Heuristically, this is a reasonablecandidate for being reciprocal of the canonical infinitely large number ω. We now prove

this fact. Note that the canonical representation of  ε is 0|{ 12n }. Hence

εω =

{0}|{ 1

2n}

· ({m}|∅) =

{0}|{ 1

n}

· ({m}|∅) =

{εm + 0 · ω − 0 · ω}|{εm +1

nω − 1

nm} = {εm}|{ε +

1

nω − 1

nm}.

Now 1 = {0}|∅. We now check the conditions of cofinality theorem i (theorem 10.24).Since ε is infinitesimal, a typical lower element which is εm is less than 1. A typicalelement 1

nω− 1

nm+εm is clearly infinite. Hence the number 1 satisfies the betweenness

condition. The cofinality part requires minimal work. Regardless of  m, εm ≥ 0. Hencewe do obtain 1 as required.

Note that in spite of the existence of infinitesimals there is no connection with

nonstandard analysis. So far we’ve not had any transfer principle. In fact, no model-theoretic ideas of any kind have played a role.

Page 101: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 101/133

Copyright c 2012, by Mikael Astner  101

Let’s now consider r + ε where r is a real number. First, let r be a dyadic fraction.Then r has the form {s}|{t} for suitable unit sets where s and t are also dyadic. Then

r + ε =

{s

}|{t

}+

{0

}|{

1

n}=

{r + 0, s + ε

}|{r +

1

n

, t + ε

}=

{r

}|{r +

1

n}=

{r

}|G

where G is the set of all dyadic fractions above r. It’s immediate from the definitionthat this is the juxtaposition of the sequence for r with the sequence for 1

ω. Note

that this argument applies to all dyadic including integers since by cofinality we canalways insert an s or t. Note also that we can now full a gap which remained since ourdiscussion of the real numbers, since we see from the above that all sequences whoseterms are eventually minus are obtained this way. Thus all sequences of length ω areeither real, ±ω, of the form r + r

ωfor dyadic r, or the form r − 1

ωfor dyadic r.

Now let r be non-dyadic. Let r = R|R. Then

r + ε = R|R + {0}|{ 1

n} = {r + 0, r + ε}|{r +

1

n, r + ε} = {r}|{r +

1

n} = r|G

where G is the set of all reals above r. (G may be taken to be the set of dyadicsinstead. It makes no difference.) It’s immediate from the definition that we nowobtain the sequence for r follow by a single plus. Thus we have a sequence of lengthω +1. We’ve here a counter-example to naive juxtaposition. In fact, the final poor  plusis worth only ε. Note that in both cases we are juxtaposing a sequence to the sequencefor r but the sequence added on depends on whether r is dyadic or not. This bringssome subtlety to the subject.

Next, we compute 2ε. This is

{0}|{ 1

n} + {0}|{ 1

n} = {0 + ε}|{ 1

n+ ε} = {ε}|{ 1

n+ ε} = {ε}|{ 1

n}

by mutual cofinality. Hence 2ε is the sequence for ε followed by a plus. Note thecontrast between this case and that of 2ω. Again the final plus has a value of only ε.

Consider1

2 ε. This is ({0}|{1})·{0}|{1

n} which is {0,1

2n +ε−1

n}|{1

2n , ε} = {0}|{ε}.This is the sequence of length ω + 1 which is that of ε followed by a minus. We can saythat the value of the last sign is only 1

2ε. The inflation rate is growing rapidly! Again

note the contrast between this case and that of  12

ω where ω minus signs took care of the 1

2 .Next we consider rε for a typical positive real number which isn’t an integer. This

has the form

R|R · {0}|{ 1

n} = {εr, εr − 1

n(r − r)}|{εr, ε +

1

n(r − r)}.

By mutual cofinality this simplifies to {εr}|{εr}. This enables us to obtain the patternby induction. We obtain the sequence for ε followed by the tail of the sequence forr after the first +. For example 3

4 = (+−+). Hence 34 ε is the sequence for ε followed

by (−+). To justify the induction we need to verify the pattern for positive integersas well. In that case R is empty and the expression simplifies instead to {εr}|{ 1

n}.

However, the pattern works in any case although here there are no infinitesimal upperelements as before.

Page 102: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 102/133

Copyright c 2012, by Mikael Astner  102

A natural expression to consider next is ε2 = 1ω2

.In analogy with ε = 1

ωone natural candidate for the sequence is a plus followed by ω2

minuses. But

ε2 = ε · ε = {0}|{1

n} · {0}|{1

n} = {0,ε

n +ε

n −1

n2 }|{ε

n} = {0}|{ε

2n }.

It follows from the previous result for rω that the result is the sequence consisting of asingle plus followed by ω · 2 minuses. In particular, we do not  have ω2 minuses as onemay naively guess. Of course, the contrast between the behavior of  rω and rε shouldmake an alert reader suspicious of such a guess at the outset.

It’s next of interest to investigate ε + ε2. Is it the sequence for ε followed by thesequence of  ε2? Well

{0}|{ 1

n} + {0}|{ ε

m} = {ε, ε2}|{ε +

ε

m, ε2 +

1

n} = {ε}|{dε}

where d is the set of all dyadic fractions larger than 1. Using the previous result

for rε thus leads to the sequence for ε followed by the sequence for ε. Thus, again, juxtaposition behaves subtly. The 2 term is contributed by the annexation of the termfor ε.

The examples we’ve done illustrate the most basic tricky phenomena that occur inattempting to find the sign sequence for various algebraic expressions. We close byconsidering the expression of 

√ ω. We ignore our previous construction of square roots

and use a good guess method. Let b be the sequence consisting of  ω pluses followedby ω2 minuses. We consider the canonical representation and obtain a cofinal upperset by restricting ourselves to those sequences for which the number of minuses is amultiple of  ω. So by earlier computations we have b = {n}|{ ω

2m } = {n}|{ ωm

}. Notefirst that this is a priori  a plausible candidate for

√ ω since n <

√ ω and ω = ω√ 

ω< ω

n.

b2 ={

bn1

+ bn2 −

n1

n2

,bω

m1+

m2 −ω2

m1m2 }|{nb +

ωb

m −ωn

m }.

Now ω = {n}|∅. So we must verify the conditions for cofinality theorem i (theorem10.24). As usual in a proof of this form all we know about b in advance is thatn < b < ω

n, i.e. that b and ω

bare both infinite. There is a trap which is a temptation

to use circular reasoning (i.e., that b2 = ω.) In this respect this proof is similar to theone dealing with 1

ωand the second proof for the sign sequence for 1

2ω .The conditions are easily verified. bn1 + bn2 − n1n2 ≤ b (n1 + n2) < ω since ω

bis

infinite. For the same reason bωm1

+ bωm2

− ω2

m1m2< 0 < ω . Also nb+ ωb

2m− ωn2m ≥ ω(b−n)

2m> ω

since b is infinite.This verifies the betweenness condition. If we let n1 = n2 = 1, we obtain 2b − 1 as

a lower element which clinches the cofinality condition since b is infinite. In fact, thesame element 2b

−1 works for all n. This shows b2 = ω.

The supply of surreal numbers is very rich. Continuing in the above manner is likeusing a teaspoon to empty an ocean. It’s time now to get some sort of hold on moregeneral elements.

Page 103: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 103/133

Copyright c 2012, by Mikael Astner  103

10.5 Normal form

10.5.1 Combinatory lemma on semigroups

So far we’ve more or less accepted everything we needed from outside the theory of 

surreal numbers, since the material was elementary. However, in order to study thenormal form we need a combinatorial lemma which isn’t well known, and since it’sinteresting in its own right we shall prove it here.

Lemma 10.57. Let T  be a set of positive well-ordered elements in a linearly-orderedsemigroup. Then the set S  of finite sums of elements of  T  is also well-ordered and eachelements of  S  can be expressed as a sum of elements of  T  in only a finite number of ways.

Proof. Both parts will follow if we show that any sequence {sn} of elements of  S  suchthat sn ≥ sn+1 for all n in which the representations are distinct must eventuallyterminate. We’ll assume an infinite sequence and obtain a contradiction.

Case 10.58. Suppose that we have only binary sums. Let sn = an + bn where an ∈ T 

and bn ∈ T . Since T  is well ordered, there exists a subsequence ain of  an such thatain+1 ≥ ain for all n. We can obtain such a subsequence as follows. Choose i suchthat ai is the least value of the a’s. If  i1 < i2 < ... < in have been chosen, thenshoose in+1 > in such that ain+1 is the least value of the as with index larger than in.Similarly there exists a subsequence of  bjn of  bin such that bjn+1 ≥ bjn for all n. Thenajn+1 ≥ ajnand bjn+1 ≥ bjn for all n. Since ajn+1 + bjn+1 ≤ ajn + bjn, we obtain thatajn+1 = ajn and bjn+1 = bjn , contradicting the hypothesis that the representations aredistinct.

Case 10.59. Suppose that we’ve only sums of  k terms for fixed k. Then the proof is essentially the same, but the notation is minutely more complicated. If  sn has theform an,1 + an,2 + ... + an,k then we simply iterate the process of taking subsequencesfor all fixed j ≤ k.

Case 10.60. If only a finite number of  k’s occur, then we can reduce to case 10.58since at least one of these k’s must appear infinitely often.

Case 10.61. Suppose an infinite number of k’s occur, i.e. the value of k is unbounded.Then we choose a subsequence {sn} of {sn} as follows: si = ai,1 + ai,2 + ... +ai,ni whereni is strictly increasing function of i. In particular, ni ≥ i. Also, we express the sumsin non-increasing order, i.e. ai,1 ≥ ai,2 ≥ ... ≥ aini . We now choose a subsequence of {sn} such that ai1 is monotonic increasing, a subsequence such that ai,2 is monotonicincreasing, etc. By the usual diagonal method we obtain a subsequence sn such thatai,j for fixed j is monotonic increasing as a function of  i for i ≥ j. (Since ni ≥ i, ai,jis defined for i ≥  j.) First we shaow that necessarily ni > i for all i. Suppose ni = i.Then consider

si = ai,1 + ai,2 + ... + ai,i

and

si+1 = ai+1,1, 1 + ai+1,2 + ... + ai+1,i + ai+1,i+1 + ....

Page 104: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 104/133

Copyright c 2012, by Mikael Astner  104

Since ai+1,j ≥ ai,j for all j ≤ i, and si+1 contains an extra term ai+1,i+1 which hasno analogue in si , it follows that ai+1 > si . This contradicts the assumption that thesequence is monotonic increasing.

For arbitrary i we now compare:

si = ai,1 + ai,2 + ... + ani,i+1 + ... + ani,ni + ....

Since ani,j ≥ ai,j for j ≤ i, since sn contains more terms than si and since sn ismonotonic decreasing, there must exist k such that i < k ≥ ni with ai,k > ani,k.Hence ai,i ≥ ai,k ≥ ani,ni . By induction we may now define a function b (i) as follows:b (1) = 1 and b (i + 1) = nb(i). Then {ab(i),b(i)} is infinite strictly decreasing sequenceof elements of  T , which is our final contradiction.

This completes the proof.

10.5.2 The ω-map

Up to now we have considered real number, ordinals, and algebraic combinations of these. What we need now is a more tractable way of looking at a general surrealnumber. We begin by studying orders of magnitude, a concept which has meaning inany linearly ordered field containing the real numbers.

We define an equivalence relation on the positive surreal numbers.

Definition 10.62. a ∼ b if and only if there exists an integer n, such that na ≥ b andnv ≥ a. This is trivially an equivalence relation. The equivalence classes are calledorders of magnitude . Related to this another definition.

Definition 10.63. a b is for all integers n: nb ≤ a. a b if and only if  b a. Wesay in words that a has a higher order of magnitude than b. Clearly we have a b,b a, or a ∼ b.

We shall assume that the reader has no trouble seeing the most obvious conse-

quences of these relations, so that they will be freely used without a detailed explana-tion when needed.

a b ⇒ na b, a c, and b c ⇒ a + b c.

One property of special interest is a ∼ b and a < c < b ⇒ a ∼ c, i.e. the equivalenceclasses are convex.

One basic fact which is special about the surreal numbers is that each equivalenceclass has a canonical number.

Theorem 10.64. Let a be a positive surreal number. Then there exists a unique x of minimal length such that x ∼ a.

Proof. The argument uses only the convexity. because of well-ordering there certainly

exists an x of minimal length such that x ∼ a. Suppose x and y are distinct, bothhave minimal length, and x ∼ a ∼ y. Let z be the common initial segment. By theconvexity property z ∼ a. Also (z) < (x) which contradicts the minimality of  (x).

Page 105: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 105/133

Copyright c 2012, by Mikael Astner  105

Remark 10.65. The same argument shows that the element of minimal length is aninitial segment of every other element equivalent to a.

Similarly to the above one can define additive order of magnitude using additionrather than multiplication.

Notation 10.66. x ∼ a if and only if ∃n ∈ Z : a + n ≥ b and b + n ≥ a.Since this is less important for out purpose and, besides, is similar to the case of 

multiplicative orders of magnitude including the possession of the convexity property,we don’t give the details. It suffices to note that here also every equivalence class hasa canonical member which is defined in a similar way.

One of the more remarkable discoveries of surreal is that the canonical elementscan be parameterized by the surreal numbers in a natural way.

For every surreal number b we shall define an element written ωb which may bethought of as the canonical element of the bth order of magnitude. (Although there arephilosophical objections to the use of exponential notation which have some validity,there is enough in common with exponentiation to make the notation psychologicallyconvenient.) As usual, we use induction and assume that ωc has been defined for all

proper segments of  b. Then.Definition 10.67.

ωb = {0, rωb}|{sωb}

where r and s run through the set of all positive real numbers, and b and b runs as inour usual notation though the lower and upper elements of the canonical representationof  b.

By cofinality we can, of course, limit r to integers and s to dyadic fractions withnumerator 1.

Theorem 10.68. ωb is always defined and greater than 0. Futhermore, b < c → ωb ωc.

Proof. We prove this by induction on the length of  b. Since b < b we have ωb

ωb

by the inductive hypothesis. Hence, for all positive reals r and s, rωb < sωb . Also,0 < sωb . Hence ωb is defined. Since 0 is a lower element in the definition, 0 < ωb.

To conclude the proof we use a method which is similar to the one we used for thearithmetical operations. Here the computation is immediate. Suppose b < c and d isthe common initial segment. If  d = b or c then the conclusion is immediate from thedefinition. Otherwise we have ωb ωd ωc.

Corollary 10.69. The uniformity theorem holds for ωb, i.e. if b = F |G for an arbitraryrepresentation the same formula holds, i.e. ωb = {0, rωF }|{sωG}. (ωF  = {ωx : x ∈ F }and similarly for G.)

Proof. As usual, this follows from the inequality b < c → ωb

< ωc

using the inversecofinality and the cofinality theorems (theorem 10.27, theorem 10.24, and theorem10.25).

Page 106: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 106/133

Copyright c 2012, by Mikael Astner  106

Theorem 10.70. An element has the form ωb if and only if and only if it’s the elementof minimal length in an equivalence class under ∼.

Proof. First consider an element of the form ωb. We have ωb = {0, rωb}|{sωb}. If 

x ∼ ω

b

then x also satisfies rω

b

< x < ω

b

since r and s are arbitrary positive real.Hence ωb is an initial segment of  x, so

ωb ≤ (x).For the converse, we show that every positive element is equivalent to an element

of the form ωb. In view of the inequality b < c → ωb ωc, such an element isunique if it exists. We use induction. Let a = A|A. Then 0 ∈ A. By the inductivehypothesis every element in A ∪ A is equivalent to an element of the form ωb. LetF  = {y : (∃x ∈ A) (x ∼ ωy)} and G = {y : (∃x ∈ A) (x ∼ ωy)}. Suppose F  ∩ G = ∅and let y ∈ F  ∪ G. Then ωy ∈ A and ωy ∼ z ∈ A. Since x < a < z it follows thata ∼ ωy.

Now suppose that F  ∩ G = ∅. We claim that F < G. For suppose x ∈ F , y ∈ G,and x > y. Then ωx ∼ a ∈ A and ωy ∼ a ∈ A. Hence

x > y → ωx ωy → a a.

This is impossible since a < a. Since F < G, F |G has meaning. Let z = F |G.Then ωF  is a complete set of representatives for the equivalence classes containing theelements of F −{0} and similarly ωG with respect to F . We now consider three cases.

Case 10.71. rω ≥ a for some positive real r and some x ∈ F . Let a ∈ A satisfya ∼ ωx. Then a ≤ a ≤ rωx but a ∼ ωxrωx; hence a ∼ ωx.

Case 10.72. rωx ≤ a for some real positive integer r and some x ∈ G. Let a ∈ A

satisfy a ∼ ωx. Then rωx ≤ a ≤ a but rωx ∼ ωx ∼ a; hence a ∼ ωx.

Case 10.73. Neither case 10.71 or case 10.72 is satisfied. This says that rωF  < a <sωG. Now let a ∈ A − {0}. Then (∃x ∈ F ) (a ∼ ωx). In particular, for some real r,rωx ≥ a. Similarly for a ∈ A we have (∃x ∈ G) (a ∼ ωx). Hence for some positivereal s, sωx

≤a. Since a = A

|A this shows that the cofinality condition is satisfied

for {0, rωF }|{sωG}. Hence a = {0, rωF }|{sωG} = ωz.This theorem now follows immediately, for if  a has a minimal length in its equiva-

lence class, then a ∼ ωb → a = ωb since, as we’ve already shown, ωb has the minimallength property.

Our next result gives some justification for the exponential notation.

Theorem 10.74.

i. ω0 = 1,

ii. ωaωb = ωa+b,

iii. for ordinals a our ωa is the same as the ordinal ωa in the usual sense.

Page 107: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 107/133

Copyright c 2012, by Mikael Astner  107

Proof. By definition ω0 = {0}|∅ = 1. We prove ii by induction as usual. Let a = A|A

and b = B|B. Then by the formula for our addition and uniformity theorem, usingthe facts that ωa = {0, rωa}|{sωa} and ωb = {0, r1ωb}|{s1ωb}, we obtain that

ωa+b ={

0, rωa+b, r1ωa+b

}|{sωa+b, s1ωa+b

}.

Similarly for multiplication we obtain that

ωaωb = {0, rωaωb, r1ωbωa, rωaωb + r1ωbωa − rr1ωaωb , sωaωb + s1ωbωa − ss1ωaωb}|{sωaωb, s1ωbωa,rωaωb + s1ωbωa − rs1ωaωb, sωaωb + r1ωbωa − r1sωbωa}.

By the inductive hypothesis this may be written

{0, rωa+b, r1ωb+a, rωa+b + r1ωb+a − rr1ωa+b , sωa+b + s1ωb+a − ss1ωa+b}|{sωa+b, s1ωb+a, rωa+b + s1ωb+a − rs1ωa+b , sωa+b + r1ωb+a − r1sωb+a}

We now show that the representations for ωa+n and ωaωb are mutually cofinal.One direction is immediate since the terms for ωa+b are among the terms for ωaωb.Since b > c

⇒ωb

ωc the other direction follows easily by elementary reasoning with

orders of magnitude. First rωa+b + r1ωb+a − rr1ωa+b ≤ (r + r1) ωmax(a

+b,a+b

).

Next. sωa+b + sb+a

1 − ss1ωa+b < 0 because the term containing ωa+b dominates

over the other terms, it follows that for any s2 < s1 the element is above s2ωb+a.Since the same argument applies if a and b are interchanged, this verifies the cofinality.

iii follows easily by induction. Let a = {a}|∅. For the purpose of this proof let ustemporarily use F (c) instead of ωc for our order of magnitude , and use ωc in its usualsense in the theory of ordinals. Then F (a) = {0, F (a)}|∅ = {0, nωa}|∅ = ωa. Thelast equality is a basic fact concerning the ordering of the ordinals.

Corollary 10.75. ωaω−a = 1Theorem 10.74 gives some justification for the exponential expression. However,

the main justification comes from the nice way the operations behave on the normalforms of surreal numbers, which we’ll see later.

Remark 10.76. Note that if  F  contains no maximum, then rωF , may be replacedsimply by ωF  because of cofinality. For suppose rωx ∈ F . If  y ∈ F  and y > x thenωy ωx ; hence ωy > rωx . A similar remark applies to sωG.

10.6 Normal form

We now obtain something which is analogous to the normal form for ordinals. Herewe need transfinite sums. We shall define expressions of the form

i<α ωairi where

(ai) is a strictly decreasing transfinite sequence of order type α and ri is a real numberdistinct from 0 for all i.

Case 10.77. α is a no-limit ordinal. Let α = β + 1. Then

i<α

ωairi =

i<β

ωairi

+ ωaβyβ .

Page 108: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 108/133

Copyright c 2012, by Mikael Astner  108

Case 10.78. α is a limit ordinal. We obtain

i<α ωairi in the form F |G.

A typical element of F  has the form

i<β ωaisi, where β < α such that si = ri fori < β , and for sβ = rβ − ε where ε is a positive real.

Similarly a typical element of  G has the form i<β ωaisi, where β < α, such that

si = ri for sβ = rβ + ε where ε is a positive real. (We use the natural notationi<β ωairi as an alternative for

i<β+1 ωaisi.)

If 0 is regarded as a limit ordinal the definition leads to the empty sum being ∅|∅ =0. For α finite, the expression is just the ordinary finite sum. For α infnite, a proof that F < G is needed in order to show that the definition makes sense. In fact, we shallshow that the ordering on surreal numbers is consistent with the lexicographic orderingwith respect to a’s and r’s in the normal form. First we define the lexicographical orderon expressions

i<α ωairi.

Let x =i<α

ωairi and y =i<β

ωbisi.

Let γ  be the least ordinal such that (aγ , rγ )

= (bγ , sγ ). If γ < min(α, β ) then x > y

if and only if  aγ  > bγ , or aγ  = bγ  and rγ  > sγ .If  γ  = β , then x > y if and only if  rγ  > 0. If γ  = α then x > y if and only if  sγ  < 0.

Note that this is consistent with the situation for the normal form of ordinals.

Theorem 10.79. The expression

i<α ωairi is defined for all strictly decreasing se-quences (ai) and all nonzero real ri. The ordering is given by the lexicographical order.Furthermore for all β > α,

|i<α

ωairi −i<β

ωairi| ωaj

if  j < β . (We call the latter inequality the tail property ).

Proof. We use induction on α.

Case 10.80. α is a non-limit ordinal. Let α = β + 1. Theniα

ωairi =i<β

ωairi + ωaβrβ .

To begin with, let x =

i<α ωairi and y =

i<α ωbisi, and suppose x > yin the lexicographical order. We must show that x > y as surreal numbers. If (∀i < β ) ((ai, ri) = (bi, si)) then either aβ > bβ , or aβ = bβ and rβ > sβ . In eithercase ωaβrβ > ωbβsβ by elementary reasoning with orders of magnitude. Since additionpreserves order, it follows that x > y .

Next assume (aγ , rγ ) = (bγ , sγ ) for some γ < β  but (aδ, rδ) = (bδ, sδ) for δ < γ .Then either aγ  > bγ , or aγ  = bγ  and rγ  > sγ . In either case ωaγrγ − ωbγsγ  ≥ ωaγ t forsome positive real t. Hence

i<γ +1

ωairi−

i<γ +1

ωbisi =

i<γ 

ωairi + ωaγrγ 

i<γ 

ωbisi + ωbγsγ 

= ωaγ rγ −ωbγsγ  ≥ ωaγ t.

Page 109: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 109/133

Copyright c 2012, by Mikael Astner  109

However

i<β ωairi and

i<β ωbisi have the tail property by the inductive hypothesis.Hence

|i<β

ωairi − i<γ +1

ωairi| ωaγ and |i<β

ωbisi − i<γ +1

ωbisi| ωaγ .

Therefore

i<β ωairi −i<β ωbisi ≥ ωaγ t for a positive real t. (We can use t less

than t.) Again, since ωbβ ωbγ ≤ ωaγ , we obtain

x − y =i<α

ωairi −i<α

ωbisi =i<β

ωairi +i<β

ωbisi + ωaβrβ − ωbβsβ ≥ ωaγ t

for a positive real t. In particular x > y.The same argument applies if only one of  x and y have a representation of length

α. A slight difference in notation is needed if one is picayune since aβ and bβ aren’tboth present, but in any case the superiority of  x gains in the γ th stage is necessarilymaintained since whichever one of the above is present is still lower order of magnitude

than ω

. (There’s even a possible case where aγ  is not present. Then we use ω

instead.)We now check the tail property. Let γ < α and j < γ . Consider |i<α ωairi −

i<γ ωairi|. This is

|ωaβrβ +

i<β

ωairi −i<γ 

ωairi

|.

If  γ  = β  this reduces to |ωaβrβ| which is certainly of lower magnitude than ωaj .Otherwise |aaβrβ| is certainly still of lower magnitude than ωaj , and so is |i<β ωairi−

i<γ ωairi| by the inductive hypothesis. Hence the absolute value of the sum is also,

and we’re done.

Case 10.81. α is a limit ordinal.By the inductive hypothesis the ordering for elements in F  ∪ G is given by the

lexicographical ordering; hence F < G.We now verify the tail property. Let β < α and j < β . We must consider

|i<α ωairi − i<β ωairi|. Now among the typical elements of  F  in the represen-

tation of 

i<α ωairi is

i<j ωairi − ωajε. (One must be cautious in reasoning witthese infinite sums . Other results which may appear to be just as obvious might re-quire a technical proof because of our specialized definition of infinite sums.) Similarly,among the typical elements of  G is

i<j ωairi + ωajε. Hence

i<j

ωairi − ωajε <i<α

ωairj <i<j

ωairi + ωajε.

By the lexicographical order, and the inductive hypothesis, we havei<j

ωairi − ωajε <i<β

ωairi <i<j

ωairi + ωajε.

Page 110: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 110/133

Copyright c 2012, by Mikael Astner  110

Therefore

|i<α

ωairi −i<β

ωairi| < ωaj (2ε) .

Since ε is an arbitrary positive real this shows that |i<α ωairi −i<β ωairi| ωaj

as desired. The proof that the lexicographical order is the correct order is similar to theproof in the non-limit ordinal case. As before, let x =

i<α ωairi and y =

i<α ωbisi

and suppose x > y in the lexicographical order. Suppose (aδ, rδ) = (bδ, sδ) for all δ < γ for some γ < α but (aγ , rγ ) = (bγ , sγ ). Then

i≤γ ω

airi−

i≤γ ωbisi ≥ ωaγ t for some

positive teal t. By the tail property,

|i<α

ωairi −i≤γ 

ωairi| ωaγ and |i<α

ωbisi −i≤γ 

ωbisi| ωbγ ≤ ωaγ .

Therefore

i<α ωairi −i<α ωbisi ≥ ωaγ t for some positive real t.

If only one of x and y has representation of length α, then the earlier remark in thenon-limit ordinal case remain valid.

The next theorem gives us the importance of the transfinite sums we’ve been dis-cussing.

Theorem 10.82. Ever surreal number can be expressed uniquely in the form

i<α ωairi.

Proof. Uniqueness is immediate from the fact that the ordering is given by the lexico-graphical ordering.

Now let x be an arbitrary nonzero surreal number. We know by theorem 10.70 that|x| ∼ ωa for some a. Let S  be the set of all real numbers s such that sωa ≤ x. Since|x| ∼ ωa, S  is nonempty and is bounded above. Let r = least upper bound property S .Then (r + ε) ωa > x and (r

−ε) ωa < x for all positive real ε; hence

|x

−ωar

| ωa.

Since |x| ∼ ωa it follows that r = 0. It’s clear that the above property determines runiquely. For convenience in the proof we shall use the notation A (x) = ωar if  x = 0.

Now assume that x cannot be expressed in the form

i<α ωairi. We define asequence (ai, ri) where i runs through all the ordinals. Suppose (ai, ri) is defined forall i < α. Then A

x −

i<α ωairi

= ωaαrα.Intuitively speaking, we are getting better and better approximations to x as α is

increasing. We first show that the a’s are decreasing, so that the sums make sense.

First let α = β + 1. Then a

x −i<β ωairi

= ωaβrβ.

ωaαrα = A

x −

i<α

ωαiri

= A

x −

i<β

ωairi

− ωairi

ωaβ

by the inductive definition of (aβ, rβ). Hence aα < aβ.

Page 111: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 111/133

Copyright c 2012, by Mikael Astner  111

Now let α be a limit ordinal and let β < α. We already know that

|x −i<β

ωairi| ∼ A

x −i<β

ωairi

ωaβ .

By the tail property |i<α ωairi−

i<β ωairi| ωaβ . Hence |x−i<α ωairi| ωaβ .

Therefore ωaαrα = A

x −i<α ωairi

ωaβ . So finally aα < aβ .Since by hypothesis x cannot be expressed as a sum,

i<α ωairi has meaning for

all α.We next show that

i<α ωairi

≥ α for any general sum. Although this inequal-ity is crude is suffices for our immediate purpose.

Let α < β . Then the elements of  F  and G used in the representation of 

i<α ωairiare also used in the representation of 

i<β ωairi. Hence the former is an initial segment

of the latter and thus it has smaller length. The uniqueness of representations as sumsguarantees that the length is strictly smaller. This is enough to verify the inequalitysince every strictly increasing function f  from ordinals to ordinals necessarily satisfies

f (x) ≥ x for all x.If α is a limit ordinal we already know that |x −

i<α ωairi| ωaβ for any β < α.This shows that x satisfies F < x < G for the F  and G used in the representationof 

i<α ωairi. Hence

i<α ωairi

< (x). This is true for all α. In view of theearlier inequality this implies that (x) is above every ordinal, which is absurd. Acontradiction.

We’ve now established the normal form of surreal numbers. The usual represen-tation of ordinals in terms of powers of  ω is a special case of this, since finite sumscorrespond to ordinary addition and this agrees with ordinal addition if terms arearranged so that there is no absorption.

Next we shall show the fundamental fact that the basic operations can be preformed

on elements in the normal form analogously to usual operations on polynomials. Thisis the main justification for the summation as well as exponential notation.

If x =

i<α ωairi, we shall call α the normal length of x, abbreviated n (x). Thisis quite different from (x) which was defined at the beginning. We shall study n (x)in more detail later.

We snow need some lemmas which will help us to deal with the normal form.

Lemma 10.83. ωar = {ωa (r − ε)}| {ωa (r + ε)} where ε is an arbitrary positive realnumber.

Proof. We know that r = {r − ε}|{r + ε} for any real r. Also, ωa = {0, sωa}|{tωa}by definition where s and t are arbitrary positive reals. Hence ωar has the form

{ωa (r

−ε) , ωa (r

−ε) + sωa ε, ωa (r + ε)

− tωa ε

}|{ωa (r + ε) , ωa (r + ε) −

sωa

ε, ωa (r − ε) +

tωa

ε}.

Page 112: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 112/133

Copyright c 2012, by Mikael Astner  112

Let ε1 be positive, real, and less than ε. Since ωa ωa ωa , it’s immediate thatthe lower terms are below ωa (r − ε1) and the upper terms above ωa (t + ε1). Hencethe result follows by the cofinality theorem.

The proof is a good typical example of reasoning with orders of magnitude andcofinality. This technique helps to give the normal form its tractability.

Lemma 10.84.

i<α ωairi = {i<α ωairi+ωaα (rα − ε)}|{i<α ωairi+ωaα (rα + ε)}.

Proof. Assume first that α is a limit ordinal. Then since

i<α ωairi =

i<α ωairi +ωaαrα we can obtain the representation of 

i<α ωairi by using the definition ofor

the first addend and lemma 10.83 for the second addend. Typical lower terms arei<β ωairi − ωβε + ωaαrα where β < α and

i<α ωairi + ωaα (rα − ε). The latter

terms are clearly cofinal by the lexicographical order. Since the argument is similar forthe upper terms, the result follows in this case.

In general, every ordinal α has the form α + n where α is a limit ordinal. We

can now use induction on n. For convenience of notation we can assume the resultfor

i<α ωairi and then prove it for

i<α+1 ωairi. The proof is now similar to theabove. The only difference is that the typical lower terms which are discarded becauseof cofinality now have the form

i<α ωairi − ωaαε by the inductive hypothesis; the

upper terms are similar.

It’s convenient to extend the definition of 

i<α ωairi to the case where ri may takethe value 0. In fact, we use exactly the same definition, but we of course no longerhave unique representation.

Lemma 10.85. Let ri be a sequence of length α, and let {ni} for i < β  be thesubsequence of i’s such that ri = 0. Furthermore, suppose bi = ani and si = rni . Then

i<αωair

i= i<β

ωbisi.

Proof. Although this appears to be completely trivial, a proof is needed because of the special definition of infinite summation. Essentially we must show that includingterms with ri = 0 doesn’t affect sums. We do this inductively on the length of thepartial sums of 

i<α ωairi. For a non-limit ordinal this is clear since we are dealing

with ordinary addition. For limit ordinals caution is required since in the expressioni<α ωairi, the β th term plays a role even if  rβ = 0, since leads to elements in F  ∪

G such as

i<β ωairi =

i<β ωairi. The left-hand side is defined in terms of anF  and G whereas the right-hand side is an ordinary sum. A typical lower term inthe definition of 

i<β+ω ωairi is

i<γ ω

airi − ωγ ε with γ < β  + ω, and there’s asimilar expression for a typical upper term. By lemma 10.84 the right-hand side is{i<β ωairi − ωaβε}|{

i<β ωairi + ωaβε}. Hence by the cofinality theorem the right-hand side equals the left-hand side.

The case just considered represents a transition where we invoked lemma 10.84. Inall other cases cofinality is all that is needed.

Page 113: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 113/133

Copyright c 2012, by Mikael Astner  113

If we are given two surreal numbers the above lemma permits us to write them inthe form

i<α ωairi and

i<α ωaisi, using the same α and ai’s, by inserting zeros

where need.

Lemma 10.86 (the associative law). i<α+β ωairi = i<α ωairi +j<β ωrα+jrα+j .

Proof. We use induction on β . If β  is a non-limit ordinal this is just an instance of theordinary associative law. If β  is a limit ordinal we compute the right-hand side usingthe representations in the definition for the right addend, whereas for the left addendwe use the representation in the definition or in lemma 10.84 depending on whetherα is a limit or non-limit ordinal. If 

j<β ωaα+jrα+j = F |G then by cofinality the

right-hand side may be expressed as {i<α ωairi + F }|{i<α ωairi + G}. (We invokethe lexicographical order and reason as in the proof of lemma 10.84. Thus a typicallower term has the form

i<α

ωairi +

j<γ 

ωaα+jra+j − εωaα+γ

for γ < β 

which by the inductive hypothesis is

i<α+γ aairi−εωaα+γ . A typical upper term can

be written similarly. But this is cofinal in the representation for

i<α+β ωairi.

The above lemmas show that in spite of the apparently artificial definition of infinitesums they behave in many ways in a manner expected of sums. we now come to theimportant fact that formal polynomial addition works.

Theorem 10.87.

i<α ωairi +

i<α ωaisi =

i<α ωai (ri + si).

Proof. First note that lemma 10.85 allows us to express the fact that formal polynomialaddition works in this convenient form. As usual we use induction on α. If  α = β + 1this is immediate. In fact,

i<α

ωairi +i<α

ωaisi =i<β

ωairi + ωaβrβ +i<β

ωaisi + ωaβsβ .

By the inductive hypothesis this is

i<β ωai (ri + si)+ωaβrβ+ωaβsβ =

i<α ωai (ri + si)using the ordinary distributive law and the definition summation. Now suppose thatα is a limit ordinal. One typical lower element of the sum is

i<β

ωairi − ωaβε

+

i<α

ωaisi

=

i<β

ωairi − ωaβε

+

i<β

ωaisi +

β<i<α

ωaisi

by lemma 10.86. (β<i<α ωaisi has the natural meaning i<α−(β+1) ωaβ+1+isβ+1+i .)

By the inductive this is i<β ωai (ri + si) − (ωaβε) + β<i<α ωaisi. By the lexico-graphical order this is mutually cofinal with

i<β ωai (ri + si) by definition, or by

lemma 10.85, if  ri + si = 0 for some i.

Page 114: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 114/133

Copyright c 2012, by Mikael Astner  114

We now turn to multiplication, and prove the remarkable fact that formal polyno-mial multiplication works. This is, of course, the main justification for the exponentialnotation and the normal form.

First we prove a special case which can be thought of as the infinite distributive

law.Lemma 10.88. ωb

i<α ωairi

=

i<α ωb+airi.

Proof. We use induction on α. If  α = β + 1 then we have

ωb

i<α

ωairi

= ωb

i<β

ωairi + ωaβrβ

= ωb

i<β

ωairi

+ ωbωaβrβ

by the ordinary distributive law. By the inductive hypothesis and theorem 10.74 thisissumi<αωb+airi + ωb+aβrβ =

i<α ωb+airi. Of course we need the fact that addition

preserves order so that (b + ai) is also a strictly decreasing sequence for the above to

make sense.Now suppose α is a limit ordinal. We then compute the product using the standard

representations ωb = {0, ωbs}|{ωbt} and

i<α ωairi = {i<β ωairi−ωaβε}|{i<β ωairi+ωaβε}. In order to simplify the notation, if the latter is written in the form d ={d}|{d} then both d − d and d − d have the form ωaβε + c where |c| ωaβ . Hencefor ε1 < ε < ε2 they are between ωaβε1 and ωaβε2. We can write the product in theform

{ωbd, ωbd+ωbs (d − d) , ωbd−ωbt (d − d)}|{ωbd, ωbd+ωbt (d − d) , ωbd−ωbs (d − d)}.

Since ωb ωb ωb , we obtain by elementary reasoning with orders of magnitudethat

ω

b

t (d − d) + ω

b

s (d − d) ≥ ω

b

ε1 > ω

b

(2ε2)) ≥ ω

b

(d − d) .

Hence ωbd−ωbt (d − d) ≤ ωbdωbs (d − d). Similarly ωbt (d − d)+ωbs (d − d) ≥ωb (d − d); hence ωbd − ωbs (d − d) ≤ ωbd + ωbt (d − d). Therefore by cofi-nality ωbd can be expressed in the form {ωbd, ωbd + ωbs (d − d)}|{ωbd, ωbd −ωbs (d − d)}. Let d1 and d1 be the lower and upper elements respectively corre-sponding to ε1 for the same β . Then

ωb (d1 − d) = ωb (ωaβ (ε − ε1)) > ωbsωaβ (ε2) ≥ ωbsy (d − d) .

Hence ωbd1 > ωbd + ωbs (d − d). Similarly ωb (d − d1 ωbs (d − d). Hence ωbd1 ≤ωbd + ωbs (d − d).

Thus again we can simplify by cofinality and obtain ωbd = {ωbd}|{ωbd}.Finallt we can use the inductive hypothesis and obtain that

ωbd = {ωb

i<β

ωairi − ωaβε

}|{ωb

i<β

ωairi + ωaβε

} = {

i<β

ωb+airi−ωb+aβε}|{i<β

ωb+airi+ωb+aβε},

Page 115: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 115/133

Copyright c 2012, by Mikael Astner  115

which by definition is

i<α ωb+airi.We are now ready to consider the formal polynomial multiplication. If x =

i<α ωairi

and y =

i<β ωbisi then we define the formal product x·y to be

i<α

j<β ωai+bjrisj .

By lemma 10.57 each exponent ai + bj occurs only finitely many times and the set of all

ai + bj is well-ordered. Hence the expression has meaning. (To be technical, when weconsider an expression such as

i<α ωairi we are applying the lemma to the positive

elements a0 − ai. This is adequate because we are dealing with binary products only.In more general situations we want a0 to be 0 to avoid trouble.)

It’s well-known and easy to verify that with respect to formal polynomial multipli-cation and addition one gets a ring. In fact, it’s an ordered ring with respect to theordering we have.

Theorem 10.89.

i<α

ωairii<β

ωbisi = i<αj<β

ωai+bjrisj ,

i.e. the product agrees with the formal product.

Proof. Again we use induction. Also we tentatively use the symbol · for formal multi-plication.

First suppose that either α or β  is a non-limit ordinal. Assume α = γ  + 1. Thesame argument applies if  β  = γ + 1. Then

i<α

ωairi

i<β

ωbisi

=

i<γ 

ωairi + ωaγrγ

i<β

ωbisi

=

i<γ 

ωai

rii<β

ωbi

si + ωaγ

rγ i<β

ωbi

siby the ordinary distributive law. We now apply the inductive hypothesis to the left ad-

dend and lemma 10.88 to the right addend to obtain

i<γ 

j<β ωai+bjrisj +

i<β ωaγ+birγ si

.

The argument is now completed by theorem 10.87 which tells us that formal additionworks.

Now suppose that α and β  are both limit ordinals. We simplify the notation asfollows: y =

i<α ωairi,

i<β ωbisi and y0, z0 are lower or supper elements in the

representation of  y and z respectively in the basic definition. Then a typical lower- orupper element of  yz has the form yz0 + y0z − y0z0. (As at times in the past we use aunified notation since the four kinds of terms involved are deal with similarly.) By theinductive hypothesis this may be written y

·z0+y0

·z

−y0

·z0 = (y

·z)

−y

−y0·z

−z0by ordinary algebra. (Recall that we already know that surreal addition agrees with

formal addition.) Recall now that we get a lower element for yz if and only if  y0 andz0 are on the same side.

Page 116: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 116/133

Copyright c 2012, by Mikael Astner  116

Now y−y0 has the form ±ωaγε1 +c1 for some γ < α, some positive real ε1 and someδ < β , some positive real ε2, and some |c2| ωaδ . There

y − y0

· z − z0

has theform ±ωaγ+aδε1ε2 + c3, where |c3| ωaγ+aδ . By the sign rule for multiplication forlower elements we have a plus in front of  ωaγ+aδ and for upper elements a minus. By

mutual cofinality we can now write yz = (y · z − ωaγ+aδ

ε) | (y · z + ωaγ+aδ

ε). (Mutualcofinality follows from the observation that if  |c1|, |c2| ωa and r1 < r2 are two realnumbers then ωar1 + c1 < ωar2 + c2.)

We must now show that the right-hand side is y·z. Now y·z = (y · z − ωaµε) | (y · z + ωaµε),where aµ is a typical exponent in the series for y · z. This again follows by cofinality.By lemma 10.84 this is valid even if  y · z has a last term. Now every exponent y · zhas the form aγ  + aδ (though the converse isn’t necessarily valid because of the possi-bility of cancellation). Hence by cofinality y · z does equal the right-hand side in therepresentation of  yz, i.e. y · z = yz.

Remark 10.90. Note that in view of the above remark about the converse we do not have mutual cofinality. Fortunately, since the required inequality is trivially satisfied,

we don’t need it. At any rate, although very often it makes no difference, in generalwe must be careful as to which cofinality theorem is being used. For example, in thefirst part of the proof, we need mutual cofinality since otherwise we would require aninequality which isn’t at all obvious.

Theorems 10.87 and 10.89 give us a powerful tool for dealing with surreal numbers.In fact, for many purposes we can simply work with these generalized power series andignore what surreal numbers are in the first place. This is an example of the whole spiritof abstraction in mathematics. However, there are limits to what can be accomplishedby general power series methods since the surreal numbers are somewhat special. Note,for example, that the class of exponents is precisely the class of all surreal numbers,which in itself is unusual.

Let us see what the power series methods accomplish. First we have an alternativeway of dealing with inverses and square roots which is much easier than the directmethods used in section 10.2. Let us consider, for example, the inverse. The essentialidea is as follows. Let x = ωa0r0

i<α ωbisi

where bi = ai − a0 and si = ri

r0. Since

the inverse of  ωa0 is ω−a0 and since r0, of course, has an inverse, it suffices to find theinverse of expressions of the form

i<α ωbisi where b0 = 0 and s0 = 1, i.e. of series

which begin with 1. In fact, if 1 +

i<α ωairi is a series beginning with 1, we getthe inverse by formally substituting

i<α ωairi for x in 1 − x + x2 − x3 + .... First,

by lemma 10.57 this leads to a series which has meaning so that we obtain a surrealnumber. Then, theorems 10.87 and 10.89 guarantee that this is the inverse of 1 + xsince

1 − x + x2 − x3 + ...

(1 + x) = 1 for ordinary formal series.

There is another method of using generalized power series to obtain existence resultswhich doesn’t depend on familiarity with identities for ordinary formal series. We shallapply this method to show that every positive surreal number has an nth root for any

integer n. The same method can also be used to prove the existence of inverses. It’s ageneralization of the well-known procedure for ordinary formal power series∞

i=0 aixi

where the coefficients of the various powers of  x are obtained recursively. I like this

Page 117: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 117/133

Copyright c 2012, by Mikael Astner  117

method because of its elementary self-contained algebraic nature. We avoid any use of analysis and in particular the binomial theorem for fractional exponents.

Theorem 10.91. Every positive surreal number has an nth root for every positiveinteger n.

Proof. Let

i<α ωairi be a surreal number. This can be expressed in form ωa0r0

1 +

0<i<α ωbisi

.

Now ωa0r0 has an nth root, namely ωa0n n

√ r0, from theorem 10.74 and the fact that r0

is positive. Hence it suffices to consider series which begin with 1.

Consider a series 1+

i<α ωairi. We shall express it in the form

1 +

i<β ωxiyi

nby determining xi and yi inductively. (For ordinary power series α = β  = ω and theai’s and xi’s are simply the integers. In our case the situation is slightly tricker. Forexample, β  might be different from α.)

Suppose that

1 +

i<γ ωxiyi

nagrees with 1+

i<α ωairi for all terms ωz where

z ≥ xi for some i < γ , but

1 +

i<γ ωxiyi

n= 1 +

i<α ωairi. (Recall that in our

generalized power series the exponents are decreasing.) Then we claim that there exists

x and y such that x < xi for all i < γ  and that 1 + i<γ ω

xiyi + ωxyn agrees with

1+ ∼i<α ωairi for all terms ωx where <≥ x. Furthermore, if all xi are finite linearcombinations of the ai with integral coefficients, then so is x. (The fact that xi isnot simply the same as ai makes the process trickier than the one for ordinary powerseries.)

In fact, let x be the first exponent for which the coefficients ωx in

1 +

i<γ ωxiyi

nand 1 +

i<α ωairi differ. Then x < xi for all i, and the respective coefficients, s and

t, satisfying s = t. Note that s or t may be 0. Now consider an expression of the form1 +

i<γ ω

xiyi + ωxyn

. This agrees with 1 +

i<γ ωairi for all terms ωz where

z ≥ xi for some i. The earliest term for which there is possible disagreement is ωx

and, in fact, its coefficient is s + ny. Since s = t there exists a nonzero y satisfying

s + ny = t. /Uniqueness doesn’t concern us.) with the above values for x and y theclaim is clearly satisfied. Since x is either of the form ai or a sum of  xi’s, the secondcondition is clearly satisfied.

We now assume that 1 +

i<α ωairi doesn’t have an nth root and obtain a con-tradiction. We use the claim above to define a sequence (xi, yi) inductively where iruns through the class of ordinals by letting (xγ , yγ ) be the pair (x, y) obtained above.Since later terms have no effect on the coefficients of earlier exponents the inductionworks. However, since each xi is a finite sum of ai’s, the collection of possible xi’s is aset so that eventually the sequence xi must terminate. This contradiction proves thetheorem.

Remark 10.92. It’s interesting to compare this with a classical situation which one is

interested in power series which permits fractional exponents but only series of lengthω. In the case one has the burder of showing that the sequence of exponents approaches∞. Fortunately, we don’t have this problem. For example, conisder 1 + ω−1 + ω−ω +

Page 118: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 118/133

Copyright c 2012, by Mikael Astner  118

ω−ω−1 + ... + ω−ω−n + .... This is a series of length ω. If we compute the square root

using the proof of theorem 10.91 we begin with 1+ 12ω and then obtain 1+

1

− 18ω

,

etc. It’s clear that we would need a sequence of length greater than ω. (For example,it would take us ”forever and a day” to reach the ω−ω term!) However, the proof shows

that we must eventually terminate at some ordinal.

10.6.1 Application to real closure

We shall use the same technique as in the previous section to show that the class of surreal numbers forms a real closed field. Specifically, we adapt the classical Hensel’slemma argument to our transfinite series.

Lemma 10.93 (Variation on Hensel’s lemma). Let f (x) = xn +n

i=1 hixn−i be a

polynomial of degree n in the surreal number where hi has the form ri + di with ri realand di infinitesimal. (Thus all terms in the series expansion of  hi have non-positiveexponents.) Suppose, furthermore, that f (x) = xn +

ni=1 hixn−i factors into two

relative prime polynomials P 0 and Q0. Then f (x) = xn +

ni=1 hixn−i factors into two

polynomials, P  and Q, where P  and Q have the same degrees as P 0

and Q0

respectivelyand the first terms of the series expansions of the coefficients of  P  and Q are the sameas the coefficients of  P 0 and Q0 respectively.

Proof. First, by regrouping we regard polynomial f (x) as a series of the form

i<α ωaisiwhere si is a polynomial over the reals of degree at most n − 1 for i > 0. Since a finiteunion of well-ordered sets is well-ordered, the ai’s are well ordered. By hypothesisa0 = 0 and s0 = f (x) = xn +

ni=1 hixn−i.

Let the degrees of  P 0 and Q0 be r and s respectively so r + s = n. We now extendP 0 and Q0 in an inductive manner similar to that in our construction of the nth roots.

Suppose

i<β ωbiP i

i<β ωbiQi

agrees with f (x) for all exponents y such that

y ≥ bi for some i < β  where P i’s and Qi’s are polynomial of degree at most r − 1 and

s

−1 respectively for i > 0, but i<β ωbiP ii<β ωbiQi

= f (x).

We shall find bβ such that aβ < ai for all i < β  and polynomials P β and Qβ of de-

grees at most r−1 and s−1 respectively so that

i<β ωbiP i + ωbβP β

i<β ωbiQi + ωbβQβ

agrees with f (x) for all exponents y such that y ≥ bβ .

Let bβ be the first exponent x for which the coefficients of ωx in

i<β ωbiP i

i<β ωbiQi

and f (x) differ. Then bβ < bi for all i. Now consider the series

i<β ωbiP i + ωbβP β

and

i<β ωbiQi + ωbβQβ where G and H  are polynomials to be determined later.

Then

i<β ωbiP i + ωbβG

i<β ωbiQi + ωbβH 

agrees with f (x) for all terms up

to ωbβ . The condition for agreement for the coefficients of  ωbβ is an equation of theform HP 0 + GQ0 = S  for some polynomial S  of degree at most n − 1 because of thebounds on the degrees of P i and Qi. Since P 0 and Q0 are relatively prime there existsG of degree at most r and H  of degree most r

−1 satisfying the above equation. Let

P β = G and Qβ = H . Also, as in the case of  nth roots, if the bi’s are finite sums of theai then so is bβ . The rest of the argument is identical to the argument for nth roots.

Page 119: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 119/133

Copyright c 2012, by Mikael Astner  119

We regroup at the end so that we end up with monic polynomials of degree rand s over the surreal. (The justification for regrouping is the same as for ordinarypolynomial in two variables. Of course the existence of bounds for the degrees of thepolynomials is crucial for the regrouping to make sense.)

10.93 

We are now ready for the main result of this section.

Theorem 10.94. Every polynomial equation of odd degree with surreal coefficientshas a root. Furthermore the exponents which occur in the series expansion of the rootsare rational linear combinations of the exponents which occur in the series expansionsof the coefficients of the polynomial.

Proof. Let P (x) = b0xn + b1xn−1 + b2xn−2 + ... + bn be a polynomial of odd degree.We may assume that b0 = 1 and b1 = 0 by making the substitution x = y = − b1

n. The

polynomial now has the form xn+

ni=2 aixn−i. Now suppose that the normal form of ai

begins with ω

ci

ri. Assume that the polynomial is now simply x

n

. Let c = max cii fori = 2, 3,...,n. We now make the substitution x = yωc. The equation becomes (yωc)n +ni=2 ai (yωc)

n−i= 0, which can be written in the form yn +

ni=2 aiω−icyn−i = 0.

The coefficient of  yn−i begins with

ωciri

ω−ic. By choice of c we have cii

≤ c withequality for at least one i, i.e. ci − ic ≤ 0. Thus all coefficients begin with terms withnon-positive exponents and at least one term begins with exponent 0.

If an odd degree polynomial is factored into irreducible factors at least one of itsfactors must have odd degree. Hence to prove the theorem it’s enough to show thatan irreducible polynomial of odd degree must have degree one. If we apply the aboveconstruction to an irreducible polynomial the polynomial remains irreducible. Henceby the contrapositive of lemma 10.93 the real part of the polynomial doesn’t have tworelatively prime factors, i.e. has the form (x − a)n or (x2 + bx + c)n. Since the degreeis odd the latter possibility is ruled out. Hence the real part of the polynomial has the

form (x − a)n

. Since the coefficient polynomial has the form xn−

1

is 0, it follows thata = 0. Therefore the real part of the polynomial has the form xn. This contradicts thefact that at least one term besides xn begin with exponent 0.

Since the contradiction leads to a contradiction, the polynomial itself must be xn.(We aren’t talking about the real part.) Since the polynomial is irreducible n must be1.

The last part of the theorem follows from the same proof. For this purpose werestrict ourself to surreal numbers whose exponents are of the form referred to in thestatement of the theorem.

10.6.2 Sign sequence

Or aim in this section is to obtain a formula which expresses the sign sequence fori<α ωairi in terms of the sign sequences for ai and ri.

Page 120: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 120/133

Copyright c 2012, by Mikael Astner  120

It’s natural to look first for the sign sequence for ωa. However, in order to carrythrough an induction we need to know the sign sequence for certain special finite sumsalong the way. Thus caution is required with the induction in order to avoid circularreasoning.

Specifically, we deal with finite sums of the form i<α ωai

ri, where ai+1 is aninitial segment of ai for all i and ri is an integer or a dyadic fraction with numerator 1.(It’s understood that the ai’s are strictly decreasing, since we are working with normalforms.)

We first need some lemmas which are roughly variations of lemmas 10.83 and 10.84.In proving these lemmas we used the fact that r = {r − ε}|{r + ε}, which in the casewhere r is dyadic involves throwing out information. (This representation is cofinal butnot mutually cofinal with the standard representation of  r.) We now see what happensif we don’t throw out information. In order to cut down on duplication later we shalldeal with a general dyadic although for our immediate purpose we need only integersand dyadic fractions with numerator one.

Recall that by applying cofinality to the canonical representation of a dyadic fractionr we obtain r in the form

{s

}|{t

}if r isn’t an integer and in the form

{r

−1

}|∅ if r is a

positive integer. For the special case where r = 12n with n > 0 we have 12n = {0}|{ 12n−1 }.

Lemma 10.95.

i. If  r = {r}|{r}, and the set of lower elements of  a is non-empty, then ωar ={ωar + ωan}|{ωar − ωan} where a as usual is a typical lower elements in thecanonical representation of  a and n is arbitrary positive integer. If the set isempty then ωar = {ωar}|{ωar}.

ii. If n is a positive integer greater than one then ωan = {ωa(n −1) + ωam}|{ωaε}if the set of lower elements of  a is non-empty, where m is arbitrary positiveinteger, a is as before, a is as usual a typical upper element in the canonicalrepresentation of  a, as ε is an arbitrary positive dyadic fraction with numerator1. If the set is empty then ωan is {ωa(n − 1)}|{ωaε}.

Proof. i. We compute ωar as in the proof of lemma 10.83.Again ωa = {0, sωa}|{tωa}. Hence ωar is

{ωar, ωar+

sωa

(r − r) , ωar−

tωa

(r − r)}|{ωar, ωar−

sωa

(r − r) , ωar+

tωa

(r − r)}.

By cofinality we can eliminate the third terms among the upper and lower ele-

ments and replace terms such as ωar +

sωa

(r − r) by ωar + ωam. Also if 

the set of a is non-empty we can eliminate the first terms by cofinality. Thus weget the desired form.

ii. In this case there is no r. The computation is the same except that now we needthe third term of the upper elements since the other terms aren’t present. Sinceωa ωa, by cofinality this term may be replaced by ωaε.

Note that negative integers can be handled by sign reversal. Also, r = 0 fordyadic fractions with numerator 1 so that the formulas simplify. If the set of  a’sis non-empty then a typical lower element is ωan. Otherwise 0 is the only lowerelement.

Page 121: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 121/133

Copyright c 2012, by Mikael Astner  121

We are now ready to consider finite sums. A conventient representation for n-foldsums has already been mentioned in section 10.2. In fact n

i=1 ai can be expressed as

{a1 + a2 + ... + aj + ...an}|{a1 + a2 + ... + ak + ...an} where 1 ≤  j ≤ n and 1 ≤ k ≤ n.We now apply this to sums of the form

i<n ωairi referred to earlier and use cofinality

to simplify. It’ understood that ri = 0 for all i.

Lemma 10.96. Let

i<n ωairi be a surreal number where for all i, ai+1 is an initialsegment of ai less than ai, and ri is a dyadic fraction. Then

i<n ωairi can be expressed

in the form F |G where a typical element x ∈ G is obtained as follows:

i. If  rn−1 isn’t a positive integer let rn−1 be the minimum upper element in thecanonical representation of  rn−1. Then

x =

i<n−1

ωairi + ωan−1rn−1 − ωan−1m

where an−1 is a typical lower element in the canonical representation of  an−1 andm i an arbitrary positive integer. (If there’s no an−1 the last term is omitted.)

ii. If rn−1 is a positive integer but ri isn’t a positive integer for at least one i, thenlet j be the largest index for which rj is not a positive integer. Then

x =i<j

ωairi + ωajrj − ωajm

where rj is the minimum upper element in the canonical representation of  rj, ajis a typical element in the canonical representation of  aj , and m is an arbitrarypositive integer.

iii. If ri is a positive integer for all i, and a0 isn’t an ordinal then x = ωa0 δ  wherea

0is a typical element in the canonical representation of  a0 and δ  is a positive

dyadic fraction with numerator one.

iv. If ri is a positive integer for all i and a0 is an ordinal then G is empty. (In fact,such an x is clearly and ordinal.)

A typical element of  F  is obtained similarly.

Proof. This follows easily from lemma 10.95 and the representation of  n-fold sumsdiscussed earlier using cofinality. Specifically, in the expression

i<n ωairi we replace

ωakrk by x for some k where x is an element in the representation of  ωakrk given bylemma 10.95. We consider G. f  can be handled similarly.

In part i the situation is analogous to that of lemma 10.84, i.e. by the lexicographicalorder, the terms obtained by replacing ωan−1rn−1 by x are cofinal. This gives us theresult. For part ii let us analyze more explicitly what happens when ωakrk is replaced

by x. If  rk isn’t a positive integer i<n ωairi is replaced by i<k ωairi + ωakrk −

ωakm +

k<i<m ωairi. In particular, the first term altered is the ωak term. If  rk is a

positive integer the sum is replaced by

i<k ωairi + ωak δ +

k<i<n ωairi. The higher

Page 122: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 122/133

Copyright c 2012, by Mikael Astner  122

term ωak is introduced. Thus in either case when ωakrk is replaced by x the term thatis altered is at least the ωak term.

Now if k = j, it follows that ωaj term is the first one that is altered. For k < j theterm that is altered is necessarily higher (either ωak or still higher). Thus the terms

obtained by altering ωaj

rj are cofinal with respect to the terms obtained by alteringωakrk for k < j. (So far this may be regarded as a more detailed proof of part i if 

 j is replaced by n.) Now suppose k > j. Recall that ak is an initial segment of  ajso is ak. Furthermore ak > aj . Since ak < aj we appear to have an inequality goingin the wrong direction to apply transitivity. Nevertheless the inequality follows fromthe nature of initial segments. Suppose ak has length α. Since ak > ak, we have of ak(α) = −. Since ak is an initial segment of  aj , then aj(α) = −. Hence ak > aj . Thusthe terms obtained by altering ωajrj are cofinal with respect to the terms obtained byaltering ωakrk for k > j .

We have shown that the upper terms may be simplified to

i<j ωairi + ωajrj −ωajm +

j<i<n ωairi, i.e. by cofinality the only term we need to alter is the jth

If  b = max

aj , ai+

then for m sufficiently high ωbm − ωajm > |j<i<n ωairi|.

Hence it’s clear by mutual cofinality that the above expression can be simplified toi<k ωairi + ωajrj − ωajm. This completes the proof of part ii.

In part iii all sums obtained have the forum

i<k ωairi + ωak δ +

k<i<n ωairi.During our proof of part ii we already saw that ak is an initial segment of  a0 and that

ak > a0. Hence the above expression is mutually cofinal with ωak δ . When k = 0 this

is simply ωa0 δ . Finally, since every element of the form ak is also of the form a0 , weneed only terms which k = 0, thus completing the proof of part iii.

Part iv follows immediately since there are no a0 . (Recall that a0 being an ordinalis equivalent to a0 consisting only of pluses which is in turn equivalent to the non-exsistence of any a0 .)

It’s increasing to contrast the situations in parts i and iii with regard to cofinality.In part i the last term contributes the cofinal part whereas the reverse is true in partiii.

We are now ready to determine the formula for the sign sequence for ωa. Let aαbe the number of pluses in the initial segment of  a of length α, and let a+ be the totalnumber of pluses in a.

Theorem 10.97.

i. The sign sequence of ωa is as follows. We begin with a plus. Then for each α wehave a string of  ωaα+1 pluses if  a(α) = + and ωaα+1 minuses if  a(α) = −.

ii. The sign sequence of ωan, where n is a positive integers greater than 1, is obtainedby beginning with the sequence for ωa and following it by ωa+(n−1) pluses. Thesign sequence of ωa 1

2n where n is a positive integer, is obtained by beginning with

the sequence for ω

a

and following it by ω

a+

n minuses. For negative coefficientswe use sign reversal. (Note that we still count the pluses in a since a is unaltered.)

iii. The sign sequence of 

i<n ωairi, where ri is either an integer or a dyadic fractionwith numerator one, and, for all i, ai+1 is an initial segment of  ai, is obtained by

Page 123: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 123/133

Copyright c 2012, by Mikael Astner  123

 juxtaposing all the modified  sign sequences for each i, where the modified signsequence of  ωairi is obtained as follows: For i = 0 we use the rule in part ii.For i > 0 we apply rule ii to the element ωbiri, where bi is obtained from ai byignoring all minuses.

Before proving the theorem we illustrate several examples. First let a = (+−). Whenωa begins with a plus. Then the first term in a, which is +, gives rise to ω0+1 = ωpluses and the second term, which is −, gives rise to ω1+1 = ω2 minuses. So altogetherwe have 1 + ω = ω pluses followed by ω2 minuses . (In juxtaposing sequences ordinaladdition is somewhat relevant.) Incidentally, since a = 1

2 this is√ 

ω (by the law of exponents), so that this is consistent with an example which was done in section 10.3.

Now let a0 = (−+−++), a1 = (−+−+), and a2 = (−+−). We compute ωa05 + ωa1 14 −

ωa23. By rule i we begin with a plus, ω minuses, ω pluses, ω2 minuses, ω2 pluses, andfinally ω3 pluses. (The group of  ω2 pluses gets absorbed by the ω3 pluses.) By rule iithis followed by ω34 pluses contributed by the 5. The contribution from ωa1 1

4 follows.By rule iii this is the sequence obtained from ωb1 1

4 where b1 = (++). This consists of ω2 pluses followed by ω22 minuses. (Note that the contribution of the 1

4 is the samefor ωa1 1

4

and ωb1 1

4

.) Finally we have ω·

3 minuses because of the sign reversal sincewe would have had ω · 3 pluses if the term was +ωa23.

The example suggests that the formula can be simplified if we consider blocks of pluses and minuses in a surreal number a rather than individual signs. In fact, this canand will be done later. However, the present for is appropriate for the inductive proof.

Proof. We do this by induction on the length of the sign sequence g(x) obtained fromx =

i<n ωairi by the formula in the statement of the theorem. We want to prove

that g(x) = x.First we show that g is one-to-one. Suppose x =

i<m ωairi, y =

i<n ωbisi, and

g(x) = g(y). Assume first that a0 = b0. The g(x) and g(y) have the same tail afterdiscarding the initial segments corresponding to ωa0 = ωb0 . Now a+

0 , a+1 ,...,a+

m−1 is adecreasing sequence of ordinals since ai+1 < ai and ai+1 is an initial segment of  ai.

Hence the length of the tail has the form i<mωa+

i ni.

Thus the length of the tail determines a+i and ni uniquely. Furthermore, a+

i de-termines ai uniquely. This is because ai is obtained from a0 by stopping at a plus-Finally, the signs of the various strings determine whether ri is an integer or a dyadicwith numerator 1 and the value of  ni determines ri. Thus x = y.

Now we rule out the possibility that a0 = b0. If neither a0 nor b0 is an initialsegment of the other, then clearly discrepancy between g(x) and g(y) arises at thepoint where a0 and b0 differ. Suppose without loss of generality that a0 is an initialsegment of  b0, and consider the tail following the sequence for ωa0 . The length of the

tail of g(x) has the form

i<m ωa+i ni which is less than ωa+0 +1. The tail of g(x) begins

with a string of  ωa+0 +1 identical signs. So certainly g(x) = g(y).It’s clear that g(x) has finite length only if  x has the form ω0r = r, in which case

the formula is consistent with what we already know about dyadic fractions.

Now let x = ωa. We look at the canonical representation g(x) = F |G. An elementof  F  is obtained by stopping just before a plus in the sequence for g(x). If a consistsonly of minuses then the only plus in x is at the beginning so F  = {0}. Otherwise, a

Page 124: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 124/133

Copyright c 2012, by Mikael Astner  124

plus in x rises from a plus in a at some place α where a(α) = +. Let b be the initial

segment of a of length α. There are ωb++1 pluses in x arising from that plus in a. Thenthe typical element of  F  is obtained by juxtaposition of the sequence arising from bwith c pluses where c < ωb++1. By cofinality we may just as well limit ourselves t

values of  c of the form ωb

+

n for positive integers n. But by case ii and the inductivehypothesis, such an element is ωb(n + 1). Similarly an element of  G is obtained bystopping just before a minus in the sequence for g(x). Let b be an initial segment of a of length α where a(α) = −. Then (again using cofinality) we obtain the typical

element of  G by juxtaposition of the sequence arising from b ωb+n minuses. By theinductive hypothesis this is ωb 1

2n . Therefore

g(x) = {0, ωan}|{ωa 1

2n} = ωa = x.

Next let x = ωan, where n is a positive integer larger than 1. Since g(x) is obtainedfrom g(ωa) by adding pluses only, both have the same upper elements. For the lowerelements we may limit ourselves to the contribution of the term n by cofinality. Thuswe add on c pluses to the sequence for ωa, where c < ωa+(n

−1). Again by cofinality we

assume that c has the form ωa+(n−1)+ωbm where b is an ordinal less than a+, and m isa positive integer. Now as a runes through all the initial segments of  a less than a, a+

runs through all ordinals less than a+, so c has the form ωa+(n−2)+ ωa+m. But this isexactly g(ωa(n − 1) + ωam) = ωa(n − 1) + ωam by the inductive hypothesis. (If thereare no terms a this reduces to ωa(n − 1).) In any case the upper and lower elementsfor g(x) are just what we need by lemma 10.95 ii to deduce that g(x) = ωan = x.

Now let x = ωa 12n . Since this is similar to the previous case it suffices to outline the

argument. g(x) has the same lower elements as g(ωa) = ωa. The upper elements are

obtained by adding on ωa+(n − 1) + ωa+m minuses. By the inductive hypothesis thisis ωa 1

2n−1 − ωam. Since 12n = {0}| 1

2n−1 we have just what we need by lemma 10.95 ito deduce that g(x) = ωa 1

2n = x.We now let x =

i<n ωairi. The arguments is a straightforward application of 

lemma 10.96. Let g(x) = F |G be the canonical representation of  g(x). Suppose firstthat r isn’t a positive integer but ri is a positive integer for i > j. In order to get setG cofinal in G, it suffices to consider a set of minuses in g(x) which is arbitrarily farout. We obtain this from the contribution of the term ωajrj . Thus a typical elementof  G is obtained by juxtaposing the sequence obtained from the truncated sum up tothe jth term with a typical upper element in the canonical representation of  g(ωbjrj),where bj is obtained from aj as in the statement of part iii of the theorem. In otherwords, the typical element of  G has the form

g

i<j

ωairi + ωajrj − ωajm

where rj , aj, and m are as in the statement of lemma 10.96 ii. This follows by

earlier part of the proof dealing with monomials. By the inductive hypothesis this isi<j ωairi + ωajrj ωajm.

If  ri is a positive integer for all i, then g(x) is obtained from g(ωa0) by adding onpluses only. Hence g(x) and g(ωa0) have the same upper elements, namely ωa0 ε.

Page 125: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 125/133

Copyright c 2012, by Mikael Astner  125

A similar argument applies to the lower elements. In all cases the upper and lowerelements obtained for g(x) are just what we need by lemma 10.96 to deduce thatg(x) = x. (For convenience we unified various cases. For example, in lemma 10.96part i may be regarded as a spacial case of part ii, and part iv of part iii. There is a

pedagogical advantage in separating cases at the beginning for the sake of concreteness,but at a later stage it’s repetitious and tedious.)

we are now ready to determine the sign sequence of a general sum

i<α ωairi.First, we define what we mean by a reduced sequence aβ of  aβ, where β < α.

The reduced sequence a0β of  aβ is obtained from aβ by discarding the following

minuses occurring in aβ:

I. if aβ(δ ) = − and there exists γ < β  such that (∀x ≤ δ )(aγ (x) = aβ(x)), then theδ th minus is discarded

II. if β  is non-limit ordinal, aβ has aβ−1 followed by a minus as an initial segment,and rβ−1 isn’t dyadic, then the last minus is discarded.

For example, if  a0 = (+−++) and a1 = (+−+−) then a◦1 = (++−), i.e. the first minusis discarded but not the second one.

If  a0 = (+++) and a1 = (+++−) then a◦1 = (+++) if  r0 isn’t dyadic but a1 = a1 if  r0

is a dyadic fraction.Roughly speaking, we ignore minuses which occur earlier; however, the second part

gives a special situation where even a new minus is ignored.

Theorem 10.98.

i. The sign sequence for ωar for positive real r is obtained bu juxtaposing thesequence for ωa with the sequence obtained from r by omitting the first plus andrepeating sign in r, ωa+ times.

ii. If r is negative the sign sequence in ii is reversed.

iii. The sign sequence for i<α ωai

ri is obtained by juxtaposition of the sign sequencefor the successive ωa◦i ri where a◦i is the reduced sequence of  ai.

Remark 10.99. Theorem 10.97 is, of course, a special case of theorem 10.98. In fact,recall that in theorem 10.97 we were interested primarily in the sign sequence for ωa

and therefore used only those sums which were needed for the induction.We illustrate theorem 10.98 i with a simple example. Consider ωfrac12 3

4 . Here

a = (+−) and r = (+−+). We already saw earlier that ω12 gives rise to ω pluses followed

by ω2 minuses. Since a+ = 1 the contribution from r is ω minuses followed by ω pluses.As a simple example of theorem 10.98 iii consider ω

12 + ω

18 . 1

8 = (+−−−). Since12 = (+−) we ignore the first minus in determining the contribution of the term ω

18 .

This therefore becomes ω pluses followed by ω22 minuses.

Proof. We first consider the case where the surreal number has the form i<n ωai

riwith ri arbitrary dyadic. This is a slight generalization of theorem 10.97. The proof that g is one-to-one extends immediately to this case. For example, the sign of thestrings still determine ri uniquely. Recall also that lemmas 10.95 and 10.96 deal with

Page 126: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 126/133

Copyright c 2012, by Mikael Astner  126

general dyadic coefficients. This gives us a head start in imitating the proof of theorem10.97.

The subcase where x = ωar with r a positive dyadic fraction but neither an integernor a dyadic fraction with numerator one is similar to the cases x = ωan and x = ωa 1

2n

dealt with in the proof of theorem 10.97. In fact, let r = {r}|{r}. Note that ris the initial segment of  r obtained by stopping just before the last plus and r bystopping just before the last minus. By hypothesis r begins with a plus followingwhich there is at least one plus and one minus. Hence in the canonical representationof  g(x) we obtain cofinality on both sides by limiting ourselves to the contribution of the term r. By further use of cofinality a typical upper element is obtained by addingon ωa+m minuses to g(ωar) and similarly for typical lower elements. By the inductivehypothesis such a typical upper element is ωar − ωam. This and a similar result forlower elements gives us what we need by lemma 10.95 i to deduce that g(x) = ωar = x.

For finite sums the proof is identical to that of theorem 10.97 iii since no use ismade there of assumption that the dyadic ri have numerator one.

Next, we consider the case x = ωar where r is not dyadic. The last part of the proof that g is one-to-one, which depends only on length, no longer works. For example g(ω2)

and g(ω√ 2) have the same length ω2 by the formula. (Of course, if one is interested,the proof can be extended to present case by nothing that even if the length of the tail

of  g(x) is ωa+0 +1 the signs are necessarily not all alike so g(x) = g(y). On the other

hand, this isn’t too important now since it’s no longer urger to know in advance thatg is one-to-one.)

Let r = R|R be the canonical representation. By lemma 10.83 ωar may beexpressed as ωaR|ωaR since r isn’t dyadic. (Note the crucial simplification for non-dyadic coefficients where we get by with lemma 10.83 rather than lemma 10.95 i.)Consider the lower elements in the canonical representation of  g(x). Since r doesn’t alast plus we obtain a cofinal subset by taking only those initial segments which stop

 just before a string of pluses which correspond to a plus in r. But this is precisely of the form g(ωar) = ωar since we have the result for dyadic coefficients. Similarly atypical upper element has the form (ωar). Hence g(x) = ωar

|ωar = ωar = x.

The fact that it’s possibly for x to be an initial segment of  y and still have n(x) >n(y) has helped to complicate the proof so far. We were unable to use induction onn(x) which a priori  seems like the natural type of induction to use. Fortunately, forthe rest of the proof we can use induction on a quantity closely to n(x) instead of thelength g(x).

We now define en(x) (the reduced normal length of  g(x)). Let

i<α ωairi be thenormal form of  x. First we define h(i) for i < α.

a. If ri isn’t dyadic then h(i) = i + 1 providing i + 1 < α.

b. If ri is dyadic then h(i) is the least j exceeding i such that either rj isn’t dyadicor ∃k(i ≤ k < j ) and aj isn’t an initial segment of  ak, providing j < α. h(i) maybe undefined for some i.

We obtain a subsequence di of  α as follows:a. d0 = 0.

b. di+1 = h(di).

Page 127: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 127/133

Copyright c 2012, by Mikael Astner  127

c. If  β  is a limit ordinal then dβ = limγ<β dγ  providing limγ<β dγ  < α. Finally,rn(x) is the ordinal type of the sequence {di} (i.e. the least i : di is undefined).

Note that the definition says that if  h(i) = j then for i ≤ k < k + 1 < j, ak+1

is an initial segment of  ak. Hence necessarily j = i + n for a finite  n. Furthermorei<k<j ωakrk is a sum of the kid we considered earlier since again by the definition all

rk are necessarily dyadic if the sum doesn’t reduce to a monomial. Thus, informally,we obtain the reduced normal length of  x as a single term.

We now need a lemma which bears the same relation to lemma 10.96 that lemma10.84 bears to lemma 10.83. Let x have the form

i<α+n ωairi where for all i ≥ α, ai+1

is an initial segment of ai and ri is dyadic. I.e. x has the form

i<α ωairi + y where y

is a finite sum of the kind considered earlier. Note that rn(x) ≤ rn

i<α ωairi

+ 1.

Lemma 10.100. Let x have the normal form

i<α+n ωairi and express x as

i<α ωairi+y. Suppose that y is a surreal number which satisfies the hypothesis of lemma 10.96.Then x can be expressed in the form F |G where a typical element x of  G is obtained

as follows:a. If  y satisfies case i or ii of lemma 10.96 then x =

i<α ωairi + y where y is

as in the lemma.

b. If y satisfies case iii and (∃aα)(∀i(i < α +aα < ai)) then x =

i<α ωairi +ωaαδ .

c. If y satisfies case iii and no such aα exists or if  y satisfies case iv (in which casethere certainly doesn’t exist an aα) then x is a typical upper element in therepresentation of 

i<α ωairi as given by the definition if  α is a limit ordinal and

by lemma 10.84 if  α is a non-limit ordinal. A typical element of  F  is obtainedsimilarly.

Proof. Typical elements x are

i<α ωairi + y and z+ y where z is a typical upperelement of i<α ωairi in cases a and b terms of the former type are clearly cofinal by

the lexicographical order thus proving the lemma in these cases. In case c consider atypical element of the former type. This has the form

i<α ωairi + ωaαε.Then by definition of case c (∃ j)( j < α∧aj ≤ aα). Now

i<j ωairi+ωajε is clearly

less than

i<α ωairi + ωaαε for ε < ε by the lexicographic order. This guaranteesthat terms of the type z+ y are cofinal in this case. As usual, by cofinality such termsmay be replaced by z.

The distinction between case b and c can be expressed in terms of the sign sequencesof the ai’s. Recall that case c is characterized by the condition (∀aα)(∃aj)( j < α ∧aj ≤aα). Since the ai’s are decreasing and since the aα are initial segments correspondingto minuses in aα, an aj corresponds to aα by the above condition precisely if it eitherequals aα or has aα followed by a minus as an initial segment. Furthermore, the

existence of an aj satisfying aj < aα for a given aα is precisely condition I in thedefinition of reduced sequence (theorem 10.97) for discarding minuses for the minus aαcorresponding to this aα.

Page 128: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 128/133

Copyright c 2012, by Mikael Astner  128

Let us maintain the condition for case c but assume that for some aα there is noaj satisfying aj < aα. Then the corresponding aj satisfies aj = aα. Since there is noaj satisfying aj < aα, j must be an immediate predecessor of α, i.e. α = j + 1. This iscondition II in the definition of reduced sequence (theorem 10.97) except for the lack

of reference to the nature of  rα−1. Furthermore the minus corresponding to aα mustbe the last minus in aα.

We now have what we need for the main induction on rn(x). First suppose thatrn(x) is a non-limit ordinal. Let x =

i<α+n ωairi where ((∀i ≥ α)(ai+1 is an

initial segment of  ai, and ri is dyadic) or (n = 1 and ri is non-dyadic)). Then x =i<α ωairi + y where y is a finite sum of the kind considered earlier. By the inductive

hypothesis we may assume the result for

i < αωairi. We now use induction ong(y◦) where y◦ =

α<i<α+n ωa◦i ri. The argument is similar to the one used in the

proof of theorem 10.97; however there is a complication because of the need to considerreduced sequences. In this connection note the obvious fact concerning juxtapositionof sequences that if  A = F |G where F  and G are both non-empty then SA = SF |SG.

As before, we desire to show that g is one-to-one, but now we regard g as a functionof  y using the reduced sequence for fixed i<α ωairi. This can differ from the earlier

case only by the contribution of aα to the sign sequence since all minuses occurring in aifor i > α are automatically ignored. This it suffices to show that aα → a◦α is one-to-one.The immediate reaction may be that it’s unreasonable to expect this but recall that(∀i < α)(ai > aα) so that the apparently obvious way of getting counter-exampolesfails.

Specifically, suppose aα = bα but a◦α = b◦α. Assume aα( j) = bα( j) but aα(i) = bα(i)for i < j . Without loss of generality the dangerous  case occurs if aα( j) = minus and theminus is ignored on a◦α. Hence (∃β < α)(∀k < j )(aβ(k) = aα(k)). By the lexicographicorder, since bα < bβ = aβ , bα( j) is necessarily minus regardless of whether condition Ior II holds in the definition of reduced sequence, this leading to a contradiction.

Now the sign sequence for g

i<α+n ωairi

is the juxtaposition (g

i<α ωairi

)(g(y◦)).(Note that we already know by theorem 10.97 that g(y) = y and similarly for y, y◦,etc. but it’s convenient to maintain the notation g(y) for consistency of notation in

dealing with juxtaposition.)Suppose y is positive. A similar argument will apply if  y is negative. Suppose first

that g(y◦) contains a minus not contributed by aα. This corresponds to y in lemma10.100 a. We can obtain subsets cofinal in the canonical representation of  g(x) byconsidering pluses and minuses in the segment within g(y◦). Now if  y = {y}|{y}then y◦ = {y◦}|{y◦} and furthermore

(g

i<α

ωairi

)(g(y◦)) = {(g

i<α

ωairi

)(g(y◦))}|{(g

i<α

ωairi

)(g(y◦))}

by trivial reasoning with juxtaposition.By the inductive hypothesis this is {

i<α ωairi + y}|{

i<α ωairi + y} which is

i<α ωairi + y as desired by lemma 10.100a. It’s worth remarking that juxtaposition

works trivially in the argument here since the minuses ignored in y depend only on thenature of  ai for i < α so they are the same for all y and y.

Now suppose all minuses in g(y◦) are contributed by aα, i.e. by minuses in aαwhich aren’t ignored. Such a minus corresponds to an aα. Assume forst that case b

Page 129: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 129/133

Copyright c 2012, by Mikael Astner  129

of lemma 10.100 holds. Then by cofinality we may limit ourselves to such aα and the juxtaposition argument is identical to that of the earlier case.

The most subtle case occurs when case b isn’t satisfied. By remark 10.99 this canhappen only when α has the form j + 1 and the minus corresponding to aα = aj is the

last minus in aα. Since we are dealing with a minus in α which isn’t ignored, rα−1 isdyadic. (This case is the most delicate with regard to the issue of ignoring minuses.) Weobtain a subset cofinal in the upper elements of the canonical representation of  g(x) by

taking sequences from g

i<α ωairi

followed by ωa+α pluses and ωa+αn minuses forsome integer n. This follows because first of all minuses preceding the one correspondingto aα are ignored in y◦. Next it follows from the identity

i<α ωi+1 = ωa that g(y◦)

begins with ωa+ pluses.Finally, there are no minuses following the succeeding ωa+α+1 minuses contributed

by the minus corresponding to aα. Nowi<α ωairi

=

i<j ωairi + ωajrj . Also the

tail of  g

i<α ωairi

contributed by ωajrj followed by ωa+j pluses and ωa+j n minusesis exactly the contribution of a term of the form ωaj (rj + ε) for some positive dyadic

ε to g i<j ωairi + ωaj (rj + ε), which is i<α ωairi + ωajε by the main inductive

hypothesis (the one of  rn(x)). Hence g(x) has the form {i<α ωairi + y}|{ωairi +ωajε} which is x by lemma 10.100 c.

Now suppose g(y◦) doesn’t contain a minus. Then any upper element (g(x))

inthe canonical representation of  g(x) corresponds to a minus in g

i<α ωairi

, i.e. -is

in an upper element in the canonical representation of the latter. Now

g

i<α

ωairi

=i<α

ωairi = {i<γ 

ωairi − ωaγε}|{i<γ 

ωairi + ωaγε}

where γ < α, and ε and ε are dyadic with numerator 1.By the inverse cofinality theorem for arbitrary (g(x)) there exists γ  and ε such

that

(g(x)) ≥i<γ 

ωairi + ωaγε = g

i<γ 

ωairi + ωaγε

by the main inductive hypothesis.

(Note that since ε is dyadic it’s guaranteed that rn

i<γ ωairi + ωaγε

≤ rn

i<α ωairi

.)

We must now show that g

i<γ ωairi + ωaγε

is greater than g

i<γ ω

airi

(g(y◦)) =

g(x).

Of course g

i<γ ωairi + ωaγε

=

i<γ ωairi+ωaγε >

i<α ωairi = g

i<α ωairi

.

Thus by the lexicographical order the only difficulty arises when

i<α ωairi is an ini-tial segment of 

i<γ ωairi + ωaγε. In order for this to occur rγ  is necessarily dyadic.

i<γ ωairi + ωaγε is the sequence i<γ ωairi followed by ωa+

γ pluses and ωa+

γ minusesfor some integer n. Therefore the contribution of 

γ<i<α ωairi + ωaγε to

i<α ωairi

consists only of pluses. Necessarily γ < i → ri is a positive integer and γ ≤ i → ai + 1is an initial segment of  ai. The latter follows since ai+1 < ai as, otherwise, ai+1

Page 130: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 130/133

Copyright c 2012, by Mikael Astner  130

would have a minus not occurring in ai and, a fortiori , not in ant aj for j < i, andwould thus contribute a minus to

i<α ωairi. It follows that α = γ  + n for some

integer n. So

γ<i<α ωairi contributes

γ<i<α ωa+

i ri pluses to

i<α ωa−iri where

γ 

≤i

→a+i > a+

i+1. Furthermore g(y◦) contains only pluses. Since rα−

1 is dyadic all

minuses in aα are contained in aα−1; hence the number of pluses in g(y◦) is boundedabove by a number of the form ωa+αm for some integer m. Since a+

α < a+α−1 this finally

shows that g

i<α ωairi

g(y◦) consists of the sequence

i<γ ωairi followed by less

than ωa+γ pluses. This proves the inequality.We now have what we need by the cofinality theorem to deduce that g(x) may

be expressed in the form {i<α ωairi}|{

i<γ ωairi + ωaγε}. Case c of lemma 10.100

applies, so the above representation of  x, thus finally g(x) = x.This completes the induction for the case where rn(x) is a non-limit ordinal. Now

suppose rn(x) is a limit ordinal. Then rn(x) = n(x). (In general, if  n(x) has theform ωa + m, then rn(x) has the form ωa + n since the blocks used in the definitionof  rn(x) are all finite.) Thus we may assume that x has the form

i<α ωairi for a

limit ordinal α and that the result is known to be valid for i<β ωairi for β < α.

We shall show that the representations of  x given by the definition and the canonicalrepresentation of g(x) are mutually cofinal which is enough to guarantee that g(x) = x.It suffices to consider upper sums since the argument is similar for lower sums.

Let y be an arbitrary upper element of  g(x). Then y is an initial segment of  xdetermined by a minus sign contributed by a term ωaβrβ for β < α. Since α is a limitordinal, β + 1 < α. Consider z =

i<β+1 ωairi + ωaβ+1ε which is an upper element of 

x. By the inductive hypothesis this is g(z) which contains g

i<β ωairi

as an initial

segment which in turn contains y followed by a minus as an initial segment. Hencez < y by the lexicographic order.

Now let y be an arbitrary upper element of x. then y has the form

i<β ωairi+ωaβεwhere we may assume that ε is a dyadic with numerator one.

We may claim first that the set of  γ  for which ωaγrγ  contributes a minus to g(x)

is cofinal in α. Otherwise suppose there exists a γ  such that the contribution of γ<i<α ωairi to g

<α ωairi

consists only of pluses. As in the proof of the case

where α is a non-limit ordinal we obtain that ai+1 is an initial segment of  ai for γ  ≤ i.However, since α is a limit ordinal this already is a contradiction.

It follows that g(x) cannot be an initial segment of y since otherwise all contributionsof ωairi for β < i to g(x) would consist only of pluses which we just noted is impossible.

Hence g(x) is defined at the least ordinal j for which g(x) and y differ. The sign(g(x))( j) is contributed from a term ωaγ rγ . Suppose (g(x))( j) = +. By the lexi-cographical order this would imply that y < g

i<δ ωairi

=

i<δ ωairi

whereδ  = max(β + 1, γ ) which is false. Hence (g(x))( j) = −. By what was said earlierthere exists δ > γ  such that ωaδrδ also contributes a minus to g(x). This determines

an upper element z of  g(x). g(z) contains g

i<γ ωairi

as an initial segment since

δ > γ . Hence by the lexicographical order z = g(z) < y which is precisely what weneed.

This finally completes the proof that g(x) = x in all cases.Now that we have the fundamental relation between the sign sequence of a surreal

Page 131: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 131/133

Copyright c 2012, by Mikael Astner  131

number and its normal form, we can study the surreal numbers in more detail. Forthis purpose it will be convenient to express theorem 10.97 i in a form which considerscontributions to the sign sequence of  ωa by strings of pluses and minuses specifically, if a string begins at the ith place in a then the next string begins at the jth place where

 j is the least ordinal larger than i such that a( j) = a(i).Corollary 10.101. The sign sequence of  ωa consists of the following juxtaposition.We begin with a plus. For each string of pluses in  a we have a string of  ωb pluses where  b is the total number of pluses in  a up to and including the string. For each string of minuses in  a we have a string of  ωb+1c minuses where  b is the total number of pluses in a up to the string  c is the number of minuses in the string. a is regarded as beginning with a string of pluses.

Proof. This follows immediately from theorem 10.97 i. For pluses we use the identityβ<i<α ωi+1 = ωa for β < α. For minuses α remains fixed during a string. (No

minuses contributes a plus!) Finally if  a begins with a minus then a may be regardedas beginning with 0 pluses giving rise to ω0 = 1 plus in ωα. Thus the last statementis convenient way of unifying the cases where a begins with a plus and where a begins

with a minus. In the former case the first plus is superfluous by absorption, so thestatement gives the plus precisely when it should.

To illustrate this we refer back to the second example in remark 10.99. We hada0 = (−+−++). The string of two pluses at the end gives rise to ω3 pluses since there’sa total number of there pluses up to and including that string.

Note finally how strings of pluses and strings of minuses are treated entirely differentways.

11 Uncategorized proofs

In this section I’ll list some of the proofs which I’ve yet to properly incorporate, and the

plan is to ultimately incorporate them. In most cases this will require some alterationof the proof to rigorously fit the context.

However, they are quintessential to the fictional section of the cosmology, so I mightas well cover them.

Theorem 11.1. There are infinitely many prime numbers.

Proof. Assume that there’s a greatest prime number p. Then the number p! + 1 is agreater prime.

Proof. A variation of the proof is to multiply the highest prime number p with everypreceding prime number and then add one. Because the value

nk=1 pk where pk is the

kth prime number is evenly dividable with every preceding number, but the addition

of one (which isn’t dividable with said numbers) is going to result in a fraction, givingrise to a greater prime number. Once again, proving that there’s no greatest primenumber, thus the amount is infinite.

Page 132: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 132/133

Page 133: Cosmo Math

7/30/2019 Cosmo Math

http://slidepdf.com/reader/full/cosmo-math 133/133

Copyright c 2012, by Mikael Astner  133

Thus card (P (X )) = 2n for finite n.

Note that 11.3 is the reason why the power set is referred to as cardinal exponen-

tiation, and often written 2

N

. I dislike this notation because implies that conventionalexponentiation in itself is enough to increase the cardinality of a transfinite number.However, the power set is by no means a conventional definition of exponentiation.

As a follow up to the proof of theorem 11.3, we’re now going to examine the relationbetween hyperoperators and ordinals, as they’re conventionally defined and applies evento the real numbers. A hyperoperator is part of an arithmetic sequence systemizingthe notion of incrementing, addition, multiplication, exponentiation, and so forth inthe following manner:

x ↑−2 y S (x) = x + 1x ↑−1 y S (S (...S (x)...)) = x + yx ↑0 y x + x + ... + x = x · yx ↑1 y x · x · ... · x = xy

x↑

2 y x(∧x(∧...(∧x)...)...

...

We realize quickly the increase in operator fires off rather quickly, for instance that3 ↑2 5 = 4.43 · 1038 with three significant digits, as opposed to 3 ↑1 5 = 243.

So, what about the ordinals? Can you get from ω0 to ω1 applying any of theseoperators? The answer is“no”. This because a hyper operator is defined in the followingsense

x ↑z y = limζ →z

limη→y

limξ→x

ξ ↑ζ  η

meaning that as long as all of x, y, and z remain countable, then so will x ↑z y, due toinduction.